Journal of ISSN: 2473-0831 JAPLR

Analytical & Pharmaceutical Research
Mini Review
Volume 3 Issue 4 - 2016
New Player of ncRNAs: Long Non-coding RNAs
Elif Karlik1 and Nermin Gozukirmizi2*
1Department of Biotechnology, Istanbul University, Turkey
2Department of Molecular Biology and Genetics, Istanbul University, Turkey
Received: November 04, 2016 | Published: November 15, 2016
*Corresponding author: Nermin Gözükırmızi, Department of MolecularBiology and Genetics, Istanbul University, Faculty of Science, Istanbul, 34118 Vezneciler, Turkey, Email:
Citation: Karlik E, Gozukirmizi N (2016) New Player of ncRNAs: Long Non-coding RNAs. J Anal Pharm Res 3(4): 00064. DOI: 10.15406/japlr.2016.03.00064

Abstract

Long non-coding RNAs (lncRNAs) play important roles in a wide range of biological processes as regulatory factors at the epigenetic, transcriptional and post- transcriptional levels. In this review, we summarized the current knowledge of lncRNAs discoveries including their identification, classifications and functions.

Keywords: Long non-coding RNAs; Discovery of lncRNAs; Classification of lncRNAs; Post- transcriptional levels; Epigenetic; Regulatory factors

Introduction

The mechanism underlying the functions of non-protein coding RNAs (ncRNAs or npcRNAs) that have no or little protein-coding potential is a fascinating area of research [1]. Based on transcript length, ncRNAs are classified as short (<200 nt) and long ncRNAs (lncRNAs; >200 nt). The recent high-throughput analysis such as cDNA/EST in silico mining, whole-genome tilling array and RNA-sequencing (RNA-seq) has revealed that the transcription landscape in eukaryotes is much more complex than had been expected [2-4]. Transcriptome analysis estimates transcripts cover 90% of eukaryotic genome [5]. These approaches have facilitated the identification of thousands of novel ncRNAs (or npcRNAs) in many organisms, such as humans, animals, and plants [6-9].

LncRNAs are arbitrarily defined as RNA transcripts that contain >200 nt but lack protein coding-potential which are transcribed by RNA polymerase II or III, and additionally, by polymerase IV/V in plants [10-12]. They are processed by splicing or nonsplicing, polyadenylation or non-polyadenylation, and can be located in the nucleus or cytoplasm. The researches have revealed that lncRNAs may represent alternatively spliced forms of known genes [13], products of antisense RNAs [14-17], double stranded RNAs [18], retained introns [13,19], short open reading frame [1,20,21]. RNA polymerase III-derived RNAs [22] and RNA decoys mimicking miRNA targets [23].

Discovery of lncRNAs

In 1990s, H19 and Xist (X-inactive specific transcript) lncRNAs were discovered by using traditional gene mapping approaches [24-16]. In the later years, HOTAIR (HOX antisense intergenic RNA) and HOTTIP (HOXA transcript at the distal tip) were discovered by using tilling arrays in the homeobox gene regions (HOX clusters) [27,28]. Using genome-wide approach, 1600 novel mouse lncRNAs have been identified by Guttman et al. [8]. Since then, thousands of lncRNAs have been determined using similar genome-wide approaches in human, mouse and plants
[29-32].

Novel lncRNAs can be detected and discovered by both experimental (next generation sequencing, NGS, technologies) and computational screenings [33-35]. First, the fragments of transcripts are obtained by using NGS technologies or tilling microarrays. Then, the transcripts sequences are mapped to the reference genome and identified transcribed units of the RNAs. The criteria for discriminating between coding and non-coding sequences of RNAs are based on similarity to known coding sequences or statistics of codon frequencies for coding potential [36]. Typically, BLASTX is most commonly used tool for known sequence similarity detection [37]. Alternatively, HMMER3 help to determine homologous domains in protein data to eliminate transcripts with protein-coding potential [38]. However, there is much more alternative tools for evaluating coding potential. The most used tools are CPC (Cording-Potential Calculator) [39] and PORTRAIT [40] use pair wise comparisons; in contrast, PhyloCSF [41] and RNAcode [42] use multiple alignments. Another popular approach, Coding Potential Assessment Tool, also uses an alignment-free logistic regression model [43]. Except these computational approaches, experimental methods such as ribosomal profiling have been utilized to compute the protein coding capacity of lncRNAs based on the periodicity of ribosome occupancy along the short translated ORFs [44].

About 1600 novel mouse lncRNAs have been identified by genome-wide approach which used gene expression data and the presence of chromatin marks for promoter regions [8]. Combination of chromatin marks and RNA-seq data sets have been used to generate the human long intervening non-coding (lincRNA) catalog which comprise 8000 lincRNAs from 24 different human cell types and tissues [45]. More than 13,500 human lncRNAs have been annotated by GENCODE and also, datasets from the 1000 Genomes Project have been utilized to reveal the association between lncRNAs and prostate cancer [30,46]. Cunnington et al. have reported the association between 56 lncRNAs and disease related to traits ranging from diabetes to multiple sclerosis, Alzheimer’s disease, etc [47]. Both computational and experimental analyses have shown that 125 putative stress responsive lncRNAs in wheat were tissue-specific and can be induced by powdery mildew infection and heat stress [48]. In addition, Zhang et al. [15] systematically identified 2224 lncRNAs by performing strand-specific RNA sequencing of rice anthers, pistils, seeds, and shoots and combining with the analysis of other available rice RNA-seq datasets [32].

Classification of lncRNAs

lncRNAs are classified based on several properties such as transcript length, sequence and structure conservation, genomic location, functions exerted on DNA or RNA, functioning mechanisms, and targeting mechanisms, association with annotated protein coding genes or repeats or biochemical pathway or stability or subcellular structures [49,50]. Besides lots of criteria for lncRNA classification, the most commonly used attributes are their size, localization and function. Typically, the threshold value is 200 bases for length discrimination of ncRNAs. Fewer than 200 bases are considered as small ncRNAs and more than 200 bases are classified as long ncRNAs [51]. After length size discrimination, genomic locations of lncRNAs are also popular for classifying. According to GENCODE for their genomic locations, lncRNAs are classified into five groups:

  1. Antisense lncRNAs, which are transcribed from the antisense strand, intersect any exon of a protein-coding locus on the opposite strand, or published evidence of antisense regulation of a coding gene. Their transcription was found to be overlap genes related to condition specific or the stress response. It is considered that antisense lncRNAs, which involve genome imprinting, regulation of alternative splicing and translation, exert their function as on-off switch for these genes [52-54].
  2. Sense lncRNAs are trancribed from sense strand of protein-coding genes that overlapping transcripts contain a coding gene within an intron on the same strand.
  3. Intronic transcripts reside within introns of a coding gene, which do not have exon-exon overlapping, is defined as sense intronic lncRNAs. Differential expression studies demonstrated that expression levels of intronic lncRNAs and their biological variation during a physiological time course, or among different individuals of the same strain are tightly correlated with their adjacent exons [55].
  4. Long intervening non-coding RNAs with a length >200 bp, are also called long “intergenic” non-coding RNAs, do not overlap exons of either protein coding and lies within the genomic interval between two genes. Approximately 20% of lincRNAs are found to be bound by polycomb repressive complex 2 (PRC2) or other chromatin-modifying complexes which indicated that they play role as enhancer-like functions by guiding chromatin-modifying complexes to specific genomic loci, transmitting information from higher order chromosomal looping into chromatin modifications to coordinate long-range gene activation [28,56].
  5. Processed transcript which do not have any open reading frame (ORF) and also, cannot be placed in any type of categories [57]. In addition to GENCODE classification, extra two categories are also emerged as bidirectional and enhancer lncRNAs. Bidirectional lncRNAs, which are tending to be highly conservative, are expressed within 1 kb of promoters in the opposite direction from the neighboring protein-coding gene [58,59]. Several studies showed that bidirectional lncRNAs are associated with transcriptional regulatory genes implicated in cell differentiation and development [60]. Enhancer lncRNA (elncRNA or eRNA), which are generally <2 kb, is transcribed from enhancer regions of the genome and may contribute to enhancer function [59]. eRNAs have been found to exert their functions in chromatin looping and long-range gene activation, playing an important role in system development and the formation of homeostasis [61,62].

Conclusion

LncRNAs play important roles in a numerous biological processes as regulatory factors. Functional analyses of lncRNAs have indicated that they are effective cis- and transregulators of gene transcription, and also act as scaffolds for chromatin-modifying complexes. Nowadays, lncRNAs are considered as major regulators involved in numerous cellular processes, including cell differentiation and development, chromosome dosage compensation, cell cycle control and adaptation to environmental changes [63-65]. Our group has been investigating the association between salinity stress metabolism and barley lncRNAs (unpublished data). Identification of novel lncRNAs is likely to provide new insight into the complicated gene regulatory network involving lncRNAs, provide novel diagnostic opportunities, and pinpoint novel therapeutically targets.

Acknowledgement

Dr. Nermin Gözükırmızı is proffessor at Molecular Biology and Genetics Department, Science Faculty of Istanbul University, Turkey. She received her Dr. rer. nat.Degreeon Botany-Genetics at Istanbul University, Turkey in 1979. Her current research interest focuses on plant stress metabolism, transposons and epigenetics marks.

M. Sc. Elif Karlık is PhD candidate at Biotechnology Department, Institution of Science of Istanbul University, Turkey. She received her master degree in Molecular Biology and Genetics at Istanbul University, Turkey. Her current research interest focuses on plant stress metabolism and regulatory network of lncRNAs.

References

  1. Ben Amor B, Wirth S, Merchan F, Laporte P, d’Aubenton-Carafa Y, et al. (2009) Novel long non-protein coding RNAs involved in Arabidopsis differentiation and stress responses. Genome Res 19(1): 57-69.
  2. Maeda N, Kasukawa T, Oyama R, Gough J, Frith M, et al. (2006) Transcript annotation in FANTOM3: Mouse gene catalog based on physical cDNAs. PLoS Genet 2(4): e62.
  3. Jacquier A. (2009) The complex eukaryotic transcriptome: Unexpected pervasive transcription and novel small RNAs. Nat Rev Genet10(12): 833-844.
  4. Khachane AN, Harrison PM (2010) Mining mammalian transcript data for functional long non-coding RNAs. PLoS One 5(4): e10316.
  5. Taft RJ, Pheasant M, Mattick JS (2007) The relationship between nonprotein-coding DNA and eukaryotic complexity. Bioessays 29(3): 288-299.
  6. Ravasi T, Suzuki H, Pang KC, Katayama S, Furuno M, Okunishi R, et al. (2006) Experimental validation of the regulated expression of large numbers of non-coding RNAs from the mouse genome. Genome Res 16(1): 11-19.
  7. Consortium EP, Birney E, Stamatoyannopoulos JA, Dutta A, Guigo R, et al. (2007) Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447(7146): 799-816.
  8. Guttman M, Amit I, Garber M, French C, Lin MF, et al. (2009) Chromatin signature reveals over a thousand highly conserved large non-coding RNAs in mammals. Nature 458: 223-227.
  9. Ponting CP, Oliver PL, Reik W (2009) Evolution and functions of long noncoding RNAs. Cell 136(4): 629-641.
  10. Dinger ME, Pang KC, Mercer TR, Mattick JS (2008) Differentiating protein-coding and noncoding RNA: challenges and ambiguities. PLoS Comput Biol 4(11): e1000176.
  11. Wierzbicki AT, Haag JR, Pikaard CS (2008) Noncoding transcription by RNA polymerase Pol IVb/Pol V mediates transcriptional silencing of overlapping and adjacent genes. Cell 135(4): 635-648.
  12. Dinger ME, Pang KC, Mercer TR, Crowe ML, Grimmond SM, et al. (2009) NRED: a database of long noncoding RNA expression. Nucleic Acids Res 37: D122-D126.
  13. Filichkin SA, Priest HD, Givan SA, Shen R, Bryant DW, et al. (2010) Genome-wide mapping of alternative splicing in Arabidopsis thaliana. Genome Res 20(1): 45-58.
  14. Matsui A, Ishida J, Morosawa T, Mochizuki Y, Kaminuma E, et al. (2008) Arabidopsis transcriptome analysis under drought, cold, high-salinity and ABA treatment conditions using a tiling array. Plant Cell Physiol 49(8): 1135-1149.
  15. Zhang X, Lii Y, Wu Z, Polishko A, Zhang H, et al. (2013) Mechanisms of small RNA generation from cis-NATs in response to environmental and developmental cues. Mol Plant 6(3): 704-715.
  16. Swiezewski S, Liu F, Magusin A, Dean C (2009) Cold-induced silencing by long antisense transcripts of an ArabidopsisPolycomb target. Nature 462(7274): 799-802.
  17. Luo C, Sidote DJ, Zhang Y, Kerstetter RA, Michael TP, et al. (2013) Integrative analysis of chromatin states in Arabidopsisidentified potential regulatory mechanisms for natural antisense transcript production. Plant J 73(1): 77-90.
  18. Rajeswaran R, Aregger M, Zvereva AS, Borah BK, Gubaeva EG, et al. (2012) Sequencing of RDR6-dependent double-stranded RNAs reveals novel features of plant siRNA biogenesis. Nucleic Acids Res 40: 6241-6254.
  19. Ner-Gaon H, Halachmi R, Goldstein SS, Rubin E, Ophir R, et al. (2004) Intron retention is a major phenomenon in alternative splicing in Arabidopsis. Plant J 39(6): 877-885.
  20. Hanada K, Takeuchi HM, Okamoto M, Yoshizumi T, Shimizu M, et al. (2013) Small open reading frames associated with morphogenesis are hidden in plant genomes. Proc Natl Acad Sci USA 110(6): 2395-2400.
  21. Moghe GD, Lehti-Shiu MD, Seddon AE, Yin S, Chen Y, et al. (2013) Characteristics and significance of intergenic polyadenylated RNA transcription in Arabidopsis. Plant Physiol 161(1): 1210-1224.
  22. Wu J, Okada T, Fukushima T, TsudzukiT, Sugiura M, et al. (2012) A novel hypoxic stress-responsive long non-coding RNA transcribed by RNA polymerase III in Arabidopsis. RNA Biol 9(3): 302-313.
  23. Zorrilla FJM, Valli A, Todesco M, Mateos I, Puga MI, et al. (2007) Target mimicry provides a new mechanism for regulation of microRNA activity. Nat Genet 39(8): 1033-1037.
  24. Brannan CI, Dees EC, Ingram RS, Tilghman SM (1990) The product of the H19 gene may function as an RNA. Mol Cell Biol 10(1): 28-36.
  25. Borsani G, Tonlorenzi R, Simmler MC, Dandolo L, Arnaud D, et al. (1991) Characterization of a murine gene expressed from the inactive X chromosome. Nature 351(6324): 325-329.
  26. Brockdorff N, Ashworth A, Kay GF, McCabe VM, Norris DP, et al. (1992) The product of the mouse Xist gene is a 15 kb inactive X-specific transcript containing no conserved ORF and located in the nucleus. Cell 71(3): 515-526.
  27. Rinn JL, Kertesz M, Wang JK, Squazzo SL, Xu X, et al. (2007) Functional demarcation of active and silent chromatin domains in human HOX loci by noncoding RNAs. Cell 129(7): 1311-1323.
  28. Wang KC, Yang YW, Liu B, Sanyal A, Zimmerman CR, et al. (2011) A longnoncoding RNA maintains active chromatin to coordinate homeotic gene expression. Nature 472(7341): 120-124.
  29. Khalil AM, Guttman M, Huarte M, Garber M, Raj A, et al. (2009) Many human large intergenic noncoding RNAs associate with chromatin-modifying complexes and affect gene expression. Proc Natl Acad Sci US A. 106(28): 11667-11672.
  30. Derrien T, Johnson R, Bussotti G, Tanzer A, Djebali S, et al. (2012) The GENCODE v7 catalog of human long noncoding RNAs: analysis of their gene structure, evolution, and expression. Genome Res 22(9): 1775-1789.
  31. Liu J, Jung C, Xu J, Wang H, Deng S, et al. (2012) Genome wide analysis uncovers regulation of long intergenic noncoding RNAs in Arabidopsis. Plant Cell 24(11): 4333-4345.
  32. Zhang YC, Liao JY, Li ZY, Yu Y, Zhang JP, et al. (2014) Genome-wide screening and functional analysis identify a large number of long noncoding RNAs involved in the sexual reproduction of rice. Genome Biology 15: 512.
  33. Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, et al. (2010) Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol 28(5): 511-515.
  34. Kim D, Pertea G, Trapnell C, Pimentel H, Kelley R, Salzberg SL (2013) TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biology 14: R36.
  35. Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, et al. (2013) STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29(1): 15-21.
  36. Iwakiri J, Hamada M, Asai K, (2016) Bioinformatics tools for lncRNA research. Biochimica et Biophysica Acta - Gene Regulatory Mechanisms 1859(1): 23-30.
  37. Gish W, States DJ, (1993) Identification of protein coding regions by database similarity search. Nat Genet 3(3): 266-272.
  38. Eddy SR (2011) Accelerated profile HMM searches. PLoS Comput Biol 7(10): e1002195.
  39. Kong L, Zhang Y, Ye ZQ, Liu XQ, Zhao SQ, et al. (2007) CPC: assess the proteincoding potential of transcripts using sequence features and support vector machine. Nucleic Acids Res 35: W345-W349.
  40. Arrial RT, Togawa RC, Brigido MM (2009) Screening non-coding RNAs in transcriptomes from neglected species using PORTRAIT: case study of the pathogenic fungus Paracoccidioides brasiliensis. BMC Bioinformatics 10: 239.
  41. Lin MF, Jungreis I, Kellis M (2011) PhyloCSF: a comparative genomics method to distinguish protein coding and non-coding regions, Bioinformatics, 27(13): i275-i282.
  42. Washietl S, Findeiss S, Muller SA, Kalkhof S, von Bergen M, et al. (2011) RNAcode: robust discrimination of coding and noncoding regionsin comparative sequence data. RNA 17(4): 578-594.
  43. Wang L, Park HJ, Dasari S, Wang S, Kocher JP, et a. (2013) CPAT: Coding-Potential Assessment Tool using an alignment-free logistic regression model. Nucleic Acids Res 41: e74.
  44. Bazzini AA, Johnstone TG, Christiano R, Mackowiak SD, Obermayer B, et al. (2014) Identification of small ORFs in vertebrates using ribosome footprinting and evolutionary conservation. EMBO J 33(9): 981-993.
  45. Cabili MN, Trapnell C, Goff L, Koziol M, Tazon-Vega B, et al. (2011) Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. Genes Dev 25(18): 1915-1927.
  46. Jin G, Sun J, Isaacs SD, Wiley KE, Kim ST, et al. (2011) Human polymorphisms at long non-coding RNAs (lncRNAs) and association with prostate cancer risk. Carcinogenesis 32(11): 1655-1659.
  47. Cunnington MS, SantibanezKoref M, Mayosi BM, Burn J, Keavney B (2010) Chromosome 9p21 SNPs associated with multiple disease phenotypes correlate with ANRIL expression. PLoS Genet 6(4): e1000899.
  48. Xin M, Wang Y, Yao Y, Song N, Hu Z, et al. (2011) Identification and characterization of wheat long non-protein coding RNAs responsive to powdery mildew infection and heat stress by using microarray analysis and SBS sequencing. BMC Plant Biol 11: 61.
  49. Ma L, Bajic VB, Zhang Z (2013) On the classification of long noncoding RNAs. RNA Biol 10(6): 925-933.
  50. St Laurent G,Wahlestedt C, Kapranov P (2015) The Landscape of long non-coding RNA classification. Trends Genet 5: 239-251.
  51. Kapranov P, Cheng J, Dike S, Nix DA, Duttagupta R, et al. (2007) RNA maps reveal new RNA classes and a possible function for pervasive transcription. Science 316(5830):1484-1488.
  52. Berteaux N, Aptel N, Cathala G, Genton C, Coll J, et al. (2008) A novel H19 antisense RNA overexpressed in breast cancer contributes to paternal IGF2 expression. Mol Cell Biol 28(22): 6731-6745.
  53. Ni T, Tu K, Wang Z, Song S, Wu H, et al. (2010) The prevalence and regulation of antisense transcripts in Schizosaccharomyces pombe. PLoS One 5(12): e15271.
  54. Xu Z, Wei W, Gagneur J, Munster CS, Smolik M, et al. (2011) Antisense expression increases gene expression variability and locus interdependency. Mol Syst Biol 7: 468.
  55. St Laurent G, Shtokalo D, Tackett MR, Yang Z, Eremina T, et al. (2012) Intronic RNAs constitute the major fraction of the non-coding RNA in mammalian cells. BMC Genomics 13: 504.
  56. Zhang K, Shi ZM, Chang YN, Hu ZM, Qi HX, et al. (2014) The ways of action of long non-coding RNAs in cytoplasm and nucleus. Gene 547(1): 1-9.
  57. Harrow J, Frankish A, Gonzalez JM, Tapanari E, Diekhans M, et al. (2012) GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res 22(9): 1760-1774.
  58. Atkinson SR, Marguerat S, Bahler J (2012) Exploring long non-coding RNAs through sequencing. Semin Cell Dev Biol 23(2): 200-205.
  59. Devaux Y, Zangrando J, Schroen B, Creemers EE, Pedrazzini T, et al. (2015) Long noncoding RNAs in cardiac development and ageing. Nat Rev Cardiol 12(7): 415-425.
  60. Lepoivre C, Belhocine M, Bergon A, Griffon A, Yammine M, et al. (2013) Divergent transcription is associated with promoters of transcriptional regulators. BMC Genomics 14: 914.
  61. De Santa F, Barozzi I, Mietton F, Ghisletti S, Polletti S, et al. (2010) A large fraction of extragenic RNA pol II transcription sites overlap enhancers. PLoS Biol 8(5): e1000384.
  62. Licastro D, Gennarino VA, Petrera F, Sanges R, Banfi S, et al. (2010) Promiscuity of enhancer, coding and non-coding transcription functions in ultraconserved elements. BMC Genomics 11: 151.
  63. Wery M, Kwapisz M, Morillon A (2011) Noncoding RNAs in generegulation. Wires systems Biology and Medicine 3(6): 728-738.
  64. Rinn JL, Chang HY (2012) Genome regulation by long noncoding RNAs. Annu Rev Biochem 81: 145-166.
  65. Sole C, Ribelles NM, de Nadal E, Posas F (2015) A novel role for lncRNAs in cell cycle control during stress adaptation. Curr Genet 61(3): 299-308.
© 2014-2016 MedCrave Group, All rights reserved. No part of this content may be reproduced or transmitted in any form or by any means as per the standard guidelines of fair use.
Creative Commons License Open Access by MedCrave Group is licensed under a Creative Commons Attribution 4.0 International License.
Based on a work at http://medcraveonline.com
Best viewed in Mozilla Firefox | Google Chrome | Above IE 7.0 version | Opera |Privacy Policy