MOJ ISSN: 2374-6920MOJPB

Proteomics & Bioinformatics
Volume 1 Issue 5 - 2014
Transcriptome and Proteome Analysis: A Perspective on Correlation
Ramaswamy Narayanan1* and Wim J M Van De Ven2
1Department of Biological Sciences, Florida Atlantic University, USA 2Department of Human Genetics, University of Leuven, Belgium
Received: September 07, 2014 | Published: September 29, 2014
*Corresponding author: Ramaswamy Narayanan, Department of Biological Sciences, Charles E. Schmidt College of Science, Florida Atlantic University, Boca Raton, FL 33431 USA, Tel: 561-297-2247; Fax: 561-297-3859; Email: @
Citation: Narayanan R, Van De Ven WJM (2014) Transcriptome and Proteome Analysis: A Perspective on Correlation. MOJ Proteomics Bioinform 1(5): 00027. DOI: 10.15406/mojpb.2014.01.00027


Druggable proteins (such as enzymes, receptors, transporters and channel proteins, signal transduction proteins, oncogenes and transcription factors) are powerful drug targets for therapeutics of diverse diseases [1-4]. Among the approximately 22,000 human proteins, novel drug targets are likely to emerge [5]. Numerous bioinformatics tools are used to monitor the gene expression at mRNA level, including UniGene and Expressed Sequence Tag, EST [6,7], Serial Analysis of Gene Expression, SAGE [8], Digital Differential Display, DDD [9,10], Cancer Genome Anatomy Project X-Profiler and Digital Gene Expression Displayer [11,12]. High throughput transcriptome analysis is often performed using Microarray [13-15], and Next-Generation RNA sequencing, NGSRNA-seq [16, 17].
In recent years proteome analysis has been greatly aided by various protein expression tools, the Allan Brain Atlas [18,19], Human Protein Atlas [20], Multi Omics Profiling Expression Database, MOPED [21], the Human Protein Reference Database, HPRD [22,23] and the recently described Human Proteome Map [24] and proteomics Db. The last two mentioned studies together account for approximately 84% of the total annotated protein-coding genes in humans. Together, these two studies identified >18,000 proteins encoded by both known genes and uncharacterized Open Reading Frames. Over 20,000 protein isoforms expressions were also characterized [25,26].
Numerous normal tissues, body fluids and cell lines were used in these two studies. These recently described protein expression databases greatly expand our ability to establish correlative evidence for mRNA and protein expression for most of the human proteome.
Based on mRNA expression using a high throughput transcriptome analysis, interpretations are often made about the functional relevance, pathways, interacting proteins and the possibility of drug therapy use. Protein expression is often inferred, but frequently a correlation of mRNA versus protein is missing in studies. The complex regulation of protein expression at the level of noncoding RNAs, DNA methylation, epigenetic changes, gene amplifications, copy number variations, mutations, post translational modifications such as acetylation, amidation, myristylation, phosphorylation, sumoylation, Ubiquitination etc., stability and degradation as well as the interacting proteins adds to the complexity of transcriptome-based interpretations [27,28].
For identifying candidate driver genes for therapeutic target discovery and functional elucidation, knowledge of mRNA and protein correlation is critical. The vast amount of data generated by microarray and the NGSRNA-sequencing technology can be reduced to a manageable number of putative targets using the correlation at the mRNA and the protein levels. In a recent study, Zhang et al. [29], using The Cancer Genome Atlas (TCGA) tissues (n=87) demonstrated that the mRNA transcript abundance and gene amplifications did not often correlate. Among 3,764 genes analyzed, only 32% showed statistically significant correlation. Further, copy number alterations had no significant effects on the protein levels. Our own results as well as several other reports support such a lack of correlation [30-34].
During the early stages of microarray technology, vast data sets were generated with no standards. The establishment of Minimum Information About a Microarray Experiment, MIAME [35], greatly reduced the noise in subsequent array-based datasets. Similar guidelines for RNA and protein expression are needed to develop meaningful interpretations. Where possible, the investigators and authors should be expected to demonstrate such a correlation using the protein expression tools, which are becoming increasingly available. This would greatly help reduce the noise in the published literature, verify functional roles for the proteins and optimize the chances of identifying candidate driven genes for druggableness.
Further, the development of single cell transcriptome and proteome analysis capability (in contrast to whole tissue-based analysis as is currently done) would allow for a single cell-based correlation between mRNA and proteins. This will allow for cellular heterogeneity in expression profiling in a tissue environment. An offshoot of the protein expression databases, such as creation of a separate human proteome database showing correlation with the transcriptome, would greatly aid in the therapeutic target discovery.


We thank Jeanine Narayanan for editorial assistance.


  1. Hopkins AL, Groom CR (2002) The druggable genome. Nature reviews Drug discovery 1(9): 727-730.
  2. Landry Y, Gies JP (2008) Drugs and their molecular targets: an updated overview. Fundam clin pharmacol 22(1): 1-18.
  3. Russ AP, Lampel S (2005) The druggable genome: an update. Drug Discov today 10(23-24): 1607-1610.
  4. Workman P, Al-Lazikani B, Clarke PA (2013) Genome-based cancer therapeutics: targets, kinase drug resistance and future strategies for precision oncology. Curr Opin Pharmacol 13(4): 486-496.
  5. Griffith M, Griffith OL, Coffman AC, Weible JV, McMichael JF, et al. (2013) DGIdb: mining the druggable genome. Nat Methods 10(12): 1209-1210.
  6. Wheeler DL, Church DM, Federhen S, Lash AE, Madden TL, et al. (2003) Database resources of the National Center for Biotechnology. Nucleic Acids Res 31(1): 28-33.
  7. Boguski MS, Schuler GD (1995) E S Tablishing a human transcript map. Nat Genet 10(4): 369-371.
  8. Velculescu VE, Zhang L, Vogelstein B, Kinzler KW (1995) Serial analysis of gene expression. Science 270(5235): 484-487.
  9. Strausberg RL, Buetow KH, Emmert-Buck MR, Klausner RD (2000) The cancer genome anatomy project: building an annotated gene index. Trends genet 16(3): 103-106.
  10. Scheurle D, DeYoung MP, Binninger DM, Page H, Jahanzeb M, et al. (2000) Cancer gene discovery using digital differential display. Cancer Res 60(15): 4037-4043.
  11. Strausberg RL (2001) The Cancer Genome Anatomy Project: new resources for reading the molecular signatures of cancer. J Pathol 195(1): 31-40.
  12. Lauriola M, Ugolini G, Rosati G, Zanotti S, Montroni I, et al. (2010) Identification by a Digital Gene Expression Displayer (DGED) and test by RT-PCR analysis of new mRNA candidate markers for colorectal cancer in peripheral blood. Int J Oncol 37(2): 519-525.
  13. Rhodes DR, Kalyana-Sundaram S, Mahavisno V, Varambally R, Yu J, et al. (2007) Oncomine 3.0: genes, pathways, and networks in a collection of 18,000 cancer gene expression profiles. Neoplasia 9(2): 166-180.
  14. Liu F, White JA, Antonescu C, Gusenleitner D, Quackenbush J (2011) GCOD - GeneChip Oncology Database. BMC Bioinformatics 12: 46.
  15. Parkinson H, Sarkans U, Shojatalab M, Abeygunawardena N, Contrino S, et al. (2005) ArrayExpress--a public repository for microarray gene expression data at the EBI. Nucleic acids Res. 33(Database issue): D553-D555.
  16. Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nature Methods 5(7): 621-628.
  17. Nagalakshmi U, Wang Z, Waern K, Shou C, Raha D, et al. (2008) The transcriptional landscape of the yeast genome defined by RNA sequencing. Science 320(5881): 1344-1349.
  18. Miller JA, Ding SL, Sunkin SM, Smith KA, Ng L, et al. (2014) Transcriptional landscape of the prenatal human brain. Nature 508(7495): 199-206.
  19. Shen EH, Overly CC, Jones AR (2012) The Allen Human Brain Atlas: comprehensive gene expression mapping of the human brain. Trends Neurosci 35(12): 711-714.
  20. Uhlen M, Oksvold P, Fagerberg L, Lundberg E, Jonasson K, et al. (2010) Towards a knowledge-based Human Protein Atlas. Nat Biotechnol 28(12): 1248-1250.
  21. Kolker E, Higdon R, Haynes W, Welch D, Broomall W, et al. (2012) MOPED: Model Organism Protein Expression Database. Nucleic acids Res 40(Database issue): D1093-D1099.
  22. Mathivanan S, Ahmed M, Ahn NG, Alexandre H, Amanchy R, et al. (2008) Human Proteinpedia enables sharing of human protein data. Nat Biotechnol 26(2): 164-167.
  23. Keshava Prasad TS, Goel R, Kandasamy K, Keerthikumar S, Kumar S, et al. (2009) Human Protein Reference Database--2009 update. Nucleic acids Res 37(Database issue): D767-D772.
  24. Kim MS, Pinto SM, Getnet D, Nirujogi RS, Manda SS, et al. (2014) A draft map of the human proteome. Nature 509(7502): 575-581.
  25. Wilhelm M, Schlegl J, Hahne H, Moghaddas Gholami A, Lieberenz M, et al. (2014) Mass-spectrometry-based draft of the human proteome. Nature 509(7502): 582-587.
  26. Lawrence RT, Villen J (2014) Drafts of the human proteome. Nat Biotechnol 32(8): 752-753.
  27. Deribe YL, Pawson T, Dikic I (2014) Post-translational modifications in signal integration. Nat Struct Mol Biol 17(6): 666-672.
  28. Vogel C, Marcotte EM (2012) Insights into the regulation of protein abundance from proteomic and transcriptomic analyses. Nat Rev Genet 13(4): 227-232.
  29. Zhang B, Wang J, Wang X, Zhu J, Liu Q, et al. (2014) Proteogenomic characterization of human colon and rectal cancer. Nature 513(7518): 382-387.
  30. Delgado AP, Brandao P, Chapado MJ, Hamid S, Narayanan R (2014) Open Reading Frames Associated with Cancer in the Dark Matter of the Human Genome. Cancer Genomics & Proteomics. 11(4): 201-213.
  31. Delgado AP, Brandao P, Narayanan R. (2014) Diabetes Associated Genes from the Dark Matter of the Human Proteome. MOJ Proteomics Bioinform 1(4): 00020.
  32. Greenbaum D, Colangelo C, Williams K, Gerstein M (2003) Comparing protein abundance and mRNA expression levels on a genomic scale. Genome Biol 4(9):117.
  33. Gedeon T, Bokes P (2012) Delayed protein synthesis reduces the correlation between mRNA and protein fluctuations. Biophys J 103(3): 377-385.
  34. Gry M, Rimini R, Stromberg S, Asplund A, Ponten F, et al. (2009) Correlations between RNA and protein expression profiles in 23 human cell lines. BMC Genomics 10:365.
  35. Brazma A, Hingamp P, Quackenbush J, Sherlock G, Spellman P, et al. (2001) Minimum information about a microarray experiment (MIAME)-toward standards for microarray data. Nat Genet 29(4): 365-371.
© 2014-2016 MedCrave Group, All rights reserved. No part of this content may be reproduced or transmitted in any form or by any means as per the standard guidelines of fair use.
Creative Commons License Open Access by MedCrave Group is licensed under a Creative Commons Attribution 4.0 International License.
Based on a work at
Best viewed in Mozilla Firefox | Google Chrome | Above IE 7.0 version | Opera |Privacy Policy