Human interactome explained
The human interactome is the set of protein–protein interactions (the interactome) that occur in human cells.[1] [2] The sequencing of reference genomes, in particular the Human Genome Project, has revolutionized human genetics, molecular biology, and clinical medicine. Genome-wide association study results have led to the association of genes with most Mendelian disorders,[3] and over 140 000 germline mutations have been associated with at least one genetic disease.[4] However, it became apparent that inherent to these studies is an emphasis on clinical outcome rather than a comprehensive understanding of human disease; indeed to date the most significant contributions of GWAS have been restricted to the “low-hanging fruit” of direct single mutation disorders, prompting a systems biology approach to genomic analysis.[5] [6] The connection between genotype and phenotype (how variation in genotype affects the disease or normal functioning of the cell and the human body) remain elusive, especially in the context of multigenic complex traits and cancer.[7] To assign functional context to genotypic changes, much of recent research efforts have been devoted to the mapping of the networks formed by interactions of cellular and genetic components in humans, as well as how these networks are altered by genetic and somatic disease.
Background
With the sequencing of the genomes of a diverse array or model organisms, it became clear that the number of genes does not correlate with the human perception of relative organism complexity – the human proteome contains some 20 000 genes,[8] which is smaller than some species such as corn. A statistical approach to calculating the number of interactions in humans gives an estimate of around 650 000, one order of magnitude bigger than Drosophila and 3 times larger than C. Elegans.[2] As of 2008, only about <0.3% of all estimated interactions among human proteins has been identified,[9] although in recent years there has been exponential growth in discovery – as of 2015,[10] over 210 000 unique human positive protein–protein interactions are currently catalogued, and bioGRID database contains almost 750 000 literature-curated PPI's for 30 model organisms, 300 000 of which are verified or predicted human physical or genetic protein–protein interactions, a 50% increase from 2013.[11] The currently available information on the human interactome network originates from either literature-curated interactions,[12] high-throughput experiments,[10] or from potential interactions predicted from interactome data, whether through phylogenetic profiling (evolutionary similarity), statistical network inference,[13] or text/literature mining methods.[14]
Protein–protein interactions are only the raw material for networks. To form useful interactome databases and create integrated networks, other types of data that can be combined with protein–protein interactions include information on gene expression and co-expression, cellular co-localization of proteins (based on microscopy), genetic information, metabolic and signalling pathways, and more.[15] The end goal of unravelling human protein interactomes is ultimately to understand mechanisms of disease and uncover previously unknown disease genes. It has been found that proteins with a high number of interactions (outward edges) are significantly more likely to be hubs in modules that correlate with disease,[10] [16] probably because proteins with more interactions are involved in more biological functions. By mapping disease alterations to the human interactome, we can gain a much better understanding of the pathways and biological processes of disease.[17]
Studying the human interactome
Analysis of metabolic networks of proteins hearkens back to the 1940s, but it was not until the late 1990s and early 2000s that computational data-driven genomic analyses to predict functional context and networks of genetic associations appeared in earnest.[8] Since then, the interactomes of many model organisms are considered to have been well characterized, notably the Saccharomyces cerevisiae Interactome[18] and the Drosophila interactome.[19]
High throughput experimental approaches for discovering protein–protein interactions typically perform a version of the two-hybrid screening approach or tandem affinity purification followed by mass spectrometry.[12] Information from experiments and literature curation are compiled into databases of protein interactions, such as DIP,[20] and BioGRID.[11] A more recent effort, HINT-KB,[10] attempts to amalgamate most of the current PPI databases, but filtering systematically erroneous interactions as well as trying to correct for inherent sociological sampling biases in literature curated datasets.
Smaller human interactome networks have been described in the specific context of important drivers of many different disorders, including neurodegenerative disorders,[21] autism and other psychiatric disorders,[22] and cancer. Cancer gene networks have been particularly well studied, due in part to large genome initiatives such as The Cancer Genome Atlas (TCGA).[23] A large portion of the mutational landscape including intra-tumoural heterogeneity has been mapped for most common types of cancers [24] (for example, breast cancer has been well studied),[25] and many studies have also investigated the difference between active driver genes and passive passenger mutations in the context of cancer interaction networks.[16]
The first attempts at large-scale integrative human interactome mapping occurred around 2005. Stetzl et al.[26] used a protein matrix of 4500 baits and 5600 preys in a yeast two hybrid system to piece together the interactome, and Rual et al. performed a similar yeast-two hybrid study verified with co-affinity purification and correlation with other biological attributes, revealing more than 300 connections to 100 disease-associated proteins.[12] Since those pioneering efforts, hundreds of similar studies have been conducted. Compiled databases such as UniHI[27] provide platform for single entry. Futschik et al.[28] performed a meta analysis of eight interactome maps and found that of 57 000 interacting proteins in total, there was a small (albeit statistically significant) overlap between the different databases, indicating considerable selection and detection biases.
In 2010, around 130 000 binary interactions in the interactome were described in the most popular databases, but many were verified with only one source.[15] With the rapid development of high throughput methods, datasets still suffer from high rates of false positives and low coverage of the interactome. Tyagi et al.[29] described a novel framework for incorporating structural complexes and binding interfaces for verification. This was part of much larger efforts for PPI verification; interaction networks are typically validated further by using a combination of coexpression profiles, protein structural information, Gene ontology terms, topological considerations, and colocalization[26] [30] before being considered “high-confidence”.
A recent resource paper (November 2014) [17] attempts to provide a more comprehensive proteome level map of the human interactome. It found vast uncharted territory in the human interactome, and used diverse methods to build a new interactome map correcting for curation bias, including probing all pairwise combinations of 13 000 protein products for interaction using Yeast two hybrid and co-affinity purification, in a massive coordinated effort across research labs in Canada and the United States. However, this still represents confirmation of but a fraction of expected interactions – around 30 000 of high confidence. Despite the coordinated efforts of many, the human interactome is still very much a work in progress.[17] [30]
See also
Notes and References
- Bonetta L . Protein-protein interactions: Interactome under construction . Nature . 468 . 7325 . 851–4 . December 2010 . 21150998 . 10.1038/468851a . 2010Natur.468..851B . 205060874 . free .
- Stumpf MP, Thorne T, de Silva E, Stewart R, An HJ, Lappe M, Wiuf C . Estimating the size of the human interactome . Proceedings of the National Academy of Sciences of the United States of America . 105 . 19 . 6959–64 . May 2008 . 18474861 . 2383957 . 10.1073/pnas.0708078105 . free .
- Hamosh A, Scott AF, Amberger JS, Bocchini CA, McKusick VA . Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders . Nucleic Acids Research . 33 . Database issue . D514–7 . January 2005 . 15608251 . 539987 . 10.1093/nar/gki033 .
- Stenson PD, Mort M, Ball EV, Shaw K, Phillips A, Cooper DN . The Human Gene Mutation Database: building a comprehensive mutation repository for clinical and molecular genetics, diagnostic testing and personalized genomic medicine . Human Genetics . 133 . 1 . 1–9 . January 2014 . 24077912 . 3898141 . 10.1007/s00439-013-1358-4 .
- Chuang HY, Hofree M, Ideker T . A decade of systems biology . Annual Review of Cell and Developmental Biology . 26 . 721–44 . 2010 . 20604711 . 3371392 . 10.1146/annurev-cellbio-100109-104122 .
- Blow N . Systems biology: Untangling the protein web . Nature . 460 . 7253 . 415–8 . July 2009 . 19606149 . 10.1038/460415a . 2009Natur.460..415B . free .
- Vidal M, Cusick ME, Barabási AL . Interactome networks and human disease . Cell . 144 . 6 . 986–98 . March 2011 . 21414488 . 3102045 . 10.1016/j.cell.2011.02.016 .
- Amaral LA . A truer measure of our ignorance . Proceedings of the National Academy of Sciences of the United States of America . 105 . 19 . 6795–6 . May 2008 . 18474865 . 2383987 . 10.1073/pnas.0802459105 . 2008PNAS..105.6795A . free .
- Bork P, Jensen LJ, von Mering C, Ramani AK, Lee I, Marcotte EM . Protein interaction networks from yeast to human . Current Opinion in Structural Biology . 14 . 3 . 292–9 . June 2004 . 15193308 . 10.1016/j.sbi.2004.05.003 .
- Konstantinos. Theofilatos. Dimitrakopoulos. Christos. Kleftogiannis. Dimitrios. Charalampos. Moschopoulos. Stergios. Papadimitriou. Likothanassis. Spiros. Seferina. Mavroudi . vanc . HINT-KB: The Human Interactome Knowledge Base. Artificial Intelligence Review. 2014. 10.1007/s10462-013-9409-8. 42. 3. 427–443. 16376962.
- Chatr-Aryamontri A, Breitkreutz BJ, Oughtred R, Boucher L, Heinicke S, Chen D, Stark C, Breitkreutz A, Kolas N, O'Donnell L, Reguly T, Nixon J, Ramage L, Winter A, Sellam A, Chang C, Hirschman J, Theesfeld C, Rust J, Livstone MS, Dolinski K, Tyers M . The BioGRID interaction database: 2015 update . Nucleic Acids Research . 43 . Database issue . D470–8 . January 2015 . 25428363 . 4383984 . 10.1093/nar/gku1204 .
- Rual JF, Venkatesan K, Hao T, Hirozane-Kishikawa T, Dricot A, Li N, Berriz GF, Gibbons FD, Dreze M, Ayivi-Guedehoussou N, Klitgord N, Simon C, Boxem M, Milstein S, Rosenberg J, Goldberg DS, Zhang LV, Wong SL, Franklin G, Li S, Albala JS, Lim J, Fraughton C, Llamosas E, Cevik S, Bex C, Lamesch P, Sikorski RS, Vandenhaute J, Zoghbi HY, Smolyar A, Bosak S, Sequerra R, Doucette-Stamm L, Cusick ME, Hill DE, Roth FP, Vidal M . 6 . Towards a proteome-scale map of the human protein-protein interaction network . Nature . 437 . 7062 . 1173–8 . October 2005 . 16189514 . 10.1038/nature04209 . 2005Natur.437.1173R . 4427026 .
- Margolin AA, Nemenman I, Basso K, Wiggins C, Stolovitzky G, Dalla Favera R, Califano A . ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context . BMC Bioinformatics . 7 . S7 . March 2006 . Suppl 1 . 16723010 . 1810318 . q-bio/0410037. 10.1186/1471-2105-7-S1-S7 . free .
- Jaeger S, Gaudan S, Leser U, Rebholz-Schuhmann D . Integrating protein-protein interactions and text mining for protein function prediction . BMC Bioinformatics . 9 . S2 . July 2008 . Suppl 8 . 18673526 . 2500093 . 10.1186/1471-2105-9-S8-S2 . free .
- Bonetta L . Protein-protein interactions: Interactome under construction . Nature . 468 . 7325 . 851–4 . December 2010 . 21150998 . 10.1038/468851a . 2010Natur.468..851B . 205060874 . free .
- Reimand J, Bader GD . Systematic analysis of somatic mutations in phosphorylation signaling predicts novel cancer drivers . Molecular Systems Biology . 9 . 637 . 2013 . 23340843 . 3564258 . 10.1038/msb.2012.68 .
- Rolland T, Taşan M, Charloteaux B, Pevzner SJ, Zhong Q, Sahni N, Yi S, Lemmens I, Fontanillo C, Mosca R, Kamburov A, Ghiassian SD, Yang X, Ghamsari L, Balcha D, Begg BE, Braun P, Brehme M, Broly MP, Carvunis AR, Convery-Zupan D, Corominas R, Coulombe-Huntington J, Dann E, Dreze M, Dricot A, Fan C, Franzosa E, Gebreab F, Gutierrez BJ, Hardy MF, Jin M, Kang S, Kiros R, Lin GN, Luck K, MacWilliams A, Menche J, Murray RR, Palagi A, Poulin MM, Rambout X, Rasla J, Reichert P, Romero V, Ruyssinck E, Sahalie JM, Scholz A, Shah AA, Sharma A, Shen Y, Spirohn K, Tam S, Tejeda AO, Trigg SA, Twizere JC, Vega K, Walsh J, Cusick ME, Xia Y, Barabási AL, Iakoucheva LM, Aloy P, De Las Rivas J, Tavernier J, Calderwood MA, Hill DE, Hao T, Roth FP, Vidal M . 6 . A proteome-scale map of the human interactome network . Cell . 159 . 5 . 1212–1226 . November 2014 . 25416956 . 4266588 . 10.1016/j.cell.2014.10.050 .
- Yu H, Braun P, Yildirim MA, Lemmens I, Venkatesan K, Sahalie J, Hirozane-Kishikawa T, Gebreab F, Li N, Simonis N, Hao T, Rual JF, Dricot A, Vazquez A, Murray RR, Simon C, Tardivo L, Tam S, Svrzikapa N, Fan C, de Smet AS, Motyl A, Hudson ME, Park J, Xin X, Cusick ME, Moore T, Boone C, Snyder M, Roth FP, Barabási AL, Tavernier J, Hill DE, Vidal M . 6 . High-quality binary protein interaction map of the yeast interactome network . Science . 322 . 5898 . 104–10 . October 2008 . 18719252 . 2746753 . 10.1126/science.1158684 . 2008Sci...322..104Y .
- Giot L, Bader JS, Brouwer C, Chaudhuri A, Kuang B, Li Y, Hao YL, Ooi CE, Godwin B, Vitols E, Vijayadamodar G, Pochart P, Machineni H, Welsh M, Kong Y, Zerhusen B, Malcolm R, Varrone Z, Collis A, Minto M, Burgess S, McDaniel L, Stimpson E, Spriggs F, Williams J, Neurath K, Ioime N, Agee M, Voss E, Furtak K, Renzulli R, Aanensen N, Carrolla S, Bickelhaupt E, Lazovatsky Y, DaSilva A, Zhong J, Stanyon CA, Finley RL, White KP, Braverman M, Jarvie T, Gold S, Leach M, Knight J, Shimkets RA, McKenna MP, Chant J, Rothberg JM . 1642026 . 6 . A protein interaction map of Drosophila melanogaster . Science . 302 . 5651 . 1727–36 . December 2003 . 14605208 . 10.1126/science.1090289 . 2003Sci...302.1727G . free .
- Xenarios I, Salwínski L, Duan XJ, Higney P, Kim SM, Eisenberg D . DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions . Nucleic Acids Research . 30 . 1 . 303–5 . January 2002 . 11752321 . 99070 . 10.1093/nar/30.1.303 .
- Lim J, Hao T, Shaw C, Patel AJ, Szabó G, Rual JF, Fisk CJ, Li N, Smolyar A, Hill DE, Barabási AL, Vidal M, Zoghbi HY . A protein-protein interaction network for human inherited ataxias and disorders of Purkinje cell degeneration . Cell . 125 . 4 . 801–14 . May 2006 . 16713569 . 10.1016/j.cell.2006.03.032 . free .
- Chang J, Gilman SR, Chiang AH, Sanders SJ, Vitkup D . Genotype to phenotype relationships in autism spectrum disorders . Nature Neuroscience . 18 . 2 . 191–8 . February 2015 . 25531569 . 4397214 . 10.1038/nn.3907 .
- Cancer Genome Atlas Research Network . Comprehensive genomic characterization of squamous cell lung cancers . Nature . 489 . 7417 . 519–25 . September 2012 . 22960745 . 3466113 . 10.1038/nature11404 . 2012Natur.489..519T .
- Gulati S, Cheng TM, Bates PA . Cancer networks and beyond: interpreting mutations using the human interactome and protein structure . Seminars in Cancer Biology . 23 . 4 . 219–26 . August 2013 . 23680723 . 10.1016/j.semcancer.2013.05.002 .
- Taylor IW, Linding R, Warde-Farley D, Liu Y, Pesquita C, Faria D, Bull S, Pawson T, Morris Q, Wrana JL . Dynamic modularity in protein interaction networks predicts breast cancer outcome . Nature Biotechnology . 27 . 2 . 199–204 . February 2009 . 19182785 . 10.1038/nbt.1522 . 11594017 .
- Stelzl U, Worm U, Lalowski M, Haenig C, Brembeck FH, Goehler H, Stroedicke M, Zenkner M, Schoenherr A, Koeppen S, Timm J, Mintzlaff S, Abraham C, Bock N, Kietzmann S, Goedde A, Toksöz E, Droege A, Krobitsch S, Korn B, Birchmeier W, Lehrach H, Wanker EE . A human protein-protein interaction network: a resource for annotating the proteome . Cell . 122 . 6 . 957–68 . September 2005 . 16169070 . 10.1016/j.cell.2005.08.029 . 11858/00-001M-0000-0010-8592-0 . 8235923 . free .
- Chaurasia G, Iqbal Y, Hänig C, Herzel H, Wanker EE, Futschik ME . UniHI: an entry gate to the human protein interactome . Nucleic Acids Research . 35 . Database issue . D590–4 . January 2007 . 17158159 . 1781159 . 10.1093/nar/gkl817 .
- Futschik ME, Chaurasia G, Herzel H . Comparison of human protein-protein interaction maps . Bioinformatics . 23 . 5 . 605–11 . March 2007 . 17237052 . 10.1093/bioinformatics/btl683 . free .
- Tyagi M, Hashimoto K, Shoemaker BA, Wuchty S, Panchenko AR . Large-scale mapping of human protein interactome using structural complexes . EMBO Reports . 13 . 3 . 266–71 . March 2012 . 22261719 . 3296913 . 10.1038/embor.2011.261 .
- De Las Rivas J, Fontanillo C . Protein-protein interactions essentials: key concepts to building and analyzing interactome networks . PLOS Computational Biology . 6 . 6 . e1000807 . June 2010 . 20589078 . 2891586 . 10.1371/journal.pcbi.1000807 . 2010PLSCB...6E0807D . free .