Enzyme Function Initiative Explained

Enzyme Function Initiative (EFI)
Formation:2010
Purpose:Develop and disseminate a robust strategy to determine enzyme function
Headquarters:University of Illinois, Urbana-Champaign
Leader Title:Principal Investigator
Leader Name:John A. Gerlt, Ph.D.
Budget:Five-year NIGMS Glue Grant
Website:www.enzymefunction.org

The Enzyme Function Initiative (EFI) is a large-scale collaborative project aiming to develop and disseminate a robust strategy to determine enzyme function through an integrated sequence–structure-based approach.[1] The project was funded in May 2010 by the National Institute of General Medical Sciences as a Glue Grant which supports the research of complex biological problems that cannot be solved by a single research group.[2] [3] The EFI was largely spurred by the need to develop methods to identify the functions of the enormous number proteins discovered through genomic sequencing projects.[4]

Motivation

The dramatic increase in genome sequencing technology has caused the number of protein sequences deposited into public databases to grow apparently exponentially.[5] To cope with the influx of sequences, databases use computational predictions to auto-annotate individual protein's functions. While these computational methods offer the advantages of being extremely high-throughput and generally provide accurate broad classifications, exclusive use has led to a significant level of misannotation of enzyme function in protein databases.[6] Thus although the information now available represents an unprecedented opportunity to understand cellular metabolism across a wide variety of organisms, which includes the ability to identify molecules and/or reactions that may benefit human quality of life, the potential has not been fully actualized.[7] The biological community's ability to characterize newly discovered proteins has been outstripped by the rate of genome sequencing, and the task of assigning function is now considered the rate-limiting step in understanding biological systems in detail.[8]

Integrated strategy for functional assignment

The EFI is developing an integrated sequence-structure based strategy for functional assignment by predicting the substrate specificities of unknown members of mechanistically diverse enzyme superfamilies.[9] The approach leverages conserved features within a given superfamily such as known chemistry, identity of active site functional groups, and composition of specificity-determining residues, motifs, or structures to predict function but relies on multidisciplinary expertise to streamline, refine, and test the predictions.[10] [11] [12] The integrated sequence-strategy under development will be generally applicable to deciphering the ligand specificities of any functionally unknown protein.[9]

Organization

By NIGMS program mandate, Glue Grant consortia must contain core resources and bridging projects.[3] The EFI consists of six scientific cores which provide bioinformatic, structural, computational, and data management expertise to facilitate functional predictions for enzymes of unknown function targeted by the EFI. At the beginning of the grant, these predictions were tested by five Bridging Projects representing the amidohydrolase, enolase, GST, HAD, and isoprenoid synthase enzyme superfamilies. Three Bridging Projects now remain.[9] In addition, the Anaerobic Enzymology Pilot Project was added in 2014 to explore the Radical SAM superfamily and Glycyl Radical Enzyme superfamily.

Scientific cores

The bioinformatics core contributes bioinformatic analysis by collecting and curating complete sequence data sets, generating sequence similarity networks, and classification of superfamily members into subgroups and families for subsequent annotation transfer and evaluation as targets for functional characterization.

The protein core develops cloning, expression, and protein purification strategies for the enzymes targeted for study.

The structure core fulfills the structural biology component for EFI by providing high resolution structures of targeted enzymes.

The computation core performs in silico docking to generate rank-ordered lists of predicted substrates for targeted enzymes using both experimentally determined and/or homology modeled protein structures.

The microbiology core examines in vivo functions using genetic techniques and metabolomics to complement in vitro functions determined by the Bridging Projects.

The data and dissemination core maintains a public database for experimental data (EFI-DB).[13] [14]

Bridging projects

The enolase superfamily contains evolutionarily related enzymes with a (β/α)7β‑barrel (TIM‑barrel) fold which primarily catalyze metal-assisted epimerization/racemization or β-elimination of carboxylate substrates.[15]

The Haloacid dehydrogenase superfamily contains evolutionarily related enzymes with a Rossmanoid α/β fold with an inserted "cap" region which primarily catalyze metal-assisted nucleophilic catalysis, most frequently resulting in phosphoryl group transfer.[16]

The isoprenoid synthase (I) superfamily contains evolutionarily related enzymes with a mostly all α-helical fold and primarily catalyze trans-prenyl transfer reactions to form elongated or cyclized isoprene products.[17]

The Anaerobic Enzymology bridging project will explore radical-dependent enzymology, which allows the execution of unusual chemical transformations via an iron-sulfur cluster cleaving S-Adenosyl methionine (SAM) and producing a radical intermediate, or alternatively, abstraction of a hydrogen from glycine producing a glycyl radical. The superfamilies containing these enzymes are largely unexplored and thus, ripe with the potential for functional discoveries. The acquisition of an anaerobic protein production pipeline coupled with the installation of a Biosafety Level 2 anaerobic chamber for culturing human gut microbes has readied the EFI to pursue anaerobic enzymology.

Participating investigators

Twelve investigators with expertise in various disciplines make up the EFI.[18]

Name Institution Role
Gerlt, John A. University of Illinois, Urbana-Champaign Program Director, Director of the Enolase Bridging Project, co-director of Data and Dissemination Core
Allen, Karen N. Boston University Director of the HAD Bridging Project
Almo, Steven C. Albert Einstein College of Medicine Director of the Protein Core and Structure Core
Cronan, John E. University of Illinois, Urbana-Champaign Co-director of the Microbiology Core
Jacobson, Matthew P. University of California, San Francisco Co-director of the Computation Core
Minor, Wladek University of Virginia Co-director of Data and Dissemination Core
Poulter, C. Dale University of Utah Director of the Isoprenoid Synthase Bridging Project
Sali, Andrej University of California, San Francisco Co-director of the Computation Core
Shoichet, Brian K. University of California, San Francisco Co-director of the Computation Core
Sweedler, Jonathan V. University of Illinois, Urbana-Champaign Co-director of the Microbiology Core
Pollard, Katherine S. Gladstone Institutes Director of the Sifting Families Pilot Project
Booker, Squire J. Pennsylvania State University Director of the Anaerobic Enzymology Pilot Project

Deliverables

The EFI's primary deliverable is development and dissemination of an integrated sequence/structure strategy for functional assignment. The EFI now offers access to two high-throughput docking tools, a web tool for comparing protein sequences within entire protein families, and a web tool for composing a genome context inventory based on a protein sequence similarity network. Additionally, as the strategy is developed, data and clones generated by the EFI are made freely available via several online resources.[9]

Funding

The EFI was established in May 2010 with $33.9 million in funding over a five-year period (grant number GM093342).[19]

External links

Notes and References

  1. New NIGMS 'Glue Grant' Takes Aim at Unknown Enzymes . NIGMS . 2010-05-20 . 2012-04-27 . dead . https://web.archive.org/web/20120427131544/http://www.nigms.nih.gov/News/Results/gluegrant_051510.htm . 2012-04-27 .
  2. Web site: Glue Grants . 2012-04-27 . NIGMS . dead . https://web.archive.org/web/20130603062033/http://www.nigms.nih.gov/Research/FeaturedPrograms/Collaborative/GlueGrants/ . 2013-06-03 .
  3. Web site: PAR-07-412: Large-Scale Collaborative Project Awards (R24/U54) . 2012-04-27 . NIH/NIGMS .
  4. Researchers Awarded $33.9 Million Grant to Study Enzyme Functions . UIUC News Bureau . 2010-05-20 . 2012-04-27 .
  5. Web site: UniProtKB/TrEMBL Protein Database Release Statistics . 2012-04-27 . UniProtKB/TrEMBL Protein Database . https://web.archive.org/web/20151001200723/http://www.ebi.ac.uk/uniprot/TrEMBLstats . 2015-10-01 . dead .
  6. 10.1371/journal.pcbi.1000605 . Annotation Error in Public Databases: Misannotation of Molecular Function in Enzyme Superfamilies . 2009 . Valencia . Alfonso . Schnoes . Alexandra M. . Brown . Shoshana D. . Dodevski . Igor . Babbitt . Patricia C. . PLOS Computational Biology . 5 . 12 . e1000605 . 20011109 . 2781113 . free . 2009PLSCB...5E0605S .
  7. 130–42 . 10.1038/nchembio0805-130 . Assignment of protein function in the postgenomic era . 2005 . Saghatelian . Alan . Cravatt . Benjamin F . Nature Chemical Biology . 1 . 3 . 16408016. 86672970 .
  8. 10.1186/gb-2006-7-1-r8 . 2006 . Brown . Shoshana . Gerlt . John . Seffernick . Jennifer . Babbitt . Patricia . Genome Biology . 7 . R8 . 16507141 . A gold standard set of mechanistically diverse enzyme superfamilies . 1 . 1431709 . free .
  9. Gerlt JA, Allen KN, Almo SC, Armstrong RN, Babbitt PC, Cronan JE, Dunaway-Mariano D, Imker HJ, Jacobson MP, Minor W, Poulter CD, Raushel FM, Sali A, Shoichet BK, Sweedler JV. The Enzyme Function Initiative.. Biochemistry. Nov 22, 2011. 50. 46. 9950–62. 21999478. 10.1021/bi201312u. 3238057.
  10. 486–91 . 10.1038/nchembio.2007.11 . Prediction and assignment of function for a divergent N-succinyl amino acid racemase . 2007 . Song . Ling . Kalyanaraman . Chakrapani . Fedorov . Alexander A . Fedorov . Elena V . Glasner . Margaret E . Brown . Shoshana . Imker . Heidi J . Babbitt . Patricia C . Almo . Steven C . Nature Chemical Biology . 3 . 8 . 17603539 . 28679225 .
  11. 10.1038/nature05981 . Structure-based activity prediction for an enzyme of unknown function . 2007 . Hermann . Johannes C. . Marti-Arbona . Ricardo . Fedorov . Alexander A. . Fedorov . Elena . Almo . Steven C. . Shoichet . Brian K. . Raushel . Frank M. . Nature . 448 . 7155 . 775–779 . 17603473 . 2254328. 2007Natur.448..775H .
  12. 1668–77 . 10.1016/j.str.2008.08.015 . Discovery of a Dipeptide Epimerase Enzymatic Function Guided by Homology Modeling and Virtual Screening . 2008 . Kalyanaraman . C . Imker . H . Fedorov . A . Fedorov . E . Glasner . M . Babbitt . P . Almo . S . Gerlt . J . Jacobson . M . Structure . 16 . 11 . 19000819 . 2714228 .
  13. 2545–55 . 10.1021/bi052101l . Leveraging Enzyme Structure−Function Relationships for Functional Inference and Experimental Design: The Structure−Function Linkage Database† . 2006 . Pegg . Scott C.-H. . Brown . Shoshana D. . Ojha . Sunil . Seffernick . Jennifer . Meng . Elaine C. . Morris . John H. . Chang . Patricia J. . Huang . Conrad C. . Ferrin . Thomas E. . Biochemistry . 45 . 8 . 16489747 .
  14. Web site: EFI-DB Experimental Database. Enzyme Function Initiative. 2012-04-27.
  15. 59–70 . 10.1016/j.abb.2004.07.034 . Divergent evolution in the enolase superfamily: The interplay of mechanism and specificity . 2005 . Gerlt . John A. . Babbitt . Patricia C. . Ivan Rayment . Rayment . Ivan . Archives of Biochemistry and Biophysics . 433 . 15581566 . 1.
  16. 1003–34 . 10.1016/j.jmb.2006.06.049 . Evolutionary Genomics of the HAD Superfamily: Understanding the Structural Adaptations and Catalytic Diversity in a Superfamily of Phosphoesterases and Allied Enzymes . 2006 . Burroughs . A. Maxwell . Allen . Karen N. . Dunaway-Mariano . Debra . Aravind . L. . Journal of Molecular Biology . 361 . 5 . 16889794. 10.1.1.420.9551 .
  17. 3412–42 . 10.1021/cr050286w . Structural Biology and Chemistry of the Terpenoid Cyclases . 2006 . Christianson . David W. . Chemical Reviews . 106 . 8 . 16895335.
  18. Web site: People . Enzyme Function Initiative . 2012-04-27.
  19. Web site: NIGMS Glue Grants Outcomes Assessment . 2012-04-27 . NIGMS . dead . https://web.archive.org/web/20120427104459/http://www.nigms.nih.gov/Research/FeaturedPrograms/Collaborative/GlueGrants/OutcomeAssessment/ . 2012-04-27 .