Chemical biology is a scientific discipline between the fields of chemistry and biology. The discipline involves the application of chemical techniques, analysis, and often small molecules produced through synthetic chemistry, to the study and manipulation of biological systems.[1] Although often confused with biochemistry, which studies the chemistry of biomolecules and regulation of biochemical pathways within and between cells, chemical biology remains distinct by focusing on the application of chemical tools to address biological questions.
Although considered a relatively new scientific field,[2] the term "chemical biology" has been in use since the early 20th century,[3] and has roots in scientific discovery from the early 19th century. The term 'chemical biology' can be traced back to an early appearance in a book published by Alonzo E. Taylor in 1907 titled "On Fermentation",[4] and was subsequently used in John B. Leathes' 1930 article titled "The Harveian Oration on The Birth of Chemical Biology".[5] However, it is unclear when the term was first used.
Friedrich Wöhler's 1828 synthesis of urea is an early example of the application of synthetic chemistry to advance biology.[6] It showed that biological compounds could be synthesized with inorganic starting materials and weakened the previous notion of vitalism, or that a 'living' source was required to produce organic compounds.[7] [8] Wöhler's work is often considered to be instrumental in the development of organic chemistry and natural product synthesis, both of which play a large part in modern chemical biology.[9]
Friedrich Miescher's work during the late 19th century investigating the cellular contents of human leukocytes led to the discovery of 'nuclein', which would later be renamed DNA. After isolating the nuclein from the nucleus of leukocytes through protease digestion, Miescher used chemical techniques such as elemental analysis and solubility tests to determine the composition of nuclein.[10] This work would lay the foundations for Watson and Crick's discovery of the double-helix structure of DNA.[11]
The rising interest in chemical biology has led to the setting up of several journals dedicated to the field. Nature Chemical Biology, created in 2005,[12] and ACS Chemical Biology, created in 2006,[13] are two of the most well-known journals in this field, with impact factors of 14.8[14] and 4.0[15] respectively.
Paul Berg | 1980 | Chemistry | Recombinant DNA[16] | |
Walter GilbertFredrick Sanger | 1980 | Chemistry | Genome sequencing | |
Kary Mullis | 1993 | Chemistry | Polymerase chain reaction[17] | |
Michael Smith | 1993 | Chemistry | Site-directed mutagenesis | |
Venkatraman RamakrishnanThomas A. Steitz Ada E. Yonath | 2009 | Chemistry | Elucidation of ribosome structure and function[18] | |
Robert J. LefkowitzBrian K. Kobilka | 2012 | Chemistry | G-protein-coupled receptors[19] | |
Frances H. ArnoldGeorge P. Smith Gregory P. Winter | 2018 | Chemistry | Enzyme development through directed evolution[20] | |
Emmanuelle CharpentierJennifer A. Doudna | 2020 | Chemistry | CRISPR/Cas9 genetic scissors[21] | |
Barry SharplessMorten Meldal | 2022 | Chemistry | Click chemistry[22] | |
Carolyn Bertozzi | 2022 | Chemistry | Applications of click chemistry in living organisms |
Glycobiology is the study of the structure and function of carbohydrates.[23] While DNA, RNA and proteins are encoded at the genetic level, carbohydrates are not encoded directly from the genome, and thus require different tools for their study.[24] By applying chemical principles to glycobiology, novel methods for analyzing and synthesizing carbohydrates can be developed.[25] For example, cells can be supplied with synthetic variants of natural sugars to probe their function. Carolyn Bertozzi's research group has developed methods for site-specifically reacting molecules at the surface of cells via synthetic sugars.[26]
Combinatorial chemistry involves simultaneously synthesizing a large number of related compounds for high-throughput analysis.[27] Chemical biologists are able to use principles from combinatorial chemistry in synthesizing active drug compounds and maximizing screening efficiency.[28] Similarly, these principles can be used in areas of agriculture and food research, specifically in the syntheses of unnatural products and in generating novel enzyme inhibitors.[29]
Chemical synthesis of proteins is a valuable tool in chemical biology as it allows for the introduction of non-natural amino acids as well as residue specific incorporation of "posttranslational modifications" such as phosphorylation, glycosylation, acetylation, and even ubiquitination.[30] These properties are valuable for chemical biologists as non-natural amino acids can be used to probe and alter the functionality of proteins, while post-translational modifications are widely known to regulate the structure and activity of proteins.[31] Although strictly biological techniques have been developed to achieve these ends, the chemical synthesis of peptides often has a lower technical and practical barrier to obtaining small amounts of the desired protein.[32]
To make protein-sized polypeptide chains with the small peptide fragments made by synthesis, chemical biologists can use the process of native chemical ligation.[33] Native chemical ligation involves the coupling of a C-terminal thioester and an N-terminal cysteine residue, ultimately resulting in formation of a "native" amide bond.[34] Other strategies that have been used for the ligation of peptide fragments using the acyl transfer chemistry first introduced with native chemical ligation include expressed protein ligation,[35] sulfurization/desulfurization techniques,[36] and use of removable thiol auxiliaries.[37]
Chemical biologists work to improve proteomics through the development of enrichment strategies, chemical affinity tags, and new probes. Samples for proteomics often contain many peptide sequences and the sequence of interest may be highly represented or of low abundance, which creates a barrier for their detection. Chemical biology methods can reduce sample complexity by selective enrichment using affinity chromatography. This involves targeting a peptide with a distinguishing feature like a biotin label or a post translational modification.[38] Methods have been developed that include the use of antibodies, lectins to capture glycoproteins, and immobilized metal ions to capture phosphorylated peptides and enzyme substrates to capture select enzymes.
To investigate enzymatic activity as opposed to total protein, activity-based reagents have been developed to label the enzymatically active form of proteins (see Activity-based proteomics). For example, serine hydrolase- and cysteine protease-inhibitors have been converted to suicide inhibitors.[39] This strategy enhances the ability to selectively analyze low abundance constituents through direct targeting.[40] Enzyme activity can also be monitored through converted substrate.[41] Identification of enzyme substrates is a problem of significant difficulty in proteomics and is vital to the understanding of signal transduction pathways in cells. A method that has been developed uses "analog-sensitive" kinases to label substrates using an unnatural ATP analog, facilitating visualization and identification through a unique handle.[42]
Many research programs are also focused on employing natural biomolecules to perform biological tasks or to support a new chemical method. In this regard, chemical biology researchers have shown that DNA can serve as a template for synthetic chemistry, self-assembling proteins can serve as a structural scaffold for new materials, and RNA can be evolved in vitro to produce new catalytic function. Additionally, heterobifunctional (two-sided) synthetic small molecules such as dimerizers or PROTACs bring two proteins together inside cells, which can synthetically induce important new biological functions such as targeted protein degradation.[43]
A primary goal of protein engineering is the design of novel peptides or proteins with a desired structure and chemical activity.[44] Because our knowledge of the relationship between primary sequence, structure, and function of proteins is limited, rational design of new proteins with engineered activities is extremely challenging.[45] In directed evolution, repeated cycles of genetic diversification followed by a screening or selection process, can be used to mimic natural selection in the laboratory to design new proteins with a desired activity.[46]
Several methods exist for creating large libraries of sequence variants. Among the most widely used are subjecting DNA to UV radiation or chemical mutagens, error-prone PCR, degenerate codons, or recombination.[47] [48] Once a large library of variants is created, selection or screening techniques are used to find mutants with a desired attribute. Common selection/screening techniques include FACS,[49] mRNA display,[50] phage display, and in vitro compartmentalization.[51] Once useful variants are found, their DNA sequence is amplified and subjected to further rounds of diversification and selection.
The development of directed evolution methods was honored in 2018 with the awarding of the Nobel Prize in Chemistry to Frances Arnold for evolution of enzymes, and George Smith and Gregory Winter for phage display.[52]
Successful labeling of a molecule of interest requires specific functionalization of that molecule to react chemospecifically with an optical probe. For a labeling experiment to be considered robust, that functionalization must minimally perturb the system. Unfortunately, these requirements are often hard to meet. Many of the reactions normally available to organic chemists in the laboratory are unavailable in living systems.[53] Water- and redox- sensitive reactions would not proceed, reagents prone to nucleophilic attack would offer no chemospecificity, and any reactions with large kinetic barriers would not find enough energy in the relatively low-heat environment of a living cell.[54] Thus, chemists have recently developed a panel of bioorthogonal chemistry that proceed chemospecifically, despite the milieu of distracting reactive materials in vivo.
The coupling of a probe to a molecule of interest must occur within a reasonably short time frame;[55] therefore, the kinetics of the coupling reaction should be highly favorable. Click chemistry is well suited to fill this niche, since click reactions are rapid, spontaneous, selective, and high-yielding. Unfortunately, the most famous "click reaction," a [3+2] cycloaddition between an azide and an acyclic alkyne, is copper-catalyzed, posing a serious problem for use in vivo due to copper's toxicity. To bypass the necessity for a catalyst, Carolyn R. Bertozzi's lab introduced inherent strain into the alkyne species by using a cyclic alkyne. In particular, cyclooctyne reacts with azido-molecules with distinctive vigor.
The advances in modern sequencing technologies in the late 1990s allowed scientists to investigate DNA of communities of organisms in their natural environments ("eDNA"), without culturing individual species in the lab. This metagenomic approach enabled scientists to study a wide selection of organisms that were previously not characterized due in part to an incompetent growth condition. Sources of eDNA include soils, ocean, subsurface, hot springs, hydrothermal vents, polar ice caps, hypersaline habitats, and extreme pH environments.[56] Of the many applications of metagenomics, researchers such as Jo Handelsman, Jon Clardy, and Robert M. Goodman, explored metagenomic approaches toward the discovery of biologically active molecules such as antibiotics.[57]
Functional or homology screening strategies have been used to identify genes that produce small bioactive molecules. Functional metagenomic studies are designed to search for specific phenotypes that are associated with molecules with specific characteristics. Homology metagenomic studies, on the other hand, are designed to examine genes to identify conserved sequences that are previously associated with the expression of biologically active molecules.[58] Functional metagenomic studies enable the discovery of novel genes that encode biologically active molecules. These assays include top agar overlay assays where antibiotics generate zones of growth inhibition against test microbes, and pH assays that can screen for pH change due to newly synthesized molecules using pH indicator on an agar plate.[59] Substrate-induced gene expression screening (SIGEX), a method to screen for the expression of genes that are induced by chemical compounds, has also been used to search for genes with specific functions.[59] Homology-based metagenomic studies have led to a fast discovery of genes that have homologous sequences as the previously known genes that are responsible for the biosynthesis of biologically active molecules. As soon as the genes are sequenced, scientists can compare thousands of bacterial genomes simultaneously.[58] The advantage over functional metagenomic assays is that homology metagenomic studies do not require a host organism system to express the metagenomes, thus this method can potentially save the time spent on analyzing nonfunctional genomes. These also led to the discovery of several novel proteins and small molecules.[60] In addition, an in silico examination from the Global Ocean Metagenomic Survey found 20 new lantibiotic cyclases.[61]
Posttranslational modification of proteins with phosphate groups by kinases is a key regulatory step throughout all biological systems. Phosphorylation events, either phosphorylation by protein kinases or dephosphorylation by phosphatases, result in protein activation or deactivation. These events have an impact on the regulation of physiological pathways, which makes the ability to dissect and study these pathways integral to understanding the details of cellular processes. There exist a number of challenges—namely the sheer size of the phosphoproteome, the fleeting nature of phosphorylation events and related physical limitations of classical biological and biochemical techniques—that have limited the advancement of knowledge in this area.[62]
Through the use of small molecule modulators of protein kinases, chemical biologists have gained a better understanding of the effects of protein phosphorylation. For example, nonselective and selective kinase inhibitors, such as a class of pyridinylimidazole compounds [63] are potent inhibitors useful in the dissection of MAP kinase signaling pathways. These pyridinylimidazole compounds function by targeting the ATP binding pocket. Although this approach, as well as related approaches,[64] [65] with slight modifications, has proven effective in a number of cases, these compounds lack adequate specificity for more general applications. Another class of compounds, mechanism-based inhibitors, combines knowledge of the kinase enzymology with previously utilized inhibition motifs. For example, a "bisubstrate analog" inhibits kinase action by binding both the conserved ATP binding pocket and a protein/peptide recognition site on the specific kinase.[66] Research groups also utilized ATP analogs as chemical probes to study kinases and identify their substrates.[67] [68] [69]
The development of novel chemical means of incorporating phosphomimetic amino acids into proteins has provided important insight into the effects of phosphorylation events. Phosphorylation events have typically been studied by mutating an identified phosphorylation site (serine, threonine or tyrosine) to an amino acid, such as alanine, that cannot be phosphorylated. However, these techniques come with limitations and chemical biologists have developed improved ways of investigating protein phosphorylation. By installing phospho-serine, phospho-threonine or analogous phosphonate mimics into native proteins, researchers are able to perform in vivo studies to investigate the effects of phosphorylation by extending the amount of time a phosphorylation event occurs while minimizing the often-unfavorable effects of mutations. Expressed protein ligation, has proven to be successful techniques for synthetically producing proteins that contain phosphomimetic molecules at either terminus.[70] In addition, researchers have used unnatural amino acid mutagenesis at targeted sites within a peptide sequence.[71] [72]
Advances in chemical biology have also improved upon classical techniques of imaging kinase action. For example, the development of peptide biosensors—peptides containing incorporated fluorophores improved temporal resolution of in vitro binding assays.[73] One of the most useful techniques to study kinase action is Fluorescence Resonance Energy Transfer (FRET). To utilize FRET for phosphorylation studies, fluorescent proteins are coupled to both a phosphoamino acid binding domain and a peptide that can be phosphorylated. Upon phosphorylation or dephosphorylation of a substrate peptide, a conformational change occurs that results in a change in fluorescence.[74] FRET has also been used in tandem with Fluorescence Lifetime Imaging Microscopy (FLIM)[75] or fluorescently conjugated antibodies and flow cytometry[76] to provide quantitative results with excellent temporal and spatial resolution.
Chemical biologists often study the functions of biological macromolecules using fluorescence techniques. The advantage of fluorescence versus other techniques resides in its high sensitivity, non-invasiveness, safe detection, and ability to modulate the fluorescence signal. In recent years, the discovery of green fluorescent protein (GFP) by Roger Y. Tsien and others, hybrid systems and quantum dots have enabled assessing protein location and function more precisely.[77] Three main types of fluorophores are used: small organic dyes, green fluorescent proteins, and quantum dots. Small organic dyes usually are less than 1 kDa, and have been modified to increase photostability and brightness, and reduce self-quenching. Quantum dots have very sharp wavelengths, high molar absorptivity and quantum yield. Both organic dyes and quantum dyes do not have the ability to recognize the protein of interest without the aid of antibodies, hence they must use immunolabeling. Fluorescent proteins are genetically encoded and can be fused to your protein of interest. Another genetic tagging technique is the tetracysteine biarsenical system, which requires modification of the targeted sequence that includes four cysteines, which binds membrane-permeable biarsenical molecules, the green and the red dyes "FlAsH" and "ReAsH", with picomolar affinity. Both fluorescent proteins and biarsenical tetracysteine can be expressed in live cells, but present major limitations in ectopic expression and might cause a loss of function.
Fluorescent techniques have been used to assess a number of protein dynamics including protein tracking, conformational changes, protein–protein interactions, protein synthesis and turnover, and enzyme activity, among others. Three general approaches for measuring protein net redistribution and diffusion are single-particle tracking, correlation spectroscopy and photomarking methods. In single-particle tracking, the individual molecule must be both bright and sparse enough to be tracked from one video to the other. Correlation spectroscopy analyzes the intensity fluctuations resulting from migration of fluorescent objects into and out of a small volume at the focus of a laser. In photomarking, a fluorescent protein can be dequenched in a subcellular area with the use of intense local illumination and the fate of the marked molecule can be imaged directly. Michalet and coworkers used quantum dots for single-particle tracking using biotin-quantum dots in HeLa cells.[78] One of the best ways to detect conformational changes in proteins is to label the protein of interest with two fluorophores within close proximity. FRET will respond to internal conformational changes result from reorientation of one fluorophore with respect to the other. One can also use fluorescence to visualize enzyme activity, typically by using a quenched activity-based proteomics (qABP). Covalent binding of a qABP to the active site of the targeted enzyme will provide direct evidence concerning if the enzyme is responsible for the signal upon release of the quencher and regain of fluorescence.[79]
Despite an increase in biological research within chemistry departments, attempts at integrating chemical biology into undergraduate curricula are lacking.[80] For example, although the American Chemical Society (ACS) requires for foundational courses in a Chemistry Bachelor's degree to include biochemistry, no other biology-related chemistry course is required.[81]
Although a chemical biology course is often not required for an undergraduate degree in Chemistry, many universities now provide introductory chemical biology courses for their undergraduate students. The University of British Columbia, for example, offers a fourth-year course in synthetic chemical biology.[82]