Expanded genetic code explained

An expanded genetic code is an artificially modified genetic code in which one or more specific codons have been re-allocated to encode an amino acid that is not among the 22 common naturally-encoded proteinogenic amino acids.^[1]

The key prerequisites to expand the genetic code are:

the non-standard amino acid to encode,
an unused codon to adopt,
a tRNA that recognizes this codon, and
a tRNA synthetase that recognizes only that tRNA and only the non-standard amino acid.

Expanding the genetic code is an area of research of synthetic biology, an applied biological discipline whose goal is to engineer living systems for useful purposes. The genetic code expansion enriches the repertoire of useful tools available to science.

In May 2019, researchers, in a milestone effort, reported the creation of a new synthetic (possibly artificial) form of viable life, a variant of the bacteria Escherichia coli, by reducing the natural number of 64 codons in the bacterial genome to 61 codons (eliminating two out of the six codons coding for serine and one out of three stop codons) – of which 59 used to encode 20 amino acids.^[2] ^[3]

Introduction

It is noteworthy that the genetic code for all organisms is basically the same, so that all living beings use the same 'genetic language'.^[4] In general, the introduction of new functional unnatural amino acids into proteins of living cells breaks the universality of the genetic language, which ideally leads to alternative life forms.^[5] Proteins are produced thanks to the translational system molecules, which decode the RNA messages into a string of amino acids. The translation of genetic information contained in messenger RNA (mRNA) into a protein is catalysed by ribosomes. Transfer RNAs (tRNA) are used as keys to decode the mRNA into its encoded polypeptide. The tRNA recognizes a specific three nucleotide codon in the mRNA with a complementary sequence called the anticodon on one of its loops. Each three-nucleotide codon is translated into one of twenty naturally occurring amino acids. There is at least one tRNA for any codon, and sometimes multiple codons code for the same amino acid. Many tRNAs are compatible with several codons. An enzyme called an aminoacyl tRNA synthetase covalently attaches the amino acid to the appropriate tRNA.^[6] Most cells have a different synthetase for each amino acid (20 or more synthetases). On the other hand, some bacteria have fewer than 20 aminoacyl tRNA synthetases, and introduce the "missing" amino acid(s) by modification of a structurally related amino acid by an aminotransferase enzyme.^[7] A feature exploited in the expansion of the genetic code is the fact that the aminoacyl tRNA synthetase often does not recognize the anticodon, but another part of the tRNA, meaning that if the anticodon were to be mutated the encoding of that amino acid would change to a new codon. In the ribosome, the information in mRNA is translated into a specific amino acid when the mRNA codon matches with the complementary anticodon of a tRNA, and the attached amino acid is added onto a growing polypeptide chain. When it is released from the ribosome, the polypeptide chain folds into a functioning protein.

In order to incorporate a novel amino acid into the genetic code several changes are required. First, for successful translation of a novel amino acid, the codon to which the novel amino acid is assigned cannot already code for one of the 20 natural amino acids. Usually a nonsense codon (stop codon) or a four-base codon are used.^[8] Second, a novel pair of tRNA and aminoacyl tRNA synthetase are required, these are called the orthogonal set. The orthogonal set must not crosstalk with the endogenous tRNA and synthetase sets, while still being functionally compatible with the ribosome and other components of the translation apparatus. The active site of the synthetase is modified to accept only the novel amino acid. Most often, a library of mutant synthetases is screened for one which charges the tRNA with the desired amino acid. The synthetase is also modified to recognize only the orthogonal tRNA. The tRNA synthetase pair is often engineered in other bacteria or eukaryotic cells.

In this area of research, the 20 encoded proteinogenic amino acids are referred to as standard amino acids, or alternatively as natural or canonical amino acids, while the added amino acids are called non-standard amino acids (NSAAs), or unnatural amino acids (uAAs; term not used in papers dealing with natural non-proteinogenic amino acids, such as phosphoserine), or non-canonical amino acids.

Non-standard amino acids

The first element of the system is the amino acid that is added to the genetic code of a certain strain of organism.

Over 71 different NSAAs have been added to different strains of E. coli, yeast or mammalian cells.^[9] Due to technical details (easier chemical synthesis of NSAAs, less crosstalk and easier evolution of the aminoacyl-tRNA synthase), the NSAAs are generally larger than standard amino acids and most often have a phenylalanine core but with a large variety of different substituents. These allow a large repertoire of new functions, such as labeling (see figure), as a fluorescent reporter (e.g. dansylalanine)^[10] or to produce translational proteins in E. coli with Eukaryotic post-translational modifications (e.g. phosphoserine, phosphothreonine, and phosphotyrosine).^[11]

The founding work was reported by Rolf Furter, who singlehandedly used yeast tRNA^Phe/PheRS pair to incorporate p-iodophenylalanine in E. coli.^[12]

Unnatural amino acids incorporated into proteins include heavy atom-containing amino acids to facilitate certain x-ray crystallographic studies; amino acids with novel steric/packing and electronic properties; photocrosslinking amino acids which can be used to probe protein-protein interactions in vitro or in vivo; keto, acetylene, azide, and boronate-containing amino acids which can be used to selectively introduce a large number of biophysical probes, tags, and novel chemical functional groups into proteins in vitro or in vivo; redox active amino acids to probe and modulate electron transfer; photocaged and photoisomerizable amino acids to photoregulate biological processes; metal binding amino acids for catalysis and metal ion sensing; amino acids that contain fluorescent or infra-red active side chains to probe protein structure and dynamics; α-hydroxy acids and D-amino acids as probes of backbone conformation and hydrogen bonding interactions; and sulfated amino acids and mimetics of phosphorylated amino acids as probes of post-translational modifications.^[13] ^[14] ^[15]

Availability of the non-standard amino acid requires that the organism either import it from the medium or biosynthesize it. In the first case, the unnatural amino acid is first synthesized chemically in its optically pure L-form.^[16] It is then added to the growth medium of the cell. A library of compounds is usually tested for use in incorporation of the new amino acid, but this is not always necessary, for example, various transport systems can handle unnatural amino acids with apolar side-chains. In the second case, a biosynthetic pathway needs to be engineered, for example, an E. coli strain that biosynthesizes a novel amino acid (p-aminophenylalanine) from basic carbon sources and includes it in its genetic code.^[17] ^[18] Another example: the production of phosphoserine, a natural metabolite, and consequently required alteration of its pathway flux to increase its production.

Codon assignment

Another element of the system is a codon to allocate to the new amino acid.

A major problem for the genetic code expansion is that there are no free codons. The genetic code has a non-random layout that shows tell-tale signs of various phases of primordial evolution, however, it has since frozen into place and is near-universally conserved.^[19] Nevertheless, some codons are rarer than others. In fact, in E. coli (and all organisms) the codon usage is not equal, but presents several rare codons (see table), the rarest being the amber stop codon (UAG).

Codon usage in *E. coli*^[20]
Codon	Amino acid	Abundance (%)
UUU	Phe (F)	1.9
UUC	Phe (F)	1.8
UUA	Leu (L)	1.0
UUG	Leu (L)	1.1
CUU	Leu (L)	1.0
CUC	Leu (L)	0.9
CUA	Leu (L)	0.3
CUG	Leu (L)	5.2
AUU	Ile (I)	2.7
AUC	Ile (I)	2.7
AUA	Ile (I)	0.4
AUG	Met (M)	2.6
GUU	Val (V)	2.0
GUC	Val (V)	1.4
GUA	Val (V)	1.2
GUG	Val (V)	2.4
UCU	Ser (S)	1.1
UCC	Ser (S)	1.0
UCA	Ser (S)	0.7
UCG	Ser (S)	0.8
CCU	Pro (P)	0.7
CCC	Pro (P)	0.4
CCA	Pro (P)	0.8
CCG	Pro (P)	2.4
ACU	Thr (T)	1.2
ACC	Thr (T)	2.4
ACA	Thr (T)	0.1
ACG	Thr (T)	1.3
GCU	Ala (A)	1.8
GCC	Ala (A)	2.3
GCA	Ala (A)	0.1
GCG	Ala (A)	3.2
UAU	Tyr (Y)	1.6
UAC	Tyr (Y)	1.4
UAA	Stop	0.2
UAG	Stop	0.03
CAU	His (H)	1.2
CAC	His (H)	1.1
CAA	Gln (Q)	1.3
CAG	Gln (Q)	2.9
AAU	Asn (N)	1.6
AAC	Asn (N)	2.6
AAG	Lys (K)	3.8
AAA	Lys (K)	1.2
GAU	Asp (D)	3.3
GAC	Asp (D)	2.3
GAA	Glu (E)	4.4
GAG	Glu (E)	1.9
UGU	Cys (C)	0.4
UGC	Cys (C)	0.6
UGA	Stop	0.1
UGG	Trp (W)	1.4
CGU	Arg (R)	2.4
CGC	Arg (R)	2.2
CGA	Arg (R)	0.3
CGG	Arg (R)	0.5
AGU	Ser (S)	0.7
AGC	Ser (S)	1.5
AGA	Ser (S)	0.2
AGG	Ser (S)	0.2
GGU	Gly (G)	2.8
GGC	Gly (G)	3.0
GGC	Gly (G)	0.7
GGA	Gly (G)	0.9

Amber codon suppression

The possibility of reassigning codons was realized by Normanly et al. in 1990, when a viable mutant strain of E. coli read through the UAG ("amber") stop codon.^[21] This was possible thanks to the rarity of this codon and the fact that release factor 1 alone makes the amber codon terminate translation. Later, in the Schultz lab, the tRNATyr/tyrosyl-tRNA synthetase (TyrRS) from Methanococcus jannaschii, an archaebacterium, was used to introduce a tyrosine instead of STOP, the default value of the amber codon.^[22] This was possible because of the differences between the endogenous bacterial syntheses and the orthologous archaeal synthase, which do not recognize each other. Subsequently, the group evolved the orthologonal tRNA/synthase pair to utilize the non-standard amino acid O-methyltyrosine.^[8] This was followed by the larger naphthylalanine^[23] and the photocrosslinking benzoylphenylalanine,^[24] which proved the potential utility of the system.

The amber codon is the least used codon in Escherichia coli, but hijacking it results in a substantial loss of fitness. One study, in fact, found that there were at least 83 peptides majorly affected by the readthrough^[25] Additionally, the labelling was incomplete. As a consequence, several strains have been made to reduce the fitness cost, including the removal of all amber codons from the genome. In most E. coli K-12 strains (viz. Escherichia coli (molecular biology) for strain pedigrees) there are 314 UAG stop codons. Consequently, a gargantuan amount of work has gone into the replacement of these. One approach pioneered by the group of Prof. George Church from Harvard, was dubbed MAGE in CAGE: this relied on a multiplex transformation and subsequent strain recombination to remove all UAG codons—the latter part presented a halting point in a first paper,^[26] but was overcome. This resulted in the E. coli strain C321.ΔA, which lacks all UAG codons and RF1.^[27] This allowed an experiment to be done with this strain to make it "addicted" to the amino acid biphenylalanine by evolving several key enzymes to require it structurally, therefore putting its expanded genetic code under positive selection.^[28]

Rare sense codon reassignment

In addition to the amber codon, rare sense codons have also been considered for use. The AGG codon codes for arginine, but a strain has been successfully modified to make it code for 6-N-allyloxycarbonyl-lysine.^[29] Another candidate is the AUA codon, which is unusual in that its respective tRNA has to differentiate against AUG that codes for methionine (primordially, isoleucine, hence its location). In order to do this, the AUA tRNA has a special base, lysidine. The deletion of the synthase (tilS) was possible thanks to the replacement of the native tRNA with that of Mycoplasma mobile (no lysidine). The reduced fitness is a first step towards pressuring the strain to lose all instances of AUA, allowing it to be used for genetic code expansion.^[30]

E. coli strain Syn61 is a variant where all uses of TCG (Ser), TCA (Ser), TAG (STOP) codons are eliminated using a synthetic genome (see below). By removing the unneeded tRNA genes and RF1, strain Syn61Δ3 was produced. The three freed codons then become available for adding three special residues, as demonstrated in strain "Syn61Δ3(ev4)".^[31]

Four base (quadruplet) codons

While triplet codons are the basis of the genetic code in nature, programmed +1 frameshift is a natural process that allows the use of a four-nucleotide sequence (quadruplet codon) to encode an amino acid.^[32] Recent developments in genetic code engineering also showed that quadruplet codon could be used to encode non-standard amino acids under experimental conditions.^[33] ^[34] ^[35] This allowed the simultaneous usage of two unnatural amino acids, p-azidophenylalanine (pAzF) and N6-[(2-propynyloxy)carbonyl]lysine (CAK), which cross-link with each other by Huisgen cycloaddition.^[36] Quadrupled decoding in wild-type, non-recoded strains is very inefficient. This stems from the fact that the interaction between engineered tRNAs with ternary complexes or other translation components is not as favorable and strong as with cell endogenous translation elements.^[37] This problem can be overcome by specifically engineering and evolving tRNA that can decode quadruplet codons in non-recoded strains.^[38] Up to 4 different quadruplet orthogonal tRNA/tRNA synthethase pairs can be generated in this manner.^[39] Quadruplet codon decoding approach has also been applied to the construction of an HIV-1 vaccine.^[40]

tRNA/synthetase pair

Another key element is the tRNA/synthetase pair.

The orthologous set of synthetase and tRNA can be mutated and screened through directed evolution to charge the tRNA with a different, even novel, amino acid. Mutations to the plasmid containing the pair can be introduced by error-prone PCR or through degenerate primers for the synthetase's active site. Selection involves multiple rounds of a two-step process, where the plasmid is transferred into cells expressing chloramphenicol acetyl transferase with a premature amber codon. In the presence of toxic chloramphenicol and the non-natural amino acid, the surviving cells will have overridden the amber codon using the orthogonal tRNA aminoacylated with either the standard amino acids or the non-natural one. To remove the former, the plasmid is inserted into cells with a barnase gene (toxic) with a premature amber codon but without the non-natural amino acid, removing all the orthogonal syntheses that do not specifically recognize the non-natural amino acid.In addition to the recoding of the tRNA to a different codon, they can be mutated to recognize a four-base codon, allowing additional free coding options.^[41] The non-natural amino acid, as a result, introduces diverse physicochemical and biological properties in order to be used as a tool to explore protein structure and function or to create novel or enhanced protein for practical purposes.

Orthogonal sets in model organisms

The orthogonal pairs of synthetase and tRNA that work for one organism may not work for another, as the synthetase may mis-aminoacylate endogenous tRNAs or the tRNA be mis-aminoacylated itself by an endogenous synthetase. As a result, the sets created to date differ between organisms.

Pair	Source	Notes and references
tRNA^Tyr-TyrRS	Methanococcus jannaschii
tRNA^Lys–LysRS	Pyrococcus horikoshii	^[42]
tRNA^Glu–GluRS	Pyrococcus horikoshii	^[43]
tRNA^Leu–LeuRS	tRNA: mutant Halobacterium sp. RS: Methanobacterium thermoautotrophicum	^[44]
tRNA^Amber-PylRS	Methanosarcina barkeri and Methanosarcina mazei	^[45]
tRNA^Amber-3-iodotyrosyl-RS		^[46]
tRNA^Tyr/Amber-TyrRS	Escherichia coli	Reported in 2003,^[47] mentioned in 2014 LeuRS^[48]
tRNA^iMet-GlnRS	tRNA: human RS: Escherichia coli	Switched to Amber codon.^[49]
tRNA^ifMet-TyrRS	tRNA: Escherichia coli RS: S. cerevisiae	Switched to Amber codon.
tRNA^Leu/Amber-LeuRS	Escherichia coli	Reported in 2004 and mutated for 2-Aminooctanoic acid, o-methyl tyrosine, and o-nitrobenzyl cysteine. Evolved in yeast for 4,5-dimethoxy-2-nitrobenzyl serine,^[50] ^[51] tested in mice and mammalian cells with photosensitive 4,5-dimethoxy-2-nitrobenzyl-cysteine.^[52] ^[53]
tRNA^Tyr-TyrRS	Bacillus stearothermophilus	^[54]
tRNA^Trp-TrpRS	Bacillus subtilis, RS modified	New AA is 5-OH Trp.^[55]

In 2017, a mouse engineered with an extended genetic code that can produce proteins with unnatural amino acids was reported.^[56]

Orthogonal ribosomes

Similarly to orthogonal tRNAs and aminoacyl tRNA synthetases (aaRSs), orthogonal ribosomes have been engineered to work in parallel to the natural ribosomes. Orthogonal ribosomes ideally use different mRNA transcripts than their natural counterparts and ultimately should draw on a separate pool of tRNA as well. This should alleviate some of the loss of fitness which currently still arises from techniques such as Amber codon suppression. Additionally, orthogonal ribosomes can be mutated and optimized for particular tasks, like the recognition of quadruplet codons. Such an optimization is not possible, or highly disadvantageous for natural ribosomes.

o-Ribosome

In 2005, three sets of ribosomes were published, which did not recognize natural mRNA, but instead translated a separate pool of orthogonal mRNA (o-mRNA).^[57] This was achieved by changing the recognition sequence of the mRNA, the Shine-Dalgarno sequence, and the corresponding recognition sequence in the 16S rRNA of ribosomes, the so-called Anti-Shine-Dalgarno-Sequence. This way the base pairing, which is usually lost if either sequence is mutated, stays available. However the mutations in the 16S rRNA were not limited to the obviously base-pairing nucleotides of the classical Anti-Shine-Dalgarno sequence.

Ribo-X

In 2007, the group of Jason W. Chin presented an orthogonal ribosome, which was optimized for Amber codon suppression.^[58] The 16S rRNA was mutated in such a way that it bound the release factor RF1 less strongly than the natural ribosome does. This ribosome did not eliminate the problem of lowered cell fitness caused by suppressed stop codons in natural proteins. However through the improved specificity it raised the yields of correctly synthesized target protein significantly (from ~20% to >60% percent for one amber codon to be suppressed and from <1% to >20% for two amber codons).

Ribo-Q

In 2010, the group of Jason W. Chin presented a further optimized version of the orthogonal ribosome. The Ribo-Q is a 16S rRNA optimized to recognize tRNAs, which have quadruplet anti-codons to recognize quadruplet codons, instead of the natural triplet codons.^[36] With this approach the number of possible codons rises from 64 to 256. Even accounting for a variety of stop codons, more than 200 different amino acids could potentially be encoded this way.

Ribosome stapling

The orthogonal ribosomes described above all focus on optimizing the 16S rRNA. Thus far, this optimized 16S rRNA was combined with natural large-subunits to form orthogonal ribosomes. If the 23S rRNA, the main RNA-component of the large ribosomal subunit, is to be optimized as well, it had to be assured, that there was no crosstalk in the assembly of orthogonal and natural ribosomes (see figureX B). To ensure that optimized 23S rRNA would only form into ribosomes with the optimized 16S rRNA, the two rRNAs were combined into one transcript.^[59] By inserting the sequence for the 23S rRNA into a loop-region of the 16S rRNA sequence, both subunits still adopt functioning folds. Since the two rRNAs are linked and thus in constant proximity, they preferably bind each other, not other free floating ribosomal subunits.

Engineered peptidyl transferase center

In 2014, it was shown that by altering the peptidyl transferase center of the 23S rRNA, ribosomes could be created which draw on orthogonal pools of tRNA.^[60] The 3' end of tRNAs is universally conserved to be CCA. The two cytidines base pair with two guanines the 23S rRNA to bind the tRNA to the ribosome. This interaction is required for translational fidelity. However, by co-mutating the binding nucleotides in such a way, that they can still base pair, the translational fidelity can be conserved. The 3'-end of the tRNA is mutated from CCA to CGA, while two cytidine nucleotides in the ribosomes A- and P-sites are mutated to guanidine. This leads to ribosomes which do not accept naturally occurring tRNAs as substrates and to tRNAs, which cannot be used as substrate by natural ribosomes.
To use such tRNAs effectively, they would have to be aminoacylated by specific, orthogonal aaRSs. Most naturally occurring aaRSs recognize the 3'-end of their corresponding tRNA.^[61] ^[62] aaRSs for these 3'-mutated tRNAs are not available yet. Thus far, this system has only been shown to work in an in-vitro translation setting where the aminoacylation of the orthogonal tRNA was achieved using so called "flexizymes". Flexizymes are ribozymes with tRNA-amino-aclylation activity.^[63]

Applications

With an expanded genetic code, the unnatural amino acid can be genetically directed to any chosen site in the protein of interest. The high efficiency and fidelity of this process allows a better control of the placement of the modification compared to modifying the protein post-translationally, which, in general, will target all amino acids of the same type, such as the thiol group of cysteine and the amino group of lysine.^[64] Also, an expanded genetic code allows modifications to be carried out in vivo.The ability to site-specifically direct lab-synthesized chemical moieties into proteins allows many types of studies that would otherwise be extremely difficult, such as:

Probing protein structure and function: By using amino acids with slightly different size such as O-methyltyrosine or dansylalanine instead of tyrosine, and by inserting genetically coded reporter moieties (color-changing and/or spin-active) into selected protein sites, chemical information about the protein's structure and function can be measured.
Probing the role of post-translational modifications in protein structure and function: By using amino acids that mimic post-translational modifications such as phosphoserine, biologically active protein can be obtained, and the site-specific nature of the amino acid incorporation can lead to information on how the position, density, and distribution of protein phosphorylation effect protein function.^[65] ^[66] ^[67] ^[68]
Identifying and regulating protein activity: By using photocaged aminoacids, protein function can be "switched" on or off by illuminating the organism.
Changing the mode of action of a protein: One can start with the gene for a protein that binds a certain sequence of DNA and, by inserting a chemically active amino acid into the binding site, convert it to a protein that cuts the DNA rather than binding it.
Improving immunogenicity and overcoming self-tolerance: By replacing strategically chosen tyrosines with p-nitro phenylalanine, a tolerated self-protein can be made immunogenic.^[69]
Selective destruction of selected cellular components: using an expanded genetic code, unnatural, destructive chemical moieties (sometimes called "chemical warheads") can be incorporated into proteins that target specific cellular components.^[70]
Producing better protein: the evolution of T7 bacteriophages on a non-evolving E. coli strain that encoded 3-iodotyrosine on the amber codon, resulted in a population fitter than wild-type thanks to the presence of iodotyrosine in its proteome^[71]
Probing protein localization and protein-protein interaction in bacteria.^[72]

Future

The expansion of the genetic code is still in its infancy. Current methodology uses only one non-standard amino acid at the time, whereas ideally multiple could be used. In fact, the group of Jason Chin has recently broken the record for a genetically recoded E. coli strain that can simultaneously incorporate up to 4 unnatural amino acids.^[73] Moreover, there has been development in software that allows combination of orthogonal ribosomes and unnatural tRNA/RS pairs in order to improve protein yield and fidelity.^[73]

Recoded synthetic genome

One way to achieve the encoding of multiple unnatural amino acids is by synthesising a rewritten genome.^[74] In 2010, at the cost of $40 million an organism, Mycoplasma laboratorium, was constructed that was controlled by a synthetic, but not recoded, genome.^[75] The first genetically recoded organism was created by a collaboration between George Church's and Farren Isaacs' labs, when the wild type was recoded in such a way that all 321 known stop codons (UAG) were substituted with synonymous UAA codons and release factor 1 was knocked out in order to eliminate the interaction with the exogenous stop codon and improve unnatural protein synthesis.^[27] In 2019, Escherichia coli Syn61 was created, with a 4 megabase recoded genome consisting of only 61 codons instead of the natural 64.In addition to the elimination of the usage of rare codons, the specificity of the system needs to be increased as many tRNA recognise several codons

Expanded genetic alphabet

Another approach is to expand the number of nucleobases to increase the coding capacity.

An unnatural base pair (UBP) is a designed subunit (or nucleobase) of DNA which is created in a laboratory and does not occur in nature. A demonstration of UBPs were achieved in vitro by Ichiro Hirao's group at RIKEN institute in Japan. In 2002, they developed an unnatural base pair between 2-amino-8-(2-thienyl)purine (s) and pyridine-2-one (y) that functions in vitro in transcription and translation for the site-specific incorporation of non-standard amino acids into proteins.^[76] In 2006, they created 7-(2-thienyl)imidazo[4,5-b]pyridine (Ds) and pyrrole-2-carbaldehyde (Pa) as a third base pair for replication and transcription.^[77] Afterward, Ds and 4-[3-(6-aminohexanamido)-1-propynyl]-2-nitropyrrole (Px) was discovered as a high fidelity pair in PCR amplification.^[78] ^[79] In 2013, they applied the Ds-Px pair to DNA aptamer generation by in vitro selection (SELEX) and demonstrated the genetic alphabet expansion significantly augment DNA aptamer affinities to target proteins.^[80]

In 2012, a group of American scientists led by Floyd Romesberg, a chemical biologist at the Scripps Research Institute in San Diego, California, published that his team designed an unnatural base pair (UBP).^[81] The two new artificial nucleotides or Unnatural Base Pair (UBP) were named "d5SICS" and "dNaM." More technically, these artificial nucleotides bearing hydrophobic nucleobases, feature two fused aromatic rings that form a (d5SICS–dNaM) complex or base pair in DNA.^[82] In 2014 the same team from the Scripps Research Institute reported that they synthesized a stretch of circular DNA known as a plasmid containing natural T-A and C-G base pairs along with the best-performing UBP Romesberg's laboratory had designed, and inserted it into cells of the common bacterium E. coli that successfully replicated the unnatural base pairs through multiple generations.^[83] This is the first known example of a living organism passing along an expanded genetic code to subsequent generations.^[84] This was in part achieved by the addition of a supportive algal gene that expresses a nucleotide triphosphate transporter which efficiently imports the triphosphates of both d5SICSTP and dNaMTP into E. coli bacteria. Then, the natural bacterial replication pathways use them to accurately replicate the plasmid containing d5SICS–dNaM.

The successful incorporation of a third base pair into a living micro-organism is a significant breakthrough toward the goal of greatly expanding the number of amino acids which can be encoded by DNA, thereby expanding the potential for living organisms to produce novel proteins. The artificial strings of DNA do not encode for anything yet, but scientists speculate they could be designed to manufacture new proteins which could have industrial or pharmaceutical uses.^[85]

In May 2014, researchers announced that they had successfully introduced two new artificial nucleotides into bacterial DNA, and by including individual artificial nucleotides in the culture media, were able to induce amplification of the plasmids containing the artificial nucleotides by a factor of 2 x 10⁷ (24 doublings); they did not create mRNA or proteins able to use the artificial nucleotides.^[86] ^[87] ^[88] ^[89]

Related methods

Selective pressure incorporation (SPI) method for production of alloproteins

There have been many studies that have produced protein with non-standard amino acids, but they do not alter the genetic code. These protein, called alloprotein, are made by incubating cells with an unnatural amino acid in the absence of a similar coded amino acid in order for the former to be incorporated into protein in place of the latter, for example L-2-aminohexanoic acid (Ahx) for methionine (Met).^[90]

These studies rely on the natural promiscuous activity of the aminoacyl tRNA synthetase to add to its target tRNA an unnatural amino acid (i.e. analog) similar to the natural substrate, for example methionyl-tRNA synthase's mistaking isoleucine for methionine.^[91] In protein crystallography, for example, the addition of selenomethionine to the media of a culture of a methionine-auxotrophic strain results in proteins containing selenomethionine as opposed to methionine (viz. Multi-wavelength anomalous dispersion for reason).^[92] Another example is that photoleucine and photomethionine are added instead of leucine and methionine to cross-label protein.^[93] Similarly, some tellurium-tolerant fungi can incorporate tellurocysteine and telluromethionine into their protein instead of cysteine and methionine.^[94] The objective of expanding the genetic code is more radical as it does not replace an amino acid, but it adds one or more to the code. On the other hand, proteome-wide replacements are most efficiently performed by global amino acid substitutions. For example, global proteome-wide substitutions of natural amino acids with fluorinated analogs have been attempted in E. coli^[95] and B. subtilis.^[96] A complete tryptophan substitution with thienopyrrole-alanine in response to 20899 UGG codons in E. coli was reported in 2015 by Budisa and Söll.^[97] Moreover, many biological phenomena, such as protein folding and stability, are based on synergistic effects at many positions in the protein sequence.^[98]

In this context, the SPI method generates recombinant protein variants or alloproteins directly by substitution of natural amino acids with unnatural counterparts.^[99] An amino acid auxotrophic expression host is supplemented with an amino acid analog during target protein expression.^[100] This approach avoids the pitfalls of suppression-based methods^[101] and it is superior to it in terms of efficiency, reproducibility and an extremely simple experimental setup.^[102] Numerous studies demonstrated how global substitution of canonical amino acids with various isosteric analogs caused minimal structural perturbations but dramatic changes in thermodynamic,^[103] folding,^[104] aggregation^[105] spectral properties^[106] ^[107] and enzymatic activity.^[108]

in vitro synthesis

See main article: mRNA display. The genetic code expansion described above is in vivo. An alternative is the change of coding in vitro translation experiments. This requires the depletion of all tRNAs and the selective reintroduction of certain aminoacylated-tRNAs, some chemically aminoacylated.^[109]

Chemical synthesis

See main article: Peptide synthesis. There are several techniques to produce peptides chemically, generally it is by solid-phase protection chemistry. This means that any (protected) amino acid can be added into the nascent sequence.

In November 2017, a team from the Scripps Research Institute reported having constructed a semi-synthetic E. coli bacteria genome using six different nucleotides (versus four found in nature). The two extra 'letters' form a third, unnatural base pair. The resulting organisms were able to thrive and synthesize proteins using "unnatural amino acids".^[110] ^[111] The unnatural base pair used is dNaM–dTPT3. This unnatural base pair has been demonstrated previously,^[112] ^[113] but this is the first report of transcription and translation of proteins using an unnatural base pair.

Notes and References

Xie J, Schultz PG . Adding amino acids to the genetic repertoire . Current Opinion in Chemical Biology . 9 . 6 . 548–54 . December 2005 . 16260173 . 10.1016/j.cbpa.2005.10.011 .
News: Zimmer C . Carl Zimmer . Scientists Created Bacteria With a Synthetic Genome. Is This Artificial Life? – In a milestone for synthetic biology, colonies of E. coli thrive with DNA constructed from scratch by humans, not nature. . 15 May 2019 . . 16 May 2019 .
Fredens J, Wang K, de la Torre D, Funke LF, Robertson WE, Christova Y, Chia T, Schmied WH, Dunkelmann DL, Beránek V, Uttamapinant C, Llamazares AG, Elliott TS, Chin JW . 6 . Total synthesis of Escherichia coli with a recoded genome . Nature . 569 . 7757 . 514–518 . May 2019 . 31092918 . 7039709 . 10.1038/s41586-019-1192-5 . 2019Natur.569..514F .
Kubyshkin V, Acevedo-Rocha CG, Budisa N . On universal coding events in protein biogenesis . Bio Systems . 164 . 16–25 . February 2018 . 29030023 . 10.1016/j.biosystems.2017.10.004 . free . 2018BiSys.164...16K .
Kubyshkin V, Budisa N . Synthetic alienation of microbial organisms by using genetic code engineering: Why and how? . Biotechnology Journal . 12 . 8 . 1600097 . August 2017 . 28671771 . 10.1002/biot.201600097 .
Book: Alberts B, Johnson A, Lewis J, Raff M, Roberts K, Walter P . Molecular Biology of the Cell. 2008. Garland Science. New York. 978-0-8153-4105-5. 5th.
Woese CR, Olsen GJ, Ibba M, Söll D . Aminoacyl-tRNA synthetases, the genetic code, and the evolutionary process . Microbiology and Molecular Biology Reviews . 64 . 1 . 202–36 . March 2000 . 10704480 . 98992 . 10.1128/mmbr.64.1.202-236.2000 .
Wang L, Brock A, Herberich B, Schultz PG . Expanding the genetic code of Escherichia coli . Science . 292 . 5516 . 498–500 . April 2001 . 11313494 . 10.1126/science.1060077 . 2001Sci...292..498W . 6702011 .
Liu CC, Schultz PG . Adding new chemistries to the genetic code . Annual Review of Biochemistry . 79 . 413–44 . 2010 . 20307192 . 10.1146/annurev.biochem.052308.105824 .
Summerer D, Chen S, Wu N, Deiters A, Chin JW, Schultz PG . A genetically encoded fluorescent amino acid . Proceedings of the National Academy of Sciences of the United States of America . 103 . 26 . 9785–9 . June 2006 . 16785423 . 1502531 . 10.1073/pnas.0603965103 . 2006PNAS..103.9785S . free .
Steinfeld JB, Aerni HR, Rogulina S, Liu Y, Rinehart J . Expanded cellular amino acid pools containing phosphoserine, phosphothreonine, and phosphotyrosine . ACS Chemical Biology . 9 . 5 . 1104–12 . May 2014 . 24646179 . 4027946 . 10.1021/cb5000532 .
Furter R . Expansion of the genetic code: site-directed p-fluoro-phenylalanine incorporation in Escherichia coli . Protein Science . 7 . 2 . 419–26 . February 1998 . 9521119 . 2143905 . 10.1002/pro.5560070223 .
Wang L, Xie J, Schultz PG . Expanding the genetic code . Annual Review of Biophysics and Biomolecular Structure . 35 . 225–49 . 2006 . 16689635 . 10.1146/annurev.biophys.35.101105.121507 .
Young TS, Schultz PG . Beyond the canonical 20 amino acids: expanding the genetic lexicon . The Journal of Biological Chemistry . 285 . 15 . 11039–44 . April 2010 . 20147747 . 2856976 . 10.1074/jbc.R109.091306 . free .
Web site: The Peter G. Schultz Laboratory . Schultz.scripps.edu . 2015-05-05 . 2018-07-12 . https://web.archive.org/web/20180712054804/http://schultz.scripps.edu/research.php . dead .
Cardillo G, Gentilucci L, Tolomelli A . Unusual amino acids: synthesis and introduction into naturally occurring peptides and biologically active analogues . Mini Reviews in Medicinal Chemistry . 6 . 3 . 293–304 . March 2006 . 16515468 . 10.2174/138955706776073394 .
Mehl RA, Anderson JC, Santoro SW, Wang L, Martin AB, King DS, Horn DM, Schultz PG . Generation of a bacterium with a 21 amino acid genetic code . Journal of the American Chemical Society . 125 . 4 . 935–9 . January 2003 . 12537491 . 10.1021/ja0284153.
Web site: context :: 21-amino-acid bacteria: expanding the genetic code . Straddle3.net . 2015-05-05.
Koonin EV, Novozhilov AS . Origin and evolution of the genetic code: the universal enigma . IUBMB Life . 61 . 2 . 99–111 . February 2009 . 19117371 . 3293468 . 10.1002/iub.146 . 0807.4749 .
Book: Maloy SR, Valley Joseph Stewart VJ, Taylor RK . Genetic analysis of pathogenic bacteria : a laboratory manual. 1996. Cold Spring Harbor Laboratory.. New York. 978-0-87969-453-1.
Normanly J, Kleina LG, Masson JM, Abelson J, Miller JH . Construction of Escherichia coli amber suppressor tRNA genes. III. Determination of tRNA specificity . Journal of Molecular Biology . 213 . 4 . 719–26 . June 1990 . 2141650 . 10.1016/S0022-2836(05)80258-X .
10.1021/ja000595y . Wang L, Magliery TJ, Liu DR, Schultz PG . 2000 . A new functional suppressor tRNA/aminoacyl-tRNA synthetase pair for the in vivo incorporation of unnatural amino acids into proteins . J. Am. Chem. Soc. . 122 . 20 . 5010–5011 . 2010-12-02 . 2011-09-27 . https://web.archive.org/web/20110927042908/http://www.chemistry.ohio-state.edu/~magliery/pdfs/WangSchultz2000JACS.pdf . dead .
Wang L, Brock A, Schultz PG . Adding L-3-(2-Naphthyl)alanine to the genetic code of E. coli . Journal of the American Chemical Society . 124 . 9 . 1836–7 . March 2002 . 11866580 . 10.1021/ja012307j .
Chin JW, Martin AB, King DS, Wang L, Schultz PG . Addition of a photocrosslinking amino acid to the genetic code of Escherichia coli . Proceedings of the National Academy of Sciences of the United States of America . 99 . 17 . 11020–4 . August 2002 . 12154230 . 123203 . 10.1073/pnas.172226299 . 2002PNAS...9911020C . free .
Aerni HR, Shifman MA, Rogulina S, O'Donoghue P, Rinehart J . Revealing the amino acid composition of proteins within an expanded genetic code . Nucleic Acids Research . 43 . 2 . e8 . January 2015 . 25378305 . 4333366 . 10.1093/nar/gku1087 .
Isaacs FJ, Carr PA, Wang HH, Lajoie MJ, Sterling B, Kraal L, Tolonen AC, Gianoulis TA, Goodman DB, Reppas NB, Emig CJ, Bang D, Hwang SJ, Jewett MC, Jacobson JM, Church GM . 6 . Precise manipulation of chromosomes in vivo enables genome-wide codon replacement . Science . 333 . 6040 . 348–53 . July 2011 . 21764749 . 5472332 . 10.1126/science.1205822 . 2011Sci...333..348I .
Lajoie MJ, Rovner AJ, Goodman DB, Aerni HR, Haimovich AD, Kuznetsov G, Mercer JA, Wang HH, Carr PA, Mosberg JA, Rohland N, Schultz PG, Jacobson JM, Rinehart J, Church GM, Isaacs FJ . 6 . Genomically recoded organisms expand biological functions . Science . 342 . 6156 . 357–60 . October 2013 . 24136966 . 4924538 . 10.1126/science.1241459 . 2013Sci...342..357L .
Mandell DJ, Lajoie MJ, Mee MT, Takeuchi R, Kuznetsov G, Norville JE, Gregg CJ, Stoddard BL, Church GM . 6 . Biocontainment of genetically modified organisms by synthetic protein design . Nature . 518 . 7537 . 55–60 . February 2015 . 25607366 . 4422498 . 10.1038/nature14121 . 2015Natur.518...55M .
Zeng Y, Wang W, Liu WR . Towards reassigning the rare AGG codon in Escherichia coli . ChemBioChem . 15 . 12 . 1750–4 . August 2014 . 25044341 . 4167342 . 10.1002/cbic.201400075 .
Bohlke N, Budisa N . Sense codon emancipation for proteome-wide incorporation of noncanonical amino acids: rare isoleucine codon AUA as a target for genetic code expansion . FEMS Microbiology Letters . 351 . 2 . 133–44 . February 2014 . 24433543 . 4237120 . 10.1111/1574-6968.12371 .
Robertson . Wesley E. . Funke . Louise F. H. . de la Torre . Daniel . Fredens . Julius . Elliott . Thomas S. . Spinck . Martin . Christova . Yonka . Cervettini . Daniele . Böge . Franz L. . Liu . Kim C. . Buse . Salvador . Maslen . Sarah . Salmond . George P. C. . Chin . Jason W. . Sense codon reassignment enables viral resistance and encoded polymer synthesis . Science . 4 June 2021 . 372 . 6546 . 1057–1062 . 10.1126/science.abg3029. 34083482 . 7611380 . 2021Sci...372.1057R .
Atkins, J. F.; Bjoerk, G. R. "A gripping tale of ribosomal frameshifting: extragenic suppressors of frameshift mutations spotlight P-site realignment." Microbiol. Mol. Biol. Rev. 2009, 73, 178-210.
Anderson, J. C.; Wu, N.; Santoro, S. W.; Lakshman, V.; King, D. S.; Schultz, P. G. "An expanded genetic code with a functional quadruplet codon." Proc. Natl. Acad. Sci. U. S. A. 2004, 101, 7566-7571.
Neumann, H.; Wang, K.; Davis, L.; Garcia-Alai, M.; Chin, J. W. "Encoding multiple unnatural amino acids via evolution of a quadruplet-decoding ribosome." Nature 2010, 464, 441-444.
Niu, W.; Schultz, P. G.; Guo, J. "An expanded genetic code in mammalian cells with a functional quadruplet codon." ACS Chem. Biol. 2013, 8, 1640-1645.
Neumann H, Wang K, Davis L, Garcia-Alai M, Chin JW . Encoding multiple unnatural amino acids via evolution of a quadruplet-decoding ribosome . Nature . 464 . 7287 . 441–4 . March 2010 . 20154731 . 10.1038/nature08817 . 4390989 . 2010Natur.464..441N .
Hong S, Sunita S, Maehigashi T, Hoffer ED, Dunkle JA, Dunham CM . Mechanism of tRNA-mediated +1 ribosomal frameshifting . Proceedings of the National Academy of Sciences of the United States of America . 115 . 44 . 11226–11231 . October 2018 . 30262649 . 6217423 . 10.1073/pnas.1809319115 . free . 2018PNAS..11511226H .
Niu, W., Schultz, P. G., and Guo, J. (2013) An expanded genetic code in mammalian cells with a functional quadruplet codon. ACS Chem Biol 8, 1640-1645.
DeBenedictis EA, Carver GD, Chung CZ, Söll D, Badran AH . Multiplex suppression of four quadruplet codons via tRNA directed evolution . Nature Communications . 12 . 1 . 5706 . September 2021 . 34588441 . 10.1038/s41467-021-25948-y. 8481270 . 2021NatCo..12.5706D .
Chen, Y., Wan, Y., Wang, N., Yuan, Z., Niu, W., Li, Q., and Guo, J. (2018) Controlling the Replication of a Genomically Recoded HIV-1 with a Functional Quadruplet Codon in Mammalian Cells. ACS Synth. Biol. 7, 1612-1617.
Watanabe T, Muranaka N, Hohsaka T . Four-base codon-mediated saturation mutagenesis in a cell-free translation system . Journal of Bioscience and Bioengineering . 105 . 3 . 211–5 . March 2008 . 18397770 . 10.1263/jbb.105.211 .
Anderson JC, Wu N, Santoro SW, Lakshman V, King DS, Schultz PG . An expanded genetic code with a functional quadruplet codon . Proceedings of the National Academy of Sciences of the United States of America . 101 . 20 . 7566–71 . May 2004 . 15138302 . 419646 . 10.1073/pnas.0401517101 . 2004PNAS..101.7566A . free .
Santoro SW, Anderson JC, Lakshman V, Schultz PG . An archaebacteria-derived glutamyl-tRNA synthetase and tRNA pair for unnatural amino acid mutagenesis of proteins in Escherichia coli . Nucleic Acids Research . 31 . 23 . 6700–9 . December 2003 . 14627803 . 290271 . 10.1093/nar/gkg903 .
Anderson JC, Schultz PG . Adaptation of an orthogonal archaeal leucyl-tRNA and synthetase pair for four-base, amber, and opal suppression . Biochemistry . 42 . 32 . 9598–608 . August 2003 . 12911301 . 10.1021/bi034550w .
Hancock SM, Uprety R, Deiters A, Chin JW . Expanding the genetic code of yeast for incorporation of diverse unnatural amino acids via a pyrrolysyl-tRNA synthetase/tRNA pair . Journal of the American Chemical Society . 132 . 42 . 14819–24 . October 2010 . 20925334 . 2956376 . 10.1021/ja104609m .
Minaba M, Kato Y . High-yield, zero-leakage expression system with a translational switch using site-specific unnatural amino Acid incorporation . Applied and Environmental Microbiology . 80 . 5 . 1718–25 . March 2014 . 24375139 . 3957627 . 10.1128/AEM.03417-13 . 2014ApEnM..80.1718M .
Chin JW, Cropp TA, Anderson JC, Mukherji M, Zhang Z, Schultz PG . An expanded eukaryotic genetic code . Science . 301 . 5635 . 964–7 . August 2003 . 12920298 . 10.1126/science.1084772 . 2003Sci...301..964C . 2376187 .
Wu N, Deiters A, Cropp TA, King D, Schultz PG . A genetically encoded photocaged amino acid . Journal of the American Chemical Society . 126 . 44 . 14306–7 . November 2004 . 15521721 . 10.1021/ja040175z .
Kowal AK, Kohrer C, RajBhandary UL . Twenty-first aminoacyl-tRNA synthetase-suppressor tRNA pairs for possible use in site-specific incorporation of amino acid analogues into proteins in eukaryotes and in eubacteria . Proceedings of the National Academy of Sciences of the United States of America . 98 . 5 . 2268–73 . February 2001 . 11226228 . 30127 . 10.1073/pnas.031488298 . 2001PNAS...98.2268K . free .
Lemke EA, Summerer D, Geierstanger BH, Brittain SM, Schultz PG . Control of protein phosphorylation with a genetically encoded photocaged amino acid . Nature Chemical Biology . 3 . 12 . 769–72 . December 2007 . 17965709 . 10.1038/nchembio.2007.44 .
Palei S, Buchmuller B, Wolffgramm J, Muñoz-Lopez Á, Jung S, Czodrowski P, Summerer D . Light-Activatable TET-Dioxygenases Reveal Dynamics of 5-Methylcytosine Oxidation and Transcriptome Reorganization . Journal of the American Chemical Society . 142 . 16 . 7289–7294 . April 2020 . 32286069 . 10.1021/jacs.0c01193 . 215757172 .
Kang JY, Kawaguchi D, Coin I, Xiang Z, O'Leary DD, Slesinger PA, Wang L . In vivo expression of a light-activatable potassium channel using unnatural amino acids . Neuron . 80 . 2 . 358–70 . October 2013 . 24139041 . 3815458 . 10.1016/j.neuron.2013.08.016 .
Wolffgramm J, Buchmuller B, Palei S, Muñoz-López Á, Kanne J, Janning P, Schweiger MR, Summerer D . 6 . Light-Activation of DNA-Methyltransferases . Angewandte Chemie . 60 . 24 . 13507–13512 . June 2021 . 33826797 . 8251764 . 10.1002/anie.202103945 . free .
Sakamoto K, Hayashi A, Sakamoto A, Kiga D, Nakayama H, Soma A, Kobayashi T, Kitabatake M, Takio K, Saito K, Shirouzu M, Hirao I, Yokoyama S . 6 . Site-specific incorporation of an unnatural amino acid into proteins in mammalian cells . Nucleic Acids Research . 30 . 21 . 4692–9 . November 2002 . 12409460 . 135798 . 10.1093/nar/gkf589 .
Zhang Z, Alfonta L, Tian F, Bursulaya B, Uryu S, King DS, Schultz PG . Selective incorporation of 5-hydroxytryptophan into proteins in mammalian cells . Proceedings of the National Academy of Sciences of the United States of America . 101 . 24 . 8882–7 . June 2004 . 15187228 . 428441 . 10.1073/pnas.0307029101 . 2004PNAS..101.8882Z . free .
Han S, Yang A, Lee S, Lee HW, Park CB, Park HS . Expanding the genetic code of Mus musculus . Nature Communications . 8 . 14568 . February 2017 . 28220771 . 5321798 . 10.1038/ncomms14568 . 2017NatCo...814568H .
Rackham O, Chin JW . A network of orthogonal ribosome x mRNA pairs . Nature Chemical Biology . 1 . 3 . 159–66 . August 2005 . 16408021 . 10.1038/nchembio719 . 37181098 .
Wang K, Neumann H, Peak-Chew SY, Chin JW . Evolved orthogonal ribosomes enhance the efficiency of synthetic genetic code expansion . Nature Biotechnology . 25 . 7 . 770–7 . July 2007 . 17592474 . 10.1038/nbt1314 . 19683574 .
Fried SD, Schmied WH, Uttamapinant C, Chin JW . Ribosome Subunit Stapling for Orthogonal Translation in E. coli . Angewandte Chemie . 54 . 43 . 12791–4 . October 2015 . 26465656 . 4678508 . 10.1002/anie.201506311 .
Terasaka N, Hayashi G, Katoh T, Suga H . An orthogonal ribosome-tRNA pair via engineering of the peptidyl transferase center . Nature Chemical Biology . 10 . 7 . 555–7 . July 2014 . 24907900 . 10.1038/nchembio.1549 .
Cavarelli J, Moras D . Recognition of tRNAs by aminoacyl-tRNA synthetases . FASEB Journal . 7 . 1 . 79–86 . January 1993 . 8422978 . 10.1096/fasebj.7.1.8422978 . free . 46222849 .
Schimmel PR, Söll D . Aminoacyl-tRNA synthetases: general features and recognition of transfer RNAs . Annual Review of Biochemistry . 48 . 601–48 . 1979 . 382994 . 10.1146/annurev.bi.48.070179.003125 .
Ohuchi M, Murakami H, Suga H . The flexizyme system: a highly flexible tRNA aminoacylation tool for the translation apparatus . Current Opinion in Chemical Biology . 11 . 5 . 537–42 . October 2007 . 17884697 . 10.1016/j.cbpa.2007.08.011 .
Wang Q, Parrish AR, Wang L . Expanding the genetic code for biological studies . Chemistry & Biology . 16 . 3 . 323–36 . March 2009 . 19318213 . 2696486 . 10.1016/j.chembiol.2009.03.001 .
Park HS, Hohn MJ, Umehara T, Guo LT, Osborne EM, Benner J, Noren CJ, Rinehart J, Söll D . 6 . Expanding the genetic code of Escherichia coli with phosphoserine . Science . 333 . 6046 . 1151–4 . August 2011 . 21868676 . 5547737 . 10.1126/science.1207203 . 2011Sci...333.1151P .
Oza JP, Aerni HR, Pirman NL, Barber KW, Ter Haar CM, Rogulina S, Amrofell MB, Isaacs FJ, Rinehart J, Jewett MC . 6 . Robust production of recombinant phosphoproteins using cell-free protein synthesis . Nature Communications . 6 . 8168 . September 2015 . 26350765 . 4566161 . 10.1038/ncomms9168 . 2015NatCo...6.8168O .
Pirman NL, Barber KW, Aerni HR, Ma NJ, Haimovich AD, Rogulina S, Isaacs FJ, Rinehart J . 6 . A flexible codon in genomically recoded Escherichia coli permits programmable protein phosphorylation . Nature Communications . 6 . 8130 . September 2015 . 26350500 . 4566969 . 10.1038/ncomms9130 . 2015NatCo...6.8130P .
Rogerson DT, Sachdeva A, Wang K, Haq T, Kazlauskaite A, Hancock SM, Huguenin-Dezot N, Muqit MM, Fry AM, Bayliss R, Chin JW . 6 . Efficient genetic encoding of phosphoserine and its nonhydrolyzable analog . Nature Chemical Biology . 11 . 7 . 496–503 . July 2015 . 26030730 . 4830402 . 10.1038/nchembio.1823 .
Gauba V, Grünewald J, Gorney V, Deaton LM, Kang M, Bursulaya B, Ou W, Lerner RA, Schmedt C, Geierstanger BH, Schultz PG, Ramirez-Montagut T . 6 . Loss of CD4 T-cell-dependent tolerance to proteins with modified amino acids . Proceedings of the National Academy of Sciences of the United States of America . 108 . 31 . 12821–6 . August 2011 . 21768354 . 3150954 . 10.1073/pnas.1110042108 . 2011PNAS..10812821G . free .
Liu CC, Mack AV, Brustad EM, Mills JH, Groff D, Smider VV, Schultz PG . Evolution of proteins with genetically encoded "chemical warheads" . Journal of the American Chemical Society . 131 . 28 . 9616–7 . July 2009 . 19555063 . 2745334 . 10.1021/ja902985e .
Hammerling MJ, Ellefson JW, Boutz DR, Marcotte EM, Ellington AD, Barrick JE . Bacteriophages use an expanded genetic code on evolutionary paths to higher fitness . Nature Chemical Biology . 10 . 3 . 178–80 . March 2014 . 24487692 . 3932624 . 10.1038/nchembio.1450 .
Kipper K, Lundius EG, Ćurić V, Nikić I, Wiessler M, Lemke EA, Elf J . Application of Noncanonical Amino Acids for Protein Labeling in a Genomically Recoded Escherichia coli . ACS Synthetic Biology . 6 . 2 . 233–255 . February 2017 . 27775882 . 10.1021/acssynbio.6b00138 .
Dunkelmann DL, Oehm SB, Beattie AT, Chin JW . A 68-codon genetic code to incorporate four distinct non-canonical amino acids enabled by automated orthogonal mRNA design . Nature Chemistry . August 2021 . 13 . 11 . 1110–1117 . 34426682 . 10.1038/s41557-021-00764-5 . 237271721 . 7612796 . 2021NatCh..13.1110D .
Krishnakumar R, Ling J . Experimental challenges of sense codon reassignment: an innovative approach to genetic code expansion . FEBS Letters . 588 . 3 . 383–8 . January 2014 . 24333334 . 10.1016/j.febslet.2013.11.039 . 10152595 . free . 2014FEBSL.588..383K .
Gibson DG, Glass JI, Lartigue C, Noskov VN, Chuang RY, Algire MA, Benders GA, Montague MG, Ma L, Moodie MM, Merryman C, Vashee S, Krishnakumar R, Assad-Garcia N, Andrews-Pfannkoch C, Denisova EA, Young L, Qi ZQ, Segall-Shapiro TH, Calvey CH, Parmar PP, Hutchison CA, Smith HO, Venter JC . 6 . Creation of a bacterial cell controlled by a chemically synthesized genome . Science . 329 . 5987 . 52–6 . July 2010 . 20488990 . 10.1126/science.1190719 . free . 2010Sci...329...52G .
Hirao I, Ohtsuki T, Fujiwara T, Mitsui T, Yokogawa T, Okuni T, Nakayama H, Takio K, Yabuki T, Kigawa T, Kodama K, Yokogawa T, Nishikawa K, Yokoyama S . 6 . An unnatural base pair for incorporating amino acid analogs into proteins . Nature Biotechnology . 20 . 2 . 177–82 . February 2002 . 11821864 . 10.1038/nbt0202-177 . 22055476 .
Hirao I, Kimoto M, Mitsui T, Fujiwara T, Kawai R, Sato A, Harada Y, Yokoyama S . 6 . An unnatural hydrophobic base pair system: site-specific incorporation of nucleotide analogs into DNA and RNA . Nature Methods . 3 . 9 . 729–35 . September 2006 . 16929319 . 10.1038/nmeth915 . 6494156 .
Kimoto M, Kawai R, Mitsui T, Yokoyama S, Hirao I . An unnatural base pair system for efficient PCR amplification and functionalization of DNA molecules . Nucleic Acids Research . 37 . 2 . e14 . February 2009 . 19073696 . 2632903 . 10.1093/nar/gkn956 .
Yamashige R, Kimoto M, Takezawa Y, Sato A, Mitsui T, Yokoyama S, Hirao I . Highly specific unnatural base pair systems as a third base pair for PCR amplification . Nucleic Acids Research . 40 . 6 . 2793–806 . March 2012 . 22121213 . 3315302 . 10.1093/nar/gkr1068 .
Kimoto M, Yamashige R, Matsunaga K, Yokoyama S, Hirao I . Generation of high-affinity DNA aptamers using an expanded genetic alphabet . Nature Biotechnology . 31 . 5 . 453–7 . May 2013 . 23563318 . 10.1038/nbt.2556 . 23329867 .
Malyshev DA, Dhami K, Quach HT, Lavergne T, Ordoukhanian P, Torkamani A, Romesberg FE . Efficient and sequence-independent replication of DNA containing a third base pair establishes a functional six-letter genetic alphabet . Proceedings of the National Academy of Sciences of the United States of America . 109 . 30 . 12005–10 . July 2012 . 22773812 . 3409741 . 10.1073/pnas.1205176109 . 2012PNAS..10912005M . free .
News: Scientists Create First Living Organism With 'Artificial' DNA. Callaway E . May 7, 2014. Nature News. Huffington Post. 8 May 2014.
News: Life engineered with expanded genetic code . Fikes BJ . May 8, 2014 . San Diego Union Tribune . 8 May 2014 . dead . https://web.archive.org/web/20140509001048/http://www.utsandiego.com/news/2014/may/08/tp-life-engineered-with-expanded-genetic-code/ . 9 May 2014 .
News: First life forms to pass on artificial DNA engineered by US scientists. Sample I . May 7, 2014. The Guardian. 8 May 2014.
News: Scientists Add Letters to DNA's Alphabet, Raising Hope and Fear . Pollack A . May 7, 2014. The New York Times. 8 May 2014.
Malyshev DA, Dhami K, Lavergne T, Chen T, Dai N, Foster JM, Corrêa IR, Romesberg FE . 6 . A semi-synthetic organism with an expanded genetic alphabet . Nature . 509 . 7500 . 385–8 . May 2014 . 24805238 . 4058825 . 10.1038/nature13314 . 2014Natur.509..385M .
News: Pollack A . Researchers Report Breakthrough in Creating Artificial Genetic Code . May 7, 2014 . . May 7, 2014 .
Callaway E . First life with 'alien' DNA . May 7, 2014 . . 10.1038/nature.2014.15179 . 86967999 . May 7, 2014 .
News: Amos J . Semi-synthetic bug extends 'life's alphabet' . BBC News . 8 May 2014 . 2014-05-09 .
Koide H, Yokoyama S, Kawai G, Ha JM, Oka T, Kawai S, Miyake T, Fuwa T, Miyazawa T . 6 . Biosynthesis of a protein containing a nonprotein amino acid by Escherichia coli: L-2-aminohexanoic acid at position 21 in human epidermal growth factor . Proceedings of the National Academy of Sciences of the United States of America . 85 . 17 . 6237–41 . September 1988 . 3045813 . 281944 . 10.1073/pnas.85.17.6237 . 1988PNAS...85.6237K . free .
Ferla MP, Patrick WM . Bacterial methionine biosynthesis . Microbiology . 160 . Pt 8 . 1571–1584 . August 2014 . 24939187 . 10.1099/mic.0.077826-0 . free .
Book: 10.1007/978-1-59745-209-0_5. 17272838. Production of Selenomethionyl Proteins in Prokaryotic and Eukaryotic Expression Systems. Macromolecular Crystallography Protocols. 363. 91–108. Methods in Molecular Biology. 2007. Doublié S . 978-1-58829-292-6. registration. https://archive.org/details/macromolecularcr00sylv.
Suchanek M, Radzikowska A, Thiele C . Photo-leucine and photo-methionine allow identification of protein-protein interactions in living cells . Nature Methods . 2 . 4 . 261–7 . April 2005 . 15782218 . 10.1038/NMETH752 . free .
Ramadan SE, Razak AA, Ragab AM, el-Meleigy M . Incorporation of tellurium into amino acids and proteins in a tellurium-tolerant fungi . Biological Trace Element Research . 20 . 3 . 225–32 . June 1989 . 2484755 . 10.1007/BF02917437 . 9439946 .
Bacher JM, Ellington AD . Selection and characterization of Escherichia coli variants capable of growth on an otherwise toxic tryptophan analogue . Journal of Bacteriology . 183 . 18 . 5414–25 . September 2001 . 11514527 . 95426 . 10.1128/jb.183.18.5414-5425.2001 .
Wong JT . Membership mutation of the genetic code: loss of fitness by tryptophan . Proceedings of the National Academy of Sciences of the United States of America . 80 . 20 . 6303–6 . October 1983 . 6413975 . 394285 . 10.1073/pnas.80.20.6303 . 1983PNAS...80.6303W . free .
Hoesl MG, Oehm S, Durkin P, Darmon E, Peil L, Aerni HR, Rappsilber J, Rinehart J, Leach D, Söll D, Budisa N . 6 . Chemical Evolution of a Bacterial Proteome . Angewandte Chemie . 54 . 34 . 10030–4 . August 2015 . 26136259 . 4782924 . 10.1002/anie.201502868 . NIHMSID: NIHMS711205
Moroder L, Budisa N . Synthetic biology of protein folding . ChemPhysChem . 11 . 6 . 1181–7 . April 2010 . 20391526 . 10.1002/cphc.201000035 .
Budisa N . Prolegomena to future experimental efforts on genetic code engineering by expanding its amino acid repertoire . Angewandte Chemie . 43 . 47 . 6426–63 . December 2004 . 15578784 . 10.1002/anie.200300646 .
Link AJ, Mock ML, Tirrell DA . Non-canonical amino acids in protein engineering . Current Opinion in Biotechnology . 14 . 6 . 603–9 . December 2003 . 14662389 . 10.1016/j.copbio.2003.10.011 .
Nehring S, Budisa N, Wiltschi B . Performance analysis of orthogonal pairs designed for an expanded eukaryotic genetic code . PLOS ONE . 7 . 4 . e31992 . 2012 . 22493661 . 3320878 . 10.1371/journal.pone.0031992 . 2012PLoSO...731992N . free .
Agostini F, Völler JS, Koksch B, Acevedo-Rocha CG, Kubyshkin V, Budisa N . Biocatalysis with Unnatural Amino Acids: Enzymology Meets Xenobiology . Angewandte Chemie . 56 . 33 . 9680–9703 . August 2017 . 28085996 . 10.1002/anie.201610129 .
Rubini M, Lepthien S, Golbik R, Budisa N . Aminotryptophan-containing barstar: structure--function tradeoff in protein design and engineering with an expanded genetic code . Biochimica et Biophysica Acta (BBA) - Proteins and Proteomics . 1764 . 7 . 1147–58 . July 2006 . 16782415 . 10.1016/j.bbapap.2006.04.012 .
Steiner T, Hess P, Bae JH, Wiltschi B, Moroder L, Budisa N . Synthetic biology of proteins: tuning GFPs folding and stability with fluoroproline . PLOS ONE . 3 . 2 . e1680 . February 2008 . 18301757 . 2243022 . 10.1371/journal.pone.0001680 . 2008PLoSO...3.1680S . free .
Wolschner C, Giese A, Kretzschmar HA, Huber R, Moroder L, Budisa N . Design of anti- and pro-aggregation variants to assess the effects of methionine oxidation in human prion protein . Proceedings of the National Academy of Sciences of the United States of America . 106 . 19 . 7756–61 . May 2009 . 19416900 . 2674404 . 10.1073/pnas.0902688106 . 2009PNAS..106.7756W . free .
Lepthien S, Hoesl MG, Merkel L, Budisa N . Azatryptophans endow proteins with intrinsic blue fluorescence . Proceedings of the National Academy of Sciences of the United States of America . 105 . 42 . 16095–100 . October 2008 . 18854410 . 2571030 . 10.1073/pnas.0802804105 . 2008PNAS..10516095L . free .
Bae JH, Rubini M, Jung G, Wiegand G, Seifert MH, Azim MK, Kim JS, Zumbusch A, Holak TA, Moroder L, Huber R, Budisa N . 6 . Expansion of the genetic code enables design of a novel "gold" class of green fluorescent proteins . Journal of Molecular Biology . 328 . 5 . 1071–81 . May 2003 . 12729742 . 10.1016/s0022-2836(03)00364-4 . 2017-08-11 . 2017-08-11 . https://web.archive.org/web/20170811061442/https://www.deepdyve.com/lp/elsevier/expansion-of-the-genetic-code-enables-design-of-a-novel-gold-class-of-L0Oc1mrVMi . dead .
Hoesl MG, Acevedo-Rocha CG, Nehring S, Royter M, Wolschner C, Wiltschi B, Budisa N, Antranikian G . 10.1002/cctc.201000253 . 2011 . Lipase Congeners Designed by Genetic Code Engineering . ChemCatChem . 3 . 1 . 213–221 . 86352672 . 1867-3880 .
Hong SH, Kwon YC, Jewett MC . Non-standard amino acid incorporation into proteins using Escherichia coli cell-free protein synthesis . Frontiers in Chemistry . 2 . 34 . 2014 . 24959531 . 4050362 . 10.3389/fchem.2014.00034 . 2014FrCh....2...34H . free .
https://www.bbc.com/news/science-environment-42167569 'Unnatural' microbe can make proteins
Zhang Y, Ptacin JL, Fischer EC, Aerni HR, Caffaro CE, San Jose K, Feldman AW, Turner CR, Romesberg FE . 6 . A semi-synthetic organism that stores and retrieves increased genetic information . Nature . 551 . 7682 . 644–647 . November 2017 . 29189780 . 5796663 . 10.1038/nature24659 . 2017Natur.551..644Z .
Web site: On stranger nucleotides . Howgego J . Chemistry World . February 2014.
Li L, Degardin M, Lavergne T, Malyshev DA, Dhami K, Ordoukhanian P, Romesberg FE . Natural-like replication of an unnatural base pair for the expansion of the genetic alphabet and biotechnology applications . Journal of the American Chemical Society . 136 . 3 . 826–9 . January 2014 . 24152106 . 3979842 . 10.1021/ja408814g .