Bioconjugation is a chemical strategy to form a stable covalent link between two molecules, at least one of which is a biomolecule.
Recent advances in the understanding of biomolecules enabled their application to numerous fields like medicine, diagnostics, biocatalysis and materials. Synthetically modified biomolecules can have diverse functionalities, such as tracking cellular events, revealing enzyme function, determining protein biodistribution, imaging specific biomarkers, and delivering drugs to targeted cells.[1] [2] [3] [4] Bioconjugation is a crucial strategy that links these modified biomolecules with different substrates. Besides applications in biomedical research, bioconjugation has recently also gained importance in nanotechnology such as bioconjugated quantum dots.
The most common types of bioconjugation include coupling of a small molecule (such as biotin or a fluorescent dye) to a protein. Antibody-drug conjugates such as Brentuximab vedotin and Gemtuzumab ozogamicin are examples falling into this category.[5]
Protein-protein conjugations, such as the coupling of an antibody to an enzyme, or the linkage of protein complexes, is also facilitated via bioconjugations.[6] [7]
Other less common molecules used in bioconjugation are oligosaccharides, nucleic acids, synthetic polymers such as polyethylene glycol,[8] and carbon nanotubes.[9]
Synthesis of bioconjugates involves a variety of challenges, ranging from the simple and nonspecific use of a fluorescent dye marker to the complex design of antibody drug conjugates. Various bioconjugation reactions have been developed to chemically modify proteins. Common types of bioconjugation reactions on proteins are coupling of lysine, cysteine, and tyrosine amino acid residues, as well as modification of tryptophan residues and of the N- and C- terminus.
However, these reactions often lack chemoselectivity and efficiency, because they depend on the presence of native amino acids, which are present in large quantities that hinder selectivity. There is an increasing need for chemical strategies that can effectively attach synthetic molecules site specifically to proteins. One strategy is to first install a unique functional group onto a protein, and then a bioorthogonal reaction is used to couple a biomolecule with this unique functional group. The bioorthogonal reactions targeting non-native functional groups are widely used in bioconjugation chemistry. Some important reactions are modification of ketone and aldehydes, Staudinger ligation with organic azides, copper-catalyzed Huisgen cycloaddition of azides, and strain promoted Huisgen cycloaddition of azides.[10] [11] [12] [13]
The nucleophilic lysine residue is commonly targeted site in protein bioconjugation, typically through amine-reactive N-hydroxysuccinimidyl (NHS) esters. To obtain optimal number of deprotonated lysine residues, the pH of the aqueous solution must be below the pKa of the lysine ammonium group, which is around 10.5, so the typical pH of the reaction is about 8 and 9. The common reagent for the coupling reaction is NHS-ester (shown in the first reaction below in Figure 1), which reacts with nucleophilic lysine through a lysine acylation mechanism. Other similar reagents are isocyanates and isothiocyanates that undergo a similar mechanism (shown in the second and third reactions in Figure 1 below). Benzoyl fluorides (shown in the last reaction below in Figure 1), which allows for lysine modification of proteins under mild conditions (low temperature, physiological pH), were recently proposed as an alternative to classically used lysine specific reagents.[14]
Because free cysteine rarely occurs on protein surface, it is an excellent choice for chemoselective modification.[15] Under basic condition, the cysteine residues will be deprotonated to generate a thiolate nucleophile, which will react with soft electrophiles, such as maleimides and iodoacetamides (shown in the first two reactions in Figure 2 below). As a result, a carbon-sulfur bond is formed. Another modification of cysteine residues involves the formation of disulfide bond (shown in the third reaction in Figure 2). The reduced cysteine residues react with exogenous disulfides, generating new a disulfide bond on the protein. An excess of disulfides is often used to drive the reaction, such as 2-thiopyridone and 3-carboxy-4-nitrothiophenol. Electron-deficient alkynes were demonstrated to selectively react with cysteine residues of proteins in the presence of other nucleophilic amino acid residues. Depending on the alkyne substitution, these reactions can produce either cleavable (when alkynone derivatives are used),[16] or hydrolytically stable bioconjugates (when 3-arylpropiolonitriles are used; the last reaction below in Figure 2).[17]
Tyrosine residues are relatively unreactive; therefore they have not been a popular targets for bioconjugation. Recent development has shown that the tyrosine can be modified through electrophilic aromatic substitutions (EAS) reactions, and it is selective for the aromatic carbon adjacent to the phenolic hydroxyl group. This becomes particularly useful in the case that cysteine residues cannot be targeted. Specifically, diazonium effectively couples with tyrosine residues (diazonium salt shown as reagent in the first reaction in Figure 3 below), and an electron withdrawing substituent in the 4-position of diazonium salt can effectively increase the efficiency of the reaction. Cyclic diazodicarboxyamide derivative like 4-Phenyl-1,2,4-triazole-3,5-dione (PTAD) were reported for selective bioconjugation on tyrosine residues (the second reaction in Figure 3 below).[18] A three-component Mannich-type reaction with aldehydes and anilines (the last reaction in Figure 3) was also described to be relatively tyrosine-selective under mild optimised reaction conditions.[19]
Since natural amino acid residues are usually present in large quantities, it is often difficult to modify one single site. Strategies targeting the termini of protein have been developed, because they greatly enhanced the site selectivity of protein modification. One of the N- termini modifications involves the functionalization of the terminal amino acid. The oxidation of N-terminal serine and threonine residues are able to generate N-terminal aldehyde, which can undergo further bioorthogonal reactions (shown in the first reaction in Figure 4). Another type of modification involves the condensation of N-terminal cysteine with aldehyde, generating thiazolidine that is stable at high pH (second reaction in Figure 4). Using pyridoxal phosphate (PLP), several N-terminal amino acids can undergo transamination to yield N-terminal aldehyde, such as glycine and aspartic acid (third reaction in Figure 4).
An example of C-termini modification is the native chemical ligation (NCL), which is the coupling between a C-terminal thioester and a N-terminal cysteine (Figure 5).
A ketone or aldehyde can be attached to a protein through the oxidation of N-terminal serine residues or transamination with PLP. Additionally, they can be introduced by incorporating unnatural amino acids via the Tirrell method or Schultz method. They will then selectively condense with an alkoxyamine and a hydrazine, producing oxime and hydrazone derivatives (shown in the first and second reactions, respectively, in Figure 6). This reaction is highly chemoselective in terms of protein bioconjugation, but the reaction rate is slow. The mechanistic studies show that the rate determining step is the dehydration of tetrahedral intermediate, so a mild acidic solution is often employed to accelerate the dehydration step.
The introduction of nucleophilic catalyst can significantly enhance reaction rate (shown in Figure 7). For example, using aniline as a nucleophilic catalyst, a less populated protonated carbonyl becomes a highly populated protonated Schiff base.[20] In other words, it generates a high concentration of reactive electrophile. The oxime ligation can then occur readily, and it has been reported that the rate increased up to 400 times under mild acidic condition. The key of this catalyst is that it can generate a reactive electrophile without competing with desired product.
Recent developments that exploit proximal functional groups have enabled hydrazone condensations[21] to operate at 20 M−1s−1 at neutral pH while oxime condensations have been discovered which proceed at 500-10000 M−1s−1 at neutral pH without added catalysts.[22] [23]
The Staudinger ligation of azides and phosphine has been used extensively in field of chemical biology. Because it is able to form a stable amide bond in living cells and animals, it has been applied to modification of cell membrane, in vivo imaging, and other bioconjugation studies.[24] [25] [26] [27]
Contrasting with the classic Staudinger reaction, Staudinger ligation is a second order reaction in which the rate-limiting step is the formation of phosphazide (specific reaction mechanism shown in Figure 9). The triphenylphosphine first reacts with the azide to yield an azaylide through a four-membered ring transition state, and then an intramolecular reaction leads to the iminophosphorane intermediate, which will then give the amide-linkage under hydrolysis.[28]
See main article: Huisgen cycloaddition.
Azide has become a popular target for chemoselective protein modification, because they are small in size and have a favorable thermodynamic reaction potential. One such azide reactions is the [3+2] cycloaddition reaction with alkyne, but the reaction requires high temperature and often gives mixtures of regioisomers.
An improved reaction developed by chemist Karl Barry Sharpless involves the copper (I) catalyst, which couples azide with terminal alkyne that only give 1,4 substituted 1,2,3 triazoles in high yields (shown below in Figure 11). The mechanistic study suggests a stepwise reaction. The Cu (I) first couples with acetylenes, and then it reacts with azide to generate a six-membered intermediate. The process is very robust that it occurs at pH ranging from 4 to 12, and copper (II) sulfate is often used as a catalyst in the presence of a reducing agent.
Even though Staudinger ligation is a suitable bioconjugation in living cells without major toxicity, the phosphine's sensitivity to air oxidation and its poor solubility in water significantly hinder its efficiency. The copper(I) catalyzed azide-alkyne coupling has reasonable reaction rate and efficiency under physiological conditions, but copper poses significant toxicity and sometimes interferes with protein functions in living cells. In 2004, chemist Carolyn R. Bertozzi's lab developed a metal free [3+2] cycloaddition using strained cyclooctyne and azide. Cyclooctyne, which is the smallest stable cycloalkyne, can couple with azide through [3+2] cycloaddition, leading to two regioisomeric triazoles (Figure 12). The reaction occurs readily at room temperature and therefore can be used to effectively modify living cells without negative effects. It has also been reported that the installation of fluorine substituents on a cyclic alkyne can greatly accelerate the reaction rate.[29]
Transition metal-based bioconjugation had been challenging due to the nature of biological conditions – aqueous solution, room temperature, mild pH, and low substrate concentrations – which are generally challenging for organometallic reactions. However, recently, besides copper-catalyzed [3 + 2] azide alkyne cycloaddition reaction, more and more diverse transition metal-mediated chemical transformations have been applied for bioconjugation reactions, introducing olefin metathesis, alkylation, C–H arylation, C–C, C–S, and C–N cross-coupling reactions.[30] [31]
Using in situ generated RhII-carbenoid by activation of vinyl-substituted diazo compounds with Rh2(OAc)4, tryptophans and cysteines were shown to be selectively alkylated in aqueous media.
However, this method is limited to surface tryptophans and cysteines possibly because of steric constraints.[34]
Imines formed from the condensation of aldehydes with lysines or the N-terminus can be reduced efficient by an water-stable [Cp*Ir(bipy)(H<sub>2</sub>O)]SO4 complex in the presence of formate ions (serving as the hydride source). The reaction happens readily under physiologically relevant conditions and results in high conversion for various aromatic aldehydes.
By using a pre-formed electrophilic π-allylpalladium(II) reagent derived from allylic acetate or carbamate precursors, selective allylic alkylation of tyrosines can be achieved in aqueous solution at room temperature and in the presence of cysteines.
Cysteine-containing peptides have been shown to undergo 1,2-addition to allenes in the presence of gold(I) and/or silver(I) salts, producing hydroxyl substituted vinyl thioethers. The reaction with peptides proceeds with high yields and is selective for cysteines over other nucleophilic residues.
However, the reactivity towards proteins is much decreased, potentially due to the coordination of gold to the protein backbone.
Multiple methods have been reported to achieve tryptophan C–H arylation, where diverse electrophiles such as aryl halides[38] [39] and aryl boronic acids[40] (an example shown below) have been used to transfer the aryl groups.
However, current tryptophan C–H arylation reaction conditions remain relatively harsh, requiring organic solvents, low pH and/or high temperatures.
Free thiols has been considered unfavorable for Pd-mediated reactions due to Pd-catalyst decomposition.[41] However, PdII oxidative addition complexes (OACs) supported by dialkylbiaryl phosphine ligands have shown to work efficiently towards cysteine S-arylation.
The first example is the use of PdII OAC with RuPhos:[42] The PdII complex resulting from the oxidative addition of aryl halides or trifluoromethanesulfonates and using RuPhos as the ligand could chemoselectively modify cysteines in various buffer with 5% organic co-solvent under neutral pH. This method has been shown to modify peptides and proteins, achieve peptide macrocyclization (by using bis-palladium reagent and peptides with two unprotected cysteines)[43] and synthesizing antibody-drug conjugates (ADCs). Changing the ligand to sSPhos supports the PdII complex to be sufficiently water soluble to achieve cysteine S-arylation under cosolvent-free aqueous conditions.[44]
There are other applications of this method where the PdII complexes were generated as PdII-peptide OACs by introducing 4-halophenylalanine into peptides during SPPS to achieve peptide-peptide or peptide-protein ligation.[45]
Alternate to directly oxidative addition to the peptide, the Pd OACs could also be transferred to the protein through amine-selective acylation reaction via NHS ester. The latter has been applied to selectively label surface lysine residues of a protein (forming PdII-protein OACs) and oligonucleotides (forming PdII-oligonucleotide OACs), which could then be linked to cysteine-containing peptides or proteins.[46]
Another example of protein-protein cross-coupling is achieved through converting cysteine residues into an electrophilic S-aryl–Pd–X OAC by utilizing an intramolecular oxidative addition strategy.[47]
Similar to cysteine, lysine N-arylation could be achieved through Pd OACs with different dialkylbiaryl phosphine ligands. Due to weaker nucleophilicity and slower reductive elimination rate compared to cysteine, the selection of supporting ligands is shown to be critical. The bulky BrettPhos and t-BuBrettPhos ligands in conjunction with mildly basic sodium phenoxide have been used as the strategy to functionalize lysines on peptide substrates. The reaction happens in mild conditions and is selective over most other nucleophilic amino acid residues.
Pd-mediated Sonogashira, Heck, and Suzuki-Miyaura cross-coupling reactions have been applied widely to modify peptides and proteins, where diverse Pd reagents have been developed for the application in aqueous solutions.[49] Those reactions require the protein or peptide substrate bearing unnatural functional groups such as alkyne,[50] [51] [52] aryl halides,[53] [54] [55] [56] and aryl boronic acids,[57] which can be achieved through genetic code expansion or post-translational modifications.
Bioconjugation of TGF-β to iron oxide nanoparticles and its activation through magnetic hyperthermia in-vitro has been reported.[58] This was done by using 1-(3-dimethylaminopropyl)ethylcarbodiimide combined with N-Hydroxysuccinimide to form primary amide bonds with the free primary amines on the growth factor. Carbon nanotubes have been successfully used in conjunction with bioconjugation to link TGF-β followed by an activation with near-infrared light.[59] Typically, these reactions have involved the use of a crosslinker, but some of these add molecular space between the compound of interest and base material and in turn causes higher degrees of non-specific binding and unwanted reactivity.[60]