Protein phosphorylation is a reversible post-translational modification of proteins in which an amino acid residue is phosphorylated by a protein kinase by the addition of a covalently bound phosphate group. Phosphorylation alters the structural conformation of a protein, causing it to become activated, deactivated, or otherwise modifying its function.[1] Approximately 13,000 human proteins have sites that are phosphorylated.
The reverse reaction of phosphorylation is called dephosphorylation, and is catalyzed by protein phosphatases. Protein kinases and phosphatases work independently and in a balance to regulate the function of proteins.[2]
The amino acids most commonly phosphorylated are serine, threonine, tyrosine, and histidine.[3] [4] These phosphorylations play important and well-characterized roles in signaling pathways and metabolism. However, other amino acids can also be phosphorylated post-translationally, including arginine, lysine, aspartic acid, glutamic acid and cysteine, and these phosphorylated amino acids have been identified to be present in human cell extracts and fixed human cells using a combination of antibody-based analysis (for pHis) and mass spectrometry (for all other amino acids).[5] [6] [7]
Protein phosphorylation was first reported in 1906 by Phoebus Levene at the Rockefeller Institute for Medical Research with the discovery of phosphorylated vitellin.[8] However, it was nearly 50 years until the enzymatic phosphorylation of proteins by protein kinases was discovered.[9]
In 1906, Phoebus Levene at the Rockefeller Institute for Medical Research identified phosphate in the protein vitellin (phosvitin)[8] and by 1933 had detected phosphoserine in casein, with Fritz Lipmann.[10] However, it took another 20 years before Eugene P. Kennedy described the first "enzymatic phosphorylation of proteins".[9] The first phosphorylase enzyme was discovered by Carl and Gerty Cori in the late 1930s. Carl and Gerty Cori found two forms of glycogen phosphorylase which they named A and B but did not correctly understand the mechanism of the B form to A form conversion. The interconversion of phosphorylase b to phosphorylase a was later described by Edmond Fischer and Edwin Krebs, as well as, Wosilait and Sutherland, involving a phosphorylation/dephosphorylation mechanism.[11] It was found that an enzyme, named phosphorylase kinase and Mg-ATP were required to phosphorylate glycogen phosphorylase by assisting in the transfer of the γ-phosphoryl group of ATP to a serine residue on phosphorylase b. Protein phosphatase 1 is able to catalyze the dephosphorylation of phosphorylated enzymes by removing the phosphate group. Earl Sutherland explained in 1950, that the activity of phosphorylase was increased and thus glycogenolysis stimulated when liver slices were incubated with adrenalin and glucagon. Phosphorylation was considered a specific control mechanism for one metabolic pathway until the 1970s, when Lester Reed discovered that mitochondrial pyruvate dehydrogenase complex was inactivated by phosphorylation. Also in the 1970s, the term multisite phosphorylation was coined in response to the discovery of proteins that are phosphorylated on two or more residues by two or more kinases. In 1975, it was shown that cAMP-dependent proteins kinases phosphorylate serine residues on specific amino acid sequence motifs. Ray Erikson discovered that v-Src was a kinase and Tony Hunter found that v-Src phosphorylated tyrosine residues on proteins in the 1970s.[12] In the early 1980, the amino-acid sequence of the first protein kinase was determined which helped geneticists understand the functions of regulatory genes. In the late 1980s and early 1990s, the first protein tyrosine phosphatase (PTP1B) was purified and the discovery, as well as, cloning of JAK kinases was accomplished which led to many in the scientific community to name the 1990s as the decade of protein kinase cascades.[13] [14] Edmond Fischer and Edwin Krebs were awarded the Nobel prize in 1992 "for their discoveries concerning reversible protein phosphorylation as a biological regulatory mechanism".[15]
Reversible phosphorylation of proteins is abundant in both prokaryotic and even more so in eukaryotic organisms.[16] [17] [18] [19] For instance, in bacteria 5-10% of all proteins are thought to be phosphorylated.[20] [21] By contrast, it is estimated that one third of all human proteins is phosphorylated at any point in time, with 230,000, 156,000, and 40,000 unique phosphorylation sites existing in human, mouse, and yeast, respectively. In yeast, about 120 kinases (out of ~6,000 proteins total) cause 8,814 known regulated phosphorylation events, generating about 3,600 phosphoproteins (about 60% of all yeast proteins).[22] [23] Hence, phosphorylation is a universal regulatory mechanism that affects a large portion of proteins. Even if a protein is not phosphorylated itself, its interactions with other proteins may be regulated by phosphorylation of these interacting proteins.
Phosphorylation introduces a charged and hydrophilic group in the side chain of amino acids, possibly changing a protein's structure by altering interactions with nearby amino acids. Some proteins such as p53 contain multiple phosphorylation sites, facilitating complex, multi-level regulation. Because of the ease with which proteins can be phosphorylated and dephosphorylated, this type of modification is a flexible mechanism for cells to respond to external signals and environmental conditions.[24]
Kinases phosphorylate proteins and phosphatases dephosphorylate proteins. Many enzymes and receptors are switched "on" or "off" by phosphorylation and dephosphorylation. Reversible phosphorylation results in a conformational change in the structure in many enzymes and receptors, causing them to become activated or deactivated. Phosphorylation usually occurs on serine, threonine, tyrosine and histidine residues in eukaryotic proteins. Histidine phosphorylation of eukaryotic proteins appears to be much more frequent than tyrosine phosphorylation.[25] In prokaryotic proteins phosphorylation occurs on the serine, threonine, tyrosine, histidine, arginine or lysine residues.[16] [17] [25] [26] The addition of a phosphate (PO43-) molecule to a non-polar R group of an amino acid residue can turn a hydrophobic portion of a protein into a polar and extremely hydrophilic portion of a molecule. In this way protein dynamics can induce a conformational change in the structure of the protein via long-range allostery with other hydrophobic and hydrophilic residues in the protein.
One such example of the regulatory role that phosphorylation plays is the p53 tumor suppressor protein. The p53 protein is heavily regulated[27] and contains more than 18 different phosphorylation sites. Activation of p53 can lead to cell cycle arrest, which can be reversed under some circumstances, or apoptotic cell death.[28] This activity occurs only in situations wherein the cell is damaged or physiology is disturbed in normal healthy individuals.
Upon the deactivating signal, the protein becomes dephosphorylated again and stops working.[29] This is the mechanism in many forms of signal transduction, for example the way in which incoming light is processed in the light-sensitive cells of the retina.
Regulatory roles of phosphorylation include:
Elucidating complex signaling pathway phosphorylation events can be difficult. In cellular signaling pathways, protein A phosphorylates protein B, and B phosphorylates C. However, in another signaling pathway, protein D phosphorylates A, or phosphorylates protein C. Global approaches such as phosphoproteomics, the study of phosphorylated proteins, which is a sub-branch of proteomics, combined with mass spectrometry-based proteomics, have been utilised to identify and quantify dynamic changes in phosphorylated proteins over time. These techniques are becoming increasingly important for the systematic analysis of complex phosphorylation networks.[38] They have been successfully used to identify dynamic changes in the phosphorylation status of more than 6,000 sites after stimulation with epidermal growth factor.[39] Another approach for understanding Phosphorylation Network, is by measuring the genetic interactions between multiple phosphorylating proteins and their targets. This reveals interesting recurring patterns of interactions – network motifs.[40] Computational methods have been developed to model phosphorylation networks[41] [42] and predict their responses under different perturbations.[43]
Eukaryotic DNA is organized with histone proteins in specific complexes called chromatin. The chromatin structure functions and facilitates the packaging, organization and distribution of eukaryotic DNA. However, it has a negative impact on several fundamental biological processes such as transcription, replication and DNA repair by restricting the accessibility of certain enzymes and proteins. Post-translational modification of histones such as histone phosphorylation has been shown to modify the chromatin structure by changing protein:DNA or protein:protein interactions.[44] Histone post-translational modifications modify the chromatin structure. The most commonly associated histone phosphorylation occurs during cellular responses to DNA damage, when phosphorylated histone H2A separates large chromatin domains around the site of DNA breakage.[45] Researchers investigated whether modifications of histones directly impact RNA polymerase II directed transcription. Researchers choose proteins that are known to modify histones to test their effects on transcription, and found that the stress-induced kinase, MSK1, inhibits RNA synthesis. Inhibition of transcription by MSK1 was most sensitive when the template was in chromatin, since DNA templates not in chromatin were resistant to the effects of MSK1. It was shown that MSK1 phosphorylated histone H2A on serine 1, and mutation of serine 1 to alanine blocked the inhibition of transcription by MSK1. Thus results suggested that the acetylation of histones can stimulate transcription by suppressing an inhibitory phosphorylation by a kinase as MSK1.[46]
Within a protein, phosphorylation can occur on several amino acids. Phosphorylation on serine is thought to be the most common, followed by threonine. Tyrosine phosphorylation is relatively rare but lies at the head of many protein phosphorylation signalling pathways (e.g. in tyrosine kinase-linked receptors) in most of the eukaryotes. Phosphorylation on amino acids, such as serine, threonine, and tyrosine results in the formation of a phosphoprotein, when the phosphate group of the phosphoprotein reacts with the -OH group of a Ser, Thr, or Tyr sidechain in an esterification reaction.[47] However, since tyrosine phosphorylated proteins are relatively easy to purify using antibodies, tyrosine phosphorylation sites are relatively well understood. Histidine and aspartate phosphorylation occurs in prokaryotes as part of two-component signaling and in some cases in eukaryotes in some signal transduction pathways. The analysis of phosphorylated histidine using standard biochemical and mass spectrometric approaches is much more challenging than that of Ser, Thr or Tyr.[48] [6] [4] and[49] In prokaryotes, archaea, and some lower eukaryotes, histidine's nitrogen act as a nucleophile and binds to a phosphate group.[50] Once histidine is phosphorylated the regulatory domain of the response regulator catalyzes the transfer of the phosphate to aspartate.
While tyrosine phosphorylation is found in relatively low abundance, it is well studied due to the ease of purification of phosphotyrosine using antibodies. Receptor tyrosine kinases are an important family of cell surface receptors involved in the transduction of extracellular signals such as hormones, growth factors, and cytokines. Binding of a ligand to a monomeric receptor tyrosine kinase stabilizes interactions between two monomers to form a dimer, after which the two bound receptors phosphorylate tyrosine residues in trans. Phosphorylation and activation of the receptor activates a signaling pathway through enzymatic activity and interactions with adaptor proteins.[51] Signaling through the epidermal growth factor receptor (EGFR), a receptor tyrosine kinase, is critical for the development of multiple organ systems including the skin, lung, heart, and brain. Excessive signaling through the EGFR pathway is found in many human cancers.[52]
Cyclin-dependent kinases (CDKs) are serine-threonine kinases which regulate progression through the eukaryotic cell cycle. CDKs are catalytically active only when bound to a regulatory cyclin. Animal cells contain at least nine distinct CDKs which bind to various cyclins with considerable specificity. CDK inhibitors (CKIs) block kinase activity in the cyclin-CDK complex to halt the cell cycle in G1 or in response to environmental signals or DNA damage. The activity of different CDKs activate cell signaling pathways and transcription factors that regulate key events in mitosis such as the G1/S phase transition. Earlier cyclin-CDK complexes provide the signal to activate subsequent cyclin-CDK complexes.[53]
There are thousands of distinct phosphorylation sites in a given cell since:
Since phosphorylation of any site on a given protein can change the function or localization of that protein, understanding the "state" of a cell requires knowing the phosphorylation state of its proteins. For example, generally, if amino acid Serine-473 in the protein AKT is phosphorylated, AKT is functionally active as a kinase, and if it is not phosphorylated, AKT is an inactive kinase.
Phosphorylation sites are crucial for proteins and their transportation and functions. They are the covalent modification of proteins through reversible phosphorylation. This enables proteins to stay inbound within a cell since the negative phosphorylated site disallows their permeability through the cellular membrane. Protein dephosphorylation allows the cell to replenish phosphates through release of pyrophosphates which saves ATP use in the cell.[55] An example of phosphorylating enzyme is found in E. coli bacteria. It possesses alkaline phosphatase in its periplasmic region of its membrane. The outermost membrane is permeable to phosphorylated molecules however the inner cytoplasmic membrane is impermeable due to large negative charges.[56] In this way, the E. coli bacteria stores proteins and pyrophosphates in its periplasmic membrane until either are needed within the cell.
Recent advancement in phosphoproteomic identification has resulted in the discoveries of countless phosphorylation sites in proteins. This required an integrative medium for accessible data in which known phosphorylation sites of proteins are organized. A curated database of dbPAF was created, containing known phosphorylation sites in H. sapiens, M. musculus, R. norvegicus, D. melanogaster, C. elegans, S. pombe and S. cerevisiae. The database currently holds 294,370 non-redundant phosphorylation sites of 40,432 proteins.[57] Other tools of phosphorylation prediction in proteins include NetPhos[58] for eukaryotes, NetPhosBac[58] for bacteria, and ViralPhos[59] for viruses.
There are a large variety of serine residues, and the phosphorylation of each residue can lead to different metabolic consequences.
Phosphorylation of serine and threonine residues is known to crosstalk with O-GlcNAc modification of serine and threonine residues.
See main article: Tyrosine phosphorylation. Tyrosine phosphorylation is a fast, reversible reaction, and one of the major regulatory mechanisms in signal transduction. Cell growth, differentiation, migration, and metabolic homeostasis are cellular processes maintained by tyrosine phosphorylation. The function of protein tyrosine kinases and protein-tyrosine phosphatase counterbalances the level of phosphotyrosine on any protein. The malfunctioning of specific chains of protein tyrosine kinases and protein tyrosine phosphatase has been linked to multiple human diseases such as obesity, insulin resistance, and type 2 diabetes mellitus.[64] Phosphorylation on tyrosine occurs in eukaryotes, select bacterial species, and is present among prokaryotes. Phosphorylation on tyrosine maintains the cellular regulation in bacteria similar to its function in eukaryotes.[65]
Arginine phosphorylation in many Gram-positive bacteria marks proteins for degradation by a Clp protease.[33]
Widespread human protein phosphorylation occurs on multiple non-canonical amino acids, including motifs containing phosphorylated histidine (1 and 3 positions), aspartate, cysteine, glutamate, arginine, and lysine in HeLa cell extracts. Due to the chemical and thermal lability of these phosphorylated residues, special procedures and separation techniques are required for preservation alongside the heat stable 'classical' Ser, Thr and Tyr phosphorylation.[66]
Antibodies can be used as powerful tool to detect whether a protein is phosphorylated at a particular site. Antibodies bind to and detect phosphorylation-induced conformational changes in the protein. Such antibodies are called phospho-specific antibodies; hundreds of such antibodies are now available. They are becoming critical reagents both for basic research and for clinical diagnosis.
Post-translational modification (PTM) isoforms are easily detected on 2D gels. Indeed, phosphorylation replaces neutral hydroxyl groups on serines, threonines, or tyrosines with negatively charged phosphates with pKs near 1.2 and 6.5. Thus, below pH 5.5, phosphates add a single negative charge; near pH 6.5, they add 1.5 negative charges; above pH 7.5, they add 2 negative charges. The relative amount of each isoform can also easily and rapidly be determined from staining intensity on 2D gels.
In some very specific cases, the detection of the phosphorylation as a shift in the protein's electrophoretic mobility is possible on simple 1-dimensional SDS-PAGE gels, as it is described for instance for a transcriptional coactivator by Kovacs et al.[67] Strong phosphorylation-related conformational changes (that persist in detergent-containing solutions) are thought to underlie this phenomenon. Most of the phosphorylation sites for which such a mobility shift has been described fall in the category of SP and TP sites (i.e. a proline residue follows the phosphorylated serine or threonine residue).
Large-scale mass spectrometry analyses have been used to determine sites of protein phosphorylation. Dozens of studies have been published, each identifying thousands of sites, many of which were previously undescribed.[68] [69] Mass spectrometry is ideally suited for such analyses using HCD or ETD fragmentation, as the addition of phosphorylation results in an increase in the mass of the protein and the phosphorylated residue. Advanced, highly accurate mass spectrometers are needed for these studies, limiting the technology to labs with high-end mass spectrometers. However, the analysis of phosphorylated peptides by mass spectrometry is still not as straightforward as for "regular", unmodified peptides. EThcD has been developed combining electron-transfer and higher-energy collision dissociation. Compared to the usual fragmentation methods, EThcD scheme provides more informative MS/MS spectra for unambiguous phosphosite localization.[70]
A detailed characterization of the sites of phosphorylation is very difficult, and the quantitation of protein phosphorylation by mass spectrometry requires isotopic internal standard approaches.[71] A relative quantitation can be obtained with a variety of differential isotope labeling technologies.[72] There are also several quantitative protein phosphorylation methods, including fluorescence immunoassays, microscale thermophoresis, FRET, TRF, fluorescence polarization, fluorescence-quenching, mobility shift, bead-based detection, and cell-based formats.[73] [74]
Protein phosphorylation is common among all clades of life, including all animals, plants, fungi, bacteria, and archaea. The origins of protein phosphorylation mechanisms are ancestral and have diverged greatly between different species. In eukaryotes, it is estimated that between 30 – 65% of all proteins may be phosphorylated, with tens or even hundreds of thousands of distinct phosphorylation sites.[75] Some phosphorylation sites appear to have evolved as conditional "off" switches, blocking the active site of an enzyme, such as in the prokaryotic metabolic enzyme isocitrate dehydrogenase. However, in the case of proteins that must be phosphorylated to be active, it is less clear how they could have emerged from non-phosphorylated ancestors. It has been shown that a subset of serine phosphosites are often replaced by acidic residues such as aspartate and glutamate between different species. These anionic residues can interact with cationic residues such as lysine and arginine to form salt bridges, stable non-covalent interactions that alter a protein's structure. These phosphosites often participate in salt bridges, suggesting that some phosphorylation sites evolved as conditional "on" switches for salt bridges, allowing these proteins to adopt an active conformation only in response to a specific signal.[76]
There are around 600 known eukaryotic protein kinases, making them one of the largest eukaryotic gene families. Most phosphorylation is carried out by a single superfamily of protein kinases that share a conserved kinase domain. Protein phosphorylation is highly conserved in pathways central to cell survival, such as cell cycle progression relying on cyclin-dependent kinases (CDKs), but individual phosphorylation sites are often flexible. Targets of CDK phosphorylation often have phosphosites in disordered segments, which are found in non-identical locations even in close species. Conversely, targets of CDK phosphorylation in structurally defined regions are more highly conserved. While CDK activity is critical for cell growth and survival in all eukaryotes, only very few phosphosites show strong conservation of their precise positions. Positioning is likely to be highly important for phosphates that allosterically regulate protein structure, but much more flexible for phosphates that interact with phosphopeptide-binding domains to recruit regulatory proteins.[77]
Protein phosphorylation is a reversible post-translational modification of proteins. In eukaryotes, protein phosphorylation functions in cell signaling, gene expression, and differentiation. It is also involved in DNA replication during the cell cycle, and the mechanisms that cope with stress-induced replication blocks. Compared to eukaryotes, prokaryotes use Hanks-type kinases and phosphatases for signal transduction. Whether or not the phosphorylation of proteins in bacteria can also regulate processes like DNA repair or replication still remains unclear.[78]
Compared to the protein phosphorylation of prokaryotes, studies of protein phosphorylation in eukaryotes from yeast to human cells have been rather extensive. It is known that eukaryotes rely on the phosphorylation of the hydroxyl group on the side chains of serine, threonine, and tyrosine for cell signaling. These are the main regulatory post-translational modifications in eukaryotic cells but the protein phosphorylation of prokaryotes are less intensely studied. While serine, threonine, and tyrosine are phosphorylated in eukaryotes, histidine and aspartate is phosphorylated in prokaryotes and eukaryotes. In bacteria, histidine phosphorylation occurs in the phosphoenolpyruvate-dependent phosphotransferase systems (PTSs), which are involved in the process of internalization as well as the phosphorylation of sugars.[79]
Protein phosphorylation by protein kinase was first shown in E. coli and Salmonella typhimurium and has since been demonstrated in many other bacterial cells.[80] It was found that bacteria use histidine and aspartate phosphorylation as a model for bacterial signaling transduction. Serine, threonine and tyrosine phosphorylation are also present in bacteria. Bacteria carry kinases and phosphatases similar to that of their eukaryotic equivalent and have also developed unique kinases and phosphatases not found in eukaryotes.[79]
Abnormal protein phosphorylation has been implicated in a number of diseases, including cancer, Alzheimer's disease, Parkinson's disease, and other degenerative disorders.
Tau protein belongs to a group of microtubule associated proteins (MAPs) which help stabilize microtubules in cells, including neurons.[81] Association and stabilizing activity of tau protein depends on its phosphorylated state. In Alzheimer's disease, due to misfoldings and abnormal conformational changes in tau protein structure, it is rendered ineffective at binding to microtubules and unable to keep the neural cytoskeletal structure organized during neural processes. Abnormal tau inhibits and disrupts microtubule organization and disengages normal tau from microtubules into cytosolic phase.[82] The misfoldings lead to the abnormal aggregation into fibrillary tangles inside the neurons. The tau protein needs to be phosphorylated to function, but hyperphosphorylation of tau protein is one of the major influences on its incapacity to associate. Phosphatases PP1, PP2A, PP2B, and PP2C dephosphorylate tau protein in vitro, and their activities are reduced in areas of the brain in Alzheimer patients.[83] Tau phosphoprotein is three to fourfold hyperphosphorylated in an Alzheimer patient compared to an aged non-afflicted individual. Alzheimer disease tau seems to remove MAP1 and MAP2 (two other major associated proteins) from microtubules and this deleterious effect is reversed when dephosphorylation is performed, evidencing hyperphosphorylation as the sole cause of the crippling activity.
α-Synuclein is a protein that is associated with Parkinson's disease.[84] In humans, this protein is encoded by the SNCA gene.[85] α-Synuclein is involved in recycling synaptic vesicles that carry neurotransmitters and naturally occurs in an unfolded form. Elevated levels of α-Synuclein are found in patients with Parkinson's disease. There is a correlation between the concentration of unphosphorylated α-Synuclein present in the patient and the severity of Parkinson's disease.[86] Specifically, phosphorylation of Ser129 in α-Synuclein has an impact on severity. Healthy patients have higher levels of unphosphorylated α-Synuclein than patients with Parkinson's disease. The measurement of change in the ratio of concentrations of phosphorylated α-Synuclein to unphosphorylated α-Synuclein within a patient could be a marker of the disease progression. Antibodies that target α-Synuclein at phosphorylated Ser129 are used to study the molecular aspects of synucleinopathies.[87] [88]
Phosphorylation of Ser129 is associated with the aggregation of the protein and further damage to the nervous system. The aggregation of phosphorylated α-Synuclein can be enhanced if a presynaptic scaffold protein, Sept4, is present in insufficient quantities. Direct interaction of α-Synuclein with Sept4 inhibits the phosphorylation of Ser129.[89] [90] [91] However, phosphorylation of Ser129 can be observed without synuclein aggregation in conditions of overexpression.[92]