Protein targeting or protein sorting is the biological mechanism by which proteins are transported to their appropriate destinations within or outside the cell.[1] [2] Proteins can be targeted to the inner space of an organelle, different intracellular membranes, the plasma membrane, or to the exterior of the cell via secretion. Information contained in the protein itself directs this delivery process.[3] Correct sorting is crucial for the cell; errors or dysfunction in sorting have been linked to multiple diseases.[4] [5]
In 1970, Günter Blobel conducted experiments on protein translocation across membranes. Blobel, then an assistant professor at Rockefeller University, built upon the work of his colleague George Palade.[6] Palade had previously demonstrated that non-secreted proteins were translated by free ribosomes in the cytosol, while secreted proteins (and target proteins, in general) were translated by ribosomes bound to the endoplasmic reticulum. Candidate explanations at the time postulated a processing difference between free and ER-bound ribosomes, but Blobel hypothesized that protein targeting relied on characteristics inherent to the proteins, rather than a difference in ribosomes. Supporting his hypothesis, Blobel discovered that many proteins have a short amino acid sequence at one end that functions like a postal code specifying an intracellular or extracellular destination. He described these short sequences (generally 13 to 36 amino acids residues) as signal peptides or signal sequences and was awarded the 1999 Nobel prize in Physiology for the same.[7]
See main article: Signal peptide. Signal peptides serve as targeting signals, enabling cellular transport machinery to direct proteins to specific intracellular or extracellular locations. While no consensus sequence has been identified for signal peptides, many nonetheless possess a characteristic tripartite structure:
After a protein has reached its destination, the signal peptide is generally cleaved by a signal peptidase. Consequently, most mature proteins do not contain signal peptides. While most signal peptides are found at the N-terminal, in peroxisomes the targeting sequence is located on the C-terminal extension.[8] Unlike signal peptides, signal patches are composed by amino acid residues that are discontinuous in the primary sequence but become functional when folding brings them together on the protein surface.[9] Unlike most signal sequences, signal patches are not cleaved after sorting is complete. In addition to intrinsic signaling sequences, protein modifications like glycosylation can also induce targeting to specific intracellular or extracellular regions.
Since the translation of mRNA into protein by a ribosome takes place within the cytosol, proteins destined for secretion or a specific organelle must be translocated.[10] This process can occur during translation, known as co-translational translocation, or after translation is complete, known as post-translational translocation.[11]
Most secretory and membrane-bound proteins are co-translationally translocated. Proteins that reside in the endoplasmic reticulum (ER), golgi or endosomes also use the co-translational translocation pathway. This process begins while the protein is being synthesized on the ribosome, when a signal recognition particle (SRP) recognizes an N-terminal signal peptide of the nascent protein.[12] Binding of the SRP temporarily pauses synthesis while the ribosome-protein complex is transferred to an SRP receptor on the ER in eukaryotes, and the plasma membrane in prokaryotes.[13] There, the nascent protein is inserted into the translocon, a membrane-bound protein conducting channel composed of the Sec61 translocation complex in eukaryotes, and the homologous SecYEG complex in prokaryotes.[14] In secretory proteins and type I transmembrane proteins, the signal sequence is immediately cleaved from the nascent polypeptide once it has been translocated into the membrane of the ER (eukaryotes) or plasma membrane (prokaryotes) by signal peptidase. The signal sequence of type II membrane proteins and some polytopic membrane proteins are not cleaved off and therefore are referred to as signal anchor sequences. Within the ER, the protein is first covered by a chaperone protein to protect it from the high concentration of other proteins in the ER, giving it time to fold correctly. Once folded, the protein is modified as needed (for example, by glycosylation), then transported to the Golgi for further processing and goes to its target organelles or is retained in the ER by various ER retention mechanisms.
The amino acid chain of transmembrane proteins, which often are transmembrane receptors, passes through a membrane one or several times. These proteins are inserted into the membrane by translocation, until the process is interrupted by a stop-transfer sequence, also called a membrane anchor or signal-anchor sequence.[15] These complex membrane proteins are currently characterized using the same model of targeting that has been developed for secretory proteins. However, many complex multi-transmembrane proteins contain structural aspects that do not fit this model. Seven transmembrane G-protein coupled receptors (which represent about 5% of the genes in humans) mostly do not have an amino-terminal signal sequence. In contrast to secretory proteins, the first transmembrane domain acts as the first signal sequence, which targets them to the ER membrane. This also results in the translocation of the amino terminus of the protein into the ER membrane lumen. This translocation, which has been demonstrated with opsin with in vitro experiments,[16] [17] breaks the usual pattern of "co-translational" translocation which has always held for mammalian proteins targeted to the ER. A great deal of the mechanics of transmembrane topology and folding remains to be elucidated.
Even though most secretory proteins are co-translationally translocated, some are translated in the cytosol and later transported to the ER/plasma membrane by a post-translational system. In prokaryotes this process requires certain cofactors such as SecA and SecB and is facilitated by Sec62 and Sec63, two membrane-bound proteins.[18] The Sec63 complex, which is embedded in the ER membrane, causes hydrolysis of ATP, allowing chaperone proteins to bind to an exposed peptide chain and slide the polypeptide into the ER lumen. Once in the lumen the polypeptide chain can be folded properly. This process only occurs in unfolded proteins located in the cytosol.[19]
In addition, proteins targeted to other cellular destinations, such as mitochondria, chloroplasts, or peroxisomes, use specialized post-translational pathways. Proteins targeted for the nucleus are also translocated post-translationally through the addition of a nuclear localization signal (NLS) that promotes passage through the nuclear envelope via nuclear pores.[20]
While some proteins in the mitochondria originate from mitochondrial DNA within the organelle, most mitochondrial proteins are synthesized as cytosolic precursors containing uptake peptide signals.[21] [22] [23] [24] Unfolded proteins bound by cytosolic chaperone hsp70 that are targeted to the mitochondria may be localized to four different areas depending on their sequences.[25] They may be targeted to the mitochondrial matrix, the outer membrane, the intermembrane space, or the inner membrane. Defects in any one or more of these processes has been linked to health and disease.[26]
Proteins destined for the mitochondrial matrix have specific signal sequences at their beginning (N-terminus) that consist of a string of 20 to 50 amino acids. These sequences are designed to interact with receptors that guide the proteins to their correct location inside the mitochondria. The sequences have a unique structure with clusters of water-loving (hydrophilic) and water-avoiding (hydrophobic) amino acids, giving them a dual nature known as amphipathic. These amphipathic sequences typically form a spiral shape (alpha-helix) with the charged amino acids on one side and the hydrophobic ones on the opposite side. This structural feature is essential for the sequence to function correctly in directing proteins to the matrix. If mutations occur that mess with this dual nature, the protein often fails to reach its intended destination, although not all changes to the sequence have this effect. This indicates the importance of the amphipathic property for the protein to be correctly targeted to the mitochondrial matrix.Proteins targeted to the mitochondrial matrix first involves interactions between the matrix targeting sequence located at the N-terminus and the outer membrane import receptor complex TOM20/22.[27] In addition to the docking of internal sequences and cytosolic chaperones to TOM70. Where TOM is an abbreviation for translocase of the outer membrane. Binding of the matrix targeting sequence to the import receptor triggers a handoff of the polypeptide to the general import core (GIP) known as TOM40. The general import core (TOM40) then feeds the polypeptide chain through the intermembrane space and into another translocase complex TIM17/23/44 located on the inner mitochondrial membrane.[28] This is accompanied by the necessary release of the cytosolic chaperones that maintain an unfolded state prior to entering the mitochondria. As the polypeptide enters the matrix, the signal sequence is cleaved by a processing peptidase and the remaining sequences are bound by mitochondrial chaperones to await proper folding and activity. The push and pull of the polypeptide from the cytosol to the intermembrane space and then the matrix is achieved by an electrochemical gradient that is established by the mitochondrion during oxidative phosphorylation. In which a mitochondrion active in metabolism has generated a negative potential inside the matrix and a positive potential in the intermembrane space.[29] It is this negative potential inside the matrix that directs the positively charged regions of the targeting sequence into its desired location.
Targeting of mitochondrial proteins to the inner membrane may follow 3 different pathways depending upon their overall sequences, however, entry from the outer membrane remains the same using the import receptor complex TOM20/22 and TOM40 general import core. The first pathway for proteins targeted to the inner membrane follows the same steps as those designated to the matrix where it contains a matrix targeting sequence that channels the polypeptide to the inner membrane complex containing the previously mentioned translocase complex TIM17/23/44. However, the difference is that the peptides that are designated to the inner membrane and not the matrix contain an upstream sequence called the stop-transfer-anchor sequence. This stop-transfer-anchor sequence is a hydrophobic region that embeds itself into the phospholipid bilayer of the inner membrane and prevents translocation further into the mitochondrion. The second pathway for proteins targeted to the inner membrane follows the matrix localization pathway in its entirety. However, instead of a stop-transfer-anchor sequence, it contains another sequence that interacts with an inner membrane protein called Oxa-1 once inside the matrix that will embed it into the inner membrane. The third pathway for mitochondrial proteins targeted to the inner membrane follow the same entry as the others into the outer membrane, however, this pathway utilizes the translocase complex TIM22/54 assisted by complex TIM9/10 in the intermembrane space to anchor the incoming peptide into the membrane. The peptides for this last pathway do not contain a matrix targeting sequence, but instead contain several internal targeting sequences.
If instead the precursor protein is designated to the intermembrane space of the mitochondrion, there are two pathways this may occur depending on the sequences being recognized. The first pathway to the intermembrane space follows the same steps for an inner membrane targeted protein. However, once bound to the inner membrane the C-terminus of the anchored protein is cleaved via a peptidase that liberates the preprotein into the intermembrane space so it can fold into its active state. One of the greatest examples for a protein that follows this pathway is cytochrome b2, that upon being cleaved will interact with a heme cofactor and become active.[30] The second intermembrane space pathway does not utilize any inner membrane complexes and therefor does not contain a matrix targeting signal. Instead, it enters through the general import core TOM40 and is further modified in the intermembrane space to achieve its active conformation. TIM9/10 is an example of a protein that follows this pathway in order to be in the location it needs to be to assist in inner membrane targeting.[31]
Outer membrane targeting simply involves the interaction of precursor proteins with the outer membrane translocase complexes that embeds it into the membrane via internal-targeting sequences that are to form hydrophobic alpha helices or beta barrels that span the phospholipid bilayer. This may occur by two different routes depending on the preprotein internal sequences. If the preprotein contains internal hydrophobic regions capable of forming alpha helices, then the preprotein will utilize the mitochondrial import complex (MIM) and be transferred laterally to the membrane. For preproteins containing hydrophobic internal sequences that correlate to beta-barrel forming proteins, they will be imported from the aforementioned outer membrane complex TOM20/22 to the intermembrane space. In which they will interact with TIM9/10 intermembrane-space protein complex that transfers them to sorting and assembly machinery (SAM) that is present in the outer membrane that laterally displaces the targeted protein as a beta-barrel.
Chloroplasts are similar to mitochondria in that they contain their own DNA for production of some of their components. However, the majority of their proteins are obtained via post-translational translocation and arise from nuclear genes. Proteins may be targeted to several sites of the chloroplast depending on their sequences such as the outer envelope, inner envelope, stroma, thylakoid lumen, or the thylakoid membrane. Proteins are targeted to Thylakoids by mechanisms related to Bacterial Protein Translocation.[32] Proteins targeted to the envelope of chloroplasts usually lack cleavable sorting sequence and are laterally displaced via membrane sorting complexes. General import for the majority of preproteins requires translocation from the cytosol through the Toc and Tic complexes located within the chloroplast envelope. Where Toc is an abbreviation for the translocase of the outer chloroplast envelope and Tic is the translocase of the inner chloroplast envelope. There is a minimum of three proteins that make up the function of the Toc complex. Two of which, referred to as Toc159 and Toc34, are responsible for the docking of stromal import sequences and both contain GTPase activity. The third known as Toc 75, is the actual translocation channel that feeds the recognized preprotein by Toc159/34 into the chloroplast.[33]
Targeting to the stroma requires the preprotein to have a stromal import sequence that is recognized by the Tic complex of the inner envelope upon being translocated from the outer envelope by the Toc complex. The Tic complex is composed of at least five different Tic proteins that are required to form the translocation channel across the inner envelope.[34] Upon being delivered to the stroma, the stromal import sequence is cleaved off via a signal peptidase. This delivery process to the stroma is currently known to be driven by ATP hydrolysis via stromal HSP chaperones, instead of the transmembrane electrochemical gradient that is established in mitochondria to drive protein import. Further intra-chloroplast sorting depends on additional target sequences such as those designated to the thylakoid membrane or the thylakoid lumen.
If a protein is to be targeted to the thylakoid lumen, this may occur via four differently known routes that closely resemble bacterial protein transport mechanisms. The route that is taken depends upon the protein delivered to the stroma being in either an unfolded or metal-bound folded state. Both of which will still contain a thylakoid targeting sequence that is also cleaved upon entry to the lumen. While protein import into the stroma is ATP-driven, the pathway for metal-bound proteins in a folded state to the thylakoid lumen has been shown to be driven by a pH gradient.
Proteins bound for the membrane of the thylakoid will follow up to four known routes that are illustrated in the corresponding figure shown. They may follow a co-translational insertion route that utilizes stromal ribosomes and the SecY/E transmembrane complex, the SRP-dependent pathway, the spontaneous insertion pathway, or the GET pathway. The last of the three are post-translational pathways originating from nuclear genes and therefor constitute the majority of proteins targeted to the thylakoid membrane. According to recent review articles in the journal of biochemistry and molecular biology, the exact mechanisms are not yet fully understood.
Many proteins are needed in both mitochondria and chloroplasts.[35] In general the dual-targeting peptide is of intermediate character to the two specific ones. The targeting peptides of these proteins have a high content of basic and hydrophobic amino acids, a low content of negatively charged amino acids. They have a lower content of alanine and a higher content of leucine and phenylalanine. The dual targeted proteins have a more hydrophobic targeting peptide than both mitochondrial and chloroplastic ones. However, it is tedious to predict if a peptide is dual-targeted or not based on its physio-chemical characteristics.
The nucleus of a cell is surrounded by a nuclear envelope consisting of two layers, with the inner layer providing structural support and anchorage for chromosomes and the nuclear lamina. The outer layer is similar to the endoplasmic reticulum (ER) membrane. This envelope contains nuclear pores, which are complex structures made from around 30 different proteins. These pores act as selective gates that control the flow of molecules into and out of the nucleus.
While small molecules can pass through these pores without issue, larger molecules, like RNA and proteins destined for the nucleus, must have specific signals to be allowed through. These signals are known as nuclear localization signals, usually comprising short sequences rich in positively charged amino acids like lysine or arginine.
Proteins called nuclear import receptors recognize these signals and guide the large molecules through the nuclear pores by interacting with the disordered, mesh-like proteins that fill the pore. The process is dynamic, with the receptor moving the molecule through the meshwork until it reaches the nucleus.
Once inside, a GTPase enzyme called Ran, which can exist in two different forms (one bound to GTP and the other to GDP), facilitates the release of the cargo inside the nucleus and recycles the receptor back to the cytosol. The energy for this transport comes from the hydrolysis of GTP by Ran. Similarly, nuclear export receptors help move proteins and RNA out of the nucleus using a different signal and also harnessing Ran's energy conversion.
Overall, the nuclear pore complex works efficiently to transport macromolecules at high speed, allowing proteins to move in their folded state and ribosomal components as complete particles, which is distinct from how proteins are transported into most other organelles.
The endoplasmic reticulum (ER) plays a key role in protein synthesis and distribution in eukaryotic cells. It's a vast network of membranes where proteins are processed and sorted to various destinations, including the ER itself, the cell surface, and other organelles like the Golgi apparatus, endosomes, and lysosomes. Unlike other organelle-targeted proteins, those headed for the ER start to be transferred across its membrane while they're still being made.
There are two types of proteins that move to the ER: water-soluble proteins, which completely cross into the ER lumen, and transmembrane proteins, which partly cross and embed themselves within the ER membrane. These proteins find their way to the ER with the help of an ER signal sequence, a short stretch of hydrophobic amino acids.
Proteins entering the ER are synthesized by ribosomes. There are two sets of ribosomes in the cell: those bound to the ER (making it look 'rough') and those floating freely in the cytosol. Both sets are identical but differ in the proteins they synthesize at a given moment. Ribosomes that are making proteins with an ER signal sequence attach to the ER membrane and start the translocation process. This process is energy-efficient because the growing protein chain itself pushes through the ER membrane as it elongates.
As the mRNA is translated into a protein, multiple ribosomes may attach to it, creating a structure called a polyribosome. If the mRNA is coding for a protein with an ER signal sequence, the polyribosome attaches to the ER membrane, and the protein begins to enter the ER while it is still being synthesized.
In the process of protein synthesis within eukaryotic cells, soluble proteins that are destined for the endoplasmic reticulum (ER) or for secretion out of the cell are guided to the ER by a two-part system. Firstly, a signal-recognition particle (SRP) in the cytosol attaches to the emerging protein's ER signal sequence and the ribosome itself. Secondly, an SRP receptor located in the ER membrane recognizes and binds to the SRP. This interaction temporarily slows down protein synthesis until the SRP and ribs complex binds to the SRP receptor on the ER.
Once this binding occurs, the SRP is released, and the ribosome is transferred to a protein translocator in the ER membrane, allowing protein synthesis to continue. The polypeptide chain of the protein is then threaded through a channel in the translocator into the ER lumen. The signal sequence of the protein, typically at the beginning (N-terminus) of the polypeptide chain, plays a dual role. It not only targets the ribosome to the ER but also triggers the opening of the translocator. As the protein is fed through the translocator, the signal sequence stays attached, allowing the rest of the protein to move through as a loop. A signal peptidase inside the ER then cuts off the signal sequence, which is subsequently discarded into the lipid bilayer of the ER membrane and broken down.
Finally, once the last part of the protein (the C-terminus) passes through the translocator, the entire soluble protein is released into the ER lumen, where it can then fold and undergo further modifications or be transported to its final destination.
=Transmembrane proteins, which are partly integrated into the ER membrane rather than released into the ER lumen, have a complex assembly process. The initial stages are similar to soluble proteins: a signal sequence starts the insertion into the ER membrane. However, this process is interrupted by a stop-transfer sequence—a string of hydrophobic amino acids—which causes the translocator to halt and release the protein laterally into the membrane. This results in a single-pass transmembrane protein with one end inside the ER lumen and the other in the cytosol, and this orientation is permanent.
Some transmembrane proteins use an internal signal (start-transfer sequence) instead of one at the N-terminus, and unlike the initial signal sequence, this start-transfer sequence isn't removed. It begins the transfer process, which continues until a stop-transfer sequence is encountered, at which point both sequences become anchored in the membrane as alpha-helical segments.
In more complex proteins that span the membrane multiple times, additional pairs of start- and stop-transfer sequences are used to weave the protein into the membrane in a fashion akin to a sewing machine. Each pair allows a new segment to cross the membrane and adds to the protein's structure, ensuring it is properly embedded with the correct arrangement of segments inside and outside the ER membrane.
Peroxisomes contain a single phospholipid bilayer that surrounds the peroxisomal matrix containing a wide variety of proteins and enzymes that participate in anabolism and catabolism. Peroxisomes are specialized cell organelles that carry out specific oxidative reactions using molecular oxygen. Their primary function is to remove hydrogen atoms from organic molecules, a process that results in the production of hydrogen peroxide . Within peroxisomes, an enzyme called catalase plays a critical role. It uses the hydrogen peroxide generated in the earlier reaction to oxidize various other substances, including phenols, formic acid, formaldehyde, and alcohol. This is known as the "peroxidative" reaction.
Peroxisomes are particularly important in liver and kidney cells for detoxifying harmful substances that enter the bloodstream. For example, they are responsible for oxidizing about 25% of the ethanol we consume into acetaldehyde. Additionally, catalase within peroxisomes can break down excess hydrogen peroxide into water and oxygen and thus preventing potential damage from the build-up of . Since it contains no internal DNA like that of the mitochondria or chloroplast all peroxisomal proteins are encoded by nuclear genes.[36] To date there are two types of known Peroxisome Targeting Signals (PTS):
There are also proteins that possess neither of these signals. Their transport may be based on a so-called "piggy-back" mechanism: such proteins associate with PTS1-possessing matrix proteins and are translocated into the peroxisomal matrix together with them.[39]
In the case of cytosolic proteins that are produced with the PTS1 C-terminal sequence, its path to the peroxisomal matrix is dependent upon binding to another cytosolic protein called pex5 (peroxin 5).[40] Once bound, pex5 interacts with a peroxisomal membrane protein pex14 to form a complex. When the pex5 protein with bound cargo interacts with the pex14 membrane protein, the complex induces the release of the targeted protein into the matrix. Upon releasing the cargo protein into the matrix, pex5 dissociation from pex14 occurs via ubiquitinylation by a membrane complex comprising pex2, pex12, and pex10 followed by an ATP dependent removal involving the cytosolic protein complex pex1 and pex6.[41] The cycle for pex5 mediated import into the peroxisomal matrix is restored after the ATP dependent removal of ubiquitin and is free to bind with another protein containing a PTS1 sequence. Proteins containing a PTS2 targeting sequence are mediated by a different cytosolic protein but are believed to follow a similar mechanism to that of those containing the PTS1 sequence.
Protein transport is defective in the following genetic diseases:
As discussed above (see protein translocation), most prokaryotic membrane-bound and secretory proteins are targeted to the plasma membrane by either a co-translation pathway that uses bacterial SRP or a post-translation pathway that requires SecA and SecB. At the plasma membrane, these two pathways deliver proteins to the SecYEG translocon for translocation. Bacteria may have a single plasma membrane (Gram-positive bacteria), or an inner membrane plus an outer membrane separated by the periplasm (Gram-negative bacteria). Besides the plasma membrane the majority of prokaryotes lack membrane-bound organelles as found in eukaryotes, but they may assemble proteins onto various types of inclusions such as gas vesicles and storage granules.
In gram-negative bacteria proteins may be incorporated into the plasma membrane, the outer membrane, the periplasm or secreted into the environment. Systems for secreting proteins across the bacterial outer membrane may be quite complex and play key roles in pathogenesis. These systems may be described as type I secretion, type II secretion, etc.
In most gram-positive bacteria, certain proteins are targeted for export across the plasma membrane and subsequent covalent attachment to the bacterial cell wall. A specialized enzyme, sortase, cleaves the target protein at a characteristic recognition site near the protein C-terminus, such as an LPXTG motif (where X can be any amino acid), then transfers the protein onto the cell wall. Several analogous systems are found that likewise feature a signature motif on the extra-cytoplasmic face, a C-terminal transmembrane domain, and cluster of basic residues on the cytosolic face at the protein's extreme C-terminus. The PEP-CTERM/exosortase system, found in many Gram-negative bacteria, seems to be related to extracellular polymeric substance production. The PGF-CTERM/archaeosortase A system in archaea is related to S-layer production. The GlyGly-CTERM/rhombosortase system, found in the Shewanella, Vibrio, and a few other genera, seems involved in the release of proteases, nucleases, and other enzymes.