A ribosomal protein (r-protein or rProtein[1] [2] [3]) is any of the proteins that, in conjunction with rRNA, make up the ribosomal subunits involved in the cellular process of translation. E. coli, other bacteria and Archaea have a 30S small subunit and a 50S large subunit, whereas humans and yeasts have a 40S small subunit and a 60S large subunit.[4] Equivalent subunits are frequently numbered differently between bacteria, Archaea, yeasts and humans.[5]
A large part of the knowledge about these organic molecules has come from the study of E. coli ribosomes. All ribosomal proteins have been isolated and many specific antibodies have been produced. These, together with electronic microscopy and the use of certain reactives, have allowed for the determination of the topography of the proteins in the ribosome. More recently, a near-complete (near)atomic picture of the ribosomal proteins is emerging from the latest high-resolution cryo-EM data (including).
Ribosomal proteins are among the most highly conserved proteins across all life forms.[5] Among the 40 proteins found in various small ribosomal subunits (RPSs), 15 subunits are universally conserved across prokaryotes and eukaryotes. However, 7 subunits are only found in bacteria (bS21, bS6, bS16, bS18, bS20, bS21, and bTHX), while 17 subunits are only found in archaea and eukaryotes.[5] Typically 22 proteins are found in bacterial small subunits and 32 in yeast, human and most likely most other eukaryotic species. Twenty-seven (out of 32) proteins of the eukaryotic small ribosomal subunit proteins are also present in archaea (no ribosomal protein is exclusively found in archaea), confirming that they are more closely related to eukaryotes than to bacteria.[5]
Among the large ribosomal subunit (RPLs), 18 proteins are universal, i.e. found in both bacteria, eukaryotes, and archaea. 14 proteins are only found in bacteria, while 27 proteins are only found in archaea and eukaryotes. Again, archaea have no proteins unique to them.[5]
Despite their high conservation over billions of years of evolution, the absence of several ribosomal proteins in certain species shows that ribosomal subunits have been added and lost over the course of evolution. This is also reflected by the fact that several ribosomal proteins do not appear to be essential when deleted.[6] For instance, in E. coli nine ribosomal proteins (uL15, bL21, uL24, bL27, uL29, uL30, bL34, uS9, and uS17) are nonessential for survival when deleted. Taken together with previous results, 22 of the 54 E. coli ribosomal protein genes can be individually deleted from the genome.[7] Similarly, 16 ribosomal proteins (uL1, bL9, uL15, uL22, uL23, bL28, uL29, bL32, bL33.1, bL33.2, bL34, bL35, bL36, bS6, bS20, and bS21) were successfully deleted in Bacillus subtilis. In conjunction with previous reports, 22 ribosomal proteins have been shown to be nonessential in B. subtilis, at least for cell proliferation.[8]
The ribosome of E. coli has about 22 proteins in the small subunit (labelled S1 to S22) and 33 proteins in the large subunit (somewhat counter-intuitively called L1 to L36). All of them are different with three exceptions: one protein is found in both subunits (S20 and L26), L7 and L12 are acetylated and methylated forms of the same protein, and L8 is a complex of L7/L12 and L10. In addition, L31 is known to exist in two forms, the full length at 7.9 kilodaltons (kDa) and fragmented at 7.0 kDa. This is why the number of proteins in a ribosome is of 56. Except for S1 (with a molecular weight of 61.2 kDa), the other proteins range in weight between 4.4 and 29.7 kDa.[9]
Recent de novo proteomics experiments where the authors characterized in vivo ribosome-assembly intermediates and associated assembly factors from wild-type Escherichia coli cells using a general quantitative mass spectrometry (qMS) approach have confirmed the presence of all the known small and large subunit components and have identified a total of 21 known and potentially new ribosome-assembly-factors that co-localise with various ribosomal particles.[10]
In the small (30S) subunit of E. coli ribosomes, the proteins denoted uS4, uS7, uS8, uS15, uS17, bS20 bind independently to 16S rRNA. After assembly of these primary binding proteins, uS5, bS6, uS9, uS12, uS13, bS16, bS18, and uS19 bind to the growing ribosome. These proteins also potentiate the addition of uS2, uS3, uS10, uS11, uS14, and bS21. Protein binding to helical junctions is important for initiating the correct tertiary fold of RNA and to organize the overall structure. Nearly all the proteins contain one or more globular domains. Moreover, nearly all contain long extensions that can contact the RNA in far-reaching regions. Additional stabilization results from the proteins' basic residues, as these neutralize the charge repulsion of the RNA backbone. Protein–protein interactions also exist to hold structure together by electrostatic and hydrogen bonding interactions. Theoretical investigations pointed to correlated effects of protein-binding onto binding affinities during the assembly process[11]
In one study, the net charges (at pH 7.4) of the ribosomal proteins comprising the highly conserved S10-spc cluster were found to have an inverse relationship with the halophilicity/halotolerance levels in bacteria and archaea.[12] In non-halophilic bacteria, the S10-spc proteins are generally basic, contrasting with the overall acidic whole proteomes of the extremely halophiles. The universal uL2 lying in the oldest part of the ribosome, is always positively charged irrespective of the strain/organism it belongs to.[12]
Ribosomes in eukaryotes contain 79–80 proteins and four ribosomal RNA (rRNA) molecules.General or specialized chaperones solubilize the ribosomal proteins and facilitate their import into the nucleus. Assembly of the eukaryotic ribosome appears to be driven by the ribosomal proteins in vivo when assembly is also aided by chaperones. Most ribosomal proteins assemble with rRNA co-transcriptionally, becoming associated more stably as assembly proceeds, and the active sites of both subunits are constructed last.[5]
In the past, different nomenclatures were used for the same ribosomal protein in different organisms. Not only were the names not consistent across domains; the names also differed between organisms within a domain, such as humans and S. cervisiae, both eukaryotes. This was due to researchers assigning names before the sequences were known, causing trouble for later research. The following tables use the unified nomenclature by Ban et al., 2014. The same nomenclature is used by UniProt's "family" curation.[5]
In general, cellular ribosomal proteins are to be called simply using the cross domain name, e.g. "uL14" for what is currently called L23 in humans. A suffix is used for the organellar versions, so that "uL14m" refers to the human mitochondrial uL14 (MRPL14).[5] Organelle-specific proteins use their own cross-domain prefixes, for example "mS33" for MRPS33[13] and "cL37" for PSRP5.[14] (See the two proceeding citations, also partially by Ban N, for the organelle nomenclatures.)
bS1 | B | S1 | ||||
eS1 | A E | S1 | S3A | |||
uS2 | , | B A E | S2 | S0 | SA | |
uS3 | , | B A E | S3 | S3 | S3 | |
uS4 | , | B A E | S4 | S9 | S9 | |
eS4 | ,, | A E | S4 | S4 (X, Y1, Y2) | ||
uS5 | , | B A E | S5 | S2 | S2 | |
bS6 | B | S6 | ||||
eS6 | A E | S6 | S6 | |||
uS7 | B A E | S7 | S5 | S5 | ||
eS7 | E | S7 | S7 | |||
uS8 | B A E | S8 | S22 | S15A | ||
eS8 | A E | S8 | S8 | |||
uS9 | B A E | S9 | S16 | S16 | ||
uS10 | B A E | S10 | S20 | S20 | ||
eS10 | E | S10 | S10 | |||
uS11 | B A E | S11 | S14 | S14 | ||
uS12 | B A E | S12 | S23 | S23 | ||
eS12 | E | S12 | S12 | |||
uS13 | B A E | S13 | S18 | S18 | ||
uS14 | B A E | S14 | S29 | S29 | ||
uS15 | B A E | S15 | S13 | S13 | ||
bS16 | B | S16 | ||||
uS17 | B A E | S17 | S11 | S11 | ||
eS17 | A E | S17 | S17 | |||
bS18 | B | S18 | ||||
uS19 | B A E | S19 | S15 | S15 | ||
eS19 | A E | S19 | S19 | |||
bS20 | B | S20 | ||||
bS21 | B | S21 | ||||
bTHX | , | B | THX (missing from E. coli) | |||
eS21 | E | S21 | S21 | |||
eS24 | A E | S24 | S24 | |||
eS25 | A E | S25 | S25 | |||
eS26 | E | S26 | S26 | |||
eS27 | A E | S27 | S27 | |||
eS28 | A E | S28 | S28 | |||
eS30 | A E | S30 | S30 | |||
eS31 | A E | S31 | S27A | |||
RACK1 | E | Asc1 | RACK1 |
uL1 | B A E | L1 | L1 | L10A | ||
uL2 | , | B A E | L2 | L2 | L8 | |
uL3 | B A E | L3 | L3 | L3 | ||
uL4 | B A E | L4 | L4 | L4 | ||
uL5 | , (b) | B A E | L5 | L11 | L11 | |
uL6 | B A E | L6 | L9 | L9 | ||
eL6 | , | E | L6 | L6 | ||
eL8 | A E | L8 | L7A | |||
bL9 | , | B | L9 | |||
uL10 | B A E | L10 | P0 | P0 | ||
uL11 | , | B A E | L11 | L12 | L12 | |
bL12 | , | B | L7/L12 | |||
uL13 | B A E | L13 | L16 | L13A | ||
eL13 | A E | L13 | L13 | |||
uL14 | B A E | L14 | L23 | L23 | ||
eL14 | A E | L14 | L14 | |||
uL15 | B A E | L15 | L28 | L27A | ||
eL15 | A E | L15 | L15 | |||
uL16 | B A E | L16 | L10 | L10 | ||
bL17 | B | L17 | ||||
uL18 | B A E | L18 | L5 | L5 | ||
eL18 | A E | L18 | L18 | |||
bL19 | B | L19 | ||||
eL19 | A E | L19 | L19 | |||
bL20 | B | L20 | ||||
eL20 | E | L20 | L18A | |||
bL21 | B | L21 | ||||
eL21 | A E | L21 | L21 | |||
uL22 | B A E | L22 | L17 | L17 | ||
eL22 | E | L22 | L22 | |||
uL23 | , (e) | B A E | L23 | L25 | L23A | |
uL24 | (b), (ae) | B A E | L24 | L26 | L26 | |
eL24 | A E | L24 | L24 | |||
bL25 | B | L25 | ||||
bL27 | B | L27 | ||||
eL27 | E | L27 | L27 | |||
bL28 | B | L28 | ||||
eL28 | E | L28 | ||||
uL29 | B A E | L29 | L35 | L35 | ||
eL29 | E | L29 | L29 | |||
uL30 | B A E | L30 | L7 | L7 | ||
eL30 | A E | L30 | L30 | |||
bL31 | B | L31 | ||||
eL31 | A E | L31 | L31 | |||
bL32 | B | L32 | ||||
eL32 | A E | L32 | L32 | |||
bL33 | B | L33 | ||||
eL33 | A E | L33 | L35A | |||
bL34 | B | L34 | ||||
eL34 | A E | L34 | L34 | |||
bL35 | B | L35 | ||||
bL36 | B | L36 | ||||
eL36 | E | L36 | L36 | |||
eL37 | A E | L37 | L37 | |||
eL38 | A E | L38 | L38 | |||
eL39 | A E | L39 | L39 | |||
eL40 | A E | L40 | L40 | |||
eL41 | A E | L41 | L41 | |||
eL42 | A E | L42 | L36A | |||
eL43 | A E | L43 | L37A | |||
P1/P2 | A E | P1/P2 (AB) | P1/P2 (αβ) |