A ribosome binding site, or ribosomal binding site (RBS), is a sequence of nucleotides upstream of the start codon of an mRNA transcript that is responsible for the recruitment of a ribosome during the initiation of translation. Mostly, RBS refers to bacterial sequences, although internal ribosome entry sites (IRES) have been described in mRNAs of eukaryotic cells or viruses that infect eukaryotes. Ribosome recruitment in eukaryotes is generally mediated by the 5' cap present on eukaryotic mRNAs.
The RBS in prokaryotes is a region upstream of the start codon. This region of the mRNA has the consensus 5'-AGGAGG-3', also called the Shine-Dalgarno (SD) sequence.[1] The complementary sequence (CCUCCU), called the anti-Shine-Dalgarno (ASD) is contained in the 3’ end of the 16S region of the smaller (30S) ribosomal subunit. Upon encountering the Shine-Dalgarno sequence, the ASD of the ribosome base pairs with it, after which translation is initiated.[2]
Variations of the 5'-AGGAGG-3' sequence have been found in Archaea as highly conserved 5′-GGTG-3′ regions, 5 basepairs upstream of the start site. Additionally, some bacterial initiation regions, such as rpsA in E.coli completely lack identifiable SD sequences.[3]
Prokaryotic ribosomes begin translation of the mRNA transcript while DNA is still being transcribed. Thus translation and transcription are parallel processes. Bacterial mRNA are usually polycistronic and contain multiple ribosome binding sites. Translation initiation is the most highly regulated step of protein synthesis in prokaryotes.[4]
The rate of translation depends on two factors:
The RBS sequence affects both of these factors.
The ribosomal protein S1 binds to adenine sequences upstream of the RBS. Increasing the concentration of adenine upstream of the RBS will increase the rate of ribosome recruitment.
The level of complementarity of the mRNA SD sequence to the ribosomal ASD greatly affects the efficiency of translation initiation. Richer complementarity results in higher initiation efficiency.[5] It is worth noting that this only holds up to a certain point - having too rich of a complementarity is known to paradoxically decrease the rate of translation as the ribosome then happens to be bound too tightly to proceed downstream.
The optimal distance between the RBS and the start codon is variable - it depends on the portion of the SD sequence encoded in the actual RBS and its distance to the start site of a consensus SD sequence. Optimal spacing increases the rate of translation initiation once a ribosome has been bound. The composition of nucleotides in the spacer region itself was also found to affect the rate of translation initiation in one study.[6]
Secondary structures formed by the RBS can affect the translational efficiency of mRNA, generally inhibiting translation. These secondary structures are formed by H-bonding of the mRNA base pairs and are sensitive to temperature. At a higher-than-usual temperature (~42 °C), the RBS secondary structure of heat shock proteins becomes undone thus allowing ribosomes to bind and initiate translation. This mechanism allows a cell to quickly respond to an increase in temperature.
Ribosome recruitment in eukaryotes happens when eukaryote initiation factors elF4F and poly(A)-binding protein (PABP) recognize the 5' capped mRNA and recruit the 43S ribosome complex at that location.[7]
Translation initiation happens following recruitment of the ribosome, at the start codon (underlined) found within the Kozak consensus sequence ACCAUGG. Since the Kozak sequence itself is not involved in the recruitment of the ribosome, it is not considered a ribosome binding site.[8]
Eukaryotic ribosomes are known to bind to transcripts in a mechanism unlike the one involving the 5' cap, at a sequence called the internal ribosome entry site. This process is not dependent on the full set of translation initiation factors (although this depends on the specific IRES) and is commonly found in the translation of viral mRNA.[9]
The identification of RBSs is used to determine the site of translation initiation in an unannotated sequence. This is referred to as N-terminal prediction. This is especially useful when multiple start codons are situated around the potential start site of the protein coding sequence.[10] [11]
Identification of RBSs is particularly difficult, because they tend to be highly degenerated.[12] One approach to identifying RBS in E.coli is using neural networks.[13] Another approach is using the Gibbs sampling method.
The Shine-Dalgarno sequence, of the prokaryotic RBS, was discovered by John Shine and Lynn Dalgarno in 1975.[14] The Kozak consensus sequence was first identified by Marilyn Kozak in 1984[15] while she was in the Department of Biological Sciences at the University of Pittsburgh.[16]