In molecular biology, messenger ribonucleic acid (mRNA) is a single-stranded molecule of RNA that corresponds to the genetic sequence of a gene, and is read by a ribosome in the process of synthesizing a protein.
mRNA is created during the process of transcription, where an enzyme (RNA polymerase) converts the gene into primary transcript mRNA (also known as pre-mRNA). This pre-mRNA usually still contains introns, regions that will not go on to code for the final amino acid sequence. These are removed in the process of RNA splicing, leaving only exons, regions that will encode the protein. This exon sequence constitutes mature mRNA. Mature mRNA is then read by the ribosome, and the ribosome creates the protein utilizing amino acids carried by transfer RNA (tRNA). This process is known as translation. All of these processes form part of the central dogma of molecular biology, which describes the flow of genetic information in a biological system.
As in DNA, genetic information in mRNA is contained in the sequence of nucleotides, which are arranged into codons consisting of three ribonucleotides each. Each codon codes for a specific amino acid, except the stop codons, which terminate protein synthesis. The translation of codons into amino acids requires two other types of RNA: transfer RNA, which recognizes the codon and provides the corresponding amino acid, and ribosomal RNA (rRNA), the central component of the ribosome's protein-manufacturing machinery.
The concept of mRNA was developed by Sydney Brenner and Francis Crick in 1960 during a conversation with François Jacob. In 1961, mRNA was identified and described independently by one team consisting of Brenner, Jacob, and Matthew Meselson, and another team led by James Watson. While analyzing the data in preparation for publication, Jacob and Jacques Monod coined the name "messenger RNA".
The brief existence of an mRNA molecule begins with transcription, and ultimately ends in degradation. During its life, an mRNA molecule may also be processed, edited, and transported prior to translation. Eukaryotic mRNA molecules often require extensive processing and transport, while prokaryotic mRNA molecules do not. A molecule of eukaryotic mRNA and the proteins surrounding it are together called a messenger RNP.
See main article: Transcription (genetics). Transcription is when RNA is copied from DNA. During transcription, RNA polymerase makes a copy of a gene from the DNA to mRNA as needed. This process differs slightly in eukaryotes and prokaryotes. One notable difference is that prokaryotic RNA polymerase associates with DNA-processing enzymes during transcription so that processing can proceed during transcription. Therefore, this causes the new mRNA strand to become double stranded by producing a complementary strand known as the tRNA strand, which when combined are unable to form structures from base-pairing. Moreover, the template for mRNA is the complementary strand of tRNA, which is identical in sequence to the anticodon sequence that the DNA binds to. The short-lived, unprocessed or partially processed product is termed precursor mRNA, or pre-mRNA; once completely processed, it is termed mature mRNA.
mRNA uses uracil (U) instead of thymine (T) in DNA. uracil (U) is the complementary base to adenine (A) during transcription instead of thymine (T). Thus, when using a template strand of DNA to build RNA, thymine is replaced with uracil. This substitution allows the mRNA to carry the appropriate genetic information from DNA to the ribosome for translation. Regarding the natural history, uracil came first then thymine; evidence suggests that RNA came before DNA in evolution.[1] The RNA World hypothesis proposes that life began with RNA molecules, before the emergence of DNA genomes and coded proteins. In DNA, the evolutionary substitution of thymine for uracil may have increased DNA stability and improved the efficiency of DNA replication.[2] [3]
See main article: Post-transcriptional modification. Processing of mRNA differs greatly among eukaryotes, bacteria, and archaea. Non-eukaryotic mRNA is, in essence, mature upon transcription and requires no processing, except in rare cases.[4] Eukaryotic pre-mRNA, however, requires several processing steps before its transport to the cytoplasm and its translation by the ribosome.
See main article: RNA splicing. The extensive processing of eukaryotic pre-mRNA that leads to the mature mRNA is the RNA splicing, a mechanism by which introns or outrons (non-coding regions) are removed and exons (coding regions) are joined.
See main article: 5' cap. A 5' cap (also termed an RNA cap, an RNA 7-methylguanosine cap, or an RNA m7G cap) is a modified guanine nucleotide that has been added to the "front" or 5' end of a eukaryotic messenger RNA shortly after the start of transcription. The 5' cap consists of a terminal 7-methylguanosine residue that is linked through a 5'-5'-triphosphate bond to the first transcribed nucleotide. Its presence is critical for recognition by the ribosome and protection from RNases.
Cap addition is coupled to transcription, and occurs co-transcriptionally, such that each influences the other. Shortly after the start of transcription, the 5' end of the mRNA being synthesized is bound by a cap-synthesizing complex associated with RNA polymerase. This enzymatic complex catalyzes the chemical reactions that are required for mRNA capping. Synthesis proceeds as a multi-step biochemical reaction.
In some instances, an mRNA will be edited, changing the nucleotide composition of that mRNA. An example in humans is the apolipoprotein B mRNA, which is edited in some tissues, but not others. The editing creates an early stop codon, which, upon translation, produces a shorter protein.
See main article: Polyadenylation. Polyadenylation is the covalent linkage of a polyadenylyl moiety to a messenger RNA molecule. In eukaryotic organisms most messenger RNA (mRNA) molecules are polyadenylated at the 3' end, but recent studies have shown that short stretches of uridine (oligouridylation) are also common.[5] The poly(A) tail and the protein bound to it aid in protecting mRNA from degradation by exonucleases. Polyadenylation is also important for transcription termination, export of the mRNA from the nucleus, and translation. mRNA can also be polyadenylated in prokaryotic organisms, where poly(A) tails act to facilitate, rather than impede, exonucleolytic degradation.
Polyadenylation occurs during and/or immediately after transcription of DNA into RNA. After transcription has been terminated, the mRNA chain is cleaved through the action of an endonuclease complex associated with RNA polymerase. After the mRNA has been cleaved, around 250 adenosine residues are added to the free 3' end at the cleavage site. This reaction is catalyzed by polyadenylate polymerase. Just as in alternative splicing, there can be more than one polyadenylation variant of an mRNA.
Polyadenylation site mutations also occur. The primary RNA transcript of a gene is cleaved at the poly-A addition site, and 100–200 A's are added to the 3' end of the RNA. If this site is altered, an abnormally long and unstable mRNA construct will be formed.
Another difference between eukaryotes and prokaryotes is mRNA transport. Because eukaryotic transcription and translation is compartmentally separated, eukaryotic mRNAs must be exported from the nucleus to the cytoplasm—a process that may be regulated by different signaling pathways.[6] Mature mRNAs are recognized by their processed modifications and then exported through the nuclear pore by binding to the cap-binding proteins CBP20 and CBP80,[7] as well as the transcription/export complex (TREX).[8] [9] Multiple mRNA export pathways have been identified in eukaryotes.[10]
In spatially complex cells, some mRNAs are transported to particular subcellular destinations. In mature neurons, certain mRNA are transported from the soma to dendrites. One site of mRNA translation is at polyribosomes selectively localized beneath synapses.[11] The mRNA for Arc/Arg3.1 is induced by synaptic activity and localizes selectively near active synapses based on signals generated by NMDA receptors.[12] Other mRNAs also move into dendrites in response to external stimuli, such as β-actin mRNA.[13] For export from the nucleus, actin mRNA associates with ZBP1[14] and later with 40S subunit. The complex is bound by a motor protein and is transported to the target location (neurite extension) along the cytoskeleton. Eventually ZBP1 is phosphorylated by Src in order for translation to be initiated.[15] In developing neurons, mRNAs are also transported into growing axons and especially growth cones. Many mRNAs are marked with so-called "zip codes", which target their transport to a specific location.[16] [17] mRNAs can also transfer between mammalian cells through structures called tunneling nanotubes.[18] [19]
See main article: Translation (biology). Because prokaryotic mRNA does not need to be processed or transported, translation by the ribosome can begin immediately after the end of transcription. Therefore, it can be said that prokaryotic translation is coupled to transcription and occurs co-transcriptionally.
Eukaryotic mRNA that has been processed and transported to the cytoplasm (i.e., mature mRNA) can then be translated by the ribosome. Translation may occur at ribosomes free-floating in the cytoplasm, or directed to the endoplasmic reticulum by the signal recognition particle. Therefore, unlike in prokaryotes, eukaryotic translation is not directly coupled to transcription. It is even possible in some contexts that reduced mRNA levels are accompanied by increased protein levels, as has been observed for mRNA/protein levels of EEF1A1 in breast cancer.[20]
See main article: Coding region. Coding regions are composed of codons, which are decoded and translated into proteins by the ribosome; in eukaryotes usually into one and in prokaryotes usually into several. Coding regions begin with the start codon and end with a stop codon. In general, the start codon is an AUG triplet and the stop codon is UAG ("amber"), UAA ("ochre"), or UGA ("opal"). The coding regions tend to be stabilised by internal base pairs; this impedes degradation.[21] [22] In addition to being protein-coding, portions of coding regions may serve as regulatory sequences in the pre-mRNA as exonic splicing enhancers or exonic splicing silencers.
See main article: 5' UTR and 3' UTR. Untranslated regions (UTRs) are sections of the mRNA before the start codon and after the stop codon that are not translated, termed the five prime untranslated region (5' UTR) and three prime untranslated region (3' UTR), respectively. These regions are transcribed with the coding region and thus are exonic as they are present in the mature mRNA. Several roles in gene expression have been attributed to the untranslated regions, including mRNA stability, mRNA localization, and translational efficiency. The ability of a UTR to perform these functions depends on the sequence of the UTR and can differ between mRNAs. Genetic variants in 3' UTR have also been implicated in disease susceptibility because of the change in RNA structure and protein translation.[23]
The stability of mRNAs may be controlled by the 5' UTR and/or 3' UTR due to varying affinity for RNA degrading enzymes called ribonucleases and for ancillary proteins that can promote or inhibit RNA degradation. (See also, C-rich stability element.)
Translational efficiency, including sometimes the complete inhibition of translation, can be controlled by UTRs. Proteins that bind to either the 3' or 5' UTR may affect translation by influencing the ribosome's ability to bind to the mRNA. MicroRNAs bound to the 3' UTR also may affect translational efficiency or mRNA stability.
Cytoplasmic localization of mRNA is thought to be a function of the 3' UTR. Proteins that are needed in a particular region of the cell can also be translated there; in such a case, the 3' UTR may contain sequences that allow the transcript to be localized to this region for translation.
Some of the elements contained in untranslated regions form a characteristic secondary structure when transcribed into RNA. These structural mRNA elements are involved in regulating the mRNA. Some, such as the SECIS element, are targets for proteins to bind. One class of mRNA element, the riboswitches, directly bind small molecules, changing their fold to modify levels of transcription or translation. In these cases, the mRNA regulates itself.
See main article: Polyadenylation. The 3' poly(A) tail is a long sequence of adenine nucleotides (often several hundred) added to the 3' end of the pre-mRNA. This tail promotes export from the nucleus and translation, and protects the mRNA from degradation.
See also: Cistron. An mRNA molecule is said to be monocistronic when it contains the genetic information to translate only a single protein chain (polypeptide). This is the case for most of the eukaryotic mRNAs.[24] [25] On the other hand, polycistronic mRNA carries several open reading frames (ORFs), each of which is translated into a polypeptide. These polypeptides usually have a related function (they often are the subunits composing a final complex protein) and their coding sequence is grouped and regulated together in a regulatory region, containing a promoter and an operator. Most of the mRNA found in bacteria and archaea is polycistronic,[24] as is the human mitochondrial genome.[26] Dicistronic or bicistronic mRNA encodes only two proteins.
In eukaryotes mRNA molecules form circular structures due to an interaction between the eIF4E and poly(A)-binding protein, which both bind to eIF4G, forming an mRNA-protein-mRNA bridge.[27] Circularization is thought to promote cycling of ribosomes on the mRNA leading to time-efficient translation, and may also function to ensure only intact mRNA are translated (partially degraded mRNA characteristically have no m7G cap, or no poly-A tail).[28]
Other mechanisms for circularization exist, particularly in virus mRNA. Poliovirus mRNA uses a cloverleaf section towards its 5' end to bind PCBP2, which binds poly(A)-binding protein, forming the familiar mRNA-protein-mRNA circle. Barley yellow dwarf virus has binding between mRNA segments on its 5' end and 3' end (called kissing stem loops), circularizing the mRNA without any proteins involved.
RNA virus genomes (the + strands of which are translated as mRNA) are also commonly circularized.[29] During genome replication the circularization acts to enhance genome replication speeds, cycling viral RNA-dependent RNA polymerase much the same as the ribosome is hypothesized to cycle.
Different mRNAs within the same cell have distinct lifetimes (stabilities). In bacterial cells, individual mRNAs can survive from seconds to more than an hour. However, the lifetime averages between 1 and 3 minutes, making bacterial mRNA much less stable than eukaryotic mRNA.[30] In mammalian cells, mRNA lifetimes range from several minutes to days.[31] The greater the stability of an mRNA the more protein may be produced from that mRNA. The limited lifetime of mRNA enables a cell to alter protein synthesis rapidly in response to its changing needs. There are many mechanisms that lead to the destruction of an mRNA, some of which are described below.
In general, in prokaryotes the lifetime of mRNA is much shorter than in eukaryotes. Prokaryotes degrade messages by using a combination of ribonucleases, including endonucleases, 3' exonucleases, and 5' exonucleases. In some instances, small RNA molecules (sRNA) tens to hundreds of nucleotides long can stimulate the degradation of specific mRNAs by base-pairing with complementary sequences and facilitating ribonuclease cleavage by RNase III. It was recently shown that bacteria also have a sort of 5' cap consisting of a triphosphate on the 5' end.[32] Removal of two of the phosphates leaves a 5' monophosphate, causing the message to be destroyed by the exonuclease RNase J, which degrades 5' to 3'.
Inside eukaryotic cells, there is a balance between the processes of translation and mRNA decay. Messages that are being actively translated are bound by ribosomes, the eukaryotic initiation factors eIF-4E and eIF-4G, and poly(A)-binding protein. eIF-4E and eIF-4G block the decapping enzyme (DCP2), and poly(A)-binding protein blocks the exosome complex, protecting the ends of the message. The balance between translation and decay is reflected in the size and abundance of cytoplasmic structures known as P-bodies.[33] The poly(A) tail of the mRNA is shortened by specialized exonucleases that are targeted to specific messenger RNAs by a combination of cis-regulatory sequences on the RNA and trans-acting RNA-binding proteins. Poly(A) tail removal is thought to disrupt the circular structure of the message and destabilize the cap binding complex. The message is then subject to degradation by either the exosome complex or the decapping complex. In this way, translationally inactive messages can be destroyed quickly, while active messages remain intact. The mechanism by which translation stops and the message is handed-off to decay complexes is not understood in detail.
The presence of AU-rich elements in some mammalian mRNAs tends to destabilize those transcripts through the action of cellular proteins that bind these sequences and stimulate poly(A) tail removal. Loss of the poly(A) tail is thought to promote mRNA degradation by facilitating attack by both the exosome complex[34] and the decapping complex.[35] Rapid mRNA degradation via AU-rich elements is a critical mechanism for preventing the overproduction of potent cytokines such as tumor necrosis factor (TNF) and granulocyte-macrophage colony stimulating factor (GM-CSF).[36] AU-rich elements also regulate the biosynthesis of proto-oncogenic transcription factors like c-Jun and c-Fos.[37]
See main article: Nonsense-mediated decay.
Eukaryotic messages are subject to surveillance by nonsense-mediated decay (NMD), which checks for the presence of premature stop codons (nonsense codons) in the message. These can arise via incomplete splicing, V(D)J recombination in the adaptive immune system, mutations in DNA, transcription errors, leaky scanning by the ribosome causing a frame shift, and other causes. Detection of a premature stop codon triggers mRNA degradation by 5' decapping, 3' poly(A) tail removal, or endonucleolytic cleavage.[38]
See main article: siRNA.
In metazoans, small interfering RNAs (siRNAs) processed by Dicer are incorporated into a complex known as the RNA-induced silencing complex or RISC. This complex contains an endonuclease that cleaves perfectly complementary messages to which the siRNA binds. The resulting mRNA fragments are then destroyed by exonucleases. siRNA is commonly used in laboratories to block the function of genes in cell culture. It is thought to be part of the innate immune system as a defense against double-stranded RNA viruses.[39]
MicroRNAs (miRNAs) are small RNAs that typically are partially complementary to sequences in metazoan messenger RNAs.[40] [41] Binding of a miRNA to a message can repress translation of that message and accelerate poly(A) tail removal, thereby hastening mRNA degradation. The mechanism of action of miRNAs is the subject of active research.[42] [43]
There are other ways by which messages can be degraded, including non-stop decay and silencing by Piwi-interacting RNA (piRNA), among others.
See also: mRNA vaccine and RNA therapeutics. The administration of a nucleoside-modified messenger RNA sequence can cause a cell to make a protein, which in turn could directly treat a disease or could function as a vaccine; more indirectly the protein could drive an endogenous stem cell to differentiate in a desired way.[44] [45]
The primary challenges of RNA therapy center on delivering the RNA to the appropriate cells.[46] Challenges include the fact that naked RNA sequences naturally degrade after preparation; they may trigger the body's immune system to attack them as an invader; and they are impermeable to the cell membrane.[45] Once within the cell, they must then leave the cell's transport mechanism to take action within the cytoplasm, which houses the necessary ribosomes.[44]
Overcoming these challenges, mRNA as a therapeutic was first put forward in 1989 "after the development of a broadly applicable in vitro transfection technique."[47] In the 1990s, mRNA vaccines for personalized cancer have been developed, relying on non-nucleoside modified mRNA. mRNA based therapies continue to be investigated as a method of treatment or therapy for both cancer as well as auto-immune, metabolic, and respiratory inflammatory diseases. Gene editing therapies such as CRISPR may also benefit from using mRNA to induce cells to make the desired Cas protein.[48]
Since the 2010s, RNA vaccines and other RNA therapeutics have been considered to be "a new class of drugs". The first mRNA-based vaccines received restricted authorization and were rolled out across the world during the COVID-19 pandemic by Pfizer–BioNTech COVID-19 vaccine and Moderna, for example.[49] The 2023 Nobel Prize in Physiology or Medicine was awarded to Katalin Karikó and Drew Weissman for the development of effective mRNA vaccines against COVID-19.[50] [51] [52]
Several molecular biology studies during the 1950s indicated that RNA played some kind of role in protein synthesis, but that role was not clearly understood. For instance, in one of the earliest reports, Jacques Monod and his team showed that RNA synthesis was necessary for protein synthesis, specifically during the production of the enzyme β-galactosidase in the bacterium E. coli.[53] Arthur Pardee also found similar RNA accumulation in 1954.[54] In 1953, Alfred Hershey, June Dixon, and Martha Chase described a certain cytosine-containing DNA (indicating it was RNA) that disappeared quickly after its synthesis in E. coli.[55] In hindsight, this may have been one of the first observations of the existence of mRNA but it was not recognized at the time as such.[56]
The idea of mRNA was first conceived by Sydney Brenner and Francis Crick on 15 April 1960 at King's College, Cambridge, while François Jacob was telling them about a recent experiment conducted by Arthur Pardee, himself, and Monod (the so-called PaJaMo experiment, which did not prove mRNA existed but suggested the possibility of its existence). With Crick's encouragement, Brenner and Jacob immediately set out to test this new hypothesis, and they contacted Matthew Meselson at the California Institute of Technology for assistance. During the summer of 1960, Brenner, Jacob, and Meselson conducted an experiment in Meselson's laboratory at Caltech which was the first to prove the existence of mRNA. That fall, Jacob and Monod coined the name "messenger RNA" and developed the first theoretical framework to explain its function.
In February 1961, James Watson revealed that his Harvard-based research group had been right behind them with a series of experiments whose results pointed in roughly the same direction. Brenner and the others agreed to Watson's request to delay publication of their research findings. As a result, the Brenner and Watson articles were published simultaneously in the same issue of Nature in May 1961, while that same month, Jacob and Monod published their theoretical framework for mRNA in the Journal of Molecular Biology.