A spliceosome is a large ribonucleoprotein (RNP) complex found primarily within the nucleus of eukaryotic cells. The spliceosome is assembled from small nuclear RNAs (snRNA) and numerous proteins. Small nuclear RNA (snRNA) molecules bind to specific proteins to form a small nuclear ribonucleoprotein complex (snRNP, pronounced "snurps"), which in turn combines with other snRNPs to form a large ribonucleoprotein complex called a spliceosome. The spliceosome removes introns from a transcribed pre-mRNA, a type of primary transcript. This process is generally referred to as splicing.[1] An analogy is a film editor, who selectively cuts out irrelevant or incorrect material (equivalent to the introns) from the initial film and sends the cleaned-up version to the director for the final cut.
However, sometimes the RNA within the intron acts as a ribozyme, splicing itself without the use of a spliceosome or protein enzymes.
See also: Splicing (genetics). In 1977, work by the Sharp and Roberts labs revealed that genes of higher organisms are "split" or present in several distinct segments along the DNA molecule.[2] [3] The coding regions of the gene are separated by non-coding DNA that is not involved in protein expression. The split gene structure was found when adenoviral mRNAs were hybridized to endonuclease cleavage fragments of single stranded viral DNA. It was observed that the mRNAs of the mRNA-DNA hybrids contained 5' and 3' tails of non-hydrogen bonded regions. When larger fragments of viral DNAs were used, forked structures of looped out DNA were observed when hybridized to the viral mRNAs. It was realised that the looped out regions, the introns, are excised from the precursor mRNAs in a process Sharp named "splicing". The split gene structure was subsequently found to be common to most eukaryotic genes. Phillip Sharp and Richard J. Roberts were awarded the 1993 Nobel Prize in Physiology or Medicine for the discovery of introns and the splicing process.
Each spliceosome is composed of five small nuclear RNAs (snRNA) and a range of associated protein factors. When these small RNAs are combined with the protein factors, they make RNA-protein complexes called snRNPs (small nuclear ribonucleoproteins, pronounced "snurps").The snRNAs that make up the major spliceosome are named U1, U2, U4, U5, and U6, so-called because they are rich in uridine, and participate in several RNA-RNA and RNA-protein interactions.[1]
The assembly of the spliceosome occurs on each pre-mRNA (also known as heterogeneous nuclear RNA, hn-RNA) at each exon:intron junction. The pre-mRNA introns contains specific sequence elements that are recognized and utilized during spliceosome assembly. These include the 5' end splice site, the branch point sequence, the polypyrimidine tract, and the 3' end splice site. The spliceosome catalyzes the removal of introns, and the ligation of the flanking exons.
Introns typically have a GU nucleotide sequence at the 5' end splice site, and an AG at the 3' end splice site. The 3' splice site can be further defined by a variable length of polypyrimidines, called the polypyrimidine tract (PPT), which serves the dual function of recruiting factors to the 3' splice site and possibly recruiting factors to the branch point sequence (BPS). The BPS contains the conserved adenosine required for the first step of splicing.
Many proteins exhibit a zinc-binding motif, which underscores the importance of zinc in the splicing mechanism.[4] [5] [6] The first molecular-resolution reconstruction of U4/U6.U5 triple small nuclear ribonucleoprotein (tri-snRNP) complex was reported in 2016.[7]
Cryo-EM has been applied extensively by Shi et al. to elucidate the near-/atomic structure of spliceosome in both yeast[8] and humans.[9] The molecular framework of spliceosome at near-atomic-resolution demonstrates Spp42 component of U5 snRNP forms a central scaffold and anchors the catalytic center in yeast. The atomic structure of the human spliceosome illustrates the step II component Slu7 adopts an extended structure, poised for selection of the 3'-splice site. All five metals (assigned as Mg2+) in the yeast complex are preserved in the human complex.
See main article: Alternative splicing. Alternative splicing (the re-combination of different exons) is a major source of genetic diversity in eukaryotes. Splice variants have been used to account for the relatively small number of protein coding genes in the human genome, currently estimated at around 20,000. One particular Drosophila gene, Dscam, has been speculated to be alternatively spliced into 38,000 different mRNAs, assuming all of its exons can splice independently of each other.[10]
The model for formation of the spliceosome active site involves an ordered, stepwise assembly of discrete snRNP particles on the pre-mRNA substrate. The first recognition of pre-mRNAs involves U1 snRNP binding to the 5' end splice site of the pre-mRNA and other non-snRNP associated factors to form the commitment complex, or early (E) complex in mammals.[11] [12] The commitment complex is an ATP-independent complex that commits the pre-mRNA to the splicing pathway.[13] U2 snRNP is recruited to the branch region through interactions with the E complex component U2AF (U2 snRNP auxiliary factor) and possibly U1 snRNP. In an ATP-dependent reaction, U2 snRNP becomes tightly associated with the branch point sequence (BPS) to form complex A. A duplex formed between U2 snRNP and the pre-mRNA branch region bulges out the branch adenosine specifying it as the nucleophile for the first transesterification.[14]
The presence of a pseudouridine residue in U2 snRNA, nearly opposite of the branch site, results in an altered conformation of the RNA-RNA duplex upon the U2 snRNP binding. Specifically, the altered structure of the duplex induced by the pseudouridine places the 2' OH of the bulged adenosine in a favorable position for the first step of splicing.[15] The U4/U5/U6 tri-snRNP (see Figure 1) is recruited to the assembling spliceosome to form complex B, and following several rearrangements, complex C is activated for catalysis.[16] [17] It is unclear how the tri-snRNP is recruited to complex A, but this process may be mediated through protein-protein interactions and/or base pairing interactions between U2 snRNA and U6 snRNA.
The U5 snRNP interacts with sequences at the 5' and 3' splice sites via the invariant loop of U5 snRNA[18] and U5 protein components interact with the 3' splice site region.[19]
Upon recruitment of the tri-snRNP, several RNA-RNA rearrangements precede the first catalytic step and further rearrangements occur in the catalytically active spliceosome. Several of the RNA-RNA interactions are mutually exclusive; however, it is not known what triggers these interactions, nor the order of these rearrangements. The first rearrangement is probably the displacement of U1 snRNP from the 5' splice site and formation of a U6 snRNA interaction. It is known that U1 snRNP is only weakly associated with fully formed spliceosomes,[20] and U1 snRNP is inhibitory to the formation of a U6-5' splice site interaction on a model of substrate oligonucleotide containing a short 5' exon and 5' splice site.[21] Binding of U2 snRNP to the branch point sequence (BPS) is one example of an RNA-RNA interaction displacing a protein-RNA interaction. Upon recruitment of U2 snRNP, the branch binding protein SF1 in the commitment complex is displaced since the binding site of U2 snRNA and SF1 are mutually exclusive events.
Within the U2 snRNA, there are other mutually exclusive rearrangements that occur between competing conformations. For example, in the active form, stem loop IIa is favored; in the inactive form a mutually exclusive interaction between the loop and a downstream sequence predominates. It is unclear how U4 is displaced from U6 snRNA, although RNA has been implicated in spliceosome assembly, and may function to unwind U4/U6 and promote the formation of a U2/U6 snRNA interaction. The interactions of U4/U6 stem loops I and II dissociate and the freed stem loop II region of U6 folds on itself to form an intramolecular stem loop and U4 is no longer required in further spliceosome assembly. The freed stem loop I region of U6 base pairs with U2 snRNA forming the U2/U6 helix I. However, the helix I structure is mutually exclusive with the 3' half of an internal 5' stem loop region of U2 snRNA.
See also: Minor spliceosome. Some eukaryotes have a second spliceosome, the so-called minor spliceosome.[22] A group of less abundant snRNAs, U11, U12, U4atac, and U6atac, together with U5, are subunits of the minor spliceosome that splices a rare class of pre-mRNA introns, denoted U12-type. The minor spliceosome is located in the nucleus like its major counterpart,[23] though there are exceptions in some specialised cells including anucleate platelets[24] and the dendroplasm (dendrite cytoplasm) of neuronal cells.[25]