Symbol: | Intron_gpII |
Group II catalytic intron, D5 | |
Rfam: | RF00029 |
Pdb: | 6cih |
Extra: | Entry contains Splicing domain V (D5) and some consensus 3' to it. |
Symbol: | group-II-D1D4 |
Group II catalytic intron, D1-D4 | |
Rfam Clan: | CL00102 |
Pdb: | 4fb0 |
Extra: | Entry contains D1-D4, parts 5' to D5. |
Group II introns are a large class of self-catalytic ribozymes and mobile genetic elements found within the genes of all three domains of life. Ribozyme activity (e.g., self-splicing) can occur under high-salt conditions in vitro. However, assistance from proteins is required for in vivo splicing.[1] In contrast to group I introns, intron excision occurs in the absence of GTP and involves the formation of a lariat, with an A-residue branchpoint strongly resembling that found in lariats formed during splicing of nuclear pre-mRNA. It is hypothesized that pre-mRNA splicing (see spliceosome) may have evolved from group II introns, due to the similar catalytic mechanism as well as the structural similarity of the Group II Domain V substructure to the U6/U2 extended snRNA.[2] [3] Finally, their ability to site-specifically insert into DNA sites has been exploited as a tool for biotechnology.[4] For example, group II introns can be modified to make site-specific genome insertions and deliver cargo DNA such as reporter genes or lox sites [5]
The secondary structure of group II introns is characterized by six typical stem-loop structures, also called domains I to VI (DI to DVI, or D1 to D6). The domains radiate from a central core that brings the 5' and 3' splice junctions into close proximity. The proximal helix structures of the six domains are connected by a few nucleotides in the central region (linker or joiner sequences). Due to its enormous size, the domain I was divided further into subdomains a, b, c, and d. Sequence differences of group II introns that led to a further division into subgroups IIA, IIB and IIC were identified, along with varying distance of the bulged adenosine in domain VI (the prospective branch point forming the lariat) from the 3' splice site, and the inclusion or omission of structural elements such as a coordination loop in domain I, which is present in IIB and IIC introns but not IIA.[1] Group II introns also form very complicated RNA Tertiary Structure.
Group II introns possess only a very few conserved nucleotides, and the nucleotides important for the catalytic function are spread over the complete intron structure. The few strictly conserved primary sequences are the consensus at the 5' and 3' splicing site (...↓GUGYG&... and ...AY↓..., with the Y representing a pyrimidine), some of the nucleotides of the central core (joiner sequences), a relatively high number of nucleotides of DV and some short-sequence stretches of DI. The unpaired adenosine in DVI (marked by an asterisk in the figure and located 7 or 8 nt away from the 3' splicing site) is also conserved and plays a central role in the splicing process. The 2' hydroxyl of the bulged adenosine attacks the 5' splice site, followed by nucleophilic attack on the 3' splice site by the 3' OH of the upstream exon. This results in a branched intron lariat connected by a 2' phosphodiester linkage at the DVI adenosine.
Protein machinery is required for splicing in vivo, and long-range intron-intron and intron-exon interactions are important for splice site positioning, as well as a number of tertiary contacts between motifs, including kissing-loop and tetraloop-receptor interactions. In 2005, A. De Lencastre et al. found that during splicing of Group II introns, all reactants are preorganized before the initiation of splicing. The branch site, both exons, the catalytically essential regions of DV and J2/3, and ε−ε' are in close proximity before the first step of splicing occurs. In addition to the bulge and AGC triad regions of DV, the J2/3 linker region, the ε−ε' nucleotides and the coordination loop in DI are crucial for the architecture and function of the active-site.[6]
The first crystal structure of a group II intron was resolved in 2008 for the Oceanobacillus iheyensis group IIC catalytic intron, and was joined by the Pylaiella littoralis (P.li.LSUI2) group IIB intron in 2014. Attempts have been made to model the tertiary structure of other group II introns, such as the ai5γ group IIB intron, using a combination of programs for homology mapping onto known structures and de novo modeling of previously unresolved regions.[7] Group IIC are characterized by a catalytic triad made up by CGC, while Group IIA and Group IIB are made up by AGC catalytic triad, which is more similar to the catalytic triad of the spliceosome. It is believed that the Group IIC are also smaller, more reactive and more ancient. The first step of splicing in Group IIC intron is done by water and it form a linear structure instead of lariat.[8]
Group II introns are found in rRNA, tRNA, and mRNA of organelles (chloroplasts and mitochondria) in fungi, plants, and protists, and also in mRNA in bacteria. The first intron to be identified as distinct from group I was the ai5γ group IIB intron, which was isolated in 1986 from a pre-mRNA transcript of the oxi 3 mitochondrial gene of Saccharomyces cerevisiae.[9]
A subset of group II introns encode essential splicing proteins, known as intron-encoded proteins or IEPs, in intronic ORFs. The length of these introns can, as a result, be up to 3 kb. Splicing occurs in almost identical fashion to nuclear pre-mRNA splicing with two transesterification steps, with both also using magnesium ions to stabilize the leaving group in each step, which has led some to theorize a phylogenetic link between group II introns and the nuclear spliceosome. Further evidence for this link includes structural similarity between the U2/U6 junction of spliceosomal RNA and domain V of group II introns, which contains the catalytic AGC triad and much of the heart of the active site, as well as parity between conserved 5' and 3' end sequences.[10]
Many of these IEPs, including LtrA, share a reverse transcriptase domain and a "Domain X". Maturase K (MatK) is a protein somewhat similar to those intron-encoded proteins, found in plant chloroplasts. It is required for in vivo splicing of Group II introns, and can be found in chloroplastic introns or in the nuclear genome. Its RT domain is broken.[11]
Group II IEPs share a related conserved domain, known as either "Domain X" in organelles or "GIIM" in bacteria, that is not found in other retroelements.[12] [13] Domain X is essential for splicing in yeast mitochondria.[14] This domain may be responsible for recognizing and binding to intron RNA[13] or DNA.[15]