Intron Explained

An intron is any nucleotide sequence within a gene that is not expressed or operative in the final RNA product. The word intron is derived from the term intragenic region, i.e., a region inside a gene.[1] The term intron refers to both the DNA sequence within a gene and the corresponding RNA sequence in RNA transcripts.[2] The non-intron sequences that become joined by this RNA processing to form the mature RNA are called exons.[3]

Introns are found in the genes of most eukaryotes and many eukaryotic viruses and they can be located in both protein-coding genes and genes that function as RNA (noncoding genes). There are four main types of introns: tRNA introns, group I introns, group II introns, and spliceosomal introns (see below). Introns are rare in Bacteria and Archaea (prokaryotes).

Discovery and etymology

Introns were first discovered in protein-coding genes of adenovirus,[4] [5] and were subsequently identified in genes encoding transfer RNA and ribosomal RNA genes. Introns are now known to occur within a wide variety of genes throughout organisms, bacteria,[6] and viruses within all of the biological kingdoms.

The fact that genes were split or interrupted by introns was discovered independently in 1977 by Phillip Allen Sharp and Richard J. Roberts, for which they shared the Nobel Prize in Physiology or Medicine in 1993,[7] though credit was excluded for the researchers and collaborators in their labs that did the experiments resulting in the discovery, Susan Berget and Louise Chow.[8] [9] The term intron was introduced by American biochemist Walter Gilbert:[1]

"The notion of the cistron [i.e., gene] ... must be replaced by that of a transcription unit containing regions which will be lost from the mature messenger – which I suggest we call introns (for intragenic regions) – alternating with regions which will be expressed – exons." (Gilbert 1978)

The term intron also refers to intracistron, i.e., an additional piece of DNA that arises within a cistron.[10]

Although introns are sometimes called intervening sequences,[11] the term "intervening sequence" can refer to any of several families of internal nucleic acid sequences that are not present in the final gene product, including inteins, untranslated regions (UTR), and nucleotides removed by RNA editing, in addition to introns.

Distribution

The frequency of introns within different genomes is observed to vary widely across the spectrum of biological organisms. For example, introns are extremely common within the nuclear genome of jawed vertebrates (e.g. humans, mice, and pufferfish (fugu)), where protein-coding genes almost always contain multiple introns, while introns are rare within the nuclear genes of some eukaryotic microorganisms,[12] for example baker's/brewer's yeast (Saccharomyces cerevisiae). In contrast, the mitochondrial genomes of vertebrates are entirely devoid of introns, while those of eukaryotic microorganisms may contain many introns.[13]

A particularly extreme case is the Drosophila dhc7 gene containing a ≥3.6 megabase (Mb) intron, which takes roughly three days to transcribe.[14] [15] On the other extreme, a 2015 study suggests that the shortest known metazoan intron length is 30 base pairs (bp) belonging to the human MST1L gene.[16] The shortest known introns belong to the heterotrich ciliates, such as Stentor coeruleus, in which most (> 95%) introns are 15 or 16 bp long.[17]

Classification

Splicing of all intron-containing RNA molecules is superficially similar, as described above. However, different types of introns were identified through the examination of intron structure by DNA sequence analysis, together with genetic and biochemical analysis of RNA splicing reactions. At least four distinct classes of introns have been identified:

Group III introns are proposed to be a fifth family, but little is known about the biochemical apparatus that mediates their splicing. They appear to be related to group II introns, and possibly to spliceosomal introns.[18]

Spliceosomal introns

Nuclear pre-mRNA introns (spliceosomal introns) are characterized by specific intron sequences located at the boundaries between introns and exons.[19] These sequences are recognized by spliceosomal RNA molecules when the splicing reactions are initiated.[20] In addition, they contain a branch point, a particular nucleotide sequence near the 3' end of the intron that becomes covalently linked to the 5' end of the intron during the splicing process, generating a branched intron. Apart from these three short conserved elements, nuclear pre-mRNA intron sequences are highly variable. Nuclear pre-mRNA introns are often much longer than their surrounding exons.

tRNA introns

Transfer RNA introns that depend upon proteins for removal occur at a specific location within the anticodon loop of unspliced tRNA precursors, and are removed by a tRNA splicing endonuclease. The exons are then linked together by a second protein, the tRNA splicing ligase.[21] Note that self-splicing introns are also sometimes found within tRNA genes.[22]

Group I and group II introns

See also: Group I catalytic intron and Group II intron. Group I and group II introns are found in genes encoding proteins (messenger RNA), transfer RNA and ribosomal RNA in a very wide range of living organisms.[23] [24] Following transcription into RNA, group I and group II introns also make extensive internal interactions that allow them to fold into a specific, complex three-dimensional architecture. These complex architectures allow some group I and group II introns to be self-splicing, that is, the intron-containing RNA molecule can rearrange its own covalent structure so as to precisely remove the intron and link the exons together in the correct order. In some cases, particular intron-binding proteins are involved in splicing, acting in such a way that they assist the intron in folding into the three-dimensional structure that is necessary for self-splicing activity. Group I and group II introns are distinguished by different sets of internal conserved sequences and folded structures, and by the fact that splicing of RNA molecules containing group II introns generates branched introns (like those of spliceosomal RNAs), while group I introns use a non-encoded guanosine nucleotide (typically GTP) to initiate splicing, adding it on to the 5'-end of the excised intron.

On the accuracy of splicing

The spliceosome is a very complex structure containing up to one hundred proteins and five different RNAs. The substrate of the reaction is a long RNA molecule and the transesterification reactions catalyzed by the spliceosome require the bringing together of sites that may be thousands of nucleotides apart.[25] [26] All biochemical reactions are associated with known error rates and the more complicated the reaction the higher the error rate. Therefore, it is not surprising that the splicing reaction catalyzed by the spliceosome has a significant error rate even though there are spliceosome accessory factors that suppress the accidental cleavage of cryptic splice sites.[27]

Under ideal circumstances, the splicing reaction is likely to be 99.999% accurate (error rate of 10−5) and the correct exons will be joined and the correct intron will be deleted.[28] However, these ideal conditions require very close matches to the best splice site sequences and the absence of any competing cryptic splice site sequences within the introns and those conditions are rarely met in large eukaryotic genes that may cover more than 40 kilobase pairs. Recent studies have shown that the actual error rate can be considerably higher than 10−5 and may be as high as 2% or 3% errors (error rate of 2 or 3 x 10−2) per gene.[29] [30] [31] Additional studies suggest that the error rate is no less than 0.1% per intron.[32] [33] This relatively high level of splicing errors explains why most splice variants are rapidly degraded by nonsense-mediated decay.[34] [35]

The presence of sloppy binding sites within genes causes splicing errors and it may seem strange that these sites haven't been eliminated by natural selection. The argument for their persistence is similar to the argument for junk DNA.[36]

Although mutations which create or disrupt binding sites may be slightly deleterious, the large number of possible such mutations makes it inevitable that some will reach fixation in a population. This is particularly relevant in species, such as humans, with relatively small long-term effective population sizes. It is plausible, then, that the human genome carries a substantial load of suboptimal sequences which cause the generation of aberrant transcript isoforms. In this study, we present direct evidence that this is indeed the case.

While the catalytic reaction may be accurate enough for effective processing most of the time, the overall error rate may be partly limited by the fidelity of transcription because transcription errors will introduce mutations that create cryptic splice sites. In addition, the transcription error rate of 10−5 – 10−6 is high enough that one in every 25,000 transcribed exons will have an incorporation error in one of the splice sites leading to a skipped intron or a skipped exon. Almost all multi-exon genes will produce incorrectly spliced transcripts but the frequency of this background noise will depend on the size of the genes, the number of introns, and the quality of the splice site sequences.

In some cases, splice variants will be produced by mutations in the gene (DNA). These can be SNP polymorphisms that create a cryptic splice site or mutate a functional site. They can also be somatic cell mutations that affect splicing in a particular tissue or a cell line.[37] [38] [39] When the mutant allele is in a heterozygous state this will result in production of two abundant splice variants; one functional and one non-functional. In the homozygous state the mutant alleles may cause a genetic disease such as the hemophilia found in descendants of Queen Victoria where a mutation in one of the introns in a blood clotting factor gene creates a cryptic 3' splice site resulting in aberrant splicing.[40] A significant fraction of human deaths by disease may be caused by mutations that interfere with normal splicing; mostly by creating cryptic splice sites.[41]

Incorrectly spliced transcripts can easily be detected and their sequences entered into the online databases. They are usually described as "alternatively spliced" transcripts, which can be confusing because the term does not distinguish between real, biologically relevant, alternative splicing and processing noise due to splicing errors. One of the central issues in the field of alternative splicing is working out the differences between these two possibilities. Many scientists have argued that the null hypothesis should be splicing noise, putting the burden of proof on those who claim biologically relevant alternative splicing. According to those scientists, the claim of function must be accompanied by convincing evidence that multiple functional products are produced from the same gene.[42] [43]

Biological functions and evolution

While introns do not encode protein products, they are integral to gene expression regulation. Some introns themselves encode functional RNAs through further processing after splicing to generate noncoding RNA molecules.[44] Alternative splicing is widely used to generate multiple proteins from a single gene. Furthermore, some introns play essential roles in a wide range of gene expression regulatory functions such as nonsense-mediated decay[45] and mRNA export.[46]

After the initial discovery of introns in protein-coding genes of the eukaryotic nucleus, there was significant debate as to whether introns in modern-day organisms were inherited from a common ancient ancestor (termed the introns-early hypothesis), or whether they appeared in genes rather recently in the evolutionary process (termed the introns-late hypothesis). Another theory is that the spliceosome and the intron-exon structure of genes is a relic of the RNA world (the introns-first hypothesis).[47] There is still considerable debate about the extent to which of these hypotheses is most correct but the popular consensus at the moment is that following the formation of the first eukaryotic cell, group II introns from the bacterial endosymbiont invaded the host genome. In the beginning these self-splicing introns excised themselves from the mRNA precursor but over time some of them lost that ability and their excision had to be aided in trans by other group II introns. Eventually a number of specific trans-acting introns evolved and these became the precursors to the snRNAs of the spliceosome. The efficiency of splicing was improved by association with stabilizing proteins to form the primitive spliceosome.[48] [49] [50] [51]

Early studies of genomic DNA sequences from a wide range of organisms show that the intron-exon structure of homologous genes in different organisms can vary widely.[52] More recent studies of entire eukaryotic genomes have now shown that the lengths and density (introns/gene) of introns varies considerably between related species. For example, while the human genome contains an average of 8.4 introns/gene (139,418 in the genome), the unicellular fungus Encephalitozoon cuniculi contains only 0.0075 introns/gene (15 introns in the genome).[53] Since eukaryotes arose from a common ancestor (common descent), there must have been extensive gain or loss of introns during evolutionary time.[54] [55] This process is thought to be subject to selection, with a tendency towards intron gain in larger species due to their smaller population sizes, and the converse in smaller (particularly unicellular) species.[56] Biological factors also influence which genes in a genome lose or accumulate introns.[57] [58] [59]

Alternative splicing of exons within a gene after intron excision acts to introduce greater variability of protein sequences translated from a single gene, allowing multiple related proteins to be generated from a single gene and a single precursor mRNA transcript. The control of alternative RNA splicing is performed by a complex network of signaling molecules that respond to a wide range of intracellular and extracellular signals.

Introns contain several short sequences that are important for efficient splicing, such as acceptor and donor sites at either end of the intron as well as a branch point site, which are required for proper splicing by the spliceosome. Some introns are known to enhance the expression of the gene that they are contained in by a process known as intron-mediated enhancement (IME).

Actively transcribed regions of DNA frequently form R-loops that are vulnerable to DNA damage. In highly expressed yeast genes, introns inhibit R-loop formation and the occurrence of DNA damage.[60] Genome-wide analysis in both yeast and humans revealed that intron-containing genes have decreased R-loop levels and decreased DNA damage compared to intronless genes of similar expression. Insertion of an intron within an R-loop prone gene can also suppress R-loop formation and recombination. Bonnet et al. (2017) speculated that the function of introns in maintaining genetic stability may explain their evolutionary maintenance at certain locations, particularly in highly expressed genes.

Starvation adaptation

The physical presence of introns promotes cellular resistance to starvation via intron enhanced repression of ribosomal protein genes of nutrient-sensing pathways.[61]

As mobile genetic elements

Introns may be lost or gained over evolutionary time, as shown by many comparative studies of orthologous genes. Subsequent analyses have identified thousands of examples of intron loss and gain events, and it has been proposed that the emergence of eukaryotes, or the initial stages of eukaryotic evolution, involved an intron invasion.[62] Two definitive mechanisms of intron loss, reverse transcriptase-mediated intron loss (RTMIL) and genomic deletions, have been identified, and are known to occur.[63] The definitive mechanisms of intron gain, however, remain elusive and controversial. At least seven mechanisms of intron gain have been reported thus far: intron transposition, transposon insertion, tandem genomic duplication, intron transfer, intron gain during double-strand break repair (DSBR), insertion of a group II intron, and intronization. In theory it should be easiest to deduce the origin of recently gained introns due to the lack of host-induced mutations, yet even introns gained recently did not arise from any of the aforementioned mechanisms. These findings thus raise the question of whether or not the proposed mechanisms of intron gain fail to describe the mechanistic origin of many novel introns because they are not accurate mechanisms of intron gain, or if there are other, yet to be discovered, processes generating novel introns.[64]

In intron transposition, the most commonly purported intron gain mechanism, a spliced intron is thought to reverse splice into either its own mRNA or another mRNA at a previously intron-less position. This intron-containing mRNA is then reverse transcribed and the resulting intron-containing cDNA may then cause intron gain via complete or partial recombination with its original genomic locus.

Transposon insertions have been shown to generate thousands of new introns across diverse eukaryotic species.[65] Transposon insertions sometimes result in the duplication of this sequence on each side of the transposon. Such an insertion could intronize the transposon without disrupting the coding sequence when a transposon inserts into the sequence AGGT or encodes the splice sites within the transposon sequence. Where intron-generating transposons do not create target site duplications, elements include both splice sites GT (5') and AG (3') thereby splicing precisely without affecting the protein-coding sequence.[66] It is not yet understood why these elements are spliced, whether by chance, or by some preferential action by the transposon.

In tandem genomic duplication, due to the similarity between consensus donor and acceptor splice sites, which both closely resemble AGGT, the tandem genomic duplication of an exonic segment harboring an AGGT sequence generates two potential splice sites. When recognized by the spliceosome, the sequence between the original and duplicated AGGT will be spliced, resulting in the creation of an intron without alteration of the coding sequence of the gene. Double-stranded break repair via non-homologous end joining was recently identified as a source of intron gain when researchers identified short direct repeats flanking 43% of gained introns in Daphnia.[64] These numbers must be compared to the number of conserved introns flanked by repeats in other organisms, though, for statistical relevance. For group II intron insertion, the retrohoming of a group II intron into a nuclear gene was proposed to cause recent spliceosomal intron gain.

Intron transfer has been hypothesized to result in intron gain when a paralog or pseudogene gains an intron and then transfers this intron via recombination to an intron-absent location in its sister paralog. Intronization is the process by which mutations create novel introns from formerly exonic sequence. Thus, unlike other proposed mechanisms of intron gain, this mechanism does not require the insertion or generation of DNA to create a novel intron.[64]

The only hypothesized mechanism of recent intron gain lacking any direct evidence is that of group II intron insertion, which when demonstrated in vivo, abolishes gene expression.[67] Group II introns are therefore likely the presumed ancestors of spliceosomal introns, acting as site-specific retroelements, and are no longer responsible for intron gain.[68] [69] Tandem genomic duplication is the only proposed mechanism with supporting in vivo experimental evidence: a short intragenic tandem duplication can insert a novel intron into a protein-coding gene, leaving the corresponding peptide sequence unchanged.[70] This mechanism also has extensive indirect evidence lending support to the idea that tandem genomic duplication is a prevalent mechanism for intron gain. The testing of other proposed mechanisms in vivo, particularly intron gain during DSBR, intron transfer, and intronization, is possible, although these mechanisms must be demonstrated in vivo to solidify them as actual mechanisms of intron gain. Further genomic analyses, especially when executed at the population level, may then quantify the relative contribution of each mechanism, possibly identifying species-specific biases that may shed light on varied rates of intron gain amongst different species.[64]

See also

Structure:

Splicing:

Function

Others:

External links

Notes and References

  1. "The notion of the cistron [i.e., gene] ... must be replaced by that of a transcription unit containing regions which will be lost from the mature messenger – which I suggest we call introns (for intragenic regions) – alternating with regions which will be expressed – exons." (Gilbert 1978) Gilbert W . Why genes in pieces? . Nature . 271 . 5645 . 501 . February 1978 . 622185 . 10.1038/271501a0 . 4216649 . Walter Gilbert . 1978Natur.271..501G . free .
  2. Kinniburgh AJ, Mertz JE, Ross J . The precursor of mouse beta-globin messenger RNA contains two intervening RNA sequences . Cell . 14 . 3 . 681–693 . July 1978 . 688388 . 10.1016/0092-8674(78)90251-9 . 21897383 .
  3. Book: Lewin B . Genes . 159–179, 386 . 1987 . Wiley . 0-471-83278-2 . 3rd . New York . 14069165.
  4. Chow LT, Gelinas RE, Broker TR, Roberts RJ . An amazing sequence arrangement at the 5' ends of adenovirus 2 messenger RNA . Cell . 12 . 1 . 1–8 . September 1977 . 902310 . 10.1016/0092-8674(77)90180-5 . 2099968 .
  5. Berget SM, Moore C, Sharp PA . Spliced segments at the 5' terminus of adenovirus 2 late mRNA . Proceedings of the National Academy of Sciences of the United States of America . 74 . 8 . 3171–3175 . August 1977 . 269380 . 431482 . 10.1073/pnas.74.8.3171 . 1977PNAS...74.3171B . free .
  6. Belfort M, Pedersen-Lane J, West D, Ehrenman K, Maley G, Chu F, Maley F . Processing of the intron-containing thymidylate synthase (td) gene of phage T4 is at the RNA level . Cell . 41 . 2 . 375–382 . June 1985 . 3986907 . 10.1016/s0092-8674(85)80010-6 . 27127017 . Marlene Belfort .
  7. Web site: The Nobel Prize in Physiology or Medicine 1993.
  8. Abir-Am . Pnina Geraldine . September 2020 . The Women Who Discovered RNA Splicing . . 108 . 5 . 298–305 . January 12, 2024.
  9. News: Flint . Anthony . November 8, 1993 . Nobel Prize in medicine brews resentment, envy . . January 12, 2024 . Newspapers.com.
  10. Tonegawa S, Maxam AM, Tizard R, Bernard O, Gilbert W . Sequence of a mouse germ-line gene for a variable region of an immunoglobulin light chain . Proceedings of the National Academy of Sciences of the United States of America . 75 . 3 . 1485–1489 . March 1978 . 418414 . 411497 . 10.1073/pnas.75.3.1485 . free . 1978PNAS...75.1485T .
  11. Tilghman SM, Tiemeier DC, Seidman JG, Peterlin BM, Sullivan M, Maizel JV, Leder P . Intervening sequence of DNA identified in the structural portion of a mouse beta-globin gene . Proceedings of the National Academy of Sciences of the United States of America . 75 . 2 . 725–729 . February 1978 . 273235 . 411329 . 10.1073/pnas.75.2.725 . free . 1978PNAS...75..725T .
  12. Stajich JE, Dietrich FS, Roy SW . Comparative genomic analysis of fungal genomes reveals intron-rich ancestors . Genome Biology . 8 . 10 . R223 . 2007 . 17949488 . 2246297 . 10.1186/gb-2007-8-10-r223 . free .
  13. Taanman JW . The mitochondrial genome: structure, transcription, translation and replication . Biochimica et Biophysica Acta (BBA) - Bioenergetics . 1410 . 2 . 103–123 . February 1999 . 10076021 . 10.1016/s0005-2728(98)00161-3 . 19229072 .
  14. Tollervey D, Caceres JF . RNA processing marches on . Cell . 103 . 5 . 703–709 . November 2000 . 11114327 . 10.1016/S0092-8674(00)00174-4 . free .
  15. Reugels AM, Kurek R, Lammermann U, Bünemann H . Mega-introns in the dynein gene DhDhc7(Y) on the heterochromatic Y chromosome give rise to the giant threads loops in primary spermatocytes of Drosophila hydei . Genetics . 154 . 2 . 759–769 . February 2000 . 10655227 . 1460963 . 10.1093/genetics/154.2.759 .
  16. Piovesan A, Caracausi M, Ricci M, Strippoli P, Vitale L, Pelleri MC . Identification of minimal eukaryotic introns through GeneBase, a user-friendly tool for parsing the NCBI Gene databank . DNA Research . 22 . 6 . 495–503 . December 2015 . 26581719 . 4675715 . 10.1093/dnares/dsv028 .
  17. Slabodnick MM, Ruby JG, Reiff SB, Swart EC, Gosai S, Prabakaran S, Witkowska E, Larue GE, Fisher S, Freeman RM, Gunawardena J, Chu W, Stover NA, Gregory BD, Nowacki M, Derisi J, Roy SW, Marshall WF, Sood P . 6 . The Macronuclear Genome of Stentor coeruleus Reveals Tiny Introns in a Giant Cell . Current Biology . 27 . 4 . 569–575 . February 2017 . 28190732 . 10.1016/j.cub.2016.12.057 . 5659724 .
  18. Copertino DW, Hallick RB . Group II and group III introns of twintrons: potential relationships with nuclear pre-mRNA introns . Trends in Biochemical Sciences . 18 . 12 . 467–471 . December 1993 . 8108859 . 10.1016/0968-0004(93)90008-b .
  19. Padgett RA, Grabowski PJ, Konarska MM, Seiler S, Sharp PA . Splicing of messenger RNA precursors . Annual Review of Biochemistry . 55 . 1119–1150 . 1986 . 2943217 . 10.1146/annurev.bi.55.070186.005351 .
  20. Guthrie C, Patterson B . Spliceosomal snRNAs . Annual Review of Genetics . 22 . 387–419 . 1988 . 2977088 . 10.1146/annurev.ge.22.120188.002131 .
  21. Greer CL, Peebles CL, Gegenheimer P, Abelson J . Mechanism of action of a yeast RNA ligase in tRNA splicing . Cell . 32 . 2 . 537–546 . February 1983 . 6297798 . 10.1016/0092-8674(83)90473-7 . 44978152 .
  22. Reinhold-Hurek B, Shub DA . Self-splicing introns in tRNA genes of widely divergent bacteria . Nature . 357 . 6374 . 173–176 . May 1992 . 1579169 . 10.1038/357173a0 . 4370160 . 1992Natur.357..173R .
  23. Cech TR . Self-splicing of group I introns . Annual Review of Biochemistry . 59 . 543–568 . 1990 . 2197983 . 10.1146/annurev.bi.59.070190.002551 .
  24. Michel F, Ferat JL . Structure and activities of group II introns . Annual Review of Biochemistry . 64 . 435–461 . 1995 . 7574489 . 10.1146/annurev.bi.64.070195.002251 .
  25. Wan R, Bai R, Zhan X, Shi Y . 2020 . How is precursor messenger RNA spliced by the spliceosome? . Annual Review of Biochemistry . 89 . 333–358 . 10.1146/annurev-biochem-013118-111024 . 31815536 . 209167227 .
  26. Wilkinson ME, Charenton C, Nagai K . 2020 . RNA splicing by the spliceosome . Annual Review of Biochemistry . 89 . 359–388 . 10.1146/annurev-biochem-091719-064225. 31794245 . 208626110 .
  27. Sales-Lee J, Perry DS, Bowser BA, Diedrich JK, Rao B, Beusch I, Yates III JR, Roy SW, Madhani HD . 2021 . Coupling of spliceosome complexity to intron diversity . Current Biology . 31 . 22 . 4898–4910 e4894 . 10.1016/j.cub.2021.09.004. 34555349 . 8967684 . 237603074 .
  28. Hsu SN, Hertel KJ . 2009 . Spliceosomes walk the line: splicing errors and their impact on cellular function . RNA Biology . 6 . 5 . 526–530 . 10.4161/rna.6.5.9860. 19829058 . 3912188 . 22592978 .
  29. Melamud E, Moult J . 2009 . Stochastic noise in splicing machinery . Nucleic Acids Research . gkp471 . 14 . 4873–4886 . 10.1093/nar/gkp471. 19546110 . 2724286 .
  30. Fox-Walsh KL, Hertel KJ . 2009 . Splice-site pairing is an intrinsically high fidelity process . Proceedings of the National Academy of Sciences . 106 . 6 . 1766–1771 . 10.1073/pnas.0813128106. 19179398 . 2644112 . 2009PNAS..106.1766F . free .
  31. Stepankiw N, Raghavan M, Fogarty EA, Grimson A, Pleiss JA . 2015 . Widespread alternative and aberrant splicing revealed by lariat sequencing . Nucleic Acids Research . 43 . 17 . 8488–8501 . 10.1093/nar/gkv763 . 26261211 . 4787815 .
  32. Pickrell JK, Pai AA, Gilad Y, Pritchard JK . 2010 . Noisy splicing drives mRNA isoform diversity in human cells . PLOS Genet . 6 . 12 . e1001236 . 10.1371/journal.pgen.1001236. 21151575 . 3000347 . free .
  33. Skandalis A . 2016 . Estimation of the minimum mRNA splicing error rate in vertebrates . Mutation Research/Fundamental and Molecular Mechanisms of Mutagenesis . 784 . 1713 . 34–38 . 10.1098/rstb.2015.0474. 27994117 . 5182408 .
  34. Zhang Z, Xin D, Wang P, Zhou L, Hu L, Kong X, Hurst LD . 2009 . Noisy splicing, more than expression regulation, explains why some exons are subject to nonsense-mediated mRNA decay . BMC Biology . 7 . 23 . 10.1186/1741-7007-7-23. 19442261 . 2697156 . free .
  35. Bitton DA, Atkinson SR, Rallis C, Smith GC, Ellis DA, Chen YY, Malecki M, Codlin S, Lemay JF, Cotobal C . 2015 . Widespread exon skipping triggers degradation by nuclear RNA surveillance in fission yeast . Genome Research . 25 . 6 . 884–896 . 10.1101/gr.185371.114. 25883323 . 4448684 .
  36. Saudemont B, Popa A, Parmley JL, Rocher V, Blugeon C, Necsulea A, Meyer E, Duret L . 2017 . The fitness cost of mis-splicing is the main determinant of alternative splicing patterns . Genome Biology . 18 . 1 . 208 . 10.1186/s13059-017-1344-6. 29084568 . 5663052 . free .
  37. Scotti MM, Swanson MS . 2016 . RNA mis-splicing in disease . Nature Reviews Genetics . 17 . 1 . 19–32 . 10.1038/nrg.2015.3. 26593421 . 5993438 .
  38. Shirley B, Mucaki E, Rogan P . 2019 . Pan-cancer repository of validated natural and cryptic mRNA splicing mutations . F1000Research . 7 . 1908 . 10.12688/f1000research.17204.3. 31275557 . 6544075 . 202702147 . free .
  39. Mucaki EJ, Shirley BC, Rogan PK . 2020 . Expression changes confirm genomic variants predicted to result in allele-specific, alternative mRNA splicing . Frontiers in Genetics . 11 . 109 . 10.3389/fgene.2020.00109. 32211018 . 7066660 . free .
  40. Rogaev EI, Grigorenko AP, Faskhutdinova G, Kittler EL, Moliaka YK . 2009 . Genotype analysis identifies the cause of the "royal disease" . Science . 326 . 5954 . 817 . 10.1126/science.1180660. 19815722 . 2009Sci...326..817R . 206522975 . free .
  41. Lynch M . 2010 . Rate, molecular spectrum, and consequences of human mutation . Proceedings of the National Academy of Sciences . 107 . 3 . 961–968 . 10.1073/pnas.0912629107. 20080596 . 2824313 . 2010PNAS..107..961L . free .
  42. Mudge JM, Harrow J . 2016 . The state of play in higher eukaryote gene annotation . Nature Reviews Genetics . 17 . 12 . 758–772 . 10.1038/nrg.2016.119. 27773922 . 5876476 .
  43. Bhuiyan SA, Ly S, Phan M, Huntington B, Hogan E, Liu CC, Liu J, Pavlidis P . 2018 . Systematic evaluation of isoform function in literature reports of alternative splicing . BMC Genomics . 19 . 1 . 637 . 10.1186/s12864-018-5013-2. 30153812 . 6114036 . free .
  44. Rearick D, Prakash A, McSweeny A, Shepard SS, Fedorova L, Fedorov A . Critical association of ncRNA with introns . Nucleic Acids Research . 39 . 6 . 2357–2366 . March 2011 . 21071396 . 3064772 . 10.1093/nar/gkq1080 .
  45. Bicknell AA, Cenik C, Chua HN, Roth FP, Moore MJ . Introns in UTRs: why we should stop ignoring them . BioEssays . 34 . 12 . 1025–1034 . December 2012 . 23108796 . 10.1002/bies.201200073 . 5808466 . free .
  46. Cenik C, Chua HN, Zhang H, Tarnawsky SP, Akef A, Derti A, Tasan M, Moore MJ, Palazzo AF, Roth FP . 6 . Genome analysis reveals interplay between 5'UTR introns and nuclear mRNA export for secretory and mitochondrial genes . PLOS Genetics . 7 . 4 . e1001366 . April 2011 . 21533221 . 3077370 . 10.1371/journal.pgen.1001366 . Snyder M . free .
  47. Penny D, Hoeppner MP, Poole AM, Jeffares DC . An overview of the introns-first theory . Journal of Molecular Evolution . 69 . 5 . 527–540 . November 2009 . 19777149 . 10.1007/s00239-009-9279-5 . 22386774 . 2009JMolE..69..527P .
  48. Cavalier-Smith T . Intron phylogeny: a new hypothesis . 1991 . Trends in Genetics . 7 . 5 . 145–148 . 10.1016/0168-9525(91)90377-3 . 2068786 .
  49. Doolittle WF . The origins of introns . 1991 . Current Biology . 1 . 3 . 145–146 . 10.1016/0960-9822(91)90214-h . 15336149 . 35790897 .
  50. Sharp PA . 1991 . "Five easy pieces."(role of RNA catalysis in cellular processes) . Science . 254 . 5032 . 663–664 . 10.1126/science.1948046 . 1948046 . 508870 .
  51. Irimia M, and Roy SW . 2014 . Origin of spliceosomal introns and alternative splicing. . Cold Spring Harbor Perspectives in Biology . 6 . 6 . a016071 . 10.1101/cshperspect.a016071. 24890509 . 4031966 .
  52. Rodríguez-Trelles F, Tarrío R, Ayala FJ . Origins and evolution of spliceosomal introns . Annual Review of Genetics . 40 . 47–76 . 2006 . 17094737 . 10.1146/annurev.genet.40.110405.090625 .
  53. Mourier T, Jeffares DC . Eukaryotic intron loss . Science . 300 . 5624 . 1393 . May 2003 . 12775832 . 10.1126/science.1080559 . 7235937 .
  54. Roy SW, Gilbert W . The evolution of spliceosomal introns: patterns, puzzles and progress . Nature Reviews. Genetics . 7 . 3 . 211–221 . March 2006 . 16485020 . 10.1038/nrg1807 . 33672491 .
  55. de Souza SJ . The emergence of a synthetic theory of intron evolution . Genetica . 118 . 2–3 . 117–121 . July 2003 . 12868602 . 10.1023/A:1024193323397 . 7539892 .
  56. Lynch M . Intron evolution as a population-genetic process . Proceedings of the National Academy of Sciences of the United States of America . 99 . 9 . 6118–6123 . April 2002 . 11983904 . 122912 . 10.1073/pnas.092595699 . free . 2002PNAS...99.6118L .
  57. Jeffares DC, Mourier T, Penny D . The biology of intron gain and loss . Trends in Genetics . 22 . 1 . 16–22 . January 2006 . 16290250 . 10.1016/j.tig.2005.10.006 .
  58. Jeffares DC, Penkett CJ, Bähler J . Rapidly regulated genes are intron poor . Trends in Genetics . 24 . 8 . 375–378 . August 2008 . 18586348 . 10.1016/j.tig.2008.05.006 .
  59. Castillo-Davis CI, Mekhedov SL, Hartl DL, Koonin EV, Kondrashov FA . Selection for short introns in highly expressed genes . Nature Genetics . 31 . 4 . 415–418 . August 2002 . 12134150 . 10.1038/ng940 . 9057609 .
  60. Bonnet A, Grosso AR, Elkaoutari A, Coleno E, Presle A, Sridhara SC, Janbon G, Géli V, de Almeida SF, Palancade B . 6 . Introns Protect Eukaryotic Genomes from Transcription-Associated Genetic Instability . Molecular Cell . 67 . 4 . 608–621.e6 . August 2017 . 28757210 . 10.1016/j.molcel.2017.07.002 . free .
  61. Parenteau J, Maignon L, Berthoumieux M, Catala M, Gagnon V, Abou Elela S . Introns are mediators of cell response to starvation . Nature . 565 . 7741 . 612–617 . January 2019 . 30651641 . 10.1038/s41586-018-0859-7 . 58014466 . 2019Natur.565..612P .
  62. Liran Carmel . Rogozin IB, Carmel L, Csuros M, Koonin EV . Origin and evolution of spliceosomal introns . Biology Direct . 7 . 11 . April 2012 . 22507701 . 3488318 . 10.1186/1745-6150-7-11 . free .
  63. Derr LK, Strathern JN . A role for reverse transcripts in gene conversion . Nature . 361 . 6408 . 170–173 . January 1993 . 8380627 . 10.1038/361170a0 . 4364102 . 1993Natur.361..170D .
  64. Yenerall P, Zhou L . Identifying the mechanisms of intron gain: progress and trends . Biology Direct . 7 . 29 . September 2012 . 22963364 . 3443670 . 10.1186/1745-6150-7-29 . free .
  65. Gozashti L, Roy S, Thornlow B, Kramer A, Ares M, Corbett-Detig R . Transposable elements drive intron gain in diverse eukaryotes . PNAS . 119 . 48 . November 2022 . 48 . 36417430 . 9860276 . 10.1073/pnas.2209766119 . free .
  66. Gozashti L, Roy S, Thornlow B, Kramer A, Ares M, Corbett-Detig R . Transposable elements drive intron gain in diverse eukaryotes . PNAS . 119 . 48 . November 2022 . 48 . 36417430 . 9860276 . 10.1073/pnas.2209766119 . free .
  67. Chalamcharla VR, Curcio MJ, Belfort M . Nuclear expression of a group II intron is consistent with spliceosomal intron ancestry . Genes & Development . 24 . 8 . 827–836 . April 2010 . 20351053 . 2854396 . 10.1101/gad.1905010 . Marlene Belfort .
  68. Cech TR . The generality of self-splicing RNA: relationship to nuclear mRNA splicing . Cell . 44 . 2 . 207–210 . January 1986 . 2417724 . 10.1016/0092-8674(86)90751-8 . 11652546 .
  69. Dickson L, Huang HR, Liu L, Matsuura M, Lambowitz AM, Perlman PS . Retrotransposition of a yeast group II intron occurs by reverse splicing directly into ectopic DNA sites . Proceedings of the National Academy of Sciences of the United States of America . 98 . 23 . 13207–13212 . November 2001 . 11687644 . 60849 . 10.1073/pnas.231494498 . free . 2001PNAS...9813207D .
  70. Hellsten U, Aspden JL, Rio DC, Rokhsar DS . A segmental genomic duplication generates a functional intron . Nature Communications . 2 . 454 . August 2011 . 21878908 . 3265369 . 10.1038/ncomms1461 . 2011NatCo...2..454H .