18.6 Introns
An intron is any nucleotide sequence within a gene that is removed by RNA splicing during maturation of the final RNA product. The term intron refers to both the DNA sequence within a gene and the corresponding sequence in the primary RNA transcript. Sequences that are joined together in the final mature RNA after RNA splicing are exons.
Introns are found in the genes of most organisms and many viruses and can be located in a wide range of genes, including those that generate proteins, ribosomal RNA (rRNA), and transfer RNA (tRNA). When proteins are generated from intron-containing genes, RNA splicing takes place as part of the RNA processing pathway that follows transcription and precedes translation.
The word intron is derived from the term intragenic region, meaning a region inside a gene; they are sometimes called intervening sequences [https://en.wikipedia.org/wiki/Intron]. The term ‘intervening sequence’, though, can refer to any of several families of internal nucleic acid sequences that are not present in the final gene product, including inteins (‘protein introns’ which are segments of a protein able to excise themselves and join the remaining portions [the exteins] with a peptide bond in a process termed protein splicing), untranslated sequences (UTR), and nucleotides removed by RNA editing, in addition to introns.
At least four distinct classes of introns have been identified:
- introns in nuclear protein-coding genes that are removed by spliceosomes (see Section 5.4) (called spliceosomal introns);
- introns in nuclear and transfer RNA genes that are removed by proteins (tRNA introns);
- self-splicing group I introns that are removed by RNA catalysis;
- self-splicing group II introns that are removed by RNA catalysis;
- there is a fifth type, called Group III introns, which are possibly related to spliceosomal introns but too little is known about how their splicing takes place.
Eukaryotic protein-coding genes are interrupted by spliceosomal introns, which are removed from transcripts before protein translation. The first fungal genomes characterised had low intron densities: the yeasts Schizosaccharomyces pombe (average 0.9 introns per gene) and Saccharomyces cerevisiae (even fewer at an average of 0.05 introns per gene). However, among filamentous ascomycete fungi, Neurospora crassa and Aspergillus nidulans have much higher intron densities (2-3 per gene), and average intron densities in basidiomycete and zygomycete fungi have proved to be among the highest known among eukaryotes (4-6 per gene).
Some aspects of intron structure are taxon-specific; for example, introns of Fusarium circinatum, as well as F. verticillioides, F. oxysporum, and F. graminearum, are characterised by some unique species-specific features.
Several fungal species share many intron positions with distantly related species; many intron positions are shared between plants and animals but there has been loss of introns in fungi.
Both the fungal ancestor and fungus-animal ancestor (of the Opisthokont lineage, Section 2.6) were very intron rich, with intron densities matching or exceeding the highest known average densities in modern species of fungi and approaching the highest known across eukaryotes. Fungal evolution has been dominated by intron loss with nearly complete intron loss along some fungal lineages.
Avoiding extremes, the average picture is of moderate intron densities in the common ancestors followed by a tripling of intron number in vertebrates and plants, massive intron loss in yeasts like Schizosaccharomyces pombe and Saccharomyces cerevisiae, and variable intron loss in other fungi (Irimia & Roy, 2014; Phasha et al., 2017).
Updated July, 2019