AUTHOR=Qu Wen , Cingolani Pablo , Zeeberg Barry R. , Ruden Douglas M. TITLE=A Bioinformatics-Based Alternative mRNA Splicing Code that May Explain Some Disease Mutations Is Conserved in Animals JOURNAL=Frontiers in Genetics VOLUME=Volume 8 - 2017 YEAR=2017 URL=https://www.frontiersin.org/journals/genetics/articles/10.3389/fgene.2017.00038 DOI=10.3389/fgene.2017.00038 ISSN=1664-8021 ABSTRACT=Deep sequencing of cDNAs made from spliced mRNAs indicates that most coding genes in many animals and plants have pre-mRNA transcripts that are alternatively spliced. In pre-mRNAs, in addition to invariant exons that are present in almost all mature mRNA products, there are at least 6 alternative types of exons, such as exons from alternative promoters or with alternative polyA sites, mutually exclusive exons, skipped exons or exons with alternative 5’ or 3’ splice sites. These 7 major types of exons can combine in alternatively spliced mature mRNAs in as many as 36 different pair-wise combinations. Our bioinformatics-based hypothesis is that there is an “alternative-splicing code” in introns and flanking exon sequences, analogous to the genetic code, that directs alternative splicing of many of the 36 types of introns that are defined by their flanking exons. Consistent with this hypothesis, we identified 42 different consensus sequences that are each present in at least 100 human introns and ranked them from most common to least common. This was done with a total of 50 plant and animal species. In humans, 37 of the 42 top consensus sequences are significantly enriched or depleted in at least one of the 36 types of introns. We further supported our hypothesis by showing that 96 human disease mutations that affect RNA splicing, and change alternative splicing from one class to another, can be partially explained by a mutation altering a consensus sequence from one type of intron to that of another type of intron, and that some of the alternative splicing consensus sequences, and presumably their small-RNA or protein targets, are evolutionarily conserved from plants to animals. We also noticed that genes with multiple introns tend to share the same splicing codes. Our work sheds new light on a possible mechanism for generating the tremendous diversity in protein structure by alternative splicing of pre-mRNAs.