Single-Cell Transcriptomics Reveals Splicing Features of Adult Neural Stem Cells in the Subventricular Zone

The central nervous system has enormously complex cellular diversity with hundreds of distinct cell types, yet alternative splicing features in single cells of important cell types at neurogenic regions are not well understood. By employing in silico analysis, we systematically identified 3,611 alternative splicing events from 1,908 genes in 28 single-cell transcriptomic data of adult mouse ependymal and subependymal regions, and found that single-cell RNA-seq has the advantage in uncovering rare splicing isoforms compared to bulk RNA-seq at the population level. We uncovered that the simultaneous presence of multiple isoforms from the same gene in a single cell is prevalent, and quiescent stem cells, activated stem cells, and neuroblast cells exhibit high heterogeneity of splicing variants. Furthermore, we also demonstrated the existence of novel bicistronic transcripts in quiescent stem cells.


INTRODUCTION
Alternative splicing (AS) events of precursor mRNAs are widespread in eukaryotic organisms and enable cells to generate vast protein diversity from a limited number of genes (Gilbert, 1978;Pan et al., 2008;Barbosa-Morais et al., 2012). AS events detected in nervous system tissues are especially prevalent and more highly conserved than those detected in other tissues, suggesting that they more often provide essential functions (Jelen et al., 2007;Merkin et al., 2012;Raj and Blencowe, 2015). It is now established that AS events play diversified regulation roles such as neural cell differentiation, neuronal migration, synapse formation, and brain development in the central nervous system (CNS) (Yano et al., 2010;Grabowski, 2011;Zheng et al., 2012;Shin and Kim, 2013;Mauger et al., 2015). Disruptions in the splicing machinery can result in neurodegeneration (Polymenidou et al., 2011). These findings emphasize the importance of characterizing AS events in CNS tissues and understanding its spatiotemporal and dynamic regulations.
Previous population-based studies revealed characteristics and spatiotemporal regulation of AS in the CNS (Jelen et al., 2007;Yano et al., 2010;Grabowski, 2011;Zheng et al., 2012;Shin and Kim, 2013;Mauger et al., 2015;Tilgner et al., 2015;Suzuki et al., 2017;Jackson et al., 2018;Weyn-Vanhentenryck et al., 2018). The CNS has enormously complex cellular diversity with hundreds of distinct cell types, while population-based AS analyses of bulk tissues with heterogeneous cellular composition reveal the average of all of the different cells in a particular tissue and does not truly reflect functions of any particular or rare but important cell types in these tissues. To circumvent such a problem, single-cell-based alternative splicing analyses become imperative. By single-cell RNA sequencing, researchers identified cell-type-specific mRNA isoforms in the mouse primary motor cortex and cerebellum (Gupta et al., 2018;Booeshaghi et al., 2021) and mRNA isoform diversity of oligodendrocyte and vascular and leptomeningeal cells in the mouse brain (Karlsson and Linnarsson, 2017), and found splicing dynamics during the differentiation of human iPSCs to motor neurons or neuron progenitor cells (Song et al., 2017). At present, few research efforts have been dedicated to AS studies in a single cell at neurogenic regions.
The subventricular zone (SVZ) of the adult mouse forebrain, one of the known neurogenic regions, harbors neural stem cells (NSCs) that migrate to the olfactory bulb along the rostral migratory stream (RMS) and give rise to olfactory bulb interneurons throughout life (Altman, 1969;Doetsch et al., 1999). This region contains multiple distinct cell types related to adult NSC activities (Doetsch et al., 1999). The lack of the single-cell-level insight on isoform expression of ependymal and subependymal cells hampered the understanding of the biological role of alternative splicing in these regions. In this study, we analyzed single-cell transcriptomic data from adult mouse ependymal and subependymal cells, identified 3,611 AS events from 1,908 genes, and systematically characterized their detailed AS features. Moreover, we uncovered that genes expressing multiple splice isoforms simultaneously in a single cell are prevalent. Finally, we demonstrated the existence of new bicistronic transcripts in single quiescent stem cells.

Landscape of AS Event Profiles in Single Cells of the Mouse SVZ Region
To identify AS events of individual cell types in the SVZ region, we performed an isoform identification pipeline to analyze transcriptomic data derived from 28 single-cell samples which we have obtained previously (Luo et al., 2015) from adult mouse CD133 positive and negative ependymal/subependymal cells. These single cells comprised quiescent neural stem cells, activated neural stem cells, neuronal lineage, oligodendrocytes, and other cells in the SVZ region (Luo et al., 2015). An average depth of 15-20 million reads per cell was obtained, and when an FPKM (fragments per kilobase of transcript per million mapped reads) cutoff at 0.1 was chosen, all samples reached saturation in mRNA detection with about 8 million reads used (Luo et al., 2015), suggesting sequencing depths of single-cell transcriptomics data were sufficient for AS analysis.
By performing the AS analysis approach described in a previously published study (Ryan et al., 2012), we identified 3,611 AS events from 1908 genes in the 28 single-cell samples. All the 3,611 AS events identified belong to different categories ( Figure 1A) of local splicing patterns： alternative 3′ acceptor sites (AA), alternative 5' donor sites (AD), alternative promoter (AP), alternative terminator (AT), exon skipping (ES), mutually exclusive exons (ME), and retention intron (RI). The distribution of AS events among different single cells is different in which alternative terminator events are the most frequent, followed by exon skipping. Furthermore, the quantity of AS events is inconsistent among single cells. For example, there are 947 AS events in CD133-negative cell S5, while only 141 AS events in neuroblast cell S28 ( Figure 1B; Supplementary Table S1).
To verify that these isoforms were not artifacts of the amplification process required for constructing single-cell libraries, we isolated RNA from bulk tissue of the adult subventricular zone of three-week-old mice and then performed RT-PCR with primers designed for randomly selected 10 AS events and Sanger sequencing for confirming these 10 AS events. Our results showed that all these detected AS events can be verified by Sanger sequencing ( Figure 1C; Supplementary Figure S1), suggesting our single-cell RNAseq is effective for alternative splicing analysis.

Discovery of Low-Abundance Splicing Isoforms in Cells From the Subventricular Zone by Single-Cell RNA-Seq
Population-based AS analyses of bulk tissues with heterogeneous cellular composition reveal the average of AS events from all of the different cells in that particular tissue. In order to know the differences of AS events in pooledcell and single-cell levels, we analyzed AS events in 4 pooledcell samples with mixed ten single-cell libraries in each sample ( Figure 2A) and compared the variance of different types and the quantity of AS events between 28 single-cell samples and 4 pooled-cell samples. We found that there were no significant differences in the proportion of AS types detected between the single and pooled cells ( Figure 2B). However, more AS events were detected in 28 single-cell samples (3,611) than in 4 tencell pooled samples (3,379 in 40 cells). Notably, differentially expressed alternative splicing (DEAS) analysis showed that most of DEAS events detected only in single-cell samples are low-abundance and novel splicing isoforms ( Figure 2C), suggesting single-cell AS analyses could find low-abundance AS events present in rare cell types and could not find in population-based AS analyses.

Detection of Highly Heterogenetic Splice Variants in Quiescent Stem Cells, Activated Stem Cells, and Neuroblast Cells
The ependymal/subependymal region contains the previously described four cell types related to adult NSC activities: ependymal E cells, subependymal B cells, transitamplifying C cells, and neuroblast A cells ( Figure 3A). It has been reported that CD133 + ependymal quiescent NSCs in ependymal E cells can be mitotically activated (B cells) and give rise to neuroblast A cells in the rostral migratory stream (RMS) and interneurons in the olfactory bulb (Doetsch, 2003;Coskun et al., 2008;Luo et al., 2015). It is reasonable to consider that differentially expressed AS events (DEAS) among quiescent NSCs, activated NSCs, and neuroblast cells may be relevant to the activation and differentiation of quiescent NSCs. To identify the splice variants during the activation and differentiation of quiescent NSCs, we analyzed single-cell RNA-seq data of 8 CD133 + ependymal quiescent NSCs, 2 activated NSCs, and 11 neuroblast cells from the SVZ region (Luo et al., 2015). Results showed that the proportion of AS types was consistent among quiescent NSCs, activated NSCs, and neuroblast cells ( Figure 3B). However, of the 1,256 AS events detected in quiescent NSCs, 712 AS events were in activated NSCs, and 977 AS events were in neuroblast cells. Only a few AS events were cell-type-specific, and most of them were highly heterogeneous ( Figure 3C; Supplementary Figure S2). For example, the SPP2 gene is expressed only in activated NSCs and showed the presence of activated NSC-specific alternative promoters (Supplementary Figure S2A). However, the Esam gene was only expressed in CD133 + ependymal quiescent NSCs, but it exhibited different isoforms in these cells. In addition, the details of the Esam isoform shared in the RefSeq database showed that it is a novel isoform with exon skipping (Supplementary Figure S2B). The Ddah2 gene showed three different isoforms in different individual neuroblast cells, respectively (Supplementary Figure S2C). We also found novel isoforms in Gkn3 and Sdhc, which existed only in one of the quiescent NSCs, respectively (Supplementary Figure S2D). These findings suggest that AS event analyses based on single-cell RNA-seq revealed a high heterogeneity of gene expression and isoform usage in quiescent stem cells, activated stem cells, and neuroblast cells ( Figure 3C).

Simultaneous Presence of Multiple Isoforms From the Same Gene in a Single Cell Is Prevalent
A previous study showed that for genes having multiple splice isoforms at the population level, individual cells predominantly expressed one particular isoform (Shalek et al., 2013). However, when Karlsson and Linnarsson sequenced six cDNAs stored in an earlier single-cell experiment, they found isoform diversity of single-cell mRNAs in oligodendrocytes (Karlsson and Linnarsson, 2017). To investigate the diversity of single-cell mRNA isoforms in the SVZ, we analyzed AS events and the corresponding parent genes of four samples, which were randomly selected from four cell types, including quiescent NSCs, activated NSCs, neuroblast cells, and oligodendrocyte cells, respectively. In the quiescent NSC sample 1 (S1), we identified 148 AS events from 91 genes, and nearly half of these genes expressed more than one isoform. In the activated NSC S9, we identified 438 AS events from 279 genes, and more than half of these genes expressed multiple isoforms. In neuroblast cell S18, we identified 275 AS events from 167 genes, and nearly two-thirds of these genes expressed more than one isoform. In oligodendrocyte cell S4, we identified 950 AS events from 545 genes, and more than two-thirds of the genes expressed more than one isoform ( Figure 4A). Moreover, most of the parent genes occurred in only one type of AS event, whereas certain parent genes contained more than one type of AS event ( Figure 4B). These statistical findings suggest that some genes can simultaneously express more than one particular isoform in a single cell. For example, Pmm1 and Creb3l4 genes can express multiple isoforms in an individual cell, some of which are novel splice isoforms, including the alternative terminator ( Figure 4C), alternative 5′ donor sites, and alternative 3' acceptor sites ( Figure 4D). Coding potential analysis showed that the novel isoforms of the two genes possess protein-coding potential and can be expected to give rise to distinct protein isoforms similar to the original one.
To verify that simultaneous presence of multiple isoforms from the same gene in a single cell is the true biological property of these cells, rather than technical noise associated with the single-cell contamination, we analyzed our single-cell transcriptomics data of two-cell-stage mouse developing embryos, where the two sister cells are easily isolated and avoid cases of more than one cell, and we acquired consistent results as mentioned previously (Supplementary Figure S3). Together, our results reveal that the simultaneous presence of multiple isoforms from the same gene in a single cell is prevalent, and single-cell isoform diversity is a widespread event, which affects the protein-coding repertoire.

Discovery of Bicistronic Transcripts in Quiescent Stem Cells
We detected new transcriptional fusions of adjacent genes in single cells of SVZ, including a fusion of the Hyal1 and Nat6 mRNA (Supplementary Figure S1A), a fusion of the Ramp2 and Vps25 mRNA (Supplementary Figure S1C), a fusion of the Nme1 and Nme2 mRNA (Supplementary Figure S1E), and a fusion of the Gimap1 and Gimap5 mRNA (Supplementary Figure S1F). A fused transcript was formed by transcription of two consecutive genes, in which partial 5′ sequence of upstream mRNA and partial 3' sequence of downstream mRNA were fused into a single transcript (for most cases, there is a strong tendency to lose the last and first exons of the upstream and downstream mRNA, respectively). By RT-PCR, cloning, and Sanger sequencing, we verified the presence of the first two fused transcripts (Supplementary Figures S1B, D). Among the fused mRNAs mentioned previously, Nme1-Nme2 and Gimap1-Gimap5 are homologs of human fused mRNAs (Akiva et al., 2006;Parra et al., 2006), suggesting that fused transcripts are evolutionarily conserved.
Interestingly, in the analysis of tandem gene fusion events, we detected novel bicistronic transcripts, where two adjacent genes (Ech1 and Lgals4) in the same orientation were transcribed into a single RNA sequence that retains the ORF of the two original genes ( Figure 5). There were two different types of bicistronic Ech1-Lgals4 transcripts in two different single quiescent stem cells, respectively. In one bicistronic transcript, the ORFs of Ech1 and Lgals4 were linked directly by truncated 3′UTR of Ech1 and truncated 5′UTR of Lgals4 ( Figure 5A). In the other bicistronic transcript, the ORFs of Ech1 and Lgals4 were linked by 77 nucleotides originating from the intergenic regions of Ech1 and Lgals4 ( Figure 5B). To experimentally test the novel bicistronic transcripts, we isolated total RNA from the mouse SVZ and used it as a template for RNA reverse transcription reaction. By RT-PCR, cloning, and Sanger sequencing, we verified the presence of the two novel bicistronic transcripts ( Figures 5C,D).
The fused transcript can lead to a new, fused protein that has the properties of two original proteins (Thomson et al., 2000;Pradet-Balade et al., 2002). Unlike fused transcripts, bicistronic transcripts retain the ORFs of the two original genes and can be translated into two proteins (Gray et al., 1999;Schaefer and Brösamle, 2009). To determine the expression levels of Ech1 and Lgals4 at the translation level, we cloned two different types of bicistronic Ech1-Lgals4 transcripts (with or without 77 nucleotides originated from the intergenic region of Ech1 and Lgals4) into pCMV-N-HA, respectively. We transfected the two recombinant pCMV-N-HAs into 293T cells, respectively, and use the empty pCMV-N-HA as control. We then examined the expression levels of Ech1 and Lgals4 in control cells and bicistronic transcript overexpressed 293T cells. We found that Ech1 and Lgals4 protein levels were higher only in the cells that overexpressed the bicistronic transcript with 77 nucleotides than in the control group ( Figure 5E). These results suggest that the sequences between the ORF of Ech1 and Lgals4 in this bicistronic transcript can drive the expression of Lgals4.

DISCUSSION
Alternative splicing of pre-messenger RNA is an essential mechanism for generating multiple transcripts and increasing the proteome complexity. Although AS events have been analyzed in CNS tissue and purified cell types in the brain (Zhang et al., 2014;Karlsson and Linnarsson, 2017;Song et al., 2017;Gupta et al., 2018;Weyn-Vanhentenryck et al., 2018) by bulk or single-cell analyses, the AS features in the adult mouse forebrain neurogenic zones remained unclear. In this study, we identified 3,611 AS events from 1,908 genes in the subventricular zone by systematically analyzing 28 single-cell transcriptomics data, which provided an initial picture of the AS patterns in ependymal quiescent, activated neural stem cells, and other cells in close proximity.
To date, several studies demonstrated that single-cell RNA-seq has provided valuable insights into cell-type-specific AS events (Li et al., 2015;Zhang et al., 2016;Karlsson and Linnarsson, 2017;Gupta et al., 2018;Booeshaghi et al., 2021). However, our singlecell AS analysis showed that most AS events were not cell-typespecific except for a few cases and revealed tremendous AS event varieties in ependymal quiescent neural stem cells, activated neural stem cells, and neuroblast cells (Figure 3). This splicing heterogeneity is largely due to cellular diversity, which may have resulted from programmed specialization during the activation of quiescent neural stem cells and differentiation of activated neural stem cells, or through random processes that occur in cells. Although most novel isoforms possess the protein-coding potential (Figure 3) with original open reading frames or new open reading frames, little is known about the significance of such variations and how many of the novel isoforms have biological functions or provide material for evolution.
In the present study, we found that single-cell isoform diversity is a widespread event. Of the 1,082 genes encoding 1,812 splicing isoforms in four selected single-cell samples for isoform number analysis and isoform diversity analysis (Figure 4), 639 (59%) of them expressed multiple isoforms simultaneously in a single cell. Those isoforms retain proteincoding potential, or code-truncated proteins, and/or non-coding RNAs. Thus, in a single cell, some genes can be expected to give rise to multiple distinct protein isoforms, greatly expanding the coding repertoire, or regulating gene expression through alternative stop codons, and non-sense-mediated decay (Lareau et al., 2007) or non-coding RNAs, suggesting that single-cell isoform diversity has a profound consequence on the fate of RNA molecules.
We have detected bicistronic transcripts in quiescent neural stem cells by single-cell RNA-seq and provided experimental evidence that these bicistronic transcripts do exist and are not mere technical artifacts. Although polycistronic operons are present in prokaryotes, bicistronic transcripts are rare in eukaryotes. Bicistronic transcripts have occasionally been reported in eukaryotes, and some evidence shows that two independent proteins can be translated from bicistronic transcripts in humans, mice, and zebrafish (Gray et al., 1999;Schaefer and Brösamle, 2009). However, the function of this transcript, as well as the mechanism involved in its generation and the actual impact of this phenomenon on the eukaryotic genomes, remains to be further elucidated.

Dataset Used in this Study
Single-cell and pooled-cell transcriptomic dataset (GSE61288) from Gene Expression Omnibus (GEO) was used for alternative splicing analysis. This dataset was prepared previously (Luo et al., 2015) from 12 adult mouse ependymal regions which include ependymal/subependymal quiescent NSCs, activated NSCs, and other cells in close proximity. This dataset was obtained using single-cell RNA sequencing technology (Tang et al., 2009(Tang et al., , 2010 and can be used for alternative splicing analyses.

Detection of Isoforms
Alternative splicing events of individual cell types in ependymal/ subependymal regions were analyzed by SpliceSeq (Ryan et al., 2012). Then, the splicing isoforms were visualized using Integrated Genomic Viewer (IGV) and followed by consulting literature, comparing with NCBI Ref-seq, Ensembl 37.61, and UCSC Known Gene 4, and then, AS events from the single cell of adult mouse ependymal/subependymal regions were identified.

Characterization of Alternative Splicing Events by PCR and Sanger Sequencing
To verify novel alternative splicing events in mouse different neuroanatomical regions and different tissues, three-week-old mice were anesthetized and euthanized in accordance with institutional guidelines of the Guide for Care and Use of Laboratory Animals (at Tongji University), and the brains and other tissues (including heart, liver, spleen, lung, kidney, stomach, large intestine, and small intestine) were isolated, and then be quickly put into cold PBS. The subventricular zone, hippocampus, olfactory bulbs, cortex, midbrain, and cerebellum were dissected from the brains under a dissection microscope. The total RNAs of different neuroanatomical regions of the brain and different tissues were extracted using RNAiso Plus (Takara Cat. # 9109). After the removal of contaminating DNA using RNase-free DNase (Ambion), 4 μg of total RNA was reverse-transcribed using RT Master Mix and according to manufacturer recommendations (Takara Cat. # RR036A).

Coding Potential Analysis of Novel Alternative Splicing Isoforms
The coding potential of novel alternative splicing isoforms was analyzed by using a coding potential calculator (CPC) (Kang et al., 2017), which is an SVM-based classifier comprehensively scoring the characteristics of a transcript including the presence and integrity of the predicted ORF, conservation of a single frame.

DATA AVAILABILITY STATEMENT
Publicly available datasets were analyzed in this study. These data can be found here: Gene Expression Omnibus (GEO) (GSE61288).

AUTHOR CONTRIBUTIONS
YL and SL contributed to the conceptualization. YW, CL, YL, SL, XG, and CL carried out experiments or analyses, and contributed to the procurement of critical reagents and protocols. CX contributed to figure preparation. ZH contributed to splicing analysis. YL, SL, YW, and CL contributed to the writing of the manuscript. All authors contributed to the article and approved the submitted version.