Six Exonic Variants in the SLC5A2 Gene Cause Exon Skipping in a Minigene Assay

Background Familial renal glucosuria is a rare renal tubular disorder caused by SLC5A2 gene variants. Most of them are exonic variants and have been classified as missense variants. However, there is growing evidence that some of these variants can be detrimental by affecting the pre-mRNA splicing process. Therefore, we hypothesize that a certain proportion of SLC5A2 exonic variants can result in disease via interfering with the normal splicing process of the pre-mRNA. Methods We used bioinformatics programs to analyze 77 previously described presumed SLC5A2 missense variants and identified candidate variants that may alter the splicing of pre-mRNA through minigene assays. Results Our study indicated six of 7 candidate variants induced splicing alterations. Variants c.216C > A, c.294C > A, c.886G > C, c.932A > G and c.962A > G may disrupt splicing enhancer motifs and generate splicing silencer sequences resulting in the skipping of exon 3. Variants c.305C > T and c.1129G > A probably disturb splice sites leading to exon skipping. Conclusion To our knowledge, we report, for the first time, SLC5A2 exonic variants that produce alterations in pre-mRNA. Our research reinforces the importance of assessing the consequences for putative point variants at the mRNA level. Additionally, we propose that minigenes function analysis may be valuable to evaluate the impact of SLC5A2 exonic variants on pre-mRNA splicing without patients’ RNA samples.


INTRODUCTION
Familial renal glucosuria (FRG) is a rare renal tubular disease, which is characterized by persistent glucosuria without aberrant glucose metabolism and any other symptoms of tubular malfunction (Calado et al., 2008;Aires et al., 2015;Wang et al., 2019). The vast majority of FRG patients are associated with SLC5A2 (OMIM 182381) pathogenic variants (Calado et al., 2008;Sada et al., 2019;Wang et al., 2019). The full-length SLC5A2 gene is 7.7 kb located on chromosome 16p11.2 and encodes for a 672 amino acid low-affinity sodium/glucose co-transporter 2 (SGLT2) with a total of 14 exons (Wells et al., 1993). SGLT2 is mainly expressed in the brush border of renal proximal tubules, couples with Na + and glucose at a ratio of 1:1 (Yu et al., 2020), and reabsorbs most of filtered renal glucose (van den Heuvel et al., 2002).
Among 83 SLC5A2 different variants described in the Human Gene Mutation Database (HGMD, accessed April 2019), 59 are missense/non-sense variants accounting for 77% (64/83). The remaining variants are splicing (5, 6%), small deletions (7, 8%), small insertions (3, 4%), small indels (1, 1%) and gross deletions (3, 4%). Most mutation analyses are performed mainly at genome level, and the impact of a variant on the encoded mRNA and protein is only predicted from the DNA sequence (López-Bigas et al., 2005). Only in few cases have the effects of variants been experimentally confirmed at both DNA and RNA levels.
Generally, exonic point variants are classified as missense, synonymous, silent, or non-sense variants, and certain point variants cause abnormal precursor-mRNA (pre-mRNA) splicing, a key step in gene expression, and this has been associated to the pathogenesis of various disorders (Takeuchi et al., 2015;Gonzalez-Paredes et al., 2016;Shao et al., 2018;Han et al., 2019;Wang et al., 2020). The pre-mRNA splicing process can be changed by point variants, which disrupt canonical splice sites (5 donor site, 3 acceptor site and branch site) and polypyrimidine tract (Gonzalez-Paredes et al., 2014; or creating or deactivating sequences that regulate splicing, such as exonic splicing enhancers/silencers (ESEs/ESSs) or intronic splicing enhancers/silencers (ISEs/ISSs) (Cartegni et al., 2002;Gonzalez-Paredes et al., 2014;Takeuchi et al., 2015). Furthermore, the substitution of some nucleotides in intronic and exonic regions can create or activate new cryptic splice sites, which may alter the final configuration of the mRNA (Gaildrat et al., 2010).
The ideal experimental method to identify splicing alterations is to analyze RNA from patients. However, in many cases, this type of sample is not always available from the patient or it has been obtained in ways that cannot ensure its stability. Alternatively, minigene analysis has become an approach to initially assess whether a particular variant affects pre-mRNA splicing. In a previous study, we have used this method to assess the consequences of presumed missense SLC12A1 variants on splicing and confirmed that one missense variant actually caused abnormal splicing .
Since a large number of SLC5A2 exonic variants described lack studies on the effects of pre-mRNA, we hypothesized that some SLC5A2 variants that have been reported as missense or synonymous can change splicing process through modification of splice sites or splicing regulatory sequences present in the pre-mRNA molecules. This study may provide new insights to the functional consequences of previously described SLC5A2 exonic point variants on pre-mRNA splicing with bioinformatics tools and minigene assays.

MATERIALS AND METHODS
The nomenclature of variants followed the guidelines of the Human Genome Variation Society 1 . The number of nucleotides was based on the SLC5A2 cDNA sequence (GenBank accession number NM_003041.4), with c.1 representing the first position of the translation initiation codon.

In silico Prediction and Screening Criteria
All SLC5A2 missense variants were selected from the Human Gene Mutation Database (March 2019) and literature (Dhayat et al., 2016;Yu et al., 2016a,b;Gong et al., 2017;Wang et al., 2017), , which were identified in our patients (Zhao et al., 2016;Wang et al., 2019). These variants were analyzed through online bioinformatics software to determine their possible effects on pre-mRNA processing. To analyze the potential effect of a variant on consensus 5 donor or 3 acceptor site and to predict the generation and/or activation of novel sites, in silico analysis by BDGP 2 were performed. Human Splicing Finder version 3.1 (HSF) 3 were used to investigate the possible impact of putative missense alterations on splicing regulatory sequences, such as ESEs and ESSs.
In this study, we selected SLC5A2 variants for experimental analyses according to the following criteria: (1) close to the 5 or 3 ends of exons; (2) predicted effect of the variant on exonic splicing regulatory elements (ESEs broken or new ESSs creation).

Amplification of SLC5A2 Genomic Fragments
Genomic DNA was extracted from peripheral blood leukocytes of healthy controls by GenElute blood genomic DNA extraction kit (Sigma, NA2010) according to the manufacturer's instruction. For the in vitro splicing assay, the target exons including approximately 50-200 nucleotides flanking shortened introns were amplified through specific oligonucleotide primers with XhoI and NheI restriction sites (XhoI: CCGCˆCTCGAG; NheI: CTAGˆCTAGC). Both edges of the shortened introns were properly designed by the HSF program so as to avoid the activation of cryptic splicing. The primers were designed by webbased source Primer-Blast 4 and were listed in Supplementary Table 1. DNA extraction from the healthy control was performed with complete understanding and written consent of the subject and was approved by the Ethics Committee of the Affiliated Qingdao Municipal Hospital of Qingdao University prior to participation in the study.

Minigene Constructions and Site-Directed Mutagenesis
PCR fragments were purified with Gel Extraction kit (Cwbio, China). Purified products and pSPL3 exon trapping vector were separately digested by restriction enzymes XhoI and NheI (XhoI: CCGCˆTCGAG; NheI: CTAGˆCTAGC). Ligation reactions were performed using 0.2 U of T4 DNA ligase (Takara, Japan) with overnight incubation at 16 • C. After that, the vector with cloned insert were transformed into DH5α competent E. coli cells and multiplied in the Luria-Bertani broth and spread evenly on the IPTG/x-GAL (Invitrogen, United States) coated ampicillin-Luria-Bertani agar plates for 16-h at 37 • C (Zhang et al., 2018). The extraction of the collected monoclonal colonies was performed using PurePlasmid Mini Kit (Cwbio, China). Minigenes were then sequenced using forward and reverse primers. Chromas 2.31 and Vector NTI Advance 10 were used for sequence analysis and alignment.
Variants of interest were introduced into SLC5A2 exons with QuikChange II Site Directed Mutagenesis Kit (Stratagene, La Jolla, CA, United States) following the manufacture's recommendations. Mutagenesis primers were designed using Primer X 5 (Supplementary Table 2). Primer extension and PCR amplification reactions are as follows: the first step is denaturation at 95 • C for 30 s, followed by 33 cycles, denaturation at 95 • C for 30 s, annealing at 62 • C-53 • C for 30 s, elongation at 72 • C for 7 min, and finally extension at 72 • C for 5 min. In order to determine the existence of target variants, all products were confirmed through direct sequencing.

Minigene Splicing Assay
Human embryonal kidney 293T (HEK293T) and Hela cells were cultured in DMEM with high glucose (4.5 g/L), supplemented with 10% fetal bovine serum and incubated at 37 • C in a 5% CO 2 incubator. One day before transfection, cells were seeded on 24-well plate to grow to 70-80% confluence in an antibiotic-free medium. Each group of minigenes (empty pSPL3 control, wild-type and mutant) were transfected to HEK239T and Hela cells with Lipofectamine 2000 (Invitrogen, United States) following the manufacturer's instructions. Fortyeight hours after transfection, total RNA was extracted with TRIzol reagent (Invitrogen, United States). First-strand cDNA synthesis was carried out through random-primed reverse transcription using PrimeScript 1st Strand cDNA Synthesis kit (Takara, Japan) (Shao et al., 2018). The resulting cDNA was amplified by PCR using vector-specific primers: SD6 (the forward primer: 5 -TCTGAGTCACCTGGACAACC-3 ) and SA2 (the reverse primer: 5 -ATCTCAGTGGTATTTGTGAGC-3 ). The PCR amplification reaction was performed as follows: in 50 µL volume, 2 µL of cDNA, 10 µL of 5 × PrimerSTAR Buffer (TaKaRa, Japan), 1 µM of each primer, 0.8 µM dNTPs and 0.5 µL PrimerSTAR HS DNA Polymerase (TaKaRa, Japan) in a 9700 (Applied Biosystem, United States). Thermal conditions were 29 cycles of 98 • C for 30 s, 58 • C for 30 s, and elongation at 72 • C for 90 s, preceded by 30 s at 98 • C, and followed by a final elongation step at 72 • C for 10 min. PCR products were resolved by electrophoresis through 1.5% agarose gel and each band signal was quantified by the software Quantity One (Bio-Rad, United States). All transcripts were analyzed by DNA sequencing. The bioinformatics online software Basic Local Alignment Search Tool was used to compare DNA sequences with the reference SLC5A2 sequence (GenBank accession number NM_003041.4). Quantification of the abnormal splicing percentage was densitometrically calculated as the percentage of exclusion (%) = (lower band/[lower band + upper band]) × 100. Error bars represent SEM (n = 3). * P < 0.05, unpaired Student's t-test.

RESULTS
A total of 77 missense variants compiled in the SLC5A2 database were analyzed with the bioinformatics software. We eliminated two of these variants [c.1891G > A p.(Glu631Lys) and c.1961A > G p.(Asn654Ser)] since they were located in the last exon and therefore could not be analyzed with the minigene approach. We finally selected the variants within two bases of 5 or 3 ends of the exons, or these variants were predicted to have an effect on splicing regulatory elements according to HSF (The total number of disrupted ESEs and gained ESSs is more than 5). Finally, 9 missense variants [c.
Bioinformatic analysis with BDGP demonstrated that this variant marginally reduces the score of the WT 5 splice site from 0.99 to 0.94 (Table 1) , located in exon 8 were predicted by HSF to alter ESEs (two and three, respectively) and create four ESSs. In order to verify whether these variants affected mRNA splicing, we also carried out minigene splicing experiments in vitro. As a result, two fragments were uniquely detected from the RT-PCR products of the WT and the mutant. Direct sequencing of all products showed that the larger amplicons were the exons-included transcripts and the smaller amplicons are the exons-excluded transcripts (Figures 2B,C). Analysis of cDNA prepared from HEK293T and Hela cells revealed that the amounts of the exon 4-skipping transcript of c.305C > T were significantly decreased with those of the control plasmid (74.5 versus 31.8% in HEK293), whereas there are a significant increase of exon 8-skipping in c.886G > C, c.932A > G and c.962A > G (13.9 versus 66.0%; 13.9 versus 47.7% and 13.9 versus 54.5%, respectively) ( Figure 2D). These data strongly suggested that these exonic variants disturbed the normal splicing in vitro.

DISCUSSION
Pre-mRNA splicing is a key process in eukaryotic gene expression, which removes introns and ligates exons successively. This process is promoted by a ribonucleotide complex, called spliceosome, which interacts with specific RNA sequences at exon-intron boundaries to precisely and efficiently control intron deletion and exon inclusion and produces correct mature mRNAs (Baralle and Baralle, 2018). Misrecognition of exon-intron boundaries or failure to eliminate introns generate aberrant mRNAs that either encode faulty protein or are degraded. In mature mRNAs, the exons inclusion depends on intrinsic regulatory sequences. Variants within the cis-motifs can disrupt the splicing process and induce disease phenotypes in human. The important part of splicing variants reflects the necessity to characterize variants at the mRNA level in that exonic variants away from the canonical GT-AG splice site could certainly be misclassified as missense variants if only the DNA is examined. In fact, it has been estimated that approximately 25-50% of exonic mutations cause disease by affecting normal pre-mRNA splicing (López-Bigas et al., 2005;Brierley and Steensma, 2016).
The purpose of this research was to evaluate the effect of SLC5A2 exonic variants associated with FRG on the splicing process with minigene systems and bioinformatics tools. We assumed that some SLC5A2 variants initially described as missense alterations could also affect pre-mRNA splicing. As far as we know, in the SLC5A2 gene, no such research had been reported. Therefore, we constructed pSPL3 minigene reporter vector to determine whether an exonic variant affects splicing efficiency. The minigene, including a conventional expression system with two cassette exons (SD6 and SA2), is used to analyze the resultant mRNA transcripts. It mainly generates two transcripts. One is composed of exon SD6, an inserted exon and exon SA2 (upper), and the other is composed only of exon SD6 and SA2 (lower) (Figure 2A). After the minigenes inserted with targeted variants were transfected to HEK293T and Hela cells, total RNA was extracted and transcribed to cDNA. As a result, all missense variants studied changed normal splicing, and 6 of them cause exon skipping. However, we should keep it in mind that the minigene strategy has a methodological limitation and could not detect all splicing patterns as a result of this, although it is an efficient tool for the detection of splicing defects.
Variants c.216C > A p.(Phe72Leu) and c.294C > A p.(Phe98Leu) were previously identified as missense variants p.(Phe72Leu) and p.(Phe98Leu), respectively (Santer, 2003;Yu et al., 2011). The variant p.(Phe72Leu) affects highly conserved amino acid residue in the transmembrane helices (TMHs) 2 of SGLT2, while p.(Phe98Leu) is located in the extracellular loop (between TMH 2 and TMH 3). Both variants were found to influence related ESEs and ESSs motifs by the assessment of HSF. The results of our minigenes indicated that both variants produced the same transcript lacking the entire exon 3. As has been shown in other cases, many juxtaposed regulatory sequences including ESEs and ESSs regulate exon usage in a combinatorial manner. They promote or inhibit the identification of surrounding splice sites through recruiting diverse protein factors (Shao et al., 2018). In this study, we suspect that these exonic base substitutions may destroy a variety of ESEs and generate multiple ESSs, causing a significant reduction in the proportion of ESEs/ESSs. Consequently, the total strength of identifying and using adjacent splice site is prominently decreased. In addition, exon 3 had a weak 5 splice donor site (score 0.00, assessed by BDGP, Table 1 and Figure 1). In the context of the weak splicing site, the exon-intron boundary of exon 3 may be not correctly recognized without the need for any assistance of the ESEs. The effect on the SGLT2 protein of joining exons 2 and 4 would lose 35 amino acids and does not alter the open reading frame. Therefore, these mutant proteins would lack part of TMH2 and part of the extracellular loop. Of note, Yu et al. reported that SLC5A2 alternative transcript lacking exon 3 identified in human cells diminishes expression of in the apical membrane of proximal tubules of kidney (Yu et al., 2011). Consequently, we consider that variants c.216C > A and c.294C > A cause disease due to the aberrant splicing.
Variant c.1129G > A was identified by us and categorized as missense variant p.(Gly377Ser), which influenced the last nucleotide of exon 9. Such substitutions often have an adverse effect on the recognition of canonical splice sites by the cell mechanism (Gonzalez-Paredes et al., 2016). Bioinformatics analyses indicated that the authentic donor splice of intron 9 may be affected. Furthermore, we demonstrated that this exonic variant disturbed the normal splicing in vitro causing exon 9 skipping. Subsequently, the ligation of exons 8 and 10 would result in a lack of 36 amino acids and the production of a truncated protein. As a result, this mutated SGLT2 protein would lack the part of the extracellular loop and may reduce or abolish the transport activity of SGLT2.
Variant c.305C > T p.(Ala102Val) disturbed normal splicing in the minigene assay, resulting in an increase of the exon 4-included transcript, which may be due to the enhanced recognition of the 3 splice site of intron 3. Predicting the impact of this splicing modification is difficult, however, in other genes, a small number of exon inclusions induced by variants have also been described, some of which may cause significant clinical symptoms (Vezain et al., 2010;Soukarieh et al., 2016;Perdomo-Ramirez et al., 2019). The minigene assays showed that variants c.886G > C p.(Val296Leu), c.932A > G p.(Lys311Arg) and c.962A > G p.(Lys321Arg) altered normal splicing by increasing approximately 52, 34, and 41% exon 8 exclusion compared with WT, respectively. We hypothesized that the reason for the exon 8 skipping caused by variant c.886G > C probably was that it abolished the acceptor splice site of intron 7. In addition, we also speculate the disruption of functional ESEs and/or the generation of functional ESSs may be the reason for the aberrant transcripts of variants c.932A > G and c.962A > G that lack exon 8. The complete skipping of exon 8 results in a 45 amino acid deletion (residues 296-340) with a subsequent frameshift from codon 341 and premature termination at position 371 in exon 9. Therefore, these variants probably have a double destructive effect; The deletion of exon 8 would result in a truncated protein lacking the COOH-terminal domain from TMH 8 to TMH 14 in the mutant SGLT2 protein and the remaining mRNA is damaged due to the resulting amino acid alterations.
In this study, 7 of the nine SLC5A2 variants were tested via minigene assays. From the results, the software HSF 3.1 seems to be suitable to predict the effect of SLC5A2 exonic mutations on pre-mRNA splicing. Although the results of several studies have showed almost 100% concordance between the results obtained with the analysis of patients' RNA and those from cells transfected with minigenes (Steffensen et al., 2014;van der Klift et al., 2015;Nakanishi et al., 2017;Perdomo-Ramirez et al., 2019), the best approach to determine whether a nucleotide substitution or allelic variant affects splicing is to assay splicing of the endogenous RNA from the relevant tissue of affected individuals. In addition, the shortcomings of this study are that we did not introduce these variants in cDNA according to prediction of single amino acid alterations and the exon exclusion and tested for SGLT2 glucose transport activity and cell surface expression. Further investigation is needed to determine the functional activities of these mutant SGLT2.
In conclusion, our results revealed that 7 previously presumed missense SLC5A2 variants altered pre-mRNA splicing with bioinformatics tools and minigenes. Variants c.216C > A, c.294C > A, c.886G > C, c.932A > G and c.962A > G may disrupt splicing enhancer motifs and generate splicing silencer sequences resulting in skipping of exon 3. Variants c.305C > T and c.1129G > A probably disturb a 3 acceptor and a 5 donor splice site leading to exon skipping, respectively. To our knowledge, we report, for the first time, SLC5A2 exonic variants affecting pre-mRNA splicing and we propose that these variants should be categorized as splicing variants. Furthermore, these findings emphasize the necessity of evaluating the impact of missense variants on mRNA in FRG, and without patients' RNA samples, a minigene assay may be a valuable tool for assessing the impact of SLC5A2 exonic variants on pre-mRNA splicing.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Affiliated Qingdao Municipal Hospital of Qingdao University. The participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
SW conceived and designed the experiments and wrote the manuscript. YW and JW performed the experiments. ZL, RZ, XS, YH, IB, and WG were involved in the data analysis. LS revised the manuscript. All authors had read and approved the final manuscript.

FUNDING
This study was funded by the National Natural Science Foundation of China (NO. 81873594).

ACKNOWLEDGMENTS
We thank all subjects for their participation.