ALG1-CDG Caused by Non-functional Alternative Splicing Involving a Novel Pathogenic Complex Allele

This study reports on a Mexican mestizo patient with a multi-systemic syndrome including neurological involvement and a type I serum transferrin profile. Clinical exome sequencing revealed complex alleles in ALG1, the encoding gene for the chitobiosyldiphosphodolichol beta-mannosyltransferase that participates in the formation of the dolichol-pyrophosphate-GlcNAc2Man5, a lipid-linked glycan intermediate during N-glycan synthesis. The identified complex alleles were NM_019109.5(ALG1): c.[208 + 16_208 + 19dup; 208 + 25G > T] and NM_019109.5(ALG1): c.[208 + 16_208 + 19dup; 1312C > T]. Although both alleles carried the benign variant c.208 + 16_208 + 19dup, one allele carried a known ALG1 pathogenic variant (c.1312C > T), while the other carried a new uncharacterized variant (c.208 + 25G > T) causing non-functional alternative splicing that, in conjunction with the benign variant, defines the pathogenic protein effect (p.N70S_S71ins9). The presence in the patient’s serum of the pathognomonic N-linked mannose-deprived tetrasaccharide marker for ALG1-CDG (Neu5Acα2,6Galβ1,4-GlcNAcβ1,4GlcNAc) further supported this diagnosis. This is the first report of an ALG1-CDG patient from Latin America.


INTRODUCTION
Approximately 140 inborn errors of metabolism have been classified as congenital disorders of glycosylation (CDG), a rapidly expanding group of diseases caused by defects in the synthesis and attachment of glycans to glycoproteins and glycolipids (Ondruskova et al., 2021). ALG1-CDG is a subtype with severe multiorgan involvement (OMIM 608540) caused by pathogenic variants in ALG1. This gene encodes for a transmembrane chitobiosyldiphosphodolichol betamannosyltransferase that participates in the first steps of N-glycan biosynthesis that occur in the cytosolic side of the endoplasmic reticulum involving dolichol-pyrophosphate-GlcNAc2Man5 synthesis, an intermediate of the lipid-linked precursor oligosaccharide that is subsequently transferred to nascent glycoproteins for protein N-glycosylation. ALG1 extends GlcNAc2-PP-dolichol by adding the first mannose in ß1,4linkage using GDP-mannose as a substrate donor to generate Man1GlcNAc2-PP-dolichol (Couto et al., 1984). ALG1 is part of a multi-mannosyltransferase complex that includes ALG1, ALG2, and ALG11 involved in a five-reaction process (Gao et al., 2004;O'Reilly et al., 2006;Aebi, 2013).

PATIENT REPORT
The patient was a 15-year-old male of Mexican ancestry with intellectual disability, seizures, hypotonic quadriplegia, and disproportionated microcephaly (height in normal centile). He was born in Tabasco State (in the southern part of Mexico). No consanguinity or endogamy was reported. The patient's parents and an older sister were healthy, but he had another sister who died at 15 years old with similar manifestations. The mother reported a history of a first-trimester spontaneous abortion. There was no other relevant family history (Figure 1). The patient was the product of a naturally conceived 38-week singleton pregnancy to a 33-year-old mother and 37-year-old father, uneventful pregnancy with two normal prenatal ultrasounds. Delivery was via emergency cesarean section for prolonged rupture of membranes. At birth weight the patient was 3,100 g [46th centile; −0.1 standard deviation scores (SDS)] and length was 50 cm (50th centile; 0.5 SDS); additional birth parameters and APGAR score were unknown. He did not require respiratory support or supplementary interventions. He left the hospital 2 days after delivery. He was hospitalized for 30 days at 9 months of age due to seizures. During this hospitalization he was first noted to have some developmental delay; the mother reported the loss of some abilities after the onset of seizures.

RESULTS
Based on the clinical phenotype, electrospray ionization mass spectrometry (ESI-MS) analysis of serum transferrin (Tf) was performed and a type I profile was established (mono-oligo/di-oligo = 0.90) ( Figure 3A). To determine the genetic basis of the disease, a skin biopsy was performed to obtain a fibroblast culture. The genomic DNA (gDNA) obtained from the patient's fibroblasts was sent to clinical exome sequencing (CES), and three ALG1 variants were found in the form of two complex alleles: one complex allele composed by a benign variant NM_019109.4(ALG1): c.208 + 16_208 + 19dup in cis with the known pathogenic variant NM_019109.5(ALG1):c.1312C > T (p.R438W) defined as NM_019109.5(ALG1): c.[208 + 16_208 + 19dup; 1312C > T] and a second complex allele composed by the same benign variant NM_019109.4(ALG1): c.208 + 16_208 + 19dup in cis with a previously not described variant c.208 Sequencing of the parents' gDNA showed that the c.[208 + 16_208 + 19dup; 1312C > T] complex allele was inherited from the mother and the c.[208 + 16_208 + 19dup; 208 + 25G > T] from the father ( Figure 3B). The Sanger sequencing of the mother's ALG1 shows an overlap that is the result of heterozygosity for the benign c.208 + 16_208 + 19dup variant, in contrast to the father that is homozygote.
The c.208 + 25G > T variant was considered potentially pathogenic because it could induce a new donor splice site (GG to GT) and cause non-functional alternative splicing. Using the Human Splicing Finder prediction (HSFPro, Genomnis), it was found that this change significantly alters splicing with the following values [WT-Mut%variation]  To further investigate the presence of alternative splicing, the patient's complementary DNA (cDNA) was synthesized and ALG1 was polymerase chain reaction (PCR) amplified. The ALG1 has 13 exons and codes for a 464 aminoacid protein. Amplification of ALG1 with primers ALG1s and ALG1as encompasses the coding sequence plus short stretches from the 5 -and 3 -UTR for a predicted amplicon of 1517 bp. PCR product analysis in an agarose gel showed a slightly thicker amplicon in the patient compared to the healthy control ( Figure 3C). Because this suggested potential splicing isoforms in the patient, subcloning and screening were performed. Three types of isoforms were identified: one with the c.1312C > T variant and constitutive splicing (c.1312C > T; p.R438W) or with the same variant (shifted to position 1201) and exon 10 skipping c.[del_962-1072;1201C > T] translated as p.[K322_G358del; R401W] and a third isoform without the c.1312C > T variant, but that presented partial intron 1 retention (+ 27 bp) without causing a frameshift and that theoretically results when translated in substitution of N70S and insertion of nine aminoacids EWPRVCLGD (p.N70S_S71ins9) (Figures 4A-C).

DISCUSSION
This report presents a patient with a severe clinical multisystemic phenotype with neurological involvement and an ESI-MS Tf with a type I profile consistent with CDG. CES revealed two complex alleles: The NM_019109.5(ALG1): c.[208 + 16_208 + 19dup; 1312C > T], inherited from the mother, and the NM_019109.5(ALG1): c.[208 + 16_208 + 19dup; 208 + 25G > T] inherited from the father. We considered that the latter could cause non-functional alternative splicing as predicted by HSFPro (Genomnis). PCRbased analysis of the patient's ALG1 amplicon further supported this hypothesis (Figure 3C).
To determine the effects of the complex alleles, the patient's ALG1 PCR amplicon was subcloned and screened, identifying three types of isoforms (Figure 4A), two derived from the c.[208 + 16_208 + 19dup; 1312C > T] complex allele and the third isoform with partial intron 1 retention (+ 27 bp) that did not present the c.1312C > T variant and that we conclude is derived from the c.[208 + 16_208 + 19dup; 208 + 25G > T] complex allele.
Regarding the isoforms derived from the c.[208 + 16_208 + 19dup; 1312C > T] complex allele, inherited by the mother, one presented constitutive splicing (p.R438W) and another an unexpected alternative splicing (exon 10 skipping) that translates into a 427 aa protein lacking 37 aminoacids p.[K322_G358del;R401W]. Because no variants were found in intron/exon junctions of exon 10, we consider that the c.1312C > T variant located in exon 13 could be responsible for altering distant splicing events, although further analysis is required to make a definitive conclusion. The c.1312C > T pathogenic variant included in this complex allele has been reported in six patients as compound heterozygotes with the following pathogenic variants, c.866A > G, c.450C > A, c.1145T > C, and c.1236A > G, and has been demonstrated to have reduced function (Dupré et al., 2010;Rohlfing et al., 2014;Ng et al., 2016;Zhang et al., 2016). No deleterious effects due to the benign intronic variant were observed. For the third isoform without the c.1312C > T variant and with partial intron 1 retention (+ 27 bp) we conclude that it results from the paternally inherited complex allele c.[208 + 16_208 + 19dup; 208 + 25G > T] where the c.208 + 25G > T variant induces a new donor splice site that causes partial intron 1 retention (+ 27 bp). The partial intron 1 retention does not cause a frameshift but results in aminoacid substitution (N70S) and insertion of nine aminoacids (EWPRVCLGD) (p.N70S_S71ins9) (Figure 4C). It is important to note that the duplication of the TCTG bases occurring at the c.208 + 16_208 + 19dup variant (chr16:5122072) shifts the position c.208 + 25G > T to + 29 bp from the donor splicing site in exon 1 which determines the length of intron retention and preservation of the ORF (Figure 4C). We consider that the resulting p.N70S_S71ins9 has a reduced function based on the different physicochemical properties at position 70 [acidic (N) to hydroxylic (S)] as well as the insertion of nine aminoacids, including a P residue. Also, a previously reported p.S71F variant has been shown to have reduced function .
The presence of the ALG1-CDG pathognomonic N-linked mannose-deprived tetrasaccharide detected by ESI-QTOF when analyzing serum total N-glycans further supports the conclusion that the resulting protein variant p.N70S_S71ins9 does not have a normal function as it is not able to compensate for the decreased function of the p.R438W pathogenic variant. The fact that the N-tetrasaccharide was not detected by ESI-MS of serum Tf (Figure 3A) could be explained by differential sensitivity related to types of equipment and techniques, as well as a possible increased sensitivity to detect the N-tetrasaccharide when analyzing total N-glycans versus Tf glycans alone.
It is noteworthy that the c.208 + 16_208 + 19dup variant is considered a benign variant but that the additional presence of the c.208 + 25G > T variant results in a pathogenic complex allele c.[208 + 16_208 + 19dup; 208 + 25G > T]. According to gnomAD database (Lek et al., 2016) the variant c.208 + 16_208 + 19dup is very frequent in all ethnic groups (gnomAD ExomesVersion: 2.1.1 global frequency f = 0.558), as of July 2021. Concerning the c.208 + 25G > T variant, it is absent from the genomes and exomes in the genomAD database (as of July 2021). Interestingly, in the absence of the c208 + 16_208 + 19dup, the c.208 + 25G > T variant would cause a frameshift and a premature stop codon in exon 2 which would also be pathogenic and probably more severe.
Given these results, ALG1-CDG diagnosis was clinically, biochemically, and genetically established. In most diseaserelated genes, variants affecting splicing are not fully characterized because variant screening is restricted to gDNA. In our experience involving ATP6V0A2-CDG and more recently PMM2-CDG, amplification of cDNA transcripts is an invaluable tool to demonstrate non-functional alternative splicing, identifying the consequence on the protein and establishing the pathogenicity mechanism (Bahena-Bahena et al., 2014;González-Domínguez et al., 2020. Of the six reported disease-causing mutations in HGMD that potentially affect splicing of ALG1, only one has been experimentally confirmed ( Table 1). The c.208 + 25G > T reported in this work would be the second variant confirmed to cause alternative non-functional splicing.
This case also highlights the importance of increasing awareness and availability of technological platforms for biochemical and genetic diagnosis of rare diseases. This patient and his family had to wait 15 years to obtain a diagnosis, with a sibling who died at the same age without diagnosis. This time to diagnosis is not adequate for ALG1 patients that have an estimated premature death rate of 44% of which 65% occur at < 12 months of age . This is a lengthy diagnostic odyssey that is unfortunately too frequent in Latin American and most underdeveloped countries that in the era of exome sequencing can no longer be acceptable.
FIGURE 5 | Total ion chromatograms of N-tetrasaccharide (NeuAc1Gal1GlcNAc2) and Gal1 or Man1GlcNAc2 in serum N-glycan profiles from a normal control and the ALG1-CDG patient. Overlay of total ion chromatograms of N-linked mannose deprived tetrasaccharide (A), Man1/Gal1GlcNAc2 (B) of a normal control plasma (in blue), and ALG1-CDG (red). Gal1GlcNAc2 is absent in total plasma proteins from normal controls, which instead have traces of Man1GlcNAc2 present. Markedly increased tetrasaccharide was detected in the ALG1-CDG patient as shown with a black arrow.  Dupré et al., 2010 It is necessary that academic and family organizations around the world pressure for a shift in global health public policy to guarantee that all patients affected by rare diseases have access to an interdisciplinary approach as well as free exome sequencing (Manickam et al., 2021).

CONCLUSION
The complex allele c.[208 + 16_208 + 19dup; 208 + 25G > T] (p.N70S_S71ins9) is pathogenic by causing non-functional alternative splicing of ALG1. Variants should be studied concerning their potential disruption of splicing, particularly if they affect canonical splicing sites. Increased awareness of rare diseases, including CDGs as well as the availability of technological platforms for genetic diagnosis, must be an international standard in public health policy.

Informed Consent
Informed consent was obtained from both parents to perform a skin biopsy, fibroblast cultures, and all required research to obtain a molecular diagnosis and to publish other data on the patient and parents. All procedures followed were in accordance with national and institutional ethical standards on human experimentation and with the Helsinki Declaration of 1975 and revised in 2000.

Electrospray Ionization Mass Spectrometry (ESI-MS) Analysis of Serum Transferrin (Tf)
On-column immunoaffinity ESI-MS analysis of serum Tf isoforms was performed using an API-5000 triple quadrupole mass spectrometer (Applied Biosystems/MDS Sciex, Foster City, CA, United States).

Cell Culture
From the patient's skin biopsy, a primary culture of fibroblasts was obtained in AmnioMAX TM C-100

Predictions of the Pathogenicity of the Variants
The HSFpro software (Genomnis; Desmet et al., 2009) was used to predict the effect of variants on splicing. All predictions were made with the DYSF transcript ENST00000268261.

Polymerase Chain Reaction (PCR) and Sanger Sequencing
Total mRNA from the patient's fibroblasts was obtained using TRIzol reagent (Life Technologies) and cDNA was synthesized using M-MLV Reverse Transcriptase (Life Technologies). The cDNA-based PCR product corresponding to the coding sequence of ALG1 was obtained using forward primer ALG1s 5 -TGACTGCTGCGGGCCAG-3 and reverse primer ALG1as 5 -CACTGGGAGGTGCTGCTCG-3 . In the case of the patient, the amplicon was isolated in low melting point agarose gel, purified, subcloned, and screened for alternative splicing. The inheritance of variants was determined by analyzing the patient's and parents' gDNA using primer ALG1s and reverse primer ALG1gas 5 -CTAAAGGAGCACTTCCGCC-3 for the c.208 + 25G > T pathogenic variant and forward primer ALG1-E13s 5 -CAGGCAATGAGGTAAGCTCTG-3 and reverse primer ALG1-E13as 5 -CAATTCTTTTACCAGGCAGTACC-3 for the c.1312C > T pathogenic variant. Sequencing was performed by an ABI Prism 3130xl autoanalyzer (Applied Biosystems, Foster City, CA, United States), and results were visualized using SnapGene Viewer 2.2.2 (GSL Biotech LLC, Chicago, IL, United States).

DATA AVAILABILITY STATEMENT
The novel pathogenic variant observed in this study has been deposited in the ClinVar database with the accession SCV001761655.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Ethics Committee of the Sociedad Latinoamericana de Glicobiología. Written informed consent to participate in this study was provided by the participants' parents.