NOTCH3 Variants and Genotype-Phenotype Features in Chinese CADASIL Patients

Cerebral autosomal dominant arteriopathy with subcortical infarcts and leukoencephalopathy (CADASIL) is a cerebral small vessel disease caused by mutations in the NOTCH3 gene. Archetypal disease-causing mutations are cysteine-affecting variants within the 34 epidermal growth factor-like repeat (EGFr) region of the Notch3 extracellular subunit. Cysteine-sparing variants and variants outside the EGFr coding region associated with CADASIL phenotype have been reported. However, the linkage between untypical variants and CADASIL is unclear. In this study, we investigated the spectrum of NOTCH3 variants in a cohort of 38 probands from unrelated families diagnosed as CADASIL. All coding exons of the NOTCH3 gene were analyzed, and clinical data were retrospectively studied. We identified 23 different NOTCH3 variants including 14 cysteine-affecting pathogenic variants, five cysteine-sparing pathogenic variants, two reported cysteine-sparing variants of unknown significance (VUS), and two novel VUS outside EGFr region. In retrospective studies of clinical data, we found that patients carrying cysteine-sparing pathogenic variants showed later symptom onset (51.36 ± 7.06 vs. 44.96 ± 8.82, p = 0.023) and milder temporal lobe involvement (1.50 ± 1.74 vs. 3.11 ± 2.32, p = 0.027) than patients carrying cysteine-affecting pathogenic variants. Our findings suggested that untypical variants comprise a significant part of NOTCH3 variants and may be associated with a distinctive phenotype.


INTRODUCTION
Cerebral autosomal dominant arteriopathy with subcortical infarcts and leukoencephalopathy (CADASIL, OMIM NO.125310) is a cerebral small vessel disease caused by mutations in the NOTCH3 gene (Joutel et al., 1996). Clinical manifestation includes migraine, recurrent cerebrovascular events, psychiatric disturbance, and cognitive impairment that eventually lead to dementia and disability. Magnetic resonance imaging (MRI) is characterized by multiple lacunes and extensive white matter hyperintensity (WMH), especially in the anterior temporal lobe and external capsule (Chabriat et al., 2009). Deposition of Notch3 extracellular domain or granular osmiophilic material (GOM) in small arteries is a pathological hallmark of CADASIL (Joutel et al., 2000). However, due to the prominent heterogeneity of clinical manifestation, genetic testing remains the gold standard for diagnosis.
The NOTCH3 gene contains 33 exons encoding the Notch3 protein, a single-pass transmembrane receptor of 2,321 amino acids. Notch3 contains a large extracellular domain (ECD) that consists of 34 epidermal growth factor-like repeats (EGFr), a transmembrane domain (TMD), and an intracellular domain (ICD) (Wang et al., 2008). To date, more than 300 NOTCH3 mutations have been reported, and most of the pathogenic mutations reside in exons 2-24 coding for EGFr 1-34 in the ECD of the Notch3 protein. Most mutations are missense mutations that change the number of cysteines in EGFr, leading to misfolding of the receptor and aggregation of the extracellular domain (Rutten et al., 2014).
Thus far, the vast majority of pathogenic mutations were found within the EGFr region of the Notch3 extracellular domain, encoded by exons 2-24 (Rutten et al., 2014). Some other studies show that mutations after exon 24 may not be associated with CADASIL phenotype rather than other diseases, such as infantile myofibromatosis (Martignetti et al., 2013) and lateral meningocele syndrome (Gripp et al., 2015). Therefore, most previous studies in CADASIL patients only analyzed the NOTCH3 gene by sequencing exons 2-24. However, several studies reported mutations outside the EGFr coding region (exons 25-33) to be possible causal of CADASIL (Jung et al., 1995;Joutel et al., 1996;Fouillade et al., 2008;Bersano et al., 2012;Hung et al., 2018). On the other hand, NOTCH3 mutations are not detected in all biopsy-determined CADASIL patients by sequencing the EGFr encoding region (Peters et al., 2005), suggesting that variants outside the EFGr encoding region require further investigation.
In this study, we investigated the spectrum of NOTCH3 variants in Chinese CADASIL patients by sequencing all coding exons of NOTCH3. We found that untypical variants comprise a significant part of NOTCH3 variants and may be associated with a distinctive phenotype.

Patients
Fifty-four probands of unrelated families clinically diagnosed as CADASIL were collected by National Clinical Research Center for Geriatric Disorders (Xiangya) between 2014 and 2020. All patients were of Chinese Han ethnicity and came from centralsouth region of mainland China. They were diagnosed as CADASIL based on the following criteria: (1) positive family history of stroke, dementia, or migraine; (2) symmetrical MRI abnormalities suggestive of vascular WMH; and (3) presence of more than one typical symptom: recurrent stroke especially

Collecting Clinical Information
Data on the background (gender, age at onset, age at assessment, family history, and vascular risk factors), clinical symptoms (migraine, ischemia/hemorrhagic stroke, cognitive impairment, and psychiatric symptoms), skin biopsy results, and MRI findings from proband and family members were collected. The severity of WMH was scored by the modified Schelten's scale (Scheltens et al., 1993). The score was assigned according to the following: 0 = absent; 1 = up to five lesions < 3 mm diameter; 2 = six or more lesions < 3 mm diameter; 3 = up to five lesions 4-10 mm in diameter; 4 = six or more lesions 4-10 mm in diameter; and 5 = one or more lesions > 10 mm diameters; 6 = confluent hyperintensities. Cerebral microbleeds (CMBs) were counted in susceptibility weighted imaging (SWI) sequences. CMB burden were categorized according to the following: 0 = absent; 1 = up to five CMBs; and 2 = six or more CMBs. The WMH rating and CMB categorizing were performed by one experienced vascular neurologist who was blind to all clinical data. Age at onset of migraine was not used as age at onset of CADASIL because occurrence of migraine was markedly earlier than other symptoms. Risk factors include hypertension, smoking, diabetes, and hyperlipidemia.

Genetic Analysis
Genomic DNA from patients and controls was extracted according to the manufacturer's standard procedure using the QIAamp DNA Blood Midi Kit (Qiagen, Hilden, Germany). Polymerase chain reaction (PCR) was performed with primers (comprising intron-exon boundaries) specific for 33 exons of  the NOTCH3 gene. Following purification of PCR products, sequencing was performed using the automated sequencer ABI 3730 (Applied Biosystems, Foster City, CA, United States). Sequencing in the opposite direction was performed whenever an abnormal sequence was found. For patients with variants outside EGFr region, additional genetic analyses were performed to exclude mutations in several other genes known to be causal of cerebral small-vessel disease or vascular dementia (HTRA1, TREX1, COL4A1, COL4A2, GLA, APP, and ITM2B) by targeted next-generation sequencing (data unpublished). In addition, 400 age-and sex-matched Chinese controls were screened for variants outside EGFr region in the NOTCH3 gene to determine the allele frequency in the Chinese population.

Bioinformatic Analyses of Untypical Variants
Four functional prediction software programs were used to predict the effect of untypical variants on protein structure, function, and sequence conservation (SIFT, PolyPhen-2, mutation taster, and GERP ++ ). Using SIFT, scores ranging from 0 to 1 are obtained to represent the normalized probability that a particular amino acid substitution will be tolerated. SIFT predicts that substitutions with scores less than 0.05 are damaging (Ng and Henikoff, 2001). In PolyPhen-2, a variant is classified as "probably damaging" if it has a probabilistic score greater than 0.85, and "possibly damaging" if it has a score greater than 0.15; the remaining variants are classified as benign (Adzhubei et al., 2010). The Mutation taster probability value ranging from 0 to 1 represents the probability of the prediction, in which a value close to 1 indicates a high security of the prediction (Schwarz et al., 2014). In GERP ++ , a variant is classified as "conserved" if it has a probabilistic score greater than 2.00 (Davydov et al., 2010). We used the Genome Aggregation Database (GnomAD) 1 to establish the minor allele frequency (MAF) of novel variants.

Statistical Analyses
Statistical analyses were performed using SPSS version 20 software (SPSS Inc., Chicago, IL). We compared clinical features in patients by Mantel-Haenszel chi-square statistic test or Fisher's exact tests for categorical variables and unpaired t-test for continuous variables. Values of p < 0.05 were considered statistically significant.
For patients carrying variants outside EGFr region, no potential pathogenic mutation were identified in several other genes known to be common causal of cerebral small-vessel disease or vascular dementia (HTRA1, TREX1, COL4A1, COL4A2, GLA, APP, and ITM2B) by targeted next-generation sequencing. None of the three variants outside EGFr region were detected in 400 age-and sex-matched Chinese controls.

Family Information and Clinical Data of Patients Carrying Untypical Variants
Affected family members from 10 probands carrying 9 untypical variants (cysteine-sparing variants and variants outside EGFr region) are summarized in Table 3. Family pedigrees and family co-segregation analysis are shown in Figure 2. Detailed family information is shown in Supplementary Table 1. Representative neuroimaging of probands carrying variant outside EGFr region are presented in Figure 3, showing multiple subcortical lacunar infarcts, extensive WMH with the involvement of the external capsule, and multiple cerebral microbleeds. None of the proband carrying variant outside EGFr region showed WMH involved with anterior temporal lobe. GOM was detected in skin biopsies from patients carrying R133S and R607H (Figure 4).

Pathogenicity Assessment and Classification of NOTCH3 Variants
According to the consensus statement of NOTCH3 variants causing CADASIL (Mancuso et al., 2020), all of the 14 cysteineaffecting NOTCH3 variants in EGFr region were considered to be pathogenic variant because they fulfill the archetype of CADASIL-related mutation and had been reported in previous studies. In silico analysis, allele frequency, and pathological information of nine untypical variants are shown in Table 4. R75P, R133S, R607H, G1347H, and R1761H were considered to be pathogenic variant based on positive GOM deposit in skin biopsies in present (R133S and R607H) or previous studies

Genotype-Phenotype Correlation
Clinical characteristics of patients carrying cystine-affecting pathogenic variants and cysteine-sparing pathogenic variants are shown in Table 5. The sex, prevalence of stroke risk factors, and clinical symptoms did not differ significantly between the two groups. We noticed that patients carrying cysteinesparing pathogenic variants showed later symptom onset than patients carrying cysteine-affecting pathogenic variants, though the difference was insignificant (51.60 ± 11.84 vs. 44.96 ± 8.82, p = 0.150). The difference was significant when an additional nine genetically confirmed family members carrying cysteine-sparing pathogenic variants were included (51.36 ± 7.06 vs. 44.96 ± 8.82, p = 0.023). Though the prevalence of anterior temporal lobes involvement was not significantly different between two groups, the severity of temporal lobes involvement rated by modified Scheltens's scale was lower in patients carrying cysteine-sparing pathogenic variants compared with those with cysteine-affecting pathogenic variants (1.50 ± 1.74 vs. 3.11 ± 2.32, p = 0.027). The patient enrollment and study work flow are shown in Figure 5. Clinical characteristics of patients carrying pathogenic variants and VUS are shown in Table 6. The onset age of patients carrying VUS was later than patients carrying pathogenic variants (55.29 ± 5.35 vs. 45.97 ± 9.44, p = 0.016).

DISCUSSION
Previous studies suggested that both ethnicity and founder effects contribute to the difference of NOTCH3 spectrum in CADASIL patients from different populations. In most of the Caucasian population, NOTCH3 mutations were frequently found in exons 2-6 (60-80%), particularly exon 4 (50-75%) (Joutel et al., 1997;Markus et al., 2002;Peters et al., 2005). Italian populations showed different spectrum with mutations in exon 4 representing only 20.6% of the total (Bianchi et al., 2015). In Asian populations, mutations were more frequently found in exon 11 (40-85%) than exon 4 (20-40%) in patients from Korea (Kim et al., 2014), Chinese mainland (Chen et al., 2017), and Chinese Taiwan (Liao et al., 2015). R544C in exon 11 is the most common mutation of NOTCH3 in the Asian population. However, mutations in exon 11 are rare in Japan. In a retrospective study including 70 Japanese CADASIL patients with NOTCH3 mutations, R544C was not detected (Ueda et al., 2015). In the present study, variants in exon 11 (32.43%) and exon 4 (27.03%) account for approximately half the total variants, suggesting exon 11 and exon 4 to be "hot regions" in our cohort. The remaining variants were distributed over the other NOTCH3 exons. The "hot regions" of NOTCH3 variants in our patients were similar to previous studies in Korea (Kim et al., 2014), Chinese mainland (Chen et al., 2017), and Chinese Taiwan (Liao et al., 2015). The difference is that we found a higher proportion of cysteine-sparing variants and variants outside the EGFr coding region.
The pathogenicity of cysteine-sparing mutation in EGFr region was controversial at first (Rutten et al., 2014), but accumulating evidences supported that cysteine-sparing mutation may be disease causing. Several cysteine-sparing mutations fulfill characteristics required to be considered potentially pathogenic (typical clinical features, diffuse WMH, thorough sequencing of EGFr coding exons, GOM deposition, and MAF < 0.001) (Muino et al., 2017). GOM deposits were observed in patients carrying cysteine-sparing mutations (Santa et al., 2003;Scheid et al., 2008;Brass et al., 2009;Wang et al., 2011;Wollenweber et al., 2015). In vitro studies showed that the aggregation behavior of cysteinesparing mutants was similar to cysteine-affecting mutants (Wollenweber et al., 2015;Huang et al., 2020). We identified six cysteine-sparing variants in EGFr region, all of them has been reported in previous studies. Among them, R75P (Ueda et al., 2015) and G1347R (Nakamura et al., 2015) have been reported to be associated with positive GOM deposition. In present study, we detected GOM in proband patients carrying R133S and R607H, supporting R133S and R607H to be pathogenic variant of CADASIL. V237M has been reported in a Japanese family characterized by late-onset migraine, gait disturbance, and dementia with MRI changes suggestive of cerebral small vessel disease (Uchino et al., 2002). R1100H has been reported in a Chinese female characterized by stroke, memory impairment, migraine, and WMH involving temporal pole and external capsule (Qin et al., 2019). Unfortunately, none of the patients carrying V237M and R1100H in previous or present study performed a skin biopsy, the pathogenicity of V237M and R1100H is not sufficiently convincing and need further clarification.
Variant outside the EGFr region in CADASIL was first reported in 1997, when NOTCH3 was characterized to be causative gene of CADASIL (Joutel et al., 1996). A heterozygous missense variant (c.5554 G > A) in exon 30 was identified in a Swiss family affected by recurrent stroke-like episodes, migraine, subcortical dementia, and positive GOM deposition. Up to date, there are six cases reported variants outside EGFr to be possible causal of CADASIL (summarized in Supplementary Table 2; Jung et al., 1995;Joutel et al., 1996;Fouillade et al., 2008;Bersano et al., 2012;Hung et al., 2018;Park et al., 2020). We identified three variants outside the EGFr coding region. All three patients have positive family history of migraine/stroke/dementia, suggesting an underlying genetic cause. Second, we excluded concurrence of other mutation in NOTCH3 and common causative genes related to cerebral smallvessel disease or vascular dementia. Moreover, these variants were not found in 400 age-and sex-matched Chinese controls and were rare in general population with MAF < 0.001. In silico analysis predicted these variants to affect conserve amino-acid sequences and impair protein function. Based on the above reasons, we consider these three novel variants outside EGFr region might be causative of CADASIL phenotype.
It is hard to clarify the pathogenicity of variants in Notch3 intracellular domain based on current consensus that accumulation of Notch3 extracellular domain, not intracellular domain, is the pathogenic mechanism of CADASIL. However, variant outside EGFr associated with positive GOM has been reported in three patients carrying L1518M (Park et al., 2020), A1851T (Joutel et al., 1996), and R1761H , respectively. The mechanism on how intracellular mutant lead to deposition of extracellular domain is unclear. In addition, other mechanisms leading to vasculopathy different from accumulation of Notch3 intracellular domain cannot be ruled out. There were studies showing that some CADASIL-like symptoms may be related to dysregulated Notch3 signaling (Baron-Menguy et al., 2017;Coupland et al., 2018). Altered Notch3 signaling has been proved in variant outside EGFr region, for example, variant L1515P (Fouillade et al., 2008), suggesting that variant outside EGFr might lead to arteriopathy by dysregulating Notch3 signaling. The three variants outside EGFr region detected in our study were located in the intracellular domain of Notch3, which is responsible for downstream Notch signaling. Although effects of these variants on protein function are not known, several software programs predicted them to be deleterious. It is reasonable to postulate that these variants might change Notch signaling pathway to initiate CADASIL pathogenicity. More functional studies were needed to clarify the role of variants outside the EFGr region in notch signaling and pathogenesis of CADASIL.
Clinical characteristics of patients carrying cysteine-sparing variants have rarely been described in cohort. R75P, a recurrent variant in Asian CADASIL was reported to be associated with less temporal pole involvement (Ueda et al., 2015). A recent study in Korea CADASIL patients reported similar results that patients with cysteine-sparing NOTCH3 variants showed less involvement of the anterior temporal lobes than those with cysteine-affecting NOTCH3 variants (Kim et al., 2020). We found that patients carrying cysteine-sparing pathogenic variants showed later symptom onset and milder temporal lobe involvement than patient carrying cysteine-affecting pathogenic variants, suggesting that cysteine-sparing variants may be related to different phenotype from typical CADASIL.
We also observed that none of the patients carrying variants outside the EGFr region showed severe WMH in anterior temporal lobe. Similarly, R544C, a recurrent mutation in Asian CADASIL patients localizes not in EGFr but between EGFr 13 and 14, has been reported to be associated with later disease onset and less involvement of anterior temporal lobe (Liao et al., 2015). We speculate that variants outside the EGFr region may lead to a milder CADASIL phenotype. As little is known about NOTCH3 variants outside the EFGr region, we advocate complete screening of NOTCH3 locus to reveal the entire genetic spectrum, and this is not difficult to achieve in the context of high-throughput sequencing.
Some shortcomings need consideration. First, only two patients carrying untypical variants agreed to a biopsy. Second, large fragment insertion/deletion mutations of NOTCH3 were not excluded. Third, because of the limited sample size, the genotype-phenotype correlation should be interpreted carefully and needs to be confirmed in larger cohorts.

CONCLUSION
In conclusion, we exhibited a distinct spectrum of NOTCH3 variants in Chinese CADASIL patients and intrigued a potential genotype-phenotype correlation. Our findings broadened spectrum of NOTCH3 variants in CADASIL and draw attention to a potential role of NOTCH3 variants outside the EGFr region. Complete screening of 33 NOTCH3 coding exons is advocated to better understand CADASIL phenotype and genotype spectrum, and more functional investigation is needed to ascertain the role of new NOTCH3 variants in Notch3 signaling and CADASIL pathogenesis.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: https://doi.org/10. 6084/m9.figshare.13585919.v2.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Institutional Ethics Review Boards of Xiangya Hospital, Central South University. The patients/participants provided their written informed consent to participate in this study. Written informed consent was obtained from the individual(s) for the publication of any potentially identifiable images or data included in this article.