A novel SETD2 variant causing global development delay without overgrowth in a Chinese 3-year-old boy

Background: Luscan-Lumish syndrome is characterized by macrocephaly, postnatal overgrowth, intellectual disability (ID), developmental delay (DD), which is caused by heterozygous SETD2 (SET domain containing 2) mutations. The incidence of Luscan-Lumish syndrome is unclear. The study was conducted to provide a novel pathogenic SETD2 variant causing atypical Luscan-Lumish syndrome and review all the published SETD2 mutations and corresponding symptoms, comprehensively understanding the phenotypes and genotypes of SETD2 mutations. Methods: Peripheral blood samples of the proband and his parents were collected for next-generation sequencing including whole-exome sequencing (WES), copy number variation (CNV) detection and mitochondrial DNA sequencing. Identified variant was verified by Sanger sequencing. Conservative analysis and structural analysis were performed to investigate the effect of mutation. Public databases such as PubMed, Clinvar and Human Gene Mutation Database (HGMD) were used to collect all cases with SETD2 mutations. Results: A novel pathogenic SETD2 variant (c.5835_c.5836insAGAA, p. A1946Rfs*2) was identified in a Chinese 3-year-old boy, who had speech and motor delay without overgrowth. Conservative analysis and structural analysis showed that the novel pathogenic variant would loss the conserved domains in the C-terminal region and result in loss of function of SETD2 protein. Frameshift mutations and non-sense mutations account for 68.5% of the total 51 SETD2 point mutations, suggesting that Luscan-Lumish syndrome is likely due to loss of function of SETD2. But we failed to find an association between genotype and phenotype of SETD2 mutations. Conclusion: Our findings expand the genotype-phenotype knowledge of SETD2-associated neurological disorder and provide new evidence for further genetic counselling.

The incidence of Luscan-Lumish syndrome (or SETD2related neurological disorder) is unclear. In 2014, Luscan first reported two heterozygous mutations in SETD2 gene in two patients with "Sotos-like" syndrome (Luscan et al., 2014). To date, fifty-one SETD2 germline mutations and about thirty cases with clinical symptoms have been reported (Lumish et al., 2015;Lelieveld et al., 2016;Tlemsani et al., 2016;van Rij et al., 2018;Marzin et al., 2019;Rabin et al., 2020;Chen et al., 2021). Most of them were featured as overgrowth and speech delay. Here, we report a novel pathogenic SETD2 variant causing global development delay without overgrowth in a Chinese 3-yearold boy and review earlier reported patients. Our findings expand the phenotype and genotype spectrum of SETD2 mutation.

Subjects and samples
The proband was a Chinese 3-year-old boy who was hospitalized for global development delay with his unrelated parents in October 2021. The proband was assessed using neuropsychological development checklist and performed auxiliary examinations such as magnetic resonance imaging (MRI) and electroencephalogram (EEG), G-banded karyotyping test and fragile X syndrome test. Then peripheral blood samples of the proband and his parents were collected for next-generation sequencing including whole-exome sequencing (WES), copy number variation (CNV) detection and mitochondrial DNA sequencing. Written informed consents were obtained from the patients' parents. And this project was approved by the Ethics Committee of Bethune International Peace Hospital (Approval No. 20180023 and Approval No. 2022-KY-26).

Next-generation sequencing
Trio-WES, trio-CNV detection and proband mitochondrial DNA sequencing were performed in Chigene Translational Medicine Research Center (Beijing, China).

WES
Genomic DNA of the proband and his parents was extracted from the EDTA-treated peripheral blood using the Blood Genome Column Medium Extraction Kit (Kangweishiji, China) according to the manufactural instructions. The libraries were constructed by xGen Exome Research Panel v2.0(IDT, United States) which contains 429,826 probes and targets 39 Mb protein-coding region. Illumina NovaSeq 6,000 sequencer (Illumina, United States) was used to sequence more than 99% of target sequences. After filtering and alignment, variant calling was conducted using Genome Analysis Toolkit software. Variant annotation and pathogenicity prediction were processed via a series of databases and software, such as 1,000 genomes, Single Nucleotide Polymorphism database (dbSNP), Exome Sequencing Project (ESP), Exome Aggregation Consortium (ExAC) and genome aggregation database (gnomAD), Provean, Sorting Intolerant From Tolerant (SIFT), Polypen2, MutationTaster, M-Cap and so on. As a prioritized pathogenicity annotation to American College of Medical Genetics and Genomics (ACMG) guideline (Richards et al., 2015), Online Mendelian Inheritance in Man (OMIM), Human Gene Mutation Database (HGMD) and ClinVar databases were used as conferences of pathogenicity of every variant. Sanger sequencing was performed to validate point mutations using 3500DX Genetic Analyzer (Applied Biosystems, United States). The primers for PCR were as follow primer: forward-TTCTTCTAGTTTTGTGCCGTTGCT, reverse primer -TGAGAA TACATCGCGTGCTCATAC.

CNV detecion
Genomic DNA of the proband and his parents was extracted as above. After DNA fragmentation, genomic DNA was amplified by ligation-mediated PCR (LM-PCR) for 4-6 rounds, and then were sequenced on the DNBSEQ-T7 sequencing system (MGI, China). CNVs of 100 KB and above in length were detected using Chigene independently developed software packages for CNV detection. CNV databases such as Decipher, ClinVar, OMIM and ClinGen were used as references to annotate the pathogenic classification of each screened CNV. The biological harm and related phenotypes of CNVs were assessed by annotated information and frequency database according to ACMG practice guidelines (2019 diagnostic guidelines) (Riggs et al., 2020).

Mitochondrial DNA sequencing
Mitochondrial DNA of the proband was extracted using the mitochondrial DNA extraction kit (Baiaolaibo, China). Full-length mitochondrial DNA was amplified using PCR and then was sheared to about 200 bp fragments using Cavoris sonicator (KU, United States). The ligated DNA products were amplified by Frontiers in Genetics frontiersin.org 4-6 rounds of LM-PCR and sequenced on the DNBSEQ-T7 sequencing system (MGI, China). The variants were mapped to references mutations to find matches in the MITOMAP human mitochondrial genome database (https://www.mitomap.org/). The pathogenicity of every variant was assessed according to the MITOtip.

Conservative analysis of SEDT2 protein
The human protein sequences containing novel missense mutations were submitted to protein BLAST (https://blast.ncbi. nlm.nih.gov/Blast.cgi) to run a homology search. Homologous protein sequences from other species (Pan troglodytes, Mus muscullus, Rattus norvegicus, and Monodelphis domestica) were retrieved. Evolutionary conservation was analyzed using MEGA7.

Protein visualization and structural analysis
The effect of the novel mutation on the structure of SEDT2 protein was investigated using the crystal structure of human SEDT2 protein. Three-dimensional structures of proteins were downloaded from PDB database (https://www.rcsb.org/) and AlphaFold Database (https://alphafold.ebi.ac.uk/). Virtual models of SEDT2 protein mutation analysis were performed using Pymol.

Case presentation
The proband is a Chinese Han boy and was the only child of his parents with an unremarkable family history. His mother was 29 years old, no diabetes, hypertension, fever, or medication history during her pregnancy. His father was 30 years old and the parents were unrelated. The proband was born at a gestation age of 29 +3 weeks because of placenta previa. He was diagnosed as neonatal respiratory distress with birth weight 1500 g (50th-90th), normal height and head circumference. The newborn was in hospital for more than 50 days and leaved hospital with good condition in all aspects. But the boy slept less, usually from 11 p.m. to 7 a.m. The proband was fed with milk powder without difficulties and complementary food was not added until 10 months old. His gastrointestinal function was poor and he was allergic to lots of food including apple, banana, wheat, fish, tomatoes, watermelon and peach. The boy presented hematochezia once at 3 months old and suffered recurrent otitis and pneumonia since the age of 7 months. This situation has improved as he grows up and he only had otitis once in the last year. He also had anemia and calcium deficiency at 7 months old, but were cured soon after symptomatic treatment.
The boy demonstrated an early development delay. He started to roll over at the age of 8 months and demonstrated severe motor and speech development delay in the first year of life. At 19 months of life, the proband was referred for clinical genetic evaluation at the Bethune International Peace Hospital. The neuropsychological development checklist showed that the proband global developmental delay with normal muscular tensity of the four limbs and normal growth (body weight 10.8 kg (25th), height 84 cm (25th-50th), head circumference 47.9 cm (50th)). Neurodevelopmental assessment of the patient was Gross motor 12 points, Fine motor 13.5 points, Adaptive ability 15 points, Language 11.5 points, Social behavior 15 points, Intellectual age 13.4 months old, Developmental Quotient 68.7, Low intelligence.
After a year of rehabilitation training, the neurodevelopmental assessment of the patient at the age of 29 months old was Gross motor 21 points, Fine motor 18 points, Adaptive ability 22.5 points, Language 15 points, Social behavior 15 points, Intellectual age 18.3 months old, Developmental Quotient 63.1, Low intelligence. The latest developmental assessment was conducted on 6 January 2023 when the boy was 39.1 months old. The results showed Gross motor 24 points, Fine motor 30 points, Adaptive ability 27 points, Language 18 points, Social behavior 25.5 points, Intellectual age 24.9 months old, Developmental Quotient 63.7, Low intelligence. By the time of submitting manuscript, the proband could only speak simple words and could not jump at 3 years and 4 months old. His urination and defecation were unconscious until recently.
Cranial MRI at 19 months old demonstrated no malformation, but revealed slightly longerT2 signal at the inner edge of right cerebellar dentate nucleus and high T2 signal at the bilateral mastoid and paranasal sinuses. The recent cranial MRI and brain ultrasound showed no abnormalities (3 years old) but the EEG suggested that the proband was younger than actual age. The boy had no facial malformations and epilepsy.

Genetic analysis
G-banded karyotyping test and fragile X syndrome test showed no abnormalities. The high throughput sequencing was performed to detect CNV, monogenic and mitochondrial variation in the proband. By trio-WES, a novel pathogenic SETD2 variant was identified in the proband. The boy carried a de novo (PS2) frameshift mutation c.5835_c.5836insAGAA (p.A1946Rfs*2, NM_ 014159.6) ( Figure 1A), which has not been published nor reported in public databases (PM2). SETD2 is evolutionarily conservative during various species and contains three domains in the C-terminal region ( Figures 1B-D). The mutation c.5835_ c.5836insAGAA would lead to the early termination of SETD2 protein and loss the SHI (SETD2-hnRNP interaction) domain (2164-2213 aa), WW domain (also known as WWP repeating motif, 2391-2422aa) and SRI (Set2-Rpb1 interaction) domain (2469-2548aa) in the wild SETD2 protein, which results Frontiers in Genetics frontiersin.org in loss of function of SETD2 protein (PVS1). According to ACMG guidelines, we confirmed the variant to be pathogenic (PVS1 + PS2 + PM2). No suspicious variants were found in results of the trio-CNV detection and mitochondrial DNA sequencing.

Protein visualization and structural analysis
The predicted three-dimensional structure of SETD2 was downloaded from AlphaFold Database (AF-Q9BYW2-F1) and visualized by Pymol using cartoon model and surface electrostatic charge model (Figures 2A, B). The SHI domain is located at amino acid residues after the 1946th site, which interacts with hnRNP L (Figures 2C, D). The mode of SETD2-hnRNP L interaction reveals a conserved design by which splicing regulators interact with one another. The SETD2: p. A1946Rfs*2 mutation, resulting in the lack of C-terminal region including SHI domain, may have an influence on the alternate splicing of a number of premature mRNAs.
Xu and his colleagues have demonstrated that SETD2 is required for proper cortical arealization and the formation of corticothalamo-cortical circuits (Xu et al., 2021), suggesting the importance of SETD2 in neurological development. SETD2 conditional knockout mice exhibit defects in social interaction, motor learning, and spatial memory, reminiscent of patients with the Sotos-like syndrome bearing SETD2 mutations. According to the review of SETD2 mutations, most patients show speech and language developmental delay, motor developmental delay, variable degree of intellectual disability and behavioral problems, which is completely in accord with the symptoms of SETD2 knockout mice. Another feature of Luscan-Lumish syndrome is overgrowth. Height and weight may normalize in adulthood, but macrocephaly is usually present at all ages (Pappas et al., 1993). Since only the latest records of growth parameters are collected, the rate of high stature or obesity is obvious lower than that of macrocephaly. Advanced bone age could be regarded as a reflect of bone overgrowth.
Frameshift mutations and non-sense mutations account for 68.5% (35/51) of SETD2 point mutations, suggesting that Luscan-Lumish syndrome is likely due to loss of function of SETD2. But we failed to find an association between genotype and phenotype of SETD2 mutations, which suggest a hypothesis that all the SETD2 point mutations will lead to the decreased activity of methyltransferase and result in H3K36me3 loss. Taking the novel mutation c.5835_c.5836insAGAA p.(A1946Rfs*2) we identified, for example, it would generate a truncated SETD2 protein losing the conserved domains in the C-terminal region. The mutated transcripts harboring premature termination codons (PTCs) is also probably degraded by the non-sense-mediated mRNA decay (NMD). Duns et al. (2010) found that the expression of SETD2 transcripts was upregulated treated with inhibitors of NMD in the renal cell carcinoma cell lines RCC-ER and RCC-AB, which harbor hemizygous mutations introducing PTCs in SETD2. This finding suggests that the degradation of the truncated SETD2 protein by the NMD is possible.
In this article, the case demonstrates motor and speech development delay early in life. He has cognitive impairment and will probably develop in to moderate intellectual disability in the future, as the most patients with Luscan-Lumish syndrome do. He has no overgrowth, which is a major feature of Luscan-Lumish syndrome. He has no facial and cranial deformities. He has no epilepsy nor multisystem malformations. He has experienced recurrent otitis and pneumonia, which has been reported in some affected children with Luscan-Lumish syndrome (Luscan et al., 2014;van Rij et al., 2018). As only four patients with SETD2 mutations experienced recurrent otitis and the onset age was younger than 5 years old, no solid evidences support that Luscan-Lumish syndrome is associated with immune deficiency or another disease is co-occurring in the patient we reported. This case expands the mutation spectrum and phenotype spectrum of SETD2 mutations. Moreover, the patients with heterozygous missense mutation SETD2 c.5218C>T p.(R1740W) exhibit profound intellectual disability, microcephaly, congenital anomalies affecting several organ systems (Rabin et al., 2020), which are significantly different from the symptoms of  (Rabin et al., 2020), which are diagnosed as Intellectual developmental disorder 70 (MRD70, MIM#620157). The intellectual disability of MRD70 patients is more serious than that of patients with Luscan-Lumish syndrome, but not as severe as that of RAPAS patients (Rabin et al., 2020). These findings indicate that codon 1740 appears to be critically important for normal SETD2 function and the mutations in codon 1740 may induce abnormal nervous development via an alternative mechanism.
In addition to point mutations of nuclear genes, CNVs and mtDNA variants are also responsible for part of neurodevelopmental disorders (Lam et al., 2019;Ullah et al., 2021;Silva et al., 2022;Sun et al., 2022). D'Gama has reported a case (AN00090) harboring a germline missense mutations in SETD2 predicted to be deleterious (responsible for ASD) and a 15q deletion consistent with her diagnosis of Angelman syndrome (D'Gama et al., 2015). Thus CNVdetection and mtDNA sequencing have been applied for the genetic diagnosis of the case in this article. No suspicious variants have been found by CNV detecion and mtDNA sequencing, which further suggests the pathogenicity of SETD2c.5835_c.5836insAGAA (p.A1946Rfs*2) in the proband.
In conclusion, we have identified a novel pathogenic SETD2 variant in a Chinese 3-year-old Chinese boy with global development delay and without overgrowth. Our findings expand the genotype-phenotype knowledge of SETD2-associated neurological disorder and provide new evidence for further genetic counselling.

Ethics statement
The studies involving human participants were reviewed and approved by the Ethics Committee of Bethune International Peace Hospital. Written informed consent to participate in this study was provided by the participants' legal guardian/next of kin. Written informed consent was obtained from the individual(s), and minor(s)' legal guardian/next of kin, for the publication of any potentially identifiable images or data included in this article.

Author contributions
YW and FL conceptualized and designed this study. FL and RW collected the patient samples and performed developmental assessment. BJ performed sequencing analysis. YW performed genetic analysis, literature review and wrote the manuscript. FL revised the manuscript. All authors contributed to the article and approved the submitted version.