A Tiered Genetic Screening Strategy for the Molecular Diagnosis of Intellectual Disability in Chinese Patients

Objective: Intellectual disability (ID) is one of the most common developmental disabilities. To identify the genetic etiology of IDs in Chongqing, we conducted a multistage study in Chinese Han patients. Methods: We collected the clinical and etiological data of 1665 ID patients, including 1,604 from the disabled children evaluation center and 61 from the pediatric rehabilitation unit. Routine genetic screening results were obtained, including karyotype and candidate gene analysis. Then 105 idiopathic cases with syndromic and severe ID/developmental delay (DD) were selected and tested by chromosomal microarray (CMA) and whole exome sequencing (WES) sequentially. The pathogenicity of the CNVs and SNVs were evaluated according to ACMG guidelines. Results: Molecular diagnosis was made by routine genetic screening in 216 patients, including 196 chromosomal syndromes. Among the 105 idiopathic patients, 49 patients with pathogenic/likely pathogenic CNVs and 21 patients with VUS were identified by CMA. Twenty-six pathogenic CNVs underlying well-known syndromic cases, such as Williams-Beuren syndrome, were confirmed by multiplex ligation-dependent probe amplification (MLPA). Nine novel mutations were identified by WES in thirty-fix CNV-negative ID cases. Conclusions: The study illustrated the genetic aberrations distribution of a large ID cohort in Chongqing. Compared with conventional or single methods, a tiered high-throughput diagnostic strategy was developed to greatly improve the diagnostic yields and extend the variation spectrum for idiopathic syndromic ID cases.


INTRODUCTION
Intellectual disability (ID) or developmental delay (DD) is one of the most common reasons for visiting pediatric rehabilitation or genetic counseling clinics. The incidence is estimated at over 1% worldwide, which seriously endangers the physical and mental health of children (Moeschler and Shevell, 2014). ID is characterized by intellectual and adaptive deficits, such as poor understanding of language, low learning ability, and limited social and practical activities. In addition to a few straight-forward ID cases, ID is often accompanied by other clinical symptoms or systematic malformations, which seriously affects quality of life in this population. Genetic abnormalities are one of the most important causes of IDs (Ropers, 2008;Moeschler and Shevell, 2014;Vissers et al., 2016). History investigation, clinical phenotype analysis, and conventional auxiliary laboratory techniques, including karyotype analysis and candidate gene screening, confirm the diagnosis of IDs. However, there is still a large number of patients without an etiologic diagnosis. This creates a heavy burden in medical costs and social stress on the families. Chongqing is in the southwest of China, with distinctive regional, ethnic, and economic characteristics. ID is one of the most important etiological components of birth defects in this district. It is very important to illustrate the epidemiological characteristics and genetic etiology of ID in this district.
The genetic etiology of ID is not well described. Traditional karyotype analysis, metabolic analysis, and candidate gene screening can only solve about 30% of the causes of genetic IDs in clinical practice (Ropers, 2008). Except chromosomal abnormalities or well-known gene mutations, previous studies of ID patients only confirmed the causative role for small copy number variations (CNVs) in the pathogenesis (Merikangas et al., 2009;Miller et al., 2010;Mefford et al., 2012). Due to its main advantages, chromosomal microarray (CMA) has facilitated the discovery of novel rare DNA CNVs across the genome. So, CMA testing has been recommended as a first-tier cytogenetic diagnostic test for patients with ID or multiple congenital anomalies (Miller et al., 2010). However, the interpretation of CNVs is very challenging (Merikangas et al., 2009). The detection of balanced rearrangements and single nucleotide variations (SNVs) are also beyond the capability of CMA analysis. In recent years, next generation sequencing (NGS) has enabled identification of multiple genetic variations, which play important roles in the pathogenesis of IDs (Harripaul et al., 2017;Bruel et al., 2020). So, the application of high-throughput technologies, including CMA and whole exome sequencing (WES), has become an effective strategy for genetic analysis in IDs (Fell and Nagy, 2021).
In the study, we recruited 1604 ID patients from the disabled children evaluation center. After review of the routine genetic screening results, 216 cases obtained a genetic diagnosis. For the remaining 44 undiagnosed syndromic ID cases and other 61 idiopathic severe ID/DD cases from the pediatric rehabilitation unit, we conducted a sequential approach by using CMA and WES. The results revealed that this might be a general strategy for the molecular screening of IDs. With the idiopathic ID cases, clinical application of high-throughput techniques greatly improved the diagnostic yields. Several novel mutations or genomic regions of clinical significance were also identified, which enriched the genotype-phenotype correlations in the ID patients and provided clues for the exploration of neurodevelopmental genes.

Patients
This study was performed with the approval of the Ethics Committee of Army Medical University, Chongqing, China. A written statement of informed consent was obtained from the legal guardian of each patient in the study. The subjects were treated in accordance with the tenets of the Declaration of Helsinki. Undergoing a diagnostic evaluation of ID, 1,604 cases were collected from the disabled children evaluation center between January 2013 and December 2016. The inclusion criteria contained the following characteristics: 1) learning disability, 2) language barriers, 3) autistic features or suitability barriers, 4) may have other developmental delays, such as growth or motor delays, and 5) may have congenital multiple malformations. The other 61 participants were identified and enrolled in the pediatric rehabilitation unit at the Xinqiao Hospital of Army Medical University from June 2013 to December 2020. The medical records of all the patients were reviewed retrospectively. Tabulated data of each patient included: 1) demographic information: age, gender, family history, history of birth, growth and development history, and systemic disorders; 2) details of the ID features: presence of malformations, intelligence quotient scoring (IQs); 3) other neuropsychiatric phenotypes (i.e., epilepsy, attention deficit and hyperactivity disorder, autism spectrum disorders, and schizophrenia).

Routine Genetic Screening
Karyotyping Cells were incubated at 37°C in MEM (Gibco/Life Technologies, USA) containing phytohemagglutinin for 72 h, then colcemid (0.2 μg/ml) was added for a further 40 min of incubation. The dividing cells were processed in 0.075 M of KCl and fixed in 3:1 methanol-acetic acid. Giemsa banding was used to produce a visible 550-band resolution karyotype on the slides. The chromosomes were analyzed and reported by experienced cytogeneticists manually, according to the recommendations of the International System for Human Cytogenomic Nomenclature.

Candidate Gene Analysis
Triplet primed PCR (TP-PCR) was performed to screen mutations in the FMR1 gene according to Chen's protocol (Chen et al., 2010). The concentrations of 11 amino acids, 31 acylcarnitines, and 1 ketone succinylacetone were measured by tandem mass spectrometry. Individuals with clear aberrant initial screening results were referred to confirmatory tests, including biochemical and genetic analysis.

DNA Extraction and Sanger Sequencing
DNA was extracted from peripheral blood leukocytes of the patients by using the Wizard Genomic DNA Purification Kit (Promega, US). When possible, parental DNA was collected. The quantity and quality of DNA were determined by using NANODROP 1000 (Thermo Fisher, US). Variants identified by the exome sequencing were confirmed by Sanger DNA sequencing. The software Primer3 was used to design the primers. PCR conditions and the primer pairs are available upon request. DNA sequences were analyzed using the vector NTI 11.0 software package. The DNA mutation numbering system we used is based on a cDNA sequence with +1 corresponding to the A of the ATG translation initiation codon in the reference sequence.

CMA Platform and MLPA Assay
The probands were screened via CMA to detect the genome-wide CNVs. The CMA assay was conducted by the KingMed Diagnostics Corporation (Guangzhou, China), by using the Affymetrix CytoScan HD array (Thermo Fisher Scientific, US), according to the manufacturer's instructions. Commercial reference DNA (male and female) provided by Thermo Fisher Scientific were used for the analysis. Genotype and CNV identification and an assessment of genotyping integrity were conducted by using Affymetrix Chromosome Analysis Suite software version 3.1 (Thermo Fisher Scientific, US).
The genomic regions are described according to the GRCh37/ hg19 reference sequence (University of California Santa Cruz). The significance of each CNV was determined by comparison to the public database, such as the Database of Genomic Variant (DGV http://dgv.tcag.ca/dgv/app/home) and DECIPHER database (http://decipher.sanger.ac.uk/). When available, blood samples were obtained from patient's parents and the same analysis was done to investigate the inheritance of CNVs. Microdeletions and microduplications were evaluated according to American College of Medical Genetics (ACMG) guidelines (Riggs et al., 2020).
For confirmation of some pathogenic CNVs, we selected the commercial MLPA probes-targeted regions associated with 23 well-known microdeletion or microduplication syndromes (including Williams-Beuren syndrome, Prader-Willi/Angelman syndrome, DiGeorge syndrome, Xq28 duplication syndrome, and Rett syndrome, etc). According to the manufacturer's instructions, MLPA was performed on the proband's genome by using the SALSA MS-MLPA kit P245-B1 (MRC-Holland, Netherlands).

Whole Exome Sequencing
The genomic DNA of the proband was fragmented to generate 200-300 bp insert fragments. The paired-end libraries were prepared following the Illumina library preparation protocol. The exome was captured using the SureSelect Human All Exons Plus kit (Agilent, Santa Clara, US). Paired-end sequencing was carried out on an Illumina HiSeq 3,000 sequencer (Illumina, San Diego, US).
Raw image files were processed by the Illumina Pipeline for base calling using default parameters. Primary data came in fastq form after image analysis and base calling was conducted using the Illumina Pipeline. The data were filtered to generate "clean reads" by removing adapters and low quality reads. The raw results were analyzed by using a customized pipeline that utilized published algorithms in a sequential manner. Sequencing reads were mapped to the reference human genome version hg19 (http://genome.ucsc.edu/). Variant analysis was performed using SOAPsnp software and Samtools for SNPs and indels, respectively. All SNPs were identified by using the dbSNP, HapMap, 1,000 human genome dataset (http://www. 1000genomes.org/), and a local database developed by BGI (Shenzhen, China).

Clinical Information of the Cohort
In total 1,665 unrelated ID patients were included; the male/ female ratio was 2.86:1. IQ scoring was carried out by using standard test scales for children (WISC III and WISC IV) (Baron, 2005). The severity was assessed also referring to the criteria of Diagnostic and Statistical Manual of Mental Disorders, 5th Edition (DSM-5). IQ scores of the patients revealed mild (23.59%), moderate (43.15%), and severe (33.26%) ID. Some of the patients also exhibited general or facial dysmorphism (39.68%), speech delay (46.74%), psychomotor retardation (39.16%), social dysfunction (23.41%), or growth retardation (14.62%). Among these patients, three had a family history of ID, while the others had no obvious family history.
For the 1604 ID patients from the disabled children evaluation center, 1,002 cases (62.47%) were found to experience known etiological risk factors, such as birth injury or infection. The routine genetic screening results from 602 cases revealed that 216 cases (35.88%) obtained a genetic diagnosis. Of which, 196 patients had chromosome number or structural abnormality. Down syndrome (27.57%, 166/602) and Turner syndrome (2.99%, 18/602) were the most common diseases. Twenty patients were diagnosed with monogenic disorders mainly composed of Fragile X syndrome (1.66%, 10/602) and inherited metabolic diseases. According to the inclusion criteria a-e in ID patients, 44 undiagnosed syndromic severe ID cases were selected from the remaining undiagnosed cases. The other 61 idiopathic severe ID/DD cases were also recruited from the pediatric rehabilitation unit. The blood samples of the 105 cases were obtained and examined according to a multi-step genetic diagnostic procedure, including CMA and WES sequentially ( Figure 1).

Detection Rate of Genetic Defects
Among the 105 idiopathic ID patients, pathogenic/likely pathogenic CNVs for ID were identified in 46.67% of patients

Classification and Characteristics of CNVs
As shown in Table 1 In 14 patients, we identified rarely reported likely pathogenic CNVs, which were not associated with any of the known syndromes. They included regions that did not completely overlap with those of known genomic imbalances. Although all the 17 likely pathogenic (LP) CNVs were not found in healthy individuals or in DGV (http://dgv.tcag.ca/dgv/app/ home). They have been identified in more than one ID patients before (Mégarbané et al., 2000;Rauch et al., 2003;Roggenbuck et al., 2004;Hellani et al., 2010;Dimitrov et al., 2011;Melis et al., 2012;Rush et al., 2013;Castillo et al., 2014;Balasubramanian et al., 2016;Leffler et al., 2016;Zhou et al., 2016;Akcakaya et al., 2017;Bonati et al., 2019;Holder-Espinasse et al., 2019;Allach El Khattabi et al., 2020). To determine the inheritance pattern, the parental DNA was available from the parent in 21 cases. We identified 20 de novo aberrations. The 1. 57 Mb duplication identified in case #24 with was inherited from his healthy father. The other CNVs were of unknown inheritance, because of the absence of parental DNA samples.
Also, a total of 17 structural CNVs were classified as VUS according to the ACMG guideline, which were identified in 16 patients. Most of the VUS (12/22, 54.55%) were smaller than 500 kb, except five UPDs over 5 Mb. Among 16 cases of VUS, the origins have not been assessed. In detail there were 14 deletions (58.82%), three duplications (11.77%), and 5 UPDs (29.41%) ( Table 2). In summary, the chromosomal distribution of all the identified CNVs were shown in Figure 2, which is most common in chromosomes 7, 15, 16, X, and 5.
In order to better analyze the clinical significance of the VUS, the genomic regions were analyzed according to the online tools and database. Nine VUS were found to be larger than the pathogenic or likely pathogenic cases reported in DECIPHER. All the VUS contained at least one protein-coding gene. The tissue expression position, cellular localization, biological process, and mice model were analyzed comprehensively for each gene. Then 18 genes were proposed to support the pathogenicity of these CNVs. As the genes did not fully explain the phenotypic  abnormalities in our patients, more experimental evidence was needed in vitro or in vivo. Supplementary Table S1, the clinical features of ID cases with pathogenic/likely pathogenic CNVs or VUS were summarized. Overall, the mean age of the patients was 3.5 years old. Except severe intellectual disability, at least one symptom of neurodevelopmental disorders was detected in the patients. Speech delay and psychiatric disturbances were most common (83.67 and 65.31%, respectively), then seizures (36.73%) and autism spectrum disorder (24.49%). Congenital dysmorphisms (48.98%) and motor developmental delay (30.61%) were the most common symptoms. The segment size of the pathogenic CNVs was common at 1-10 Mb (61.4%).

Mutations Identified by NGS
The 35 patients with benign CNVs were next tested by WES. According to the international guidelines of ACMG (Richards et al., 2015), 11 SNVs were classified as pathogenic/likely pathogenic and 5 SNVs as VUS in 12 ID cases ( Table 3). In which, 12 SNVs were inherited from the parents and 2 were de novo. All kinds of SNVs were present, including nine missense variations, four deletions, two nonsense, and one splicing. There were eight compound heterozygous variations, three heterozygous and five hemizygous. Dominant or recessive inheritance could be found, with the recessive in the majority. All these mutations were not found in the normal controls. In total, 9 novel mutations and seven reported mutations were identified in 12 ID cases by using WES (Figure 3).
The mean age of these WES-positive patients was 2.85 years old. The most common clinical phenotype was globe development delay (GDD), along with some other manifestations, such as facial dysmorphism, dystonia, and dyskinesia. Most of the cases are known inherited metabolic syndromes, such as Hunter syndrome, Glycogen storage disease, Glycine encephalopathy, and Lesch-Nyhan syndrome. However, for the VUS identified in five cases (#69, #12, #17, #19, and #86), the actual pathogenic mechanism remains to be confirmed by further experiments in vitro or in vivo.

DISCUSSION
ID is one of the most common non-structural birth defects in this ethnically diverse region, where economic and social development is particularly unbalanced. According to the incidence rate, there are over 1 million ID patients in Chongqing, most of which without a clear genetic diagnosis. But traditional karyotype analysis, metabolic analysis, and candidate gene screening can only explain a small amount of the causes of idiopathic IDs (Ropers, 2008). With the incidence of chromosomal disease significantly reduced, the proportion of patients with submicroscopic rearrangements increased  (Sagoo et al., 2009;Saldarriaga et al., 2015). Therefore, it is valuable to use high-throughput methods to analyze the genetic etiology of the ID probands (Wright et al., 2018).
Based on a diagnostic evaluation of the phenotype, 1665 ID patients were included. Sixty-one cases were from the department of pediatrics at Xinqiao Hospital and 1,604 cases from the disabled children evaluation center. After phenotypic and medical history analysis, 1,002 cases (62.47%, 1,002/1,604) were found to have known etiological risk factors. Initially, 216 cases were clearly diagnosed by routine genetic analysis. Even with its high diagnostic yield and clinical impact on pediatric care, CMA testing is not yet widely used for clinical diagnostic purposes in children with DD/ID in some districts, including Chongqing. With limited research funds, we strictly enrolled only 105 patients for high-throughput testing. After  Frontiers in Genetics | www.frontiersin.org September 2021 | Volume 12 | Article 669217 multi-step screening in 105 undiagnosed ID cases, 56 (53.33%, 56/105) patients were genetically diagnosed by high-throughput techniques. In total, 272 ID cases received a genetic diagnosis, including 196 (72.06%) with chromosomal aberrations, which was the largest proportion. Forty-nine patients (49/272, 18.01%) were identified to carry pathogenic CNVs, and 27 (9.93%) patients were affected by monogenic disorders. The results corresponded to the generally held diagnostic yield of CNVs and SNVs (Stobbe et al., 2014;Bass and Skuse, 2018). The unusually high proportion of chromosomal disorders maybe due to selective bias. These patients were easily recognized and enrolled with a clear molecular diagnosis by conventional karyotyping. For the idiopathic ID cases, clinical application of high-throughput techniques greatly improved the diagnostic yields, which might be a general strategy for the molecular screening of IDs. To date, this is the first report about the etiological distribution of the genetic defects of IDs in Chongqing. The results improve the subsequent genetic counseling and are meaningful to formulate the birth defect prevention and control strategies.
Considering the incidence of syndromes, Trisomy 21 was the most common type, followed by Turner syndrome, Fragile X syndrome, and chromosomal microduplications and microdeletions. The results corresponded to the most common genetic causes of ID. The disabled children evaluation center regularly recruited Down syndrome children for medical guidance and genetic counseling. That might lead to an unusually high diagnosis rate of Down syndrome. In fact, the incidence of Down syndrome has declined significantly in recent years with the introduction of prenatal screening and diagnosis techniques. Typical facial features and globe developmental delay were highly suggestive of the diagnosis of FXS. If there were such specific clues to direct diagnosis, MLPA was the best choice to make a diagnosis. Conventional karyotyping has been widely used to identify the causes of ID patients, because of the advantage in the detection of balanced rearrangements and mosaicism, convenience, and cost-effectiveness. So, MLPA and karyotyping should be optional for ID patients with characteristic facial deformities in the genetic screening department (Miller et al., 2010). In recent years, quantitative fluorescent polymerase chain reaction (QF-PCR) has been widely used in the diagnosis of chromosomal aneuploidy due to its accuracy and high efficiency. The genetic screening results in our cohort also confirmed that QF-PCR/MLPA was an alternative solution. However, the complex rearrangements might also be missed. On this occasion, CMA is sensitive enough to identify such pathogenic CNVs. Therefore, CMA should be recommended as a first-tier clinical diagnostic test for ID patients. WES still has incomparable advantages and cost performance in the identification of SNVs. Briefly, for the appropriate selection of a genetic diagnostic method for ID patients, detailed clinical data and precise clinical diagnosis are extremely important. According to the study, for the syndromic ID patients, a multi-step genetic diagnostic procedure is economical and powerful to identify the genetic defects.
The application of high-throughput techniques has introduced a major advance in the genetic diagnosis of idiopathic ID and associated congenital abnormalities (Mefford et al., 2012;Wiszniewski et al., 2018). However, it is challenging in terms of data interpretation and pathogenicity evaluation. The analysis of data is a time-consuming and labor-intensive work. ACMG updated the technical standards for the interpretation of SNVs and CNVs in 2015 and 2020 separately (Richards et al., 2015;Riggs et al., 2020). Compared to the 2011 version (Kearney et al., 2011), the point-based scoring metric for CNVs paid more attention to the genomic content, the inheritance pattern, and the correlations of clinical findings. With extensive application of NGS-based techniques in clinical laboratories, more abundant variation databases across different races would lead to more consistency across interpretations (Smajlagić et al., 2021;Yuan et al., 2021).
In this study, it was easy to obtain a diagnosis for the recurrent pathogenic CNVs. These regions occur at genomic rearrangement hotspots, chromosomes 7, 15, 16, X, and 5. This is consistent with the results in the Chinese cohorts of pediatric patients with developmental conditions (Yuan et al., 2021). For the likely pathogenic CNVs identified in 14 patients, the pathogenicity evaluation was based on existing reported cases and their absence in the normal population. The parental studies confirmed that most LP CNVs were essentially de novo, except case #24. Despite the controversy, case #24 could also be explained by incomplete penetrance. For all the 74 CNVs, more deletions were identified compared to duplications. This might be associated with a bias due to the small sample size. Several patients had large structural abnormalities, which were directedly tested by CMA without prior chromosome analysis. Obviously, the female protective effect for ID patients was also observed in this report. Our study also revealed a group of unique non-recurrent CNVs across the human genome, many of which still warrant further analysis.
For a better clinical interpretation of the VUS, the genomic regions were also analyzed according to the bioinformatic tools and database. Eighteen genes were proposed to support the pathogenicity of these CNVs. According to the literature and databases, some genes have been linked to known neurological syndromes, such as MYCN, MED13L, TCF4, etc. While some other genes were reported to be candidate genes for neurodevelopmental disorders, such as DLGAP2 and RBFOX1 (Chien et al., 2013;Zhao, 2013). In fact, accurate interpretation of the variations still depended on the evidence from functional studies of the candidate genes, by using iPSCs, genetically modified cell lines or mouse models (Zhao and Bhattacharyya, 2018;Fell and Nagy, 2021).
It is estimated that rare SNVs account for approximately 10-20% of ID cases (Vissers et al., 2016;Harripaul et al., 2017). Here in 35 CNV-negative ID patients, 11 pathogenic/ likely pathogenic SNVs were identified in 7 cases by WES. Most of the SNVs were inherited from the parents and correlated with GDD phenotype. While for the six VUS identified in five cases, more evidence was needed to determine its pathogenicity. Interestingly, several rare damaging SNVs were also identified in known syndromic ID genes, such as RNASEH2B and EIF2B3. The inheritance pattern of the phenotypes was autosomal recessive, so ACMG interpretation was not suitable for the assessment of these single allelic SNVs. A de novo splicing Frontiers in Genetics | www.frontiersin.org September 2021 | Volume 12 | Article 669217 8 variant in PIP5K1B was also identified, which was reported to be associated with autism (Marshall et al., 2008). However, the actual pathogenic mechanism remains to be further elucidated in vitro or in vivo. Consistent with previous CMA results, no exon level CNVs were called in the 35 WES cases.
Based on the literature, the underlying genetics of ID/DD seems to be extraordinary complex. A large proportion of patients lacks a specific diagnosis. After using a multi-step screening strategy, nearly half of the 105 patients were still etiologically unknown. According to the technical limitations, the variation located in the non-coding region or the low ratio somatic mosaicism could be missed in the study. The problem might partly be solved by comprehensively introducing whole genome sequencing (WGS), third-generation sequencing (TGS), or an iterative patient-specific approach in the clinic (Lindstrand et al., 2019;Cope et al., 2020). WGS robustly not only captures SNVs and CNVs, but also detects structural variations, STRs (short tandem repeats), ROH (runs of homozygosity), and genomic rearrangements. In certain conditions, WGS could be used as a single test instead of performing CMA followed by WES. In addition, the polygenic genetic basis of the ID or the imperceptible environmental factors might also explain the loss of the heritability.
Certain limitations of the study should be mentioned. The size and composition of the cohort was perhaps the most important one. Since our study was based on previous genetic screening, 1604 ID patients were recruited from the disabled children evaluation center, including 1,002 patients with known etiological risk factors and only 602 patients with genetic screening results. Then 44 patients with negative screening results and 61 idiopathic patients from the pediatric rehabilitation center were strictly selected and tested by CMA and WES sequentially. The 105 cases came from a variety of sources and the sample size is very limited, so some of the findings may be biased. Moreover, most of our results were not further validated by other techniques. We also did not demonstrate a direct relationship between these variations and ID phenotype through functional studies. Therefore, a larger sample size, rigorous inclusion criteria, well-defined multistage screening protocol, and more comprehensive genetic diagnostic approaches are required to obtain a comprehensive overview of the genetic etiology of ID in Chongqing.
In summary, our study explored the genetic etiology of a large ID cohort by using an efficient sequential high-throughput diagnostic strategy. For the strictly selected idiopathic ID cases, CMA and WES might be effective diagnostic tools to greatly improve the diagnostic yields in clinic. These data further extended the variation spectrum of ID in this district, which provided clues for the exploration of neurodevelopmental genes.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusion of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Ethics Committee of Army Medical University. Written informed consent to participate in this study was provided by the participants' legal guardian/next of kin. Written informed consent was obtained from the individual(s), and minor(s)' legal guardian/next of kin, for the publication of any potentially identifiable images or data included in this article.

AUTHOR CONTRIBUTIONS
LD, DZ, ZW, and XG did the experiments and analysed the data. DZ, ZW, MM, and LL collected the samples and the clinical data. HG prepared the manuscript. HG, YZ, and YB designed and supervised the study.

FUNDING
This work was supported by the National Natural Science Foundation of China (No. 81570217), the Natural Science Foundation Project of Chongqing (cstc2018jcyjAX0641, cstc2017jcyjAX0478), and the Project of Chongqing health and family planning commission (No. 2017-111) ACKNOWLEDGMENTS Some or all data, models, or code generated or used during the study are available from the corresponding author by request. All data should be uploaded to online databases in the coming few months. We would like to thank the family members for participating in this study.