Rare Copy Number Variations and Predictors in Children With Intellectual Disability and Epilepsy

Introduction: The concurrence of intellectual disability/global developmental delay and epilepsy (ID/GDD-EP) is very common in the pediatric population. The etiologies for both conditions are complex and largely unknown. The predictors of significant copy number variations (CNVs) are known for the cases with ID/GDD, but unknown for those with exclusive ID/GDD-EP. Importantly, the known predictors are largely from the same ethnic group; hence, they lack replication. Purpose: We aimed to determine and investigate the diagnostic yield of CNV tests, new causative CNVs, and the independent predictors of significant CNVs in Chinese children with unexplained ID/GDD-EP. Materials and methods: A total of 100 pediatric patients with unexplained ID/GDD-EP and 1,000 healthy controls were recruited. The American College of Medical Genetics guideline was used to classify the CNVs. Additionally, clinical information was collected and compared between those with significant and non-significant CNVs. Results: Twenty-eight percent of the patients had significant CNVs, 16% had variants of unknown significance, and 56% had non-significant CNVs. In total, 31 CNVs were identified in 28% (28/100) of cases: 25 pathogenic and 6 likely pathogenic. Eighteen known syndromes were diagnosed in 17 cases. Thirteen rare CNVs (8 novel and 5 reported in literature) were identified, of which three spanned dosage-sensitive genes: 19q13.2 deletion (ATP1A3), Xp11.4-p11.3 deletion (CASK), and 6q25.3-q25.3 deletion (ARID1B). By comparing clinical features in patients with significant CNVs against those with non-significant CNVs, a statistically significant association was found between the presence of significant CNVs and speech and language delay for those aged above 2 years and for those with facial malformations, microcephaly, congenital heart disease, fair skin, eye malformations, and mega cisterna magna. Multivariate logistic regression analysis allowed the identification of two independent significant CNV predictors, which are eye malformations and facial malformations. Conclusion: Our study supports the performance of CNV tests in pediatric patients with unexplained ID/GDD-EP, as there is high diagnostic yield, which informs genetic counseling. It adds 13 rare CNVs (8 novel), which can be accountable for both conditions. Moreover, congenital eye and facial malformations are clinical markers that can aid clinicians to understand which patients can benefit from the CNV testing and which will not, thus helping patients to avoid unnecessary and expensive tests.


INTRODUCTION
Intellectual disability (ID) is a complex neurodevelopmental ailment characterized by low intelligence quotient (IQ <70) and restrictions in adaptive functioning, normally diagnosed before the age of 18 years (1). Global developmental delay (GDD) is defined as significant delay in two or more developmental domains. Significant delay is determined by performance that is two or more standard deviations lower than the mean on objective, norm-referenced, and age-appropriate testing in two or more domains (2). The term GDD is used for young children who are <5 years of age, as there is some disagreement on how to objectively measure IQ and cognition in a consistent, reliable, and valid fashion in these patients (3). Epilepsy (EP) is a brain disease characterized by two unprovoked seizures >24 h apart (International League Against Epilepsy [ILAE]) (4). Epilepsy is a very common disease in children with intellectual disability/global developmental delay (ID/GDD) with an estimated prevalence of about 22.2% (5,6). Patients with both intellectual disability/global developmental delay and epilepsy (ID/GDD-EP) have a mortality rate 3.3 times higher than those with only intellectual disability/global developmental delay (7).
Genetic etiologies, prenatal illnesses, postnatal illnesses, environmental factors, and metabolic diseases can account for ID/GDD-EP. For a substantial proportion of people with both conditions, the underlying etiology has yet to be elucidated. Camfield et al. found in their study that ∼37% of their patients had unknown etiology (8).These patients without obvious cause can be classified as patients with unexplained ID/GDD-EP. The obvious causes include conditions such as metabolic diseases, infectious diseases, immunological conditions, autoimmune diseases, brain injuries, Down syndrome, and tuberous sclerosis complex.
Genetic factors play a major role in etiology, especially in pediatric patients, who are highly heterogeneous (8). Copy number variations (CNVs) are important mechanisms of genomic diversity and evolutionary changes in humans. Over the past decade, genomewide identification of CNVs has become efficient, allowing the detection of causative submicroscopic chromosomal aberrations, especially in neuropsychiatric disorders, including autism spectrum disorder, ID/GDD, EP, and schizophrenia (9). Borlot et al. found a high prevalence of pathogenic CNVs in their cohort of adults with pediatric-onset epilepsy and intellectual disability (10). However, only a few studies have focused on the identification of the role of CNV tests in pediatric patients with ID/GDD-EP, especially from China.
Copy number variations can be detected by multiple methods, including array-based comparative genomic hybridization (aCGH), single nucleotide polymorphism (SNP) arrays, and next-generation sequencing (CNV seq) (11). The CNV tests are expensive despite the fact that they are useful in identifying the etiologies of neuropsychiatric disorders. The diagnostic yield ranges from 15 to 20%; hence, not all cases have pathogenic CNVs (11). Therefore, it is important to have clinical markers/predictors that could guide clinicians in ordering this particular test. There are few studies that focused on identifying the predictors of significant (pathogenic or likely pathogenic) CNVs in patients with ID/GDD, and most of them are from the same ethnic group and, hence, lack replication (12)(13)(14)(15). Moreover, no study has focused on identifying the predictors of significant CNVs among patients with exclusive unexplained ID/GDD-EP.
In this study, we sought to determine and investigate the diagnostic yield of CNV tests, new causative CNVs, and the independent predictors of significant CNVs in Chinese children with unexplained ID/GDD-EP. This study identifies new CNVs, which can be accountable for both ID/GDD and EP, and provides some clinical markers to help clinicians understand which patients can benefit from the CNV testing and which will not, thus helping patients avoid unnecessary and expensive tests, especially for resource-limited countries.

Ethical Clearance
This study was reviewed and approved by the Institutional Ethics Committee of Xiangya Hospital Central South University, thus complying with the treaty agreed to in 1964 in Helsinki by the World Medical Association on ethical principles of human research for medical purposes and subsequent revisions of the same (2013). Both informed and written consent were obtained from the subjects.

Human Subjects
A total of 100 pediatric patients, diagnosed by senior neurologists to have unexplained ID/GDD-EP at the Children's Intellectual Disability Medical Institution, Department of Pediatric Neurology, Xiangya Hospital, were recruited from the hospital database retrospectively. In total, 1,000 in-house healthy controls from the Chinese population were recruited. Inclusion criteria for this cross-sectional study include: (1) aged 14 years and below, (2) diagnosed with unexplained ID with IQ < 70 or with unexplained GDD with DQ < 85, (3) diagnosed with unexplained EP, which happened after ID/GDD, and (4) underwent a CNV test. The exclusion criteria include: (1) diagnosed with only unexplained ID/GDD without EP, (2) diagnosed with only unexplained EP without ID/GDD, (3) diagnosed with ID/GDD that happened after EP, and (4) diagnosed with unexplained ID/GDD-EP, but did not undergo a CNV test.

Diagnostic Protocol
The assessment of the ID was done according to diagnostic criteria of the DSM-5 for intellectual disabilities (Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition, American Psychiatric Association, 2013). Observations, clinical interview, and standardized age-related rating scales were used for the assessment of the adaptive functioning. However, the diagnosis was often initially formulated based on clinical judgment, rather than on formal standardized assessments, especially for young patients (16). Standardized age-related rating scales that were used include: Gesell Developmental Schedules for patients younger than 2-4 years, Wechsler Preschool and Primary Scale of Intelligence-Fourth Edition (WPPSI-IV) for patients between 4 and 6 years, and Wechsler Intelligence Scale for Children-Fourth Edition (WISC-IV) for patients who were 6 years old or above. Clinical judgment was utilized to grade the severity of GDD for the patients aged below 2 years. Patients with deficits in their adaptive and intellectual functioning with an onset during their developmental period were classified into four categories, such as mild ID when their IQ values ranged from 55 to 70, moderate ID when IQ values ranged from 40 to 55, severe ID when IQ values ranged from 25 to 40, and profound ID when IQ values were less than 25. Patients with DQ ranging from 65 to 84 were considered to have mild GDD, those with DQ ranging from 45 to 64 were considered to have moderate GDD, and those with DQ of 44 or less were considered to have severe GDD. Epilepsy was diagnosed according to International League Against Epilepsy (ILAE) guidelines. Clinical, neurophysiological, and imaging data were considered in defining phenotypes. Phenotypes were classified into known electro-clinical syndromes according to the ILAE classification, where possible, or into "unclassified epilepsy" when they did not fit the criteria for syndromes.

Copy Number Variation Tests
Whole-genome CNVs were detected using the following: CNV_01 Affymetrix aGGH+SNP Microarray for 8 patients, Illumina HumanCytoSNP-12 BeadChip for 6 patients, and low-depth whole-genome sequencing for the remaining patients. One-thousand healthy controls were genotyped using the Illumina Human610-Quad BeadChip (17). BeadChip the CNV_01 Affymetrix aGGH+SNP Microarray incorporates 200,000 of the best SNPs for CNV testing (https://www. thermofisher.com). The Illumina HumanCyto-SNP12 BeadChip is a powerful, whole-genome scanning panel that incorporates 300,000 of the best SNPs for CNV testing and has dense coverage of around 250 genomic regions commonly screened in cytogenetics laboratories (https://www.illumina.com). Lowdepth whole-genome sequencing is a cost-effective approach, which detects low frequency and rare variation in complex trait-association studies (http://www.biorxiv.org). Chromosome coordinates refer to chromosome build (hg19).

Copy Number Variation Analysis and Classification
The pathogenicity of CNVs was predicted based on their size and contained gene according to the American College of Medical Genetics (ACMG) guidelines for interpretation of postnatal CNVs (18). All detected CNVs in patients were, firstly, compared with the DGV database (Database of Genomic Variants, http://dgv.tcag.ca/gb2/gbrowse/dgv) and 1,000 healthy controls (17). The identified candidate CNVs, which did not overlap with those in the DGV and healthy controls, were then compared with DECIPHER (Database of Chromosomal Imbalance and Phenotype in Humans using Ensemble Resources, https://decipher.sanger.ac.uk) and ClinVar (https://www.ncbi. nlm.nih.gov/clinvar). The genes involved in the chromosomal region of interest and their functions were studied for their potential role in ID/GDD or EP/both by using all available evidence in databases such as the Online Mendelian Inheritance in Man (OMIM), ClinGen, Gene Reviews, and PubMed. For each relevant OMIM gene that could play a role in the patient's phenotype, the residual variation intolerance score (RVIS) and the percentile of most intolerant genes (MIG) to guide the interpretation of potential clinical significance were applied (19). The RVIS is based on allele frequency and represented in wholeexome sequence data from the National Heart, Lung, and Blood Institute Exome Sequencing Project 6500 datasets. A gene with a positive score has more common functional variation and a gene with a negative score has less variation and is referred to as intolerant. The more intolerant a gene is, the more likely it is associated with the disease, if mutated or at abnormal levels. Additionally, the Human Protein Atlas was used to study their tissue expression (https://www.proteinatlas.org/). Parents were analyzed, where available, to determine if the CNV had arisen de novo or was inherited. Furthermore, parental phenotypes were taken into account to help determine if the CNV was significant.

Validation of Suspected Pathogenic Copy Number Variations
Parents of patients with suspected pathogenic CNVs were tested for validation whenever available and gave consent. The fluorescence in situ hybridization or quantitative polymerase chain reaction or CNV tests were used for validation and identification of the mode of CNV inheritance.

Statistical Analysis
For the purpose of analysis, we divided patients into two major groups. The first group consisted of patients with significant CNVs (pathogenic and likely pathogenic). The second group consisted of patients with non-significant CNVs (likely benign, benign, or no CNV detected). Carriers of variants of unknown significance (VOUS) were excluded from the analysis, since the pathogenic role was unknown. The phenotypes of the two groups were compared and analyzed using IBM SPSS Statistics 22.0 software (IBM, Armonk, NY, USA). We excluded the missing values in data analysis. We compared the clinical data of patients with significant CNVs and those with non-significant CNVs. The association between each clinical feature and CNV test results was first established using a Chi-squared test (or Fisher's exact test when appropriate). Fisher's exact test (2-sided), rather than a Chi-squared test (2-sided), was used when one or more of the cells had an expected value of <5. Furthermore, to identify the independent predictors of significant CNVs, all significant variables in the univariate analysis were subjected into a multivariate logistic regression model and backward stepwise selection of variables was performed, using the presence of significant CNVs as a dependent dichotomous variable. Odds ratios, standard error, 95% confidence intervals (CI), and positive and negative predictive values were calculated. Results with a value of P ≤ 0.05 were considered to be statistically significant.

Copy Number Variation Test Results
Twenty-four percent (24/100 cases) had at least one pathogenic CNV each, 4% (4/100 cases) had at least one likely pathogenic CNV, 16% (16/100 cases) had VOUS, 49% (49/100 cases) had benign CNVs, and 7% (7/100 cases) had no CNV detected. Thirty-one CNVs (25 pathogenic and 6 likely pathogenic) were identified in 28% (28/100 cases) as possible causes for unexplained ID/GDD-EP. One case had 3 CNVs (all were likely pathogenic), one case had 2 pathogenic CNVs, 23 cases had one pathogenic CNV each, and 3 cases had one likely pathogenic CNV each. The size of their CNVs ranged from 0.11 Mb to 21 Mb. Twenty of all detected pathogenic or likely pathogenic CNVs were deletions and 13 were duplications. Eight patients had de novo CNVs, while others had an unknown mode of inheritance since their parents were not tested (Supplementary Table 1).

Comparison of CNVs and Clinical Findings (N = 84)
By comparing seizure characteristics, clinical information, EEG findings, brain imaging findings, and clinical features in patients with significant CNVs against those with non-significant CNVs, a statistically significant association was found between the presence of significant CNVs and speech and language delay for those aged above 2 years (P = 0.025) and for those with facial malformations (P = 0.000), microcephaly (P = 0.014), congenital heart disease (P = 0.005), fair skin (P = 0.034), eye malformations (P = 0.037), and mega cisterna magna (P = 0.034) ( Table 5). Eye malformations (P = 0.032) and facial malformations (P = 0.027) were identified as independent predictors of significant CNVs according to multivariate logistic regression analysis ( Table 6).The positive predictive value for the adjusted model was 65%, whereas negative predictive value was 80% ( Table 7).

DISCUSSION
This study focuses on pediatric patients with unexplained ID/GDD-EP, with its major aims being to identify the following: the diagnostic yield of CNV tests, new causative CNVs, and the independent predictors of significant CNVs. Thirtyone CNVs (25 pathogenic and 6 likely pathogenic) in 28% (28/100) of cases were identified and 17 cases had 18 known syndromes. One case had two CNVs that presented two different syndromes: 3q29 microduplication syndrome and 1q43q44 microdeletion syndrome. Three rare non-recurrent CNVs encompassing dosage-sensitive genes were identified: deletion at 19q13.2 (ATP1A3), Xp11.4-p11.3 (CASK), and 6q25.3-q25.3 (ARID1B). The relative burden conveyed by each clinical feature accompanying unexplained ID/GDD-EP was evaluated by both univariate and multivariate analysis in 84 patients (28 patients with significant CNVs and 56 patients with non-significant CNVs). Upon seeking the association between significant CNVs and clinical findings of patients with unexplained ID/GDD-EP, we found that speech and language delay for those aged above 2 years (P = 0.025), facial malformations (P = 0.000), microcephaly (P = 0.014), congenital heart disease (P = 0.005), fair skin (P = 0.034), eye malformations (P = 0.037), and mega cisterna magna (P = 0.034) were associated with positive CNV tests. However, only eye malformations (P = 0.032) and facial malformations (P = 0.027) were independent predictors according to multivariate logistic regression. Their positive and negative predictive values were 65 and 80%, respectively. The role of CNVs has been well-studied in children with EP (20-22) and ID/GDD (23,24); however, only a few studies have focused on patients with ID/GDD-EP. The diagnostic yield of CNV tests in patients with unexplained EP ranges from 5 to 15% (21,25,26), while for individuals with unexplained ID/GDD, ASD, or multiple congenital anomalies, it is estimated to range from 15 to 20% (11). This study of pediatric patients with unexplained ID/GDD-EP revealed that a high proportion of our patients had clinically relevant CNVs. Twenty-eight probands (28%) had 25 pathogenic and 6 likely pathogenic CNVs. Fry et al. and Mullen et al. studied pediatric patients with genetic generalized epilepsy with ID, whereby they found the prevalence of pathogenic and likely pathogenic CNVs ranging from 8.8 to 28% (27,28), respectively. Our finding is similar to that of Mullen et al. (28). Borlot et al. investigated adult patients with pediatric-onset ID and EP, whereby the prevalence of pathogenic and likely pathogenic CNVs was found to be 16.1% (10). Their study was based in Canada where there are multiple ethnicities. It involved 143 adults in which most of them had mild ID associated with focal or febrile seizures that started 1 year after birth. Our study involved 100 Chinese children in which most of them had severe ID associated with focal seizures that started within the first year of delivery. Our study revealed a higher diagnostic yield compared to that of Borlot et al. This can be explained by the fact that in most cases, patients with severe ID/GDD and EP end up dying early, when compared to those with mild ID/GDD and EP. We speculate that the study performed by Borlot et al. might have missed some cases with severe ID/GDD due to early deaths. However, the higher diagnostic yield in our study could also be due to ascertainment bias. Consequently, our findings support the usage of CNV testing in pediatric patients with unexplained ID/GDD-EP, as there is a high diagnostic yield which informs genetic counseling.  Eighteen known syndromes were diagnosed in 17 cases with pathogenic CNVs: Angelman syndrome for two cases, Prader-Willi syndrome for two cases, Lubs X-linked mental retardation syndrome for three cases, 16p11.2-p12.2 microdeletion syndrome for three cases, 1p36 microdeletion syndrome for two cases, 2q33.1 deletion syndrome/Glass syndrome for one case, 17p13.1 microdeletion syndrome for one case, 1q21.1 recurrent microdeletion syndrome for one case, 3q29 microduplication syndrome and 1q43q44 microdeletion syndrome for one case, and Wolf-Hirschhorn syndrome for one case. Borlot et al. found that 16 out of 23 adult patients with pathogenic or likely pathogenic CNVs and yet diagnosed with ID/GDD-EP had known syndromes (10). Therefore, our result along with that of Borlot et al. reinforces the supposition that a large number of patients with ID/GDD-EP might have CNVs which fall under certain microduplication or microdeletion syndrome. One case (case number 20) had two CNVs that present two different syndromes:3q29 microduplication syndrome and 1q43q44 microdeletion syndrome (29). The 3q29 microduplication syndrome is characterized by mild-tomoderate ID/GDD, microcephaly, and minor dysmorphic features (30). The 1q43q44 microdeletion syndrome is characterized by ID/GDD, EP, microcephaly, anomalies of the corpus callosum, and facial dysmorphism (29). Our case presented with severe ID/GDD, EP, microcephaly, facial dysmorphism (hypertelorism, long and smooth philtrum, thin vermilion borders, and micrognathia), recurrent respiratory tract infections, and stereotypic movements. A de novo deletion at 1q44q44 spanning HNRNPU and ZBTB18 has been described in cases with ID, EP, microcephaly, and hypogenesis of the corpus callosum (29,31). This is similar to our patient's phenotype with the exception that she had no corpus callosum dysplasia. Our patient's CNV spans HNRNPU solely. The absence of ZBTB18 in our patient's CNV could explain the lack of corpus callosum dysplasia. A de novo duplication at 3q29-q29 (880 Kb) spans CEP19, PCYT1A, RNF168, TCTEX1D2, and TFRC. However, none of the contained genes associate with ID/GDD or EP/both. Therefore, 1q43q44 microdeletion syndrome is likely to explain our patient's phenotype with few additional clinical features.
Three rare pathogenic CNVs of interest which span dosagesensitive genes were identified. The first one is de novo deletion at 19q13.2 (333 Kb); this interval contains the ATP1A3 gene. Mutations in the ATP1A3 gene have been reported to be associated with alternating hemiplegia, as well as both ID/GDD and EP in a cohort of 34 patients (32). This coincides with our first patient's phenotype with an exception that he had hypotonia without hemiplegia; however, he will be continued to be monitored. The second CNV is deletion at Xp11.4-p11.3 (2.14 Mb); this interval contains the CASK gene. Mutation in this gene commonly affects females whereby they present with severe ID, microcephaly, and variable degrees of pontocerebellar hypoplasia (33). Additionally, some may have epileptic spasms, sensorineural hearing loss, eye anomalies, overall poor growth, dysmorphic features, including broad nasal bridge and tip, large ears, long philtrum, micrognathia, and hypertelorism (34). Our case number 4 presented with microcephaly, facial dysmorphism, and visual and hearing impairment, severe ID, late-onset epileptic spasms, and abnormal signals near the periventricular region. However, our case had no pontocerebellar hypoplasia. The third CNV is a deletion of unknown mode of inheritance at 6q25.3-q25.3 (240 Kb) spanning the ARID1B gene. This gene is associated with Coffin-Siris syndrome (ARID1B). Our case number 6 presented with moderate GDD, EP, facial dysmorphism (low frontal hairline, lateral sparse eyebrows, alternating ptosis, bulbous nasal tip, long and smooth philtrum, prominent upper lip, high palate, and mild retrognathia), abnormal fifth (pinky) finger, atrial septal defect, hearing loss, and enlarged cisterna magna and hence fits for Coffin-Siris syndrome. Consequently, we conclude that these three CNVs are implicated in ID/GDD-EP.
In this study, it was also found that congenital eye and facial malformations can independently predict the presence of significant CNVs among pediatric patients with unexplained ID/GDD-EP ([P = 0.032] and [P = 0.027], respectively). Their positive and negative predictive values were 65% and 80%, respectively. This is to say that 65% of the patients with these clinical features might have significant CNVs, whereas 80% of the patients lacking these clinical features might have nonsignificant CNVs. This is the first study that investigated the predictive values of the clinical features; hence, it remains nonconclusive. Studies involving larger sample sizes are invited. Preiksaitiene et al. studied a large cohort of patients with ID/GDD and found that there is a significant association between eye malformations and significant CNVs; however, they could not stand as independent predictors in their study (35). Our study marks the first time that congenital eye malformations are reported to be an independent predictor of the presence of significant CNVs. Caramaschi et al. studied patients with ID/GDD and found that facial malformation can be an independent predictor of presence of significant CNVs (14). The combination of the aforementioned studies and ours suggests that the presence of facial malformations in patients with either ID/GDD or ID/GDD-EP indicates the presence of pathogenic or likely pathogenic CNVs in 65% of all the cases.
In addition, speech and language delay for those aged above 2 years (P = 0.025), microcephaly (P = 0.014), congenital heart disease (P = 0.005), fair skin (P = 0.034), and mega cisterna magna (P = 0.034) showed significant associations; however, they could not stand as independent predictors in our cohort. This could be attributed by the usage of a small sample size. Nevertheless, they are promising indicators that need more studies with larger sample size. Shoukier et al. also found a significant association between significant CNVs and microcephaly in patients with ID/GDD (15). Positive association between significant CNVs and congenital heart disease in patients with ID/GDD has been reported (13,36,37). A chromosomal aberration can contribute to the etiology of speech and language delay (38,39). Fair skin is commonly seen in patients with Prader-Willi syndrome and Angelman syndrome (40,41), and it was also noticed in our patients with these diagnoses. Isolated mega cisterna magna can lead to impaired memory and speech (42). It has been reported to be associated with psychiatric conditions such as mania, autism, catatonic schizophrenia (43), and mild syndromic ID (44).This association points out its role in neurodevelopmental disorders. This is the first study that shows that there is a significant association between pathogenic or likely pathogenic CNVs with speech delay and mega cisterna magna. This could be due to our inclusion criteria of patients with exclusive unexplained ID/GDD-EP. Other studies involved patients with ID/GDD with or without EP. Lastly, we could not find any association between significant CNVs and seizure semiology or EEG findings.
Despite all aforementioned strengths, our study has some limitations. First, it involved a small sample size and was prone to information bias because it was a retrospective study. Second, we identified many CNVs whose pathogenic mechanisms could not be explained, since no gene function analysis was performed. Therefore, those CNVs require further functional evaluation and comparison with similar cases. Moreover, we identified two clinical markers for pediatric patients with ID/GDD-EP, but these findings need to be replicated in future studies involving large sample sizes. Lastly, different CNV detection methods were used whereby they have different sensitivities.

CONCLUSION
Twenty-eight percent (28/100) of our cases had 25 pathogenic and 6 likely pathogenic CNVs, including 18 syndromes. However, the high rate (28%) of significant CNVs in our study could be due to ascertainment bias. We have identified 13 rare CNVs (8 novel and 5 reported in the literature) in which three of them span dosage-sensitive genes as follows: 19q13.2 deletion (ATP1A3), Xp11.4-p11.3 deletion (CASK), and 6q25.3-q25.3 deletion (ARID1B). Furthermore, we found two independent clinical markers with 65% positive predictive value: congenital eye and facial malformations ([P = 0.032] and [P = 0.027]), respectively. These clinical markers help clinicians understand which patients can benefit from the CNV testing and which will not, thus helping patients to avoid unnecessary and expensive tests, especially for resource-limited countries.

AUTHOR CONTRIBUTIONS
MK and JX share the primary authorship for this manuscript. MK and JX participated in the entire process of the project, analyzed all the data, and wrote the manuscript. LW and LY analyzed patients' data. FH and CC prepared tables. NP, HD, WZ, and AA analyzed cytogenetic results according to ACMG guidelines. FY and JP are the corresponding authors, who initiated the study and supervised each and every step of this study. All of the authors have reviewed the manuscript and agreed to be accountable for all aspects of work ensuring integrity and accuracy.