Whole-Exome Sequencing Identifies Damaging de novo Variants in Anencephalic Cases

Background Anencephaly is a lethal neural tube defect (NTD). Although variants in several genes have been implicated in the development of anencephaly, a more complete picture of variants in the genome, especially de novo variants (DNVs), remains unresolved. We aim to identify DNVs that play an important role in the development of anencephaly by performing whole-exome DNA sequencing (WES) of proband–parent trios. Results A total of 13 DNVs were identified in 8 anencephaly trios by WES, including two loss of function (LoF) variants detected in pLI > 0.9 genes (SPHKAP, c.2629_2633del, and NCOR1, p.Y1907X). Damaging DNVs were identified in 61.5% (8/13) of the anencephalic cases. Independent validation was conducted in an additional 502 NTD cases. Gene inactivation using targeted morpholino antisense oligomers and rescue assays were conducted in zebrafish, and transfection expression in HEK293T cells. Four DNVs in four cases were identified and predicted to alter protein function, including p.R328Q in WD repeat domain phosphoinositide-interacting 1 (WIPI1). Three variants, p.G313R, p.T418M, and p.L406P, in the WIPI1 gene were identified from the independent replication cohort consisting of 502 cases. Functional analysis suggested that the wipi1 p.L406P and p.R328Q variants most likely displayed loss-of-function effects during embryonic development. Conclusion De novo damaging variants are the main culprit for majority of anencephalic cases. Missense variants in WIPI1 may play a role in the genetic etiology of anencephaly, and LoF variants in SPHKAP and NCOR1 may also contribute to anencephaly. These findings add to our existing understanding of the genetic mechanisms of NTD formation.


INTRODUCTION
Neural tube defects (NTDs) are severe congenital malformations of the central nervous system (including anencephaly, spina bifida, and encephalocele) that are caused by a partial or incomplete closure of the neural tube during embryogenesis (Wallingford et al., 2013). Infants with anencephaly are mostly stillborn or die shortly after birth, while infants with spina bifida and encephalocele may survive but suffer from physical and developmental disabilities with varying degrees of severity. NTDs in humans are considered to have a multifactorial etiology, with contributions from both genetic and environmental factors (Kibar et al., 2007a;Haddow, 2011;Wallingford et al., 2013;Kang et al., 2018). Despite a long history of etiological studies in humans (Blom et al., 2006;Wallingford et al., 2013;Murdoch et al., 2014;Wilde et al., 2014), the causative mechanism underlying the development of NTDs remains largely unknown.
Etiologically, NTDs are understood to have a significant genetic component with an estimated heritability of 60% (Bassuk and Kibar, 2009). While over ∼300 causative genes have been linked to mouse NTDs Juriloff, 2007, 2010;Juriloff and Harris, 2012;Wilde et al., 2014), identification of NTD genes in humans is difficult, and the predisposing genetic factors for human NTDs are still unclear. To characterize these genetic underpinnings, efforts have been made toward identifying common, rare, or de novo DNA variants that contribute to the occurrence of NTDs. Very few common DNA variations, such as variations in folate-related genes including methylenetetrahydrofolate reductase (MTHFR), have been identified using genetic association studies; and these common variations might at most confer a modest risk and account for only a very small portion of disease heritability. The other "undiscovered" heritability may be attributed to rare variants that are uncommon in the general population but probably producing larger adverse genetic effects on NTDs than common variants (Manolio et al., 2009). As described by Chen et al. (2018), the genetic contribution in the form of multiple singleton loss of function (LoF) variants can come from any number of KEGG ontogeny groups spread over the genome.
Rare variant discovery has previously proceeded on sequencing of candidate genes identified using a priori data from animal models or based on functional relevance. According to research clues provided by animal models, the strategy of targeting candidate genes has successfully identified a number of causative variations in genes primarily from the planar cell polarity (PCP) signaling pathway that controls the process by which cells become polarized within the plane of an epithelium in numerous tissues in both Drosophila and in vertebrates (Simons and Mlodzik, 2008;Gray et al., 2011). For example, rare variants in the core PCP genes CELSR1, FZD6, PRICKLE1, DVL2, VANGL1, and VANGL2 (Kibar et al., 2007b;Lei et al., 2010;Bosoi et al., 2011;Seo et al., 2011;Allache et al., 2012;De Marco et al., 2012Juriloff and Harris, 2012) have been established as human NTD risk factors. However, candidate gene studies in NTDs are slow and inherently biased, and have faced limited success in identifying more causative genes predisposing to human NTDs, further demonstrating the need for novel alternative approaches.
With rapid development of high-throughput-sequencing technology and the concurrent reduction in sequencing costs, whole-exome sequencing (WES) now is becoming a powerful approach to investigating new candidate genes and new variants associated with human disease (Riviere et al., 2012;Nava et al., 2014;Schwarze et al., 2018). In particular, the application of WES in parent-offspring trios or multiplex families has been successful in identifying candidate pathogenic de novo variants (DNVs) in patients with neurodevelopmental disorders (Yates et al., 2017), such as intellectual disability (Rauch et al., 2012), autism spectrum disorder (Neale et al., 2012), and schizophrenia (Girard et al., 2011). To date, only one Canadian research group has published WES data from 43 trios affected with NTDs (35 myelomeningocele and 8 anencephaly), and another eight families with both forms of open and closed NTDs (including seven myelomeningocele, four spina bifida occulta, one anencephaly, and so on), demonstrating an important role that LoF DNVs play in the development of NTDs (Lemay et al., 2015(Lemay et al., , 2019. In the present study, we focus on the detection of DNVs from cases with the most severe subtype of NTDs, anencephaly. We conducted WES in 13 parent-offspring trios of a child with anencephaly to call variants in each sample of a trio independently, and then identify putative DNVs by comparing offspring against parental genotypes with Mendelian inconsistency. Thus, we identified DNVs expected to affect neurodevelopment and contribute to the occurrence of anencephaly. We subsequently sequenced the gene that carries the DNVs in a larger independent NTD cohort. Knock down experiments using targeted morpholino (MO) antisense oligomers and rescue assays in zebrafish as well as transfection expression studies in HEK293T cells were performed to explore the functional relevance of candidate variants.

Study Subjects
Subjects were recruited from five rural counties (Taigu, Pingding, Xiyang, Shouyang, and Zezhou) in Shanxi Province of northern China from 2011 to 2014. Cases with a confirmed prenatal diagnosis of an NTD were ascertained through a population-based birth defect surveillance program (Liu et al., 2016). Skin tissues taken from infants were collected at delivery or at the time of termination of NTD affected pregnancies by experienced pathologists, and kept frozen at −80 • C until used for various analyses. Blood samples from parents were collected at same time and stored at −80 • C until needed. The study protocol was approved by the institutional review board of Peking University. Written informed consent was obtained from all mothers before the study.

Whole-Exome Sequencing and Variant Prediction
We performed WES on 13 trios comprised of a child with a cranial NTD (2 anencephaly only; 11 anencephaly and spina bifida) and unaffected parents to identify DNVs in their exomes. Briefly, exomes were captured from genomic DNA using Agilent SureSelect Human All Exon Capture V4 kit (Agilent Technologies, Mississauga, ON, Canada) and then sequenced on an Illumina HiSeq 2000 sequencer using 101-bp paired-end reads. Supplementary Materials detail the methods used for the whole-exome data analysis. The summary of sequencing QA/QC is listed in Supplementary Table S1. We identified DNVs by excluding all variants found in the proband that were transmitted by one or both of the parents. We manually checked the sequencing context for all coding variants via Integrative Genomics Viewer (IGV) (Thorvaldsdottir et al., 2013). The sensitivity and specificity of our pipeline was tested with the genome in a bottle 1 (Zook et al., 2016).
PolyPhen 2 , SIFT 3 , and PROVEAN prediction 4 algorithms were employed to predict the impact of exome coding variants on protein function. Clustal-Omega 1.2.1 software was used to estimate the conservation of amino acids that were changed by a variant. Residues were considered to be highly conserved if there was no variation in amino acid properties observed across the compared six orthologous proteins, including five mammalian orthologous proteins plus zebrafish.

WIPI1 Sanger DNA Resequencing and Case-Control Burden Test
Potential DNVs identified by exome sequencing were validated by Sanger sequencing in the child and both parents. Primer5.0 was used to design the specific primers used for the PCR amplification and direct sequencing of the surrounding region of each variant. Sequencing reactions were performed on G-50-purified PCR products with the BigDye Terminator kit. Sequencing products were run on an ABI Prism 3730 DNA Analyzer (Applied Biosystems). Sequences were analyzed using Mutation Surveyor software v3.97 (SoftGenetics) and were verified by manual inspection. To perform WIPI1 case-control burden test, we extracted East Asian rare (MAF < 0.001) predicted-to-be-damaging variants from gnomAD database 5 . Fisher's exact test was used for case-control burden test.

Parental Testing
Parental testing was carried out with the AmpFl STR SGM plus kit (Applied Biosystems) to exclude false paternity, and DNA inversion to determine whether a variant had occurred de novo.

Multiplex PCR Amplification and Next-Generation Sequencing
Multiplex PCR amplification and next-generation sequencing were performed to screen for DNA variants along the entire coding regions and intron-exon boundaries of the targeted WIPI1 gene. Detailed methods of PCR amplification and sequencing are provided in the Supplementary Materials.

Zebrafish Morpholino and mRNA Injections
Wild-type Tübingen zebrafish strains were raised under standard conditions at 28.5 • C. Injections were performed in the fertilized egg at its one cell stage. MO injection was performed with a splice wipi1-MO (5 -CTGTGTTGTGTTTACCTGTCTGTTG-3 ). MOs were designed by Gene Tools, LLC (Philomath, OR, United States). To produce wipi1 mRNA, the full coding wipi1 transcript (NM_017983) was PCR amplified and cloned into a pCMV6-Entry vector (Origene Technologies, MD, United States). The cDNA was transfected into Escherichia coli and the plasmid containing the wipi1 cDNA was extracted and enzyme digested. Finally, the capped and polyadenylated wipi1 mRNA was produced by in vitro transcription using mMESSAGE mMACHINETM T7 Transcription Kit (Ambion, Austin, TX, United States). The synthesized mRNA was diluted in phenol red to different concentrations for micro-injection. All injected embryos were observed at 24 and 48 h postfertilization (hpf) and images of each phenotype were captured by using the CCD image system under bright field. To knockdown the translation of the zebrafish wipi1, different doses of MO reagent including 1, 2, 4, and 6 ng were injected. For the overexpression experiment, 100, 200, and 300 pg human WIPI1 mRNA was injected into different embryos. In the rescue study, 100, 200, and 300 pg of human WIPI1 WT, or mutant plasmids mRNA, were co-injected with 4 ng of the wipi1-MO. All embryos were independently assessed by two individuals to avoid bias and divided into four grades according to the shortening and bending of the body axis. The distribution of malformation of each group was compared by Pearson's χ 2 test and Fisher's exact test.

Cell Culture and Transfection
HEK293T cells were purchased from the Cell Bank of Chinese Academy of Sciences (Shanghai, China) and cultured according to the manufacturer's protocols. HEK293T cells were plated into six-well plates and transfected with dsRed-tagged WIPI1 wild-type, R328Q, G313R, T418M, or L406P variant using Lipofectamine 2000 Transfection Reagent (Invitrogen) according to the manufacturer's manual. Cells were plated at a density of 2.0 × 10 6 cells per 100-mm Petri dish for real-time PCR, protein blotting, and subcellular localization analyses. Cells were incubated for 24 h at 37 • C under an atmosphere containing 5% CO 2 prior to use.

Real-Time PCR
RNA was isolated from cells using Trizol (Invitrogen). DNase I digestion (DNA-free, Ambion) was used to remove genomic DNA; after that RNA was reversetranscribed using random hexamers (Superscript VILO cDNA synthesis kit). The abundance of WIPI1 mRNA was analyzed using real-time PCR reagent kit (PrimeScript TM RT reagent Kit with gDNA Eraser and TB Green TM Premix Ex Taq TM II, TAKARA) on a Real-Time PCR system (Biorad, CFX96). Each sample was analyzed in triplicate. Relative quantification of each gene expression level was normalized according to the glyceraldehyde-3-phosphate dehydrogenase (GAPDH) expression.

Western Blot
Protein was extracted on ice from cells using radioimmunoprecipitation assay (RIPA) buffer. Protein concentrations were determined using the Bradford assay. Western blotting was performed by conventional methods, with 50 µg of protein run per sample on NuPAGE 4-12% Bis-Tris gel (Life Technologies, Carlsbad, CA, United States), and transferred to a polyvinylidene difluoride (PVDF) membrane (XCell II Blot Module, Invitrogen). Primary antibodies were WIPI1 antibody (1:1000, Cell Signaling Technology, MA, United States) and Rabbit anti-GAPDH antibody (1:1000, Cell Signaling Technology, MA, United States). After incubation with secondary antibody anti-rabbit IgG (1:10,000, Cell Signaling Technology, MA, United States), blots were developed using ECL Prime (Solarbio, Beijing, China). ImageJ was used to detect densitometry data. Results were normalized with the GAPDH loading control. Independent experiments were carried out three times for each sample.

Subcellular Localization
HEK293T cells were plated into six-well plates and transfected with dsRed-tagged WIPI1 wild-type, R328Q, G313R, T418M, or L406P variant. Forty-eight hours later, the cells were washed three times with phosphate-buffered saline (PBS) and incubated 5 min with Hochest 3342 (1 µg/ml) (Invitrogen) and CellMask TM Plasma Membrane Stains (Life Technologies, Carlsbad, CA, United States) (2 µg/ml) solution. The cells were subsequently washed in PBS three times and fixed in 4% paraformaldehyde (PFA in PBS) for 10 min at room temperature, followed by three more PBS washes. Cells were examined and photographed by a fluorescence microscopy (Olympus, BX51). The assay was repeated independently in three times.

Statistical Analysis
Differences in proportions of zebrafish embryos with convergent extension defects between groups were examined with Pearson's χ 2 test or Fisher's exact test. Independent samples t-test was used to identify the difference in expression of mRNA or protein between groups. Statistical analyses were conducted using SPSS 20.0. A two-tailed P-value of <0.05 was considered statistical significance.

De novo Variants Identified in NTD-Affected Trios
A total of 13 coding DNVs in eight cases were identified. Seven of them were selected for validation by Sanger sequencing and all of them were confirmed as true positive (Table 1 and Figure 1). All bam file scripts were attached in Supplementary Figure S1. Two LoF variants were identified in SPHKAP (c.2629_2633del) and NCOR1 (p.Y1907X) gene, respectively. One case carried three DNVs. Among the 13 DNVs, 9 of them were non-synonymous SNVs which were all predicted to be damaging by either SIFT or PolyPhen2. For all the eight non-synonymous SNVs carriers, at  least one predicted to be damaging non-synonymous DNV was identified. Some cases (e.g., B06, B08, and B13) were found to carry more than one damaging DNVs. Clinical phenotypes of the NTDs in the 13 trios are shown in Supplementary Table S2.

Variants of WIPI1 Gene in a NTD Cohort
As we previously found that WIPI1 knockdown in zebrafish causes convergent extension defect, and WD repeat protein (encoded by WIPI1 gene) plays an important role in the autophagy pathway which is considered to be essential for neural tube closure (Fimia et al., 2007;Xu et al., 2013), we further determined whether variants of WIPI1 gene increase NTD risk in a large cohort. Clinical phenotype of NTDs in the cohort of 502 NTD cases was shown in Supplementary Table S3. We used next-generation sequencing to detect variants in the WIPI1 gene in all NTD cases. We found another three variants in the WIPI1 gene in four cases in the NTD cohort. We summarized the variants within WIPI1 coding sequence identified in NTDaffected trios and in NTD cohort in Table 2. T418M variant was identified in two cases, one with both anencephaly and spina bifida, and one with only spina bifida. An additional two cases with anencephaly carried G313R and L406P variants, respectively. Amino acid conservation analysis showed all the variants located at highly conserved nucleotides (Supplementary Figure S2). Compared with East Asia WIPI1 variants in gnomAD database, there is significant (OR: 3.07, 95% CI: 1.07-8.84, Fisher's exact test P-value: 0.053) enrichment of WIPI1 rare damaging missense variants in anencephaly. In addition, the DNV has been reported in the Clinvar 6 .

wipi1 Variants Affect Convergent Extension in Zebrafish
To examine the potential pathogenic effect of these wipi1 variants in vivo, we investigated their functions on convergent extension in zebrafish model which represents a well-established model for genetic variation studies on NTDs (Roszko et al., 2009;Reynolds et al., 2010). Knockdown or overexpression of NTD-related genes in zebrafish leads to a defective convergent extension manifested mainly by shortened body axis (Roszko et al., 2009;Reynolds et al., 2010). Embryos with convergent extension defects had shorter anterior/posterior axes, as well as crooked or bent tails which are commonly considered to be indications of an NTD (Figure 2A). We clustered these injected embryos into four categories according to the severity of their morphology: grade 1, WT like; grade 2, mild defect, up to one-forth shortened axis compared with WT embryos; grade 3, moderate, up to 1/2 shortened axis; grade 4, severe, the body axis does not extend out of the range of the yolk ball (Figure 2A). We examined the distributions of these four categories among different treatment groups. 6 https://www.ncbi.nlm.nih.gov/clinvar/ Either the knockdown or overexpression of WT wipi1 caused convergent extension defects with a marked shortening of the anterior-posterior axis (Figures 2B,C). As shown in Figure 2B, the distribution of the phenotype was significantly different in embryos injected with different MO doses (1, 2, 4, and 6 ng) of wipi1 antisense MO at 24 and 48 hpf (24 h: χ 2 = 932.8, P < 0.001; 48 h: χ 2 = 606.7, P < 0.001). The dose-response relationship was observed (P for trend < 0.05) and the rate of abnormal morphology increased dramatically in embryos injected with 4 ng of wipi1-MO. We chose 4 ng as the optimal injection dose for all subsequent tests.
Embryos injected with different concentrations of wild-type mRNA also presented with mild to severe malformations at 24 and 48 hpf ( Figure 2C). Compared with the control group, the rate of abnormal morphology increased in embryos injected with 200-300 pg mRNA (P < 0.05), but not in embryos injected with 100 pg mRNA at 24 and 48 hpf. The dose-response relationship was observed in the embryos at 48 hpf (P for trend < 0.05).
To further clarify the function of wipi1, we carried out rescue experiment. We first tested different concentrations (100-300 pg) of wild-type WIPI1 mRNA co-injected with 4 ng wipi1-MO ( Figure 3A). Compared with the MO group, the rate of abnormal morphology decreased in embryos injected with 100-300 pg oF WIPI1 wild-type mRNA (P < 0.05). Compared with control group, embryos injected with 100 pg of WIPI1 wild-type mRNA had a similar response frequency of abnormal embryos (P > 0.05), suggesting a good rescue effect.
We co-injected 4 ng wipi1-MO with each wipi1 variant mRNA to clarify the effect of these wipi1 variants ( Figure 3B). All the wipi1 mutants could not reverse the phenotype of the wipi1 knock-down and only manifested mild effects, whereas WT-MO dramatically ameliorated the phenotype at 24 and 48 hpf (P < 0.05). The malformation response frequency of the Wipi1 L406P variant mRNA-MO co-injected group was significantly higher than other co-injected groups (P < 0.05), and showed no difference from the MO group (P = 0.130), indicating that this treatment was the least effective in rescuing the normal phenotype among all the mutants at 48 hpf (P < 0.05).

WIPI1 Variants Affect Expression
To investigate the effect of human WIPI1 mutants on the expression of mRNA and protein, we transfected with dsRedtagged WIPI1 wild-type, R328Q, G313R, T418M, or L406P  Frontiers in Neuroscience | www.frontiersin.org variant into HEK293T cells. No significant difference in transfection efficiency was detected between wild-type and the mutant constructs (Supplementary Figure S3). WIPI1 mRNA expression level was not influenced by WIPI1 R328Q, G313R, T418M, or L406P variant ( Figure 4A). However, WIPI1 L406P variant completely affected while the WIPI1 R328Q variant partially affected WIPI1 protein expression, and the WIPI1 G313R or T418M variants had no effects on WIPI1 protein expression (Figures 4B,C).

Subcellular Localization
We examined the WIPI1 subcellular localization of the four NTD case-specific missense variants with the wild-type localization serving as a positive control. All of the four variants localized to the cell's cytoplasm and maintained the same subcellular localization pattern as did the wild-type (Supplementary Figure S4).

DISCUSSION
In this study, we performed WES on 13 parent-offspring trios where the child was diagnosed with anencephaly, resulting in the identification of a total of 13 coding DNVs in 8 cases. The identified non-synonymous SNVs were all predicted to be damaging DNVs by either SIFT or PolyPhen2. This The data were derived from three independent experiments and expressed as mean ± SD. Data for WIPI1 mRNA and expression were normalized by GAPDH for each sample. Error bars represent standard error/deviation of the mean. The asterisk indicates a statistically significant difference between wild-type and L406P mutation ( * * P < 0.01; * P < 0.05).
indicates that WES on trios is a robust strategy to identify potential causative variants for anencephaly. Two LoF DNVs were identified in pLI > 0.9 genes, SPHKAP and NCOR1. SPHKAP plays role in cAMP signal pathway which may interact with NTD-related G-protein-coupled receptors such as GPR161 (Matteson et al., 2008;Mukhopadhyay et al., 2013;Kim et al., 2019) and CELSR1 (Robinson et al., 2012;Lei et al., 2014). NCOR1 is a nuclear receptor corepressor which was previously associated with Rett Syndrome (Lyst et al., 2013) and Endometrial Hyperplasia (Kashima et al., 2009). NCOR1 plays role in a large corepressor complex which is associated with the retinoid acid receptor (RAR) in the absence of ligand. It is well known that overdose retinoid acid can cause NTDs . Therefore, NCOR1 variants could increase NTD risk through RAR pathway. We subsequently identified three variants in the WIPI1 gene using exome sequencing strategies in an independent population consisting of 502 NTD cases. We performed functional studies by combining the evidence obtained using knockdown/rescue assays in a zebrafish model and transfection expression studies in HEK293T cells. Our findings suggest that functional variants in WIPI1 gene might contribute to the etiology of human anencephaly.
Although DNVs have been shown to contribute to a range of rare and common birth defect phenotypes, including congenital heart defects and autism (Sanders et al., 2012;Veltman and Brunner, 2012;Zaidi et al., 2013), few studies have been published utilizing either WES or WGS in patients with spina bifida or anencephaly to identify the recurrent DNVs that contribute to their etiology. The only WES study involving NTD cases was conducted by Lemay et al. (2015Lemay et al. ( , 2019 in Canada. As for the anencephaly subtype, only eight trios were recruited. This study identified one LoF DNV in SHROOM3 gene, which was assumed to be associated with the development of anencephaly (Lemay et al., 2015(Lemay et al., , 2019. In the present study, we conducted a WES analysis in 13 parent-offspring trios involving a child with anencephaly. In one anencephaly case without visible cardiac defects, two nucleotide variants led to a single amino acid change were found in CNOT3, which was previously associated with cardiac defects in fly or mouse models, and with embryonical lethality in Cnot3-null mice (Neely et al., 2010). This is the first study reporting WES for NTDs in Chinese Han populations, which provides an unbiased description of the genetic variation allowing us to explore for the first time the full allele frequency spectrum of anencephaly cases.
We further examined variants of the WIPI1 gene in a large NTD cohort, considering that WD repeat protein (encoded by WIPI1 gene) has been previously linked to the autophagy pathway, which has a protective function against the onset of neurodegeneration and may well be essential for normal neural tube closure (Fimia et al., 2007;Xu et al., 2013). Resequencing 502 Chinese NTDs cases, we identified three candidate diseaserelated WIPI1 variants (p.G313R, p.T418M, and p.L406P). All of the variants are heterozygous and two were private. No any additional known variants in PCP genes related to NTDs were found in cases carried these three WIPI1 variants (Wang et al., 2018). Unfortunately, due to the absence of corresponding parental samples, we could not confirm their inheritance pattern. It is interesting to note that all the three variants were from anencephaly cases, except p.T418M which was identified in a spina bifida case, although the number of anencephalic fetuses we examined was significantly less than that of spina bifida cases. On the other hand, because p.T418M was not shown to be a functional variant, this variant may not be causal for spina bifida or encephalocele. This suggests that cases with anencephaly may be more likely to carry WIPI1 variants than cases with either spina bifida or encephalocele. Future studies of gene knock-out animal models should help to determine whether the WIPI1 variants are specific to cranial NTDs, which could have important implications for the etiology of NTDs.
Functional validation of these mutants represents an essential step toward dissecting the etiological complexity of the human anencephaly phenotype, and investigating the underlying its pathogenic mechanisms. According to the alignment in species and amino acid conservation analysis, the four variants observed in our study were highly conserved residues. Subcellular localization showed that the four variants didn't affect the subcellular distribution of the WIPI1 protein. To investigate the effect of these variants on gene expression, we transfected with dsRed-tagged WIPI1 wild-type and variants into HEK293T cells. Western blotting revealed that the p.L406P variant was harmful to the stability of the WIPI1 protein, while p.R328Q was only moderately harmful.
We further investigated potential functional properties of the WIPI1 variants in zebrafish model which is a wellestablished model for genetic variation studies on NTDs (Roszko et al., 2009;Reynolds et al., 2010). Convergent extension in zebrafish is a polarized cellular rearrangement that leads to the narrowing of the mediolateral axis and lengthening of the anteroposterior axis for gastrulation and neural tube formation. Our observations of convergent extension in zebrafish rescue assays showed a failure to rescue with the transcripts of human WIPI1 p.L406P and p.R328Q variants. These data indicate that the human RNA containing the p.L406P and p.R328Q variants fails to function as well as wild-type RNA, suggesting that the two variants impair the protein function during embryogenesis and might be hypomorphs. A raised concern is that the p.L406P and p.R328Q variants were rarely observed in the Exome Aggregation Consortium (ExAC) database. We previously found that genetic variants might interact in a digenic fashion to generate the visible NTD phenotypes (Wang et al., 2018). There might be other potential variations in the WIPI1 gene or other genes that contribute to compound heterozygous genotypes that lead to NTD phenotypes. To further dissect the roles of these variations and the WIPI1 gene during embryogenic development, additional functional studies using mouse conditional knockouts are needed.

CONCLUSION
In conclusion, de novo damaging variants are the main culprit for majority of anencephalic cases. Missense variants in WIPI1 may play a role in the genetic etiology of anencephaly, and LoF variants in SPHKAP and NCOR1 may also contribute to anencephaly. These findings expand our collective knowledge of the genetic mechanisms contributing to NTDs, and strongly implicate de novo damaging variants in the etiology of anencephaly. Our study also highlights the importance of WES or further whole-genome scale variant screens in a large number of patients to discover new genes and new variants that cause human congenital defects.

DATA AVAILABILITY STATEMENT
The datasets for this manuscript are not publicly available because of ethical and legal reasons. Requests to access the datasets should be directed to AR at renag@bjmu.edu.cn.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Institutional Review Board of Peking University. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
AR, RF, and YL conceptualized the study. LW supervised the implementation, helped data analysis, and drafted the manuscript. TT, YS, and BZ conducted the zebrafish study. NL performed the bioinformatic data analysis. YL helped the study design and data interpretation. PZ performed the pipeline sensitivity and specificity test. XC provided critical comments on the earlier version of the manuscript. TT participated in the functional study. LW, AR, LJ, and ZL participated in subject enrollment. AR and RF critically revised the manuscript.

FUNDING
This work was supported by a National Institutes of Health (NIH) grant (ES021006) and a grant from Peking University (BMU20140446). RF and YL were supported by the NIH grants HD081216, HD067244, and HD083809.