A Genetics-First Approach Revealed Monogenic Disorders in Patients With ARM and VACTERL Anomalies

Background: The VATER/VACTERL association (VACTERL) is defined as the non-random occurrence of the following congenital anomalies: Vertebral, Anal, Cardiac, Tracheal-Esophageal, Renal, and Limb anomalies. As no unequivocal candidate gene has been identified yet, patients are diagnosed phenotypically. The aims of this study were to identify patients with monogenic disorders using a genetics-first approach, and to study whether variants in candidate genes are involved in the etiology of VACTERL or the individual features of VACTERL: Anorectal malformation (ARM) or esophageal atresia with or without trachea-esophageal fistula (EA/TEF). Methods: Using molecular inversion probes, a candidate gene panel of 56 genes was sequenced in three patient groups: VACTERL (n = 211), ARM (n = 204), and EA/TEF (n = 95). Loss-of-function (LoF) and additional likely pathogenic missense variants, were prioritized and validated using Sanger sequencing. Validated variants were tested for segregation and patients were clinically re-evaluated. Results: In 7 out of the 510 patients (1.4%), pathogenic or likely pathogenic variants were identified in SALL1, SALL4, and MID1, genes that are associated with Townes-Brocks, Duane-radial-ray, and Opitz-G/BBB syndrome. These syndromes always include ARM or EA/TEF, in combination with at least two other VACTERL features. We did not identify LoF variants in the remaining candidate genes. Conclusions: None of the other candidate genes were identified as novel unequivocal disease genes for VACTERL. However, a genetics-first approach allowed refinement of the clinical diagnosis in seven patients, in whom an alternative molecular-based diagnosis was found with important implications for the counseling of the families.


INTRODUCTION
The VATER/VACTERL association (VACTERL) (OMIM %192350) is a very serious condition that is defined as the non-random combination of the following congenital anomalies: Vertebral, Anal, Cardiac, Tracheal-Esophageal, Renal, and Limb anomalies (1). The prevalence of VACTERL in Europe is 0.4 in 10,000, including live births, fetal deaths (miscarriages or stillbirths from 20 weeks of gestation), and termination of pregnancy following prenatal diagnosis of a fetal anomaly (2). Any combination of three features of VACTERL qualifies for a clinical diagnosis, resulting in large phenotypic heterogeneity among VACTERL patients (3)(4)(5)(6)(7)(8)(9). VACTERL usually occurs sporadically, but in some patients, familial occurrence of component features of VACTERL has been observed, indicating that genetic factors may play a role (10). However, other than variants identified in a few VACTERL patients, no candidate gene has been identified that explains the etiology in a substantial fraction of VACTERL patients yet (11)(12)(13)(14)(15)(16)(17). For example, biallelic mutations were recently identified in the 3-hydroxyanthranilic acid 3,4-dioxygenase (HAAO) gene and kynureninase (KYNU) gene in four unrelated patients with congenital vertebral, cardiac, and renal anomalies (17). Consequently, a VACTERL diagnosis is mainly based on the clinical phenotype.
Several monogenic disorders have phenotypic overlap with VACTERL, and patients should only get this diagnosis when the possibility of overlapping monogenic disorders is excluded (18). As monogenic disorders with phenotypic overlap are sometimes hard to distinguish based on clinical phenotype alone, a genetics-first approach may aid physicians in making more accurate molecular/genetic diagnoses. However, detailed genetic testing, e.g., whole exome sequencing (WES) or whole genome sequencing (WGS), is rarely performed in patients that exhibit congenital anomalies from the VATER/VACTERL malformation spectrum (1,19).
In this study, we were interested whether a genetics-first approach for 26 genes would identify patients with monogenic disorders that were previously not diagnosed as such. In addition, we aimed to study the role of 30 candidate genes in the etiology of VACTERL or one of the component features of VACTERL: ARM or EA/TEF.

Study Population
We included three groups of patients diagnosed with: (1) VACTERL; (2) ARM with or without additional congenital anomalies; and (3) EA/TEF with or without additional congenital anomalies. Patients with ARM or EA/TEF with additional anomalies that do not fulfill the criteria for a diagnosis of VACTERL were included to study whether previously identified candidate genes for VACTERL are specific to VACTERL, or are also present in patients with the individual features ARM or EA/TEF. Patients were excluded when a genetically defined syndrome was diagnosed or highly suspected at the time of recruitment. The patients included had not been subjected to extensive genetic testing previously and were not necessarily seen by a clinical geneticist.
The patients were collected from different research groups: 84 VACTERL patients were derived from the German Network for Congenital Uro-REctal malformations (CURE-Net); 56 VACTERL patients from the Aetiologic research into Genetic and Occupational/environmental Risk factors for Anomalies in children (AGORA) data-and biobank (20)  The recently introduced VACTERL limits were used to classify our VACTERL cases in three mutually exclusive subtypes: STRICT-VACTERL, VACTERL-LIKE, and VACTERL-PLUS (9). The STRICT-VACTERL subtype contains cases with ≥3 major VACTERL features; the VACTERL-LIKE subtype contains cases with <3 major VACTERL features, but with additional minor VACTERL features adding up to ≥3 major and minor VACTERL features combined; and the VACTERL-PLUS subtype, contains cases that fulfilled either the STRICT-VACTERL or the VACTERL-LIKE subtype criteria, but had additional major congenital anomalies outside the VACTERL spectrum (9).

Molecular Inversion Probe (MIP)-Based Sequencing
MIPs are 70-nucleotide (nt) single stranded DNA molecules, including a 30-nt common linker sequence and a locus-specific extension and ligation arm of 20-nt each that are complementary to the target DNA. A total of 3000 MIPs were designed to cover the coding regions of the 56 candidate genes. The DNA samples of all patients were analyzed with targeted sequencing using these MIPs, according to previously reported protocols (39)(40)(41)(42). MIP capture and the subsequent polymerase chain reaction were performed as described before (39), with minor modifications. The MIP-captured next generation sequencing (NGS) libraries of 419 patients (CURE-Net and Rotterdam) were sequenced on the Next-Seq500 sequencer (Illumina), while 96 patients (AGORA, Cincinnati, and Milan) were sequenced on the HiSeq 2000 sequencer (Illumina). NGS reads were mapped to UCSC human reference genome assembly hg19 and the average coverage of the entire MIP panel was >150-fold per patient. Variants were called and annotated as described previously (41,42).

Variant Filtering, Prioritization, and Validation
Variant filtering and prioritization was performed to identify rare variation thought to be damaging for protein function. Positions with <10x coverage were marked as missing due to insufficient coverage. All variants were filtered and prioritized based on the filters presented in Table 2.
We focused primarily on loss-of-function (LoF) variants, including stop-gains, canonical splice-sites, and frameshifts. We only searched for additional missense variants in genes in which we identified a LoF variant. Subsequently, missense variants were prioritized only when the Combined Annotation Dependent Depletion (CADD) score was >20 and the population frequency was <0.1% in the ExAC and dbSNP databases. Prioritized variants were validated using Sanger sequencing. If available, we included parental DNA in the validation step to determine the segregation of the variant in the family.

Genotype-Phenotype Correlation and Reverse Phenotyping
When we validated a variant in a gene associated with a monogenic disorder, the patients were carefully re-evaluated in light of the possible new diagnosis based on the genetic information. In case the variant was inherited from one of the parents, the parent's phenotype was also re-evaluated to see whether the segregation in the family would fit the phenotype in the family.

Patient Characteristics
In total, 510 patients were analyzed, of whom 211 patients diagnosed with VACTERL, 204 with ARM, and 95 with EA/TEF ( Table 3). In the VACTERL cohort, 93 patients were categorized as STRICT-VACTERL patients, 52 as VACTERL-LIKE patients, and 66 as VACTERL-PLUS patients. In the ARM cohort, 119 (58%) patients had additional congenital anomalies, and in the EA/TEF cohort43 (45%) patients. The presence of additional congenital anomalies (VACTERL-PLUS, ARM-PLUS, and EA/TEF-PLUS) is considered a first clue for pediatricians to consider an alternative clinical diagnosis.

Prioritized Variants
This study revealed five LoF variants in SAL-Like Transcription Factors 1 and 4 (SALL1, SALL4) and Midline 1 (MID1), genes associated with monogenic disorders with overlapping features ( Table 4). In addition, two missense variants with a CADD score >20 and population frequency <0.1% were identified in SALL4 and MID1. These seven variants are considered pathogenic (LoF) or highly pathogenic (missense variants), suggesting an alternative diagnosis in these patients based on genetic evidence. Two of these variants were de novo in sporadic cases, four variants were inherited from a mildly affected parent, and one variant had unknown inheritance.
A frameshift variant in SALL1 was identified in an ARM-PLUS patient with a perineal fistula, bilateral auricular appendage, and right duplication of thumb phalanx (ID = 1). The variant was de novo and the parents of the patient were healthy. As this family was lost to follow up, clinical re-evaluation was not possible. Based on the "classic" combination of phenotypic features in combination with the genetic evidence, however, this patient can be diagnosed retrospectively with Townes-Brocks syndrome.   A maternally-inherited frameshift variant in SALL1 was identified in an ARM-PLUS patient with a perineal fistula, atrial septal defect, agenesis of the septum pellucidum, right single transverse palmar crease, flat feet, and postnatal cerebral hemorrhage leading to hydrocephalus (ID = 2). The mother had narrow flat feet, second toes longer than first toes, and chronic kidney failure (IgA nephropathy). During re-evaluation, the sister of this patient was found to be affected with bilateral hearing loss due to adenoid hypertrophy and having an accessory finger without bone on the right ulna site. However, she did not carry the frameshift variant in SALL1. Based on this variant and the phenotype, the patient and his mother were diagnosed retrospectively with Townes-Brocks syndrome (43). Due to the highly variable phenotype of Townes-Brocks syndrome, it is not surprising that this patient was not identified based on the phenotype alone. In a third patient with isolated ARM with a perineal fistula (ID = 3), a stop-gain variant was identified in SALL1. Unfortunately, this family was lost to follow up and no parental DNA was available. At the time of recruitment, both the father and sister were noted to have anal atresia. Based on the pathogenic variant in SALL1 and the fact that ARM is a major clinical feature of Townes-Brocks syndrome (43), the patient was diagnosed with this syndrome. The positive family history in the father and sister makes an autosomal-dominant inheritance likely, although this could not be confirmed.

SALL4
Variants in the SALL4 gene are associated with Duane-radial ray syndrome (OMIM #607323), an autosomal dominant disorder characterized by upper limb, ocular, and renal anomalies. Less common features include sensorineural deafness and gastrointestinal anomalies, such as ARM. Genomic position is based on genome build hg19. ARM, anorectal malformation.
A maternally-inherited stop-gain variant was identified in SALL4 in a patient with isolated ARM with a recto-vestibular fistula (ID = 4). The mother has a left-sided short thumb with clinodactyly of thumb. During clinical re-evaluation, we found that the patient has a younger brother with ARM, meconium plug syndrome, Meckel's diverticulum, ear malformation with hearing loss, accessory thumb, and ventricular septal defect, who was clinically diagnosed with Duane-radial ray syndrome. Subsequently, the brother was found to carry the same SALL4 variant. As all three family members share a pathogenic variant in SALL4 and present with features of the Duane-radial ray syndrome (44), this diagnosis seems appropriate. However, it was not suspected for the index patient before evaluating the sequencing data.
A maternally-inherited missense was identified within SALL4 in an ARM-PLUS patient with vestibular fistula, a double left thumb, and an anomaly of the filum terminale (ID = 5). Besides having bilateral clinodactyly of the fifth finger, the mother of this patient was healthy. The maternal grandmother had clinodactyly as well, but could not be tested. Based on the clinical phenotype and the presence of a missense variant in SALL4 that is highly likely to be deleterious, a diagnosis of Duane-radial ray syndrome seems appropriate for this patient. The mild phenotype of the mother can be explained by reduced penetrance, as only 13% of patients show the complete triad originally described for Duaneradial ray syndrome (Duane anomaly, radial ray malformation, and sensorineural hearing loss) (44).

MID1
Variants in the MID1 gene are associated with the X-linked recessive Opitz G/BBB syndrome (OMIM #300000), in which ARM, cardiac anomalies, TEF, hypospadias, hypertelorism and syndactyly are described.
A frameshift variant in MID1 was identified in an ARM-PLUS patient with perineal fistula, tricuspid valve insufficiency, and hypospadia (ID = 6). The variant was de novo and the parents of the patient were healthy at the time of recruitment. As the family was lost to follow-up, clinical re-evaluation was not possible. It is unknown whether the patient had hypertelorism, a feature present in virtually all affected individuals (45). Based on the mutation identified in MID1, however, a diagnosis of Opitz G/BBB syndrome seems most likely for this patient.
A maternally-inherited missense variant was identified in a male VACTERL-PLUS patient with a single umbilical artery, split vertebrae (Th5), ARM, pulmonary vein and artery stenosis, ventricular septal defect, renal dysplasia and left cystic kidney, malrotation, and cryptorchidism (ID = 7). At the time of reevaluation of the patient's phenotype mild hypertelorism and a cow's lick were noted. Besides having clubfoot and syndactyly of the toes, his mother was healthy and had no hypertelorism, which is consistent with X-linked recessive inheritance. Given the phenotype, the likely damaging genetic variant, and the X-linked recessive segregation in this family, a diagnosis of Opitz G/BBB syndrome seems most likely.

Variants in the Remaining Candidate Genes
LoF variants were not identified in any of the remaining candidate genes. Therefore, we did not evaluate the presence of missense variants with a CADD score >20 and population frequency <0.1% in these genes.

DISCUSSION
In this study, the coding regions of 56 candidate genes were sequenced in 510 patients with VACTERL, ARM, or EA/TEF. In the majority of patients, no monogenic disorders were identified that have phenotypic overlap with VACTERL. As we did not find LoF variants in the remaining candidate genes either, we were not able to identify a disease gene for VACTERL based on this gene panel. In seven patients, however, our genetics-first approach provided evidence for a monogenic disorder, allowing the patients to be diagnosed retrospectively with Townes-Brocks, Duane-radial ray, or Opitz G/BBB syndrome.
A strength of this study is that we evaluated a large set of well-characterized VACTERL, ARM, and EA/TEF patients who were clinically diagnosed by pediatric surgeons and/or clinical geneticists. Detailed genetic testing had not been performed in these patients, and they were not suspected of having an alternative disorder. Therefore, it is interesting that we identified several patients affected by an undiagnosed monogenic disorder in this patient population.
The patients in which a monogenic disorder was identified presented either with a milder phenotype or with an atypical presentation of the specific disorders, possibly due to reduced penetrance (46). Therefore, it is not surprising that these patients were not diagnosed correctly before. Our findings suggest that a genetics-first approach seems beneficial in making more accurate diagnoses. Although we identified only seven patients with a monogenic disorder, this information is extremely important for counseling of the families regarding the index patient and the risks in future pregnancies.
At present, VACTERL is considered a diagnosis "per exclusionem, " as no genes have been found to confirm the presence of VACTERL based on molecular evidence. This suggests that the diagnosis can only be made when monogenic disorders with multiple features in common with VACTERL are excluded based on the absence of key features and genetic evidence for these monogenic disorders. We hypothesized that a genetics-first approach would improve the identification of patients with monogenic disorders that are hard to distinguish from VACTERL based on the clinical phenotype alone. As we aimed to identify genes that would directly lead to a diagnosis, variant interpretation was rather stringent, with a strong focus on LoF variants. As a result, we were indeed able to identify an alternative diagnosis based on genetic evidence in 7 out of 510 patients (1.4%). In 5 out of these 7 patients, additional congenital anomalies were present, confirming our hypothesis that the presence of additional congenital anomalies is a first clue for pediatricians to consider an alternative diagnosis.
This study was not population-based, however, so the frequency of monogenic disorders in all patients with features of the VATER/VACTERL spectrum may be different. In addition, it is worth mentioning that most of the patients in which we identified an alternative diagnosis came from the ARM cohort (6/204 = 2.9%). As the clinical phenotype of ARM is quite variable, this may be the best cohort to use in a geneticsfirst approach.
We did not identify any LoF variants in the remaining candidate genes, but our gene panel covered only a small proportion of the human genome. The use of genome-wide approaches, such as WES or WGS, may be more fruitful in identifying new genes or genetic regions that are associated with VACTERL or its individual features ARM or EA/TEF. The beneficial effects of a genotype-first approach have been demonstrated for other sporadic heterogeneous disorders, such as intellectual disability, autism, and schizophrenia, as well (47,48). Trio analyses may highlight the role of de novo variation, as most of the VACTERL cases occur sporadically.
Ten of the candidate genes from our gene panel were previously reported in the literature and were used to screen groups of VACTERL patients before. Based on a promising knock-out animal model and variants identified in VACTERL patients, the Proprotein Convertase Subtilisin/Kexin Type 5 gene (PCSK5) was considered a very interesting candidate (27,37). However, we did not identify LoF variants in PCSK5, which is in line with a previous study among 39 VACTERL patients (15). Variants in other candidate genes, including Sonic Hedgehog (SHH), Pancreas Associated Transcription Factor 1a (PTF1A), and LIM Domain Containing Preferred Translocation Partner in Lipoma (LPP) were not found either, in agreement with the literature (12,14,35). In most studies, including this one, only the coding regions of the candidate genes were sequenced. It is possible, however, that non-coding regions of these genes play a role in the etiology of complex congenital anomalies, such as VACTERL. As non-coding regions often contain regulatory elements involved in gene expression, they might be of interest to study. In addition, epigenetic factors, such as DNA methylation, may also play an important role.
In conclusion, we did not find an unequivocal disease gene for VACTERL, ARM, or EA/TEF. The lack of identifying mutations in one or a few genes in a large percentage of patients confirms that the VACTERL association remains a difficult to tackle heterogeneous group of rare disorders. However, we did identify seven patients with monogenic disorders that were clinically not recognized as such, using a genetics-first approach. Although the proportion of patients that received an alternative diagnosis through genetic testing was low, we would still like to highlight the importance of genetic testing in patients with multiple congenital anomalies to prevent misdiagnosis, as this is extremely important for counseling of the families. This study exemplifies that a phenotypically rather broad cohort benefits from a genotype-first approach to define the molecular diagnosis in a subset of cases.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article can be made available by the authors upon individual request, without undue reservation.

ETHICS STATEMENT
Written informed consent was obtained from all participants and/or their parents. The Ethics Committees of the University of Bonn and the University of Heidelberg approved the CURE-Net study protocol, the Regional Committee on Research Involving Human Subjects Arnhem-Nijmegen approved the AGORA study protocol, the Erasmus University Medical Centre's local ethics board approved the Erasmus study protocol, and the Institutional Review Board at the Cincinnati Children's Hospital Medical Center approved the collection of genetic samples of the Cincinnati patients. Written informed consent to participate in this study was provided by the participants' legal guardian/next of kin.

AUTHOR CONTRIBUTIONS
RP carried out the final analyses, interpreted the data and drafted the initial manuscript. GD was involved in the acquisition of data by performing clinical re-evaluations of some of the patients and interpreted the data. EB, AK, HR, and CM conceptualized and designed the study. NR, IR, HB, and AH conceptualized and designed the study, interpreted the data, and had general supervision of the project. RA conceptualized and designed the study and carried out the initial analyses. NK carried out the initial experiments and analyses. MS carried out validation experiments. SD was involved in conceptualization of the study. ES, SM, NS, CS, AB, DT, GB, AM, MFB, MDB, ML, AP, and IB were involved in the acquisition of data by including research participants. All authors revised the manuscript for important intellectual content, approved the final manuscript as submitted, and agreed to be accountable for all aspects of the work.

FUNDING
RP was supported by a personal research grant from the Radboud university medical center, Nijmegen, the Netherlands. GD was supported by BONFOR grant O-120.0001. EB was supported by a grant (SSWO S13-9) of the Friends of Sophia Foundation. HR was supported by grant 2014_A14 from the Else Kröner-Fresenius Foundation, and by the grants RE 1723/1-1, RE 1723/1-3, and RE 1723/2-1 from the German Research Foundation (Deutsche Forschungsgemeinschaft, DFG). AH and HB were supported by the Solve-RD project. The Solve-RD project received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 779257.