Reanalysis of Genomic Sequencing Results in a Clinical Laboratory: Advantages and Limitations

Genetic diagnosis of patients with neurodevelopmental disorders is imperative and a standard clinical practice. Considering the continuous accumulation of data on disease-causing variants, reanalysis of previously established sequencing data is important. Periodic reanalysis of variants with uncertain significance has become mandatory in clinical laboratories. Therefore, to confirm the utility of the reanalysis of targeted gene panel data in clinical laboratories, we re-evaluated the data of two groups of patients who had undergone targeted gene panel testing for neurodevelopmental disorders (n = 116) and epileptic encephalopathy (n = 384). This reanalysis was based on a reannotation process reflecting updated databases. Six (5.2%) and seven (1.8%) new pathogenic or likely pathogenic variants were identified in these two groups, respectively, attributable to the updated guidelines and de novo reports from unrelated patients. Although relatively low, considerable increase in the diagnostic yield was confirmed. We suggest that reanalysis of genetic variants, mainly using changes in databases and updated interpretations, should be implemented as a routine practice in clinical laboratories.


INTRODUCTION
Adoption of massive parallel sequencing has revolutionized the molecular genetic diagnosis of patients with genetically heterogeneous neurodevelopmental disorders. Advancements in technology and development of cost-effective methods have improved the feasibility of multigene panel testing with hundreds of relevant genes and whole-exome sequencing in clinical laboratories. An average diagnostic yield of up to 40% is reported depending on patient group and intensity of analysis (1)(2)(3)(4)(5).
Given the continuous accumulation of data on the relationship between gene-disease and variant-disease, periodic reanalysis of the already reported patient results has been considered (6). Several studies have reported that reanalysis using improved bioinformatic tools and updated databases or expanded knowledge on genotype-phenotype correlation is beneficial for the diagnosis of previously unsolved cases (6)(7)(8)(9)(10)(11)(12)(13)(14). Furthermore, periodic reanalysis of variants with uncertain significance has now become mandatory in clinical laboratories (15).
The increased diagnostic yield of reanalysis is mostly attributed to newly established gene-disease relationships following initial exome sequencing (16). However, in clinical laboratories that perform the majority of genetic testing using specific gene panels rather than larger whole-exome or wholegenome sequencing, the identification of new pathogenic or likely pathogenic variants based on newly discovered gene-disease relationships is limited. It is also difficult to change bioinformatic tools in clinical laboratories; reanalysis is inevitable, and mainly based on updated databases related to variants or updated guidelines. While next-generation sequencing (NGS) guidelines recommend reanalysis of previously negative NGS data (17), performing reanalysis is difficult in clinical laboratories owing to limited resources. Moreover, the utility of the reanalysis process has been questioned because it is labor-intensive.
Herein, we re-evaluated the established results of targeted gene panel testing in patients with neurodevelopmental disorders, including those with delayed development, intellectual disability, and epileptic encephalopathy, to confirm the applicability of reanalysis of targeted gene panel data in clinical laboratories.

Data Collection
Gene panel sequencing data were collected from patients who had undergone gene panel testing for neurodevelopmental disorders (n = 116) and epileptic encephalopathy (n = 384) between January 2017 and July 2018. No pathogenic or likely pathogenic variants were identified. In our clinical laboratory, gene panel testing for epileptic encephalopathy was performed when the patients showed specific epilepsy syndromes, probably related to developmental, and epileptic encephalopathy. Further, gene panel testing for neurodevelopment disorders was performed when the patients presented seizures, not specifically epilepsy syndrome or severe developmental delays. The xGen Inherited Diseases Panel (Integrated DNA Technologies, Coralville, IA, USA) comprising 4,503 genes was used for neurodevelopmental disorders, and a customized gene panel comprising 173 candidate genes (Supplementary Table 1) was used for epileptic encephalopathy. This study was approved by the institutional review board of Severance Hospital, and the requirement for informed consent was waived-off.

Data Analysis and Interpretation
We used a comprehensive custom bioinformatic pipeline that supported a wide range of variants, ranging from singlenucleotide variants to copy-number variants. The flow chart of our bioinformatic pipeline was as previously described (18,19) without any modification. We only re-annotated the variant call format (VCF) files of patients with no pathogenic or likely pathogenic variants for reanalysis (Figure 1). All annotation processes were automatically performed. The variants from VCF files were first annotated using Annovar and Variant Effect Predictor software to determine their effects on genes, transcripts, protein sequences, and regulatory regions. The variants were then annotated using ClinVar, Online Mendelian Inheritance in Man (OMIM), the Human Gene Mutation Database (HGMD), computational (in silico) predictive programs (Mutation Taster, SIFT, PolyPhen-2, PROVEAN), Single Nucleotide Polymorphism Database (dbSNP), 1000 Genome, the Exome Aggregation Consortium, the Genome Aggregation Database, and the Korean Reference Genome Database. These databases were updated four times per year in our laboratory. Following automated annotation, benign or likely benign variants were filtered out by tallying the scores from the frequency of variant-expressing population, in silico results, and the literature reported in the databases. We manually checked the automatically re-annotated data and performed parental tests when possible. Reanalysis was performed from May to July 2019. The pathogenicity of variants was classified according to the American College of Medical Genetics and Genomics and the Association for Molecular Pathology (ACMG/AMP) guidelines (17). We used the ClinGen recommendation for de novo criteria as indicated in the Sequence Variant Interpretation (SVI) Working Group (https://clinicalgenome.org/working-groups/ sequence-variant-interpretation/). This recommendation for de novo criteria suggests a point-based system to determine the strength of de novo evidence (ACMG/AMP criteria codes PS2 and PM6) based on confirmed vs. assumed status, phenotypic consistency, and number of de novo observations. We also used the recommendations by the SVI working group for interpretation of the PVS1 criterion for exon duplication (20).

Overall Increase in Diagnostic Yield
In total, 66 of 116 (56.9%) patients who were tested using the neurodevelopmental disorder panel and 231 of 384 (60.2%) patients tested by the gene panel for epilepsy were male. Six pathogenic or likely pathogenic variants were identified in the patients subjected to gene panel testing for neurodevelopmental disorders, accounting for an increase of 5.2% (6/116) in the diagnostic yield. Of the 384 patients previously subjected to gene panel testing for epilepsy, seven new pathogenic or likely pathogenic variants were identified during the course of reanalysis; this accounted for an increase of 1.8% (7/384) in the diagnostic yield. During the initial testing, parental testing had not been conducted for 13 patients owing to the omission or rejection of consent, but it was triggered by changed classification of the variants during reanalysis; we could perform parental testing on three patients (P1, P8, and P13). Except for P11, parents of other 12 patients had shown no symptoms similar to their children. The upgraded variants are described in Table 1. The variants of uncertain significance reported in the initial report for the 13 cases are presented in Supplementary Table 2.

Updated Guidelines, New Variant-Disease Associations, and Phenotyping
The SVI Working Group suggested in 2018 that the discovery of a de novo variant in unrelated patients might allow its prediction in patients with unaffected parents. The variants from P1-P7, P9-P10, and P12-P13 with unaffected parents were de novo in unrelated patients in the new literature. Therefore, they were given various levels of pathogenic evidence by the sum of points ( Table 1). We contacted their parents to explain the results of reanalysis and received consent for parental testing from the parents of P1 and P13. The variants from P1 and P13 were confirmed to be de novo. In addition, the variants from P3 and P13 were associated with the PS3 code, owing to the new literature on decreased protein function. The P3 variant in ITPR1, a gene encoding a calcium channel that modulates intracellular calcium signaling, was shown to decrease calcium ion release in the endoplasmic reticulum (21). The P13 variant in KCNC1, encoding a highly conserved subunit of a potassium ion channel, was demonstrated to decrease the amplitude of current (22). In the case of ALDH7A1 (P8), which is associated with the recessive disease pyridoxine-dependent epilepsy, two variants were identified near the intronic junction; these were not canonical splice variants but were reported as variants of uncertain significance owing to the lack of evidence. The variant c.192+3A>T (NM_001182.4) was thought to be likely pathogenic in ClinVar in October 2017 after our initial report and was assigned the PP5 code. After additional phenotyping that identified decreased seizures with pyridoxine administration, the variant was assigned the PP4 code and deemed to be likely pathogenic. Considering the reinterpretation of c.192+3A>T variant to be likely pathogenic, the variant c.1093+5G>T was likely to receive PM3 score owing to its trans position with the variant c.192+3A>T. The patient's parents were informed of the reanalysis results and the need for a parent test. The variant c.1093+5G>T in ALDH7A1 was actually identified in trans with c.192+3A>T following parental testing. Exon duplication, unlike exon deletion, has not been described in detail in the ACMG/AMP guidelines published in 2015; thus, exon 3,4 duplication in GRIN2A in P11 was reported as a variant of uncertain significance (Supplementary Figure 1). However, in 2018, new guidelines for PVS1 were released (20), and the code PVS1_Strong was assigned for the exon duplication in GRIN2A because the reading frame was thought to be disrupted and the occurrence of non-sense-mediated decay was predicted.

Changed Genetic Inheritance or Updated Gene-Disease Associations
As P6 was a female patient, one missense variant in ALG13, known to be inherited in an X-linked recessive mode, was underestimated in our initial report. However, the mode of inheritance in ALG13 changed from X-linked recessive to X-linked dominant. Therefore, the likelihood of disease association of the variant c.320A>G (NM_001099922.2) in ALG13 increased in P6. It was deemed to be likely pathogenic, as it was reported as de novo in other patients. The initial diagnostic test was performed, and CACNA1E encoding a subunit of a calcium channel was considered a candidate for epileptic encephalopathy; however, the OMIM database did not clearly report this gene and the related disease. Therefore, we conservatively interpreted the CACNA1E missense variant as a variant of uncertain significance (P7). CACNA1E was linked to epileptic encephalopathy in OMIM in 2019 (CACNA1E, OMIM# 601013); thus, the CACNA1E variant c.1054G>A (NM_001205293.1) in P7 became more noticeable and was interpreted as a likely pathogenic de novo variant from unrelated patients.

DISCUSSION
Neurodevelopmental disorders, including epileptic encephalopathy, affect more than 3% children worldwide (23). The genetic diagnosis of these diseases is gaining importance, and NGS technology with massive parallel sequencing has become the standard clinical practice. Approximately 250 novel gene-disease and 9,200 novel variant-disease associations are reported every year (6). Therefore, the significance of reanalysis of negative results obtained from previous rounds of NGS cannot be overemphasized.
Several reports have described the reanalysis of whole-exome or clinical-exome sequencing data, showing varying yields from approximately 10% to 30% performed by applying improved bioinformatic pipelines or searching for newly discovered disease-associated genes (6)(7)(8)(9)(10)(11)(12)(13)(14). To achieve maximum diagnostic yield, realignment with upgraded tools through the inspection and clarification of patients' symptoms and signs, data sharing and collaboration with other institutes, and reannotation of variants using updated databases should be periodically performed. However, the process of reanalysis in clinical laboratories is limited because it is time consuming and laborintensive. Furthermore, frequent changes in analytical pipelines may hinder routine work, and patients can only be contacted when they visit centers for appointment. In addition, detailed phenotyping is often difficult, given the limited treatment time available.
In the present study, we performed reanalysis of the previous data using updated guidelines, new variant-disease The phenotypic consistency in P1 and P4-P13 with epilepsy was considered "phenotype consistent with gene but not highly specific" and phenotypic consistency in P2 and P3 without epilepsy was considered "Phenotype consistent with gene but not highly specific and high genetic heterogeneity".
Frontiers in Neurology | www.frontiersin.org associations with some phenotyping, new genetic inheritance, and updated gene-disease associations, all of which contributed to the increase in the diagnostic yield. We reanalyzed the results using two gene panels comprising 173 and 4,503 genes based on limited conditions. For efficient bioinformatic analysis, we incepted with VCF files and used automated annotation programs (Figure 1). This process allowed us to reduce labor and time because it excluded the need to manually search the databases. We identified six new pathogenic or likely pathogenic variants (5.2%) in 116 patients who underwent neurodevelopmental disorder panel testing and seven new pathogenic or likely pathogenic variants (1.8%) in 384 patients who underwent epileptic encephalopathy panel testing. The increase in the diagnostic yield was relatively low, possibly due to the lack of updates with respect to newly established gene-disease associations. It may also be attributed to the fact that we performed panel sequencing, not exome sequencing, without changing the previously used bioinformatic tools and without in-depth phenotyping. Although the cost reduction for large-scale genome sequencing may increase the utilization of whole-exome sequencing as a routine practice, many clinical laboratories still employ gene panels with a limited number of target genes (24). The identification of 13 new pathogenic or likely pathogenic variants using targeted gene panel results demonstrates the benefit of this approach, considering the limited time and resources available in clinical laboratories. We learnt several lessons from the reanalysis of targeted panel sequencing data at the clinical laboratory level. First, it is important to keep a track of the relevant guidelines for the correct interpretation of variants. The ACMG/AMP guidelines published in 2015 provide information on variant interpretation necessary during the initial testing but may lack some explanation. The updated recommendation for de novo criteria highlighted that de novo reports from unrelated patients could help interpret the variants. In addition, the updated guidelines for the PVS1 code provided us with the evidence for the interpretation of exon duplication.
Second, even during reanalysis, identifying variants related to the cause of an underlying disease may be useful to treat patients. In the case of P8, ALDH7A1 variants associated with pyridoxine-dependent epilepsy were thought to be related to patient's seizures, and the administration of pyridoxine improved seizures in this patient. In the case of P4, clonazepam that was found to be effective in patients with the same GLRA1 variant related to hyperekplexia (25) could be administered to our patient.
Third, it is advantageous to include as many relevant genes as possible during the designing of gene panel. During reanalysis, the yield of our gene panel for neurodevelopment disorders with a higher number of genes was higher than that of the gene panel for epileptic encephalopathy. ZDHHC9, ITPR1, and GLRA1 showed meaningful variants in the gene panel for neurodevelopment disorders during reanalysis but were not actually included in the gene panel for epileptic encephalopathy.
Our study has a few limitations. The reanalysis was performed on two different set of panels, not by exome sequencing, owing to the nature of the clinical laboratory. Hence, it was difficult to reveal the variants upgraded from the discovery of new genedisease associations. In addition, only unsolved cases were subject to reanalysis. Future studies should be conducted on previously pathogenic or likely pathogenic variants.
In conclusion, we reanalyzed the data obtained using a small and large gene panel, starting with VCF files. The main factors were updated guidelines and de novo reports from other patients. Although relatively low, the increase in the diagnostic yield was considerable. This approach may encourage the implementation of data reanalysis as a routine process in clinical laboratories.

DATA AVAILABILITY STATEMENT
The datasets generated for this study can be found in the Supplementary Material.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Severance Hospital, Institutional Review Board. Written informed consent from the participants' legal guardian/next of kin was not required to participate in this study in accordance with the national legislation and the institutional requirements. Written informed consent was not obtained from the minor(s)' legal guardian/next of kin for the publication of any potentially identifiable images or data included in this article.

AUTHOR CONTRIBUTIONS
DW and SK analyzed the data and wrote the paper. BK and S-TL verified the analytical method and aided in interpreting the results. JC and H-CK designed the study and supervised the findings of the work. All authors discussed the results and contributed to the final manuscript.

ACKNOWLEDGMENTS
We would like to thank Editage (www.editage.co.kr) for English language editing.