Allelic Dropout Is a Common Phenomenon That Reduces the Diagnostic Yield of PCR-Based Sequencing of Targeted Gene Panels

Primary cardiomyopathies (CMPs) are monogenic but multi-allelic disorders with dozens of genes involved in pathogenesis. The implementation of next-generation sequencing (NGS) approaches has resulted in more time- and cost-efficient DNA diagnostics of cardiomyopathies. However, the diagnostic yield of genetic testing for each subtype of CMP fails to exceed 60%. The aim of this study was to demonstrate that allelic dropout (ADO) is a common phenomenon that reduces the diagnostic yield in primary cardiomyopathy genetic testing based on targeted gene panels assayed on the Ion Torrent platform. We performed mutational screening with three custom targeted gene panels based on sets of oligoprimers designed automatically using AmpliSeq Designer® containing 1049 primer pairs for 37 genes with a total length of 153 kb. DNA samples from 232 patients were screened with at least one of these targeted gene panels. We detected six ADO events in both IonTorrent PGM (three cases) and capillary Sanger sequencing (three cases) data, identifying ADO-causing variants in all cases. All ADO events occurred due to common or rare single nucleotide variants (SNVs) in the oligoprimer binding sites and were detected because of the presence of “marker” SNVs in the target DNA fragment. We ultimately identified that PCR-based NGS involves a risk of ADO that necessitates the use of Sanger sequencing to validate NGS results. We assume that oligoprimer design without ADO data affects the amplification efficiency of up to 0.77% of amplicons.

Primary cardiomyopathies (CMPs) are monogenic but multi-allelic disorders with dozens of genes involved in pathogenesis. The implementation of next-generation sequencing (NGS) approaches has resulted in more time-and cost-efficient DNA diagnostics of cardiomyopathies. However, the diagnostic yield of genetic testing for each subtype of CMP fails to exceed 60%. The aim of this study was to demonstrate that allelic dropout (ADO) is a common phenomenon that reduces the diagnostic yield in primary cardiomyopathy genetic testing based on targeted gene panels assayed on the Ion Torrent platform. We performed mutational screening with three custom targeted gene panels based on sets of oligoprimers designed automatically using AmpliSeq Designer ® containing 1049 primer pairs for 37 genes with a total length of 153 kb. DNA samples from 232 patients were screened with at least one of these targeted gene panels. We detected six ADO events in both IonTorrent PGM (three cases) and capillary Sanger sequencing (three cases) data, identifying ADO-causing variants in all cases. All ADO events occurred due to common or rare single nucleotide variants (SNVs) in the oligoprimer binding sites and were detected because of the presence of "marker" SNVs in the target DNA fragment. We ultimately identified that PCR-based NGS involves a risk of ADO that necessitates the use of Sanger sequencing to validate NGS results. We assume that oligoprimer design without ADO data affects the amplification efficiency of up to 0.77% of amplicons.

INTRODUCTION
In recent years, the study of the genetic causes of monogenic diseases has evolved from a basic science research area into widely accepted clinical testing protocols with substantial impacts on diagnostics and clinical decision-making (Ackerman et al., 2011). Primary cardiomyopathies (CMP) are monogenic but multi-allelic disorders with dozens of genes involved in pathogenesis (Hershberger et al., 2018). The prevalence of clinically expressed and hypertrophic cardiomyopathy (HCM) gene carriers has been greatly underestimated and could be as high as 1:200 (Semsarian et al., 2015).
The implementation of NGS approaches has resulted in more time-and cost-efficient DNA diagnostics of cardiomyopathies. However, the diagnostic yield of genetic testing for each subtype of CMP fails to exceed 60% (Hershberger et al., 2018). Negative results obtained by genetic testing do not rule out the presence of genetic disease because our knowledge about the molecular pathogenesis of disease is still evolving. Moreover, the technical limitations of all known techniques of DNA/RNA analysis and variant interpretation contribute to incomplete results. Alternative sequencing approaches such as capillary Sanger sequencing confirm the genetic variants found by NGS methods to increase the reliability of the DNA test results (Baudhuin et al., 2015).
Allelic dropout (ADO) is a common phenomenon that reduces the efficiency of PCR-based targeted sequencing. It was first described in 1991 as a "partial amplification failure, " causing a potential source of misdiagnosis for both dominant and recessive diseases (Navidi and Arnheim, 1991). The practical importance of the ADO phenomenon was originally shown in 1997 by Lissens and Sermon in a case of preimplantation genetic diagnosis of cystic fibrosis wherein the heterozygous F508 mutation in the CFTR gene was not detected in 25% of mutant blastomeres (Lissens and Sermon, 1997). The ADO phenomenon involves selective allele amplification during the polymerase chain reaction (PCR) thermocycling process. The presence of single nucleotide variants (SNVs) in the forward and/or reverse oligoprimer binding sites may lead to the complete or partial lack of amplification of the single allele, while the second one may "drop" out during the PCR process. In such cases, SNVs causing ADO are usually located closer to the 3 ′ end of the oligoprimer binding site (Martins et al., 2011).
Bi-directional capillary Sanger sequencing and highthroughput semiconductor sequencing approaches are routinely used for cross-validation of genetic findings (Baudhuin et al., 2015;Di Resta and Ferrari, 2018). Both approaches are PCRbased, share similar limitations, and may be negatively impacted by ADO. However, the incidence of ADO events in these PCR-based diagnostic assays remains unknown.
The aim of this study was to demonstrate that ADO is a common phenomenon influencing the diagnostic yield of targeted gene panel testing of primary CMPs on the Ion Torrent platform with follow-up verification by Sanger sequencing.

MATERIALS AND METHODS
We performed genetic testing on DNA samples from 232 patients diagnosed with inherited cardiomyopathies in clinical centres. This study was performed in accordance with the 1964 Helsinki declaration, its later amendments and local ethics committee. Written informed consent was obtained from all individual participants included in the study. DNA samples were extracted from venous blood using Quick-DNA Miniprep Plus Kit (Zymo Research Corp., Irvine, CA, USA) according to the manufacturer's instructions.
Mutational screening was performed using three custom targeted gene panels with two sets of oligoprimers designed automatically using Ion AmpliSeq Designer ® (Thermo Fisher Scientific, Waltham, MA, USA) containing 1,049 primer pairs for 37 genes, with a total length of 153 kb. More detailed characteristics of each target genes panels are presented in Supplementary Table 1. Manufacturer grouped primers for each panel in 2 pools. Libraries preparation was performed using Ion AmpliSeq TM Library Kit 2.0 according to the manufacturer's instructions (Thermo Fisher Scientific). Sequencing was performed on Ion 314 TM and Ion 316 TM chips using highthroughput semiconductor sequencing on an Ion PGM TM System according to the manufacturer's instructions (Thermo Fisher Scientific). Average reads per amplicon were 192, mean coverage with at least 20 reads−94.7%, mean coverage with at least 100 reads−79.51%. Data from the Ion PGM TM System were processed with CoverageAnalysis and VariantCaller plugins available within licensed Torrent Suite Software 5.6.0 and Ion Reporter Software (Thermo Fisher Scientific). NGS sequencing reads were visualized using the Integrative Genomic Viewer (IGV) tool (Robinson et al., 2011) using hg19 as a reference genome. All DNA samples were screened with at least one of the targeted gene panels mentioned.
Rare genetic variants detected by NGS were verified via bi-directional capillary Sanger sequencing on an ABI 3730XL DNA Analyzer according to the manufacturer's instructions (Thermo Fisher Scientific). Alternative pairs of oligoprimers flanking the coding and adjacent intronic regions of the 37 genes were designed for PCR using open-source PerlPrimer (Marshall, 2004). The PCR protocol and annealing temperature of the primers were determined experimentally. The results of direct Sanger sequencing were visualized using Chromas 2 software (Technelysium Pty Ltd, South Brisbane, Australia).
All archival direct Sanger sequencing chromatograms were involved in the study to track the ADO phenomenon. Genetic variants found by NGS were visually compared with Sanger sequencing chromatograms, noting the possible loss of heterozygosity or underrepresentation of alternative alleles. To reveal the cause of ADO, forward-and reverse-primer binding sites were analysed using the Genome Aggregation Database (gnomAD) (Karczewski et al., 2020). In order to exclude only one allele amplification, all amplicons with noted or suspected ADO cases were re-sequenced with alternative non-overlapping oligoprimer pairs.
All genetic variants newly detected in this study were registered in public database ClinVar (https://www.ncbi.nlm. nih.gov/clinvar/). List of variants with accession numbers is summarized in Supplementary Table 2.

RESULTS
We performed mutational screening on 232 DNA samples from patients diagnosed with different types of inherited CMPs. The DNA samples were screened with at least one of the three targeted gene panels.
We found that three ADO cases occurred during sequencing on the IonTorrent platform and three occurred during capillary Sanger sequencing. In the targeted gene panels, ADO led to underrepresentation/loss of marker variants in NGS reads ( Table 1). The Sanger sequencing chromatograms revealed a dropout of the allele due to the loss of heterozygosity of the already detected ("marker") SNV (Table 1). Control capillary resequencing using additional alternative oligoprimers confirmed the true allelic status.
We identified the cause of ADO in all six cases ( Table 1). In the targeted genes, ADO was caused by rare or unique genetic variants in the oligoprimer binding sites; in capillary Sanger sequencing, all ADO cases occurred due to common SNVs.
We found that the ADO phenomenon may lead not only to the complete loss of an allele but also to the underrepresentation of the "marker" variant in NGS reads. For example, the heterozygous rare missense variant c.641G>A (p.R214Q) in the SCN1B gene was detected in only 1 of 2 overlapping amplicons by target IonTorrent sequencing and was represented in 5% of all reads ( Figure 1A). This caused the loss of two missense variants, -c.744C>A (p.S248R) and c.749G>C (p.R250T)-and only reads with the wild-type allele were displayed. The presence of all three linked heterozygous missense variants-c.641G>A (p.R214Q), c.744C>A (p.S248R), and c.749G>C (p.R250T)-in the SCN1B gene in the DNA sample was confirmed by control Sanger sequencing ( Figure 1B).
To reproduce this case of ADO in a single (i.e., nonmultiplexed) PCR, we performed a single control PCR with two oligoprimers designed by AmpliSeq and flanking genomic region chr19:35524839-35525003 (the corresponding target region of the SCN1B gene). This amplicon was sequenced separately by capillary Sanger sequencing and the loss of the allele containing the c.641G>A (p.R214Q) variant was reproduced ( Figure 1C).
Cross-validation of DNA diagnostic results using an alternative sequencing approach (capillary Sanger sequencing was performed first as a basic method) allowed us to identify a case of allelic dropout in the SCN5A gene in an Iranian family with Brugada syndrome (Figure 2A). Heterozygous missense variant c.4516C>T (p.P1506S) in exon 26 of the SCN5A gene was found by capillary Sanger sequencing in DNA samples of the proband ( Figure 2B) and proband's brother. However, further cascade familial screening revealed this missense variant in the proband's nephew in the hemizygous state ( Figure 2C). Theoretically, the hemizygous state of this c.4516C>T variant in the II-4 family member may involve alternative explanations such as consanguinity in the family or de novo deletion of the SCN5A gene in the maternal allele, but it did not fit the clinical phenotype and family history. Dropout of the wild-type allele was the most reliable explanation.
Manual analysis of oligoprimer binding sites using gnomAD revealed the presence of a common genetic variant c.4542+89C>T [total minor allele frequency (MAF) 0.098] in the 3 ′ -end of the R-primer which caused ADO. This heterozygous SNV was detected in DNA samples of patient II-4 and his mother by PCR-RFLP analysis using an additional pair of oligoprimers flanking this region. Absence of genetic variant c.4516C>T (p.P1506S) in the mother's DNA sample was confirmed by capillary Sanger sequencing with two independent oligoprimers and PCR-RFLP analysis. Control re-sequencing on the IonTorrent platform showed that family  member II-4 is a carrier of heterozygous c.4516C>T (p.P1506S) ( Figure 2D).
In one case, we detected ADO when comparing results from two consecutive targeted gene panel sequencing assays with overlapping gene spectra and different oligoprimers encompassing the target regions of the LDB3 gene. A variant of unknown significance-c.1051A>G (p.T351A)-in the LDB3 gene in sample ARVD19 was detected only by panel I ("Genes encoding desmosomal and associated proteins") (Supplementary Figure 1A) but not in panel II ("Genes encoding sarcomeric and associated proteins") (Supplementary Figure 1B). This SNV was also confirmed by control capillary sequencing (Supplementary Figure 1C).
We also found that ADO may occur not only due to localization of SNVs in the 3 ′ -end of oligoprimer binding sites, but also close to the 5 ′ -end. Deep intronic variant c.2300-195A>G in the PKP2 gene was located close to the 5 ′ -end and led to ADO in nine samples studied ( Table 1).
We found that ADO is a non-consistent process even in the same DNA sample. For example, three consecutive PCR capillary sequencing runs with sample ARVD16 yielded one positive result (heterozygous variant c.2091A>G was detected) and two negative results (complete loss of the c.2091A>G variant). This ADO event was caused by the intronic variant c.1904-49T>A located at the 3 ′ -end of the primer-binding site.
All amplicons with identified ADO events were carefully resequenced using alternative oligoprimer pairs and selective allele amplification was confirmed in all cases ( Table 1).

DISCUSSION
Currently, ADO is a known limitation of PCR-based molecular diagnostic approaches. Caused by different mechanisms, the single allele amplifies exclusively or pre-dominantly, leading to overrepresentation of homozygosity (Wang et al., 2012).
Following the automatization of all molecular diagnostic procedures-from primer design to variant detection, calling, and interpretation-that has increased the amount of samples tested simultaneously, the awareness of ADO events should also be increased. The Clinical Laboratory Standards Institute (CLSI) Guidelines recommend that assay development and quality control should include measures aimed at both detecting allelic dropout and minimizing its occurrence (CLSI, 2012).
In this study, six ADO events were identified in both PCR-based sequencing platforms. The causes of these events were also revealed in all cases. On the IonTorrent platform, ADO events were caused by rare or unique genetic variants in the oligoprimer binding sites. On the capillary sequencing platform, ADO events were caused by common SNVs in the oligoprimer binding sites ( Table 1). As a result of such selective amplification, we observed partial hemizygosity or underrepresentation of the heterozygous genetic variants in the NGS results. Some of these underrepresented variants may be filtered automatically during NGS data processing. Cross-validation of the genetic findings revealed by one sequencing platform with an alternative approach is a powerful method to decrease the rate of false-positive results in genetic testing. However, there is no universally-accepted method to decrease-let alone effectively detect-partial hemizygosity due to allelic dropout. It seems that resequencing of the region of interest by two independent oligoprimer pairs remains the "gold standard" of DNA diagnostics.
There are 2 types of causes of the ADO phenomenon described in literature (Wang et al., 2012): (1) "Sample-specific" causes due to the quality of the DNA sample or the low DNA concentration. Such ADO cases are found in forensic diagnostics, where only fragmented or degraded DNA is available, as well as in preimplantation diagnostics, where genotyping is performed on DNA extracted from one blastomere; (2) "Locus-specific" causes due to the characteristics of the locus under investigation. The presence of single nucleotide polymorphisms (SNPs) in oligoprimer binding sites of forward and/or reverse primers disturbs the specificity of the complementary interaction between the oligonucleotide and the target DNA sequences, leading to the lack of oligoprimer hybridization, and elongation of the amplicon.
All ADO events revealed in this study involved locus-specific causes due to the characteristics of individual loci in normal concentrations of DNA.
In cases of locus-specific allelic dropout, the causal SNV in the oligoprimer binding site is usually located close to the 3 ′ -end of the oligoprimer. This was initially reported in a study by Martins et al. (2011). The authors used the rs2247836 variant (MAF = 0.403 in the European population and 0.323 in the African population) in the intron 4 of the PAH gene to evaluate the probability of ADO depending on SNV location in oligoprimer binding sites. Four alternative variants of forward oligoprimers were designed containing the SNV in the 3rd, 5th, and 7th positions of the 3 ′ -end of the oligoprimer. Sanger sequencing was performed for patients carrying the heterozygous genetic variant rs2247836 and mutation p.Arg158Gln in exon 5 of the PAH gene. Loss of heterozygosity was detected for all positions of the ADO-causing SNV. The authors recommended careful consideration during primer design of the rare/common SNVs in areas within 7 nucleotides of the 3 ′ -end of oligoprimers. We found that the presence of SNVs close to the 5 ′ -end of oligoprimers may also cause ADO events. Convincing data suggest that any polymorphic position within the oligoprimer sequence potentially reduces the accuracy of DNA diagnostics (Martins et al., 2011).
Data from 30,769 reported genotypes for eight mutations involved in four diseases show that, on average, allele dropout/drop-in potentially leading to misdiagnosis occurred in 0.44% of genotype results (Blais et al., 2015). We re-analysed the oligoprimer binding sites containing SNVs within overlapping amplicons and found additional amplicons that may be missing due to ADO events. The presence of SNVs in these fragments was exhibited in the NGS reads as underrepresented variants and/or was revealed in isolated reads ( Table 2).
We hypothesize that the risk of ADO would increase with the number of target genes and overall panel size because it would increase number of oligoprimer pairs to cover. Potential causes of ADO (SNVs) were found in 4 of 521 amplicons in panel 1 (0.77%). There are increasing numbers of studies that discuss the importance of ADO in DNA diagnostic procedures (Tester et al., 2006;Coulet et al., 2010;Medlock et al., 2012;Rossetti et al., 2012;Lam and Mak, 2013;Shmukler et al., 2013;Rhees et al., 2014;Blais et al., 2015;Proost, 2016). This phenomenon was detected in oncogenetics (BRCA1/2 testing), inborn metabolic disease genotyping (FAH testing), hematology research etc (Lam and Mak, 2013;Shmukler et al., 2013;Jeong et al., 2019). We surmise that the actual number of ADO events remains unknown and may significantly exceed the events actually detected. The risk of the negative impacts of possible polymorphic sites in automatically generated primer sequences on sequencing results remains high. It depends on the number of overlapping amplicons. It seems that underrepresentation of genetic variants in NGS reads is not dependent on the read depth. Jeong et al. demonstrated a reproducible ADO phenomenon during 3 consecutive resequencing by Ion S5 with read depth from 1985 to 8608 (Jeong et al., 2019).
Allelic dropout can lead to underdetection of the rare variants within missing amplicons and increase false negative results rate in diagnostic setting, and can cause mistaken assignment of heterozygous genotypes as homozygotes with underestimation of the observed heterozygosity in population studies. The simple strategy to restore allelic drop out could be a repeated genotyping with non-overlapping pairs of oligoprimers. But in daily practice replicate genotyping is costly. Increasing the tiling density of amplicons would be also helpful but it requires more oligos for design and synthesis, and also noticeably increases the assay price. Another way could be to improve automatic primer design tools with continuously updating dbSNP and gnomAD data to omit inclusion of the SNPs into the primer sequences. Analysis the secondary structure of primer and template sequences would also be important for the design algorithm (Lam and Mak, 2013). Regular updates on SNV distribution and prevalence in the human genome and improvement in primer design algorithms would greatly improve the diagnostic yield of molecular genetic testing.
In conclusion, PCR-based sequencing technologies such as next-generation sequencing and Sanger sequencing are widely used in clinical practice. Despite their high throughput and constantly improving efficiency, limitations remain for the application of these technologies.
All PCR-based methods involve the risk that ADO will decrease the diagnostic yield of genetic testing because of undetectable, potentially pathogenic variants. We demonstrate that ADO is a common phenomenon in both NGS and Sanger sequencing results.
Theoretically, ADO may affect up to 0.77% of amplicons. It seems that the actual rate of ADO may be even higher and is dependent on the number of oligoprimer pairs. Specific software that incorporates updates on the distribution of SNVs to avoid ADO resulting from automatic oligoprimer design would substantially increase the accuracy of molecular research.
Oligoprimer sequences are available upon request.

DATA AVAILABILITY STATEMENT
The datasets for this article are not publicly available due to concerns regarding participant/patient anonymity. Requests to access the datasets should be directed to the corresponding author. The ClinVar reference numbers for the variants discussed in the article are present in the article text and/or Supplementary Material.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the local Ethics Committee of Petrovsky National Research Center of Surgery (Moscow, Russia). The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
AS, AB, and SS performed wet genetic investigation. AS performed data analysis and drafting the manuscript. EZ-management of the project, editing, and final approval of the manuscript. All authors read, discussed, and approved the manuscript as submitted.

FUNDING
This work was supported by Russian Science Foundation Research grant № 16-15-10421.