Reality check: testing the in silico predictions of false negative results due to mutations in SARS-CoV-2 PCR assays using templates with mismatches in vitro

Otwell, Taylor; Knight, Brittany; Coryell, Michael; Stone, Jennifer; Davis, Phillip; Necciai, Bryan; Carlson, Paul; Sozhamannan, Shanmuga

doi:10.3389/fcimb.2025.1524025

ORIGINAL RESEARCH article

Front. Cell. Infect. Microbiol., 31 October 2025

Sec. Clinical and Diagnostic Microbiology and Immunology

Volume 15 - 2025 | https://doi.org/10.3389/fcimb.2025.1524025

Reality check: testing the in silico predictions of false negative results due to mutations in SARS-CoV-2 PCR assays using templates with mismatches in vitro

Taylor Otwell¹

Brittany Knight^1*

Michael Coryell²

Jennifer Stone^1,3

Phillip Davis¹

Bryan Necciai⁴

Paul Carlson²

Shanmuga Sozhamannan^4,5*

¹Integrated Health Surveillance and Diagnostics, MRIGlobal, Kansas City, MO, United States
²Laboratory of Mucosal Pathogens & Cellular Immunology, Division of Bacterial, Parasitic and Allergenic Products, Office of Vaccines Research and Review, Biologics Evaluation and Research, U.S. Food and Drug Administration, Silver Spring, MD, United States
³National Strategic Research Institute, Nebraska Medical Center, Omaha, NE, United States
⁴Defense Biological Product Assurance Office (DBPAO), Joint Program Executive Office for Chemical, Biological, Radiological and Nuclear Defense (JPEO-CBRND), Enabling Biotechnologies, Frederick, MD, United States
⁵Joint Research and Development, LLC, Stafford, VA, United States

Molecular diagnostic assays are critical tools to test, diagnose and treat infectious and other diseases. For example, PCR test results have been extremely valuable during the COVID-19 pandemic, not only to provide appropriate health care for infected and symptomatic individuals as needed, but also for implementing public health measures such as test, trace and isolate infected and asymptomatic individuals to prevent further transmission of the virus. Sustained transmission and unhindered proliferation of the pathogen across the population during a continuous, ongoing pandemic such as COVID-19, resulted in many variants with mutations. These mutations may lead to signature erosion, a phenomenon wherein diagnostic tests developed using the genomic sequence of an earlier version of the pathogen, may fail and cause a false negative (FN) result in a sample containing a new variant. We and others have developed applications such as PSET (PCR Signature Erosion Tool) to monitor the performance of diagnostic tests in silico using pathogen genomic sequences. Here, we present and discuss the data on wet lab testing of the in silico predictions to assess assay performance with mismatches in assay signatures. We found that the majority of the assays performed without drastic reduction in assay performance even with mismatches in primer and probe regions as measured by PCR efficiencies and C_t value shifts. We identified critical residues and positions and types of changes that may impact assay performance. Despite the extensive accumulation of mutations in SARS-CoV-2 variants over the course of the various waves of the pandemic, most PCR assays proved to be extremely robust and continued to perform well even with drastic changes and signature erosion.

Introduction/background

Real time PCR assays are the bedrock of pathogen detection in various clinical, veterinary, and environmental sample types. In these tests, presence of a unique part of a pathogen genome is examined by polymerase chain reaction based nucleic acid amplification. Usually, the target amplicon, a very tiny portion of the genome (e.g., ~0.0018% of Bacillus anthracis or ~0.33% of SARS-CoV-2 genome for a 100 bp amplicon), is used as a proxy for the presence of the pathogen. The underlying success of the PCR test relies on specific and efficient binding of the primers and probes to the complementary target nucleic acid sequences present in the sample and subsequent amplification. Several parameters contribute to the efficiency of the PCR which include composition of bases (GC content), the interaction kinetics of the primer and probe to the target sequences reflective of the percent identity between the two components and reaction conditions (ionic strength and other reagents) and cycling parameters. Designing successful PCR assays may be impacted by the many inherent features of the pathogen genome sequences and their diversity and hence availability of contiguous stretches of unique sequences of sufficient length for designing a PCR assay (Stadhouders et al., 2010). In addition, appropriate target selection during assay design is also critical to achieve high specificity of an assay, i.e., the assay should only detect the agent or pathogen of interest and not any near neighbor or any other DNA that may have some identity to the intended amplicon target. For the most part, a well-designed assay detects all strains of a given target pathogen and excludes everything else. The assays are designed based on available sequences of the target agent at a given time and hence may only be effective against the sequence diversity known at that time.

Democratization of next-generation sequencing has enabled the generation of thousands of whole genome sequences of a given pathogen directly from clinical specimens, as evidenced during the SARS-CoV-2 pandemic. As of September 03, 2024, around 16,934,260 sequences have been generated and shared from many countries around the globe via GISAID, the global data science initiative (Khare et al., 2021). The sensitivity, speed, scalability, and reduced costs of modern-day sequencers make whole genome sequencing an attractive diagnostic tool—even rivaling PCR—though there are still a number of challenges that currently prevent whole genome sequencing from completely eclipsing PCR as the predominant diagnostic tool for SARS-CoV-2 and other infectious diseases (John et al., 2021). This genomic revolution has created an opportunity as well as challenges for molecular assay design, development, testing, validation, and continuous performance evaluation and monitoring. With the constant evolution of the given target pathogen over time, there is a possibility that the assay target may show signature erosion and the assays may fail, resulting in false negative results. Hence, real time periodic monitoring of assays in silico against newer sequences can potentially reveal such failures in advance. This will enable redesigning of assays to address the changing genomic profile even before a variant becomes dominant and causes overwhelming false negative results in clinical and environmental sample testing. SARS-CoV-2 exemplifies such a scenario and in silico monitoring has revealed such failures in diagnostic assays [e.g., S Gene Target Failure (SGTF) from alpha variant] over time during the pandemic (Davies et al., 2021). Mismatch in assay signatures can also be beneficial for discriminating wild-type and variant strains, as evidenced by SGTF. Many studies have reported in silico monitoring as a means of assessing assay failures (Sozhamannan et al., 2015; Khan and Cheung, 2020; Miranda and Weber, 2021; Mentes et al., 2022; Rana et al., 2022).

However useful in silico monitoring of assays may be, these approaches may not accurately predict assay failures in analytical or clinical sample testing. The assay failure and hence false negative results in wet lab testing may result from not only the type and number of mismatches in signature sequences or position of the mismatches from the priming site but also other parameters such as ionic conditions of the PCR reaction and matrix effects. Many in silico predictions do not account for these factors. Understanding the impact of these factors may aid in tweaking the in silico approaches to better predict assay failures and may potentially circumvent the need for wet lab testing.

The basic physical parameters of nucleotide mismatches and ionic conditions and their impact on PCR have been well defined (Eun, 1996). In general, a 1% base mismatch reduces the melting temperature (T_m) by 1.0–1.4 °C (Bonner et al., 1973). For an oligonucleotide, single base pair mismatches can affect the T_m by as much as 10 °C (Wallace et al., 1979). When the T_m is decreased by ∼15 °C due to mismatches, the annealing rate of the DNA is reduced by a factor of two (Chang et al., 1974). Mismatching hybrids are more stable at high salt than at low salt concentrations, approximately 66% being the minimum match. The stringency of hybridization can be adjusted by several factors such as temperature, ionic strength, and chaotropic agents (Nozari et al., 1986).

There is extensive documentation on wet lab testing the impact of mismatches between primers/probes and templates. Different experimental approaches can be taken to assess PCR assay performance. Priming probabilities, for example, were found to be a good measure of analytical specificity (Boyle et al., 2009; Wright et al., 2014). Another study quantitatively investigated the effects of primer-template mismatches within the 3-end primer region on real-time PCR using the 5-nuclease assay (Stadhouders et al., 2010). The results showed that single mismatches instigate a broad variety of effects, ranging from minor (<1.5 cycle threshold, e.g., A–C, C–A, T–G, G–T) to severe impact (>7.0 cycle threshold, e.g., A–A, G–A, A–G, C–C) on PCR amplification.

A systematic approach to assess the effects of mismatches and their positions from the 3’end was conducted (Lefever et al., 2013). Single mismatches located >5 bp from the 3 end have a moderate effect on qPCR amplification and can be tolerated and complete blocking of the PCR reaction was observed for 4 mismatches. A number of additional studies have drawn similar conclusions (Christopherson et al., 1997; Süss et al., 2009; Wu et al., 2009; Persson et al., 2019; Howson et al., 2020; So et al., 2020; Cao et al., 2022; Wu et al., 2023). These studies in general point to various parameters to be critical in impacting assay efficiencies because of diversities in experimental set up and other factors and outputs measured. A few recent studies have focused on SARS-CoV-2 as examples and the impact of mutations on PCR efficiencies (Vogels et al., 2020; Storey et al., 2021; Chen et al., 2022).

Overall, the forgoing studies are either focused on introduction of different types and numbers of single nucleotide polymorphisms (SNPs) in predetermined tester-chosen locations, or the assays are restricted to one target gene or organism and the data are extrapolated to other assays and scenarios. Thus, the data were derived from a limited set of template mismatches or a single assay target. Also, these studies could not arrive at a single parameter that can predict false negative results. During the COVID-19 pandemic, a large number (276) of molecular diagnostic assays were developed and issued Emergency Use Authorization (EUA) (United States Food and Drug Administration). These assays are generally spread across the entire genome, and there were many variants with mutations falling within these assay signatures.

The availability of a large set of diagnostic assays and millions of SARS-CoV-2 whole genome sequences provided the opportunity to test in vitro a number of naturally occurring permutations of different assays and validate the in silico predictions. Here we have conducted extensive testing of 16 assays with over 200 synthetic templates spanning the SARS-CoV-2 genome. We assessed the impact of mismatches in primer and probe binding sites on PCR performance by capturing various metrics such as change in melting temperature (ΔT_m), amplification efficiency, C_t values obtained at various template concentrations, and y-intercept. The use of machine learning models trained on these data to predict the impact of template mismatches on PCR assay performance has been reported elsewhere (Knight et al., 2025). Here we present and discuss the companion wet lab testing data for the 16 assays with various mutant permutations of the wild-type template.

Materials and methods

Assay selection

Over the course of the COVID-19 pandemic, periodically, we tracked the performance of 43 SARS-CoV-2 PCR assays using an in silico analysis tool called PSET (PCR signature erosion tool) against SARS-CoV-2 sequences from the GISAID database¹. This tool primarily used percent identity between the query sequence (assay signature sequences comprising of primer, probe and amplicon sequence) and the subject sequences from GISAID. If the mismatch percent in either one of the primers or the probe was >10%, then those subject sequences were considered to have potential for causing false negative results in wet lab testing. The 10% mismatch threshold was chosen somewhat arbitrarily and considering some literature data (Lefever et al., 2013) with the logic that primers of 20 nt length can tolerate up to 2 mismatches without significant reduction in assay performance.

Of the 43 assays that were tracked in real time using GISAID sequences, we selected 16 assays for evaluating the accuracy of in silico predicted false negative results using synthetic templates with specific mutations in either primer, probe or a combination. These 16 assays are: BVP 501Y, BVP 501Y-omi, C3 ORF3a, C4 ORF8, CDC_N2, Chan-S, China_N, France_nCoV_IP2, HKU-ORF1b-nsp14, Japan_N, Japan_N2, ncov_n_gene, Noblis.40, Yale 69/70 del, Young-ORF1ab, Young-S. Assay designs (primer, probe, and amplicon sequences), sources/references, and genomic locations for all 43 assays are provided in Supplementary File 1. Location of the 16 selected assays on the SARS-CoV-2 genome (NC_045512) is displayed in Figure 1.

Figure 1

Genome map of NC_045512 with PCR assay locations marked in red, such as

Figure 1. Location of the 16 selected assays on the SARS-CoV-2 genome (NC_045512). Genes are represented by blue bars with annotations, and assay locations by red bars.

In silico analysis of assay inclusivity and uniqueness of false negative (FN) results

In order to evaluate the inclusivity of each assay, all SARS-CoV-2 genomes designated as “complete” (N = 1793877) were downloaded from NCBI Virus database² (Hatcher et al., 2017) on March 29, 2023, along with a corresponding metadata table containing the Pangolin (Rambaut et al., 2020) lineage assignments. Prediction of amplicon products in silico, as well as calculation of inclusivity statistics were performed by the method previously described (Stanhope et al., 2022).

In order to evaluate the number of unique template sequences contributing to False Negatives (FNs) we aligned all of the genomes downloaded for analysis (N = 1793877) to the RefSeq SARS-CoV-2 genome (NC_045512.2) using minimap2 (Li, 2018) with the “asm5” preset corresponding to 5% sequence divergence. For each primer pair we extracted the coordinates from the predicted amplicon table to produce a bed file with the genomic intervals for each assay on the reference genome. Using the BAM file from the minimap2 alignment and the bed intervals for each assay we used bam2msa³ to excise all the unique sequence alignments to the region of the reference genome corresponding to each assay. A diagram to illustrate the in silico analysis workflow described above is shown in Figure 2.

Figure 2

Flowchart illustrating an in silico PCR process. It begins with Minimap2 alignment of amplicons to a SARS-CoV-2 reference, followed by extraction of alignments using bam2msa. The alignments are used to calculate statistics like TP, FP, and Unique FN. There is also a step for extracting coordinates for each assay aligned on the reference. Arrows indicate the flow between steps.

Figure 2. Simulated PCR output is used to generate the expected amplicons used in the alignment step. Coordinates of each assay’s predicted amplicon from the reference genome are used to get intervals for extracting aligned amplicons from the BAM file. These alignments are converted to MSA objects to calculate.

True Positives were calculated from the total number of accession numbers with results in amplicon data table produced by simulate_PCR (Gardner and Slezak, 2014) after the detection filtering criteria of no less than 90% alignment identity across the length of the primers and probes. False Negatives were the set of accession missing after subtracting the True Positive from the total set. While the input number of genomes was 1,793,877, many genomes could not be aligned across various sections of the reference genome for reasons including assembly errors in the query record and large numbers of ambiguous nucleotide characters such as “N”. This resulted in several instances where the number of True Positive detections based on the estimated amplicons from simulate_PCR was larger than the number of assemblies that could be aligned to a region on the reference that corresponds to a primer set. We therefore reported alignment depth in the whole genome alignment at each location. The number of False Negatives in the alignment indicates many false negative sequences were successfully aligned to the reference genome, despite not passing the filtering criteria described above for detection. Unique False Negatives were the count of unique sequences in the Multiple Sequence Alignment (MSA) that came from genomes in the False Negative set for each assay. To calculate the contribution of “N” characters in the input sequences to the number of unique sequences we filtered out any sequences in the MSA that contained an “N” character and then performed the same calculation for Unique False Negatives above for the remaining sequences. For each assay, the number and percentage of aligned sequences identified as False Negatives was calculated for both the full dataset, as well as the dataset after filtering out aligned sequences with “N”s, acknowledging that filtering of these sequences could result in skewing the lineage-level False Negative predictions.

In silico prediction of melting temperatures

For evaluating the determinants of PCR performance, features were engineered to incorporate information about the impact of mismatches on melting temperature of each oligonucleotide, specifically, with candidate template sequences. To produce these features, primer and probe sequences were aligned with the corresponding experimental template sequence. Using the MeltingTemp module from the Biopython library (Cock et al., 2009) melting temperatures of each oligo as it aligns with the experimental templates was calculated with the GC content melting temperature formula: $(T m = 81.5 + 0.41 (% G C) - 500 / N + 16.6 \times \log [N a +] / (1.0 + 0.7 \times [N a +]))$ , where Na+ is the millimolar concentration of sodium ions, GC is the fractional GC content of the alignment and N is the length of the oligonucleotide (Wetmur, 1991). The “Annealing Temp Change” features were calculated by subtracting the estimated oligonucleotide melting temperature for each probe and template alignment under consideration from the ideal estimated melting temperature, where the oligonucleotides are exact matches for the template.

PCR design and experiments

Template sequences selected for wet lab testing are provided alongside a summary of the test results in Supplementary File 2. Representative false negative (FN) templates and positive control templates were ordered as synthetic DNA oligos (gBlock fragments) from IDT (Coralville, Iowa), and included 20 bps of flanking template sequence on each end of the amplicon sequence. The FN templates were tested at four levels (50, 500, 5000, and 50,000 copies per reaction) in triplicate reactions alongside no template controls (NTCs) and positive controls (PCs) (wild-type with no mismatches). A universal set of reagents and parameters was used for testing all assays, which included TaqPath 1-Step RT-qPCR Master Mix, CG (Thermo Fisher Cat. No. A15299). Primers and probes (IDT, Coralville, Iowa; PrimeTime™ 5’ 6-FAM™/ZEN™/3’ IB^®FQ) were included at final concentrations in the reaction of 900 nM and 250 nM, respectively. These concentrations were selected based on the highest concentrations recommended by the manufacturer for this master mix, because higher concentrations enhance the likelihood of primer and probe binding, meaning higher concentrations are often more permissive of mismatches. The final reaction volume was 20 µL, with 5 µL of template added to 15 µL of master mix. The thermal cycling protocol used was as follows: reverse transcription at 50 °C for 15 minutes, initial denaturation at 95 °C for 2 minutes, and 50 cycles of 95 °C for 3 seconds and 55 °C for 30 seconds, with data collection at the end of each cycle. This protocol corresponds to the manufacturer’s recommendations for this master mix, with two modifications: 1) The number of cycles was increased from 40 to 50 to allow generation of C_t values for templates with suboptimal amplification efficiency due to mismatches or deletions, and 2) The annealing and extension temperature was reduced from 60 °C to 55 °C to be more permissive of mismatches and reflective of annealing/extension temperature recommended for many of the published assays evaluated.

PCR was performed on a Bio-Rad CFX96 real-time PCR instrument followed by analysis using a universal threshold in order to assess mismatched template performance to the wild-type positive control. In addition to qualitative results (detection or no detection), quantitative performance metrics captured included average C_t values at each level tested, amplification efficiency, linear regression coefficient (R²), and y-intercept (the theoretical C_t value that would be obtained for a single copy of template, which can be used as rough indicator of an assay’s anticipated analytical sensitivity).

Results

In silico prediction of false negative results

The overall false negative results predicted for most assays was ≤5%, with three exceptions: China_N, Yale 69/70 del, and Young_S (Figure 2). These three assays had overall false negative rates of 69.7%, 54.1%, and 54.2% respectively. Analysis of false negatives for these assays on a per-lineage basis revealed that the overall high false negative results are contributed to primarily by some lineages which produced very high predicted false negatives (99-100%), while other lineages showed false negatives percentages of<1%. Similarly, some assays with low overall predicted false negative results produced moderately high false negatives for some specific lineages. For example, the Chan_S assay had an overall predicted false negative of just 5%, but for some lineages of interest (e.g., B.1.258) were >30% (Figure 3).

Figure 3

Chart showing percentage of database sequences with predicted false-negative results across various assays for variants of SARS-CoV-2. Rows represent different assays. Columns list variants of SARS-CoV-2. False negative percentages are represented by red bars. Total number of representative sequences for each lineage are displayed below the lineage name. Predicted false negative results for the China_N are noticeably high across all lineages.

Figure 3. Percentage of database sequences predicted to cause false negative results for each assay, with percentages represented by red data bars. Total number of sequences present in the database for each lineage are displayed below the lineage name.

Because of the heterogeneity of predicted false negative percentages across different lineages, we hypothesized that the majority of predicted false negative results might result from non-unique mutations prevalent in specific lineages. This was investigated by aligning database sequences to the assay regions and determining the percentage of unique false negative sequences aligned (after excluding sequences with ambiguous “N” nucleotides). This analysis revealed that for all assays, only a very small proportion (<2%) of the aligned false negative sequences were unique, regardless of whether the assay had low or high overall percentage of false negative sequences in the alignment (Table 1). This may also indicate overrepresentation of specific lineage sequences in the database.

Table 1

Table 1. Results from evaluating the number of unique false negatives for each assay.

Assay performance with FN templates containing mismatches in primer binding sites

To evaluate the impact of mismatches in primer binding sites identified during in silico analyses, we assessed a total of 97 templates that included templates predicted to cause false negatives and the corresponding positive controls (no primer or probe mismatches) for 11 different SARS-CoV-2 assays (Figure 4). Detailed information and performance metrics tables for all assays are provided in Supplementary File 2.

Figure 4

A heat map displaying primer-template specificity and efficiency across multiple assay amplicon sequences. The left panel displays sequences with color-coded primers, while the right panel includes a heatmap indicating copy efficiency and correlation. Mutations in primer binding regions are annotated. A heatmap key is provided at the bottom right.

Figure 4. Alignment of all templates with mutations in primer binding sites, alongside a heat map of in silico predicted changes in primer T_m and in vitro PCR testing metrics (average C_t values at 50, 500, 5,000, and 50,000 copies per reaction; y-intercept; and amplification efficiency). For template sequences, green = primer and probe binding sites; blue = transition mutations; red = transversion mutations; grey = deletions. For ease of visualization, in the assay map, the forward primer is depicted closer to the heat map on the right side.

During in silico analyses, the China_N assay generated the highest overall false negative percentages. This is predominantly due to 3-nucleotide mismatches present at the very beginning (5’ end) of the forward primer (GGG → AAC) common in some lineages (e.g., P.1, P.2, B.1.1.7, and various omicron lineages) that resulted in a<90% match. However, PCR testing with a template containing these 3-nucleotide mismatches (FN4754) produced amplification curves (Figure 5) comparable to those produced by the positive control (PC) template containing no mismatches. The absence of a negative impact (contrary to in silico predictions based on % match), is likely due to; 1) The 3-nucleotide mismatch is located at the 5’ end of the primer and had negligible impact on amplification, and 2) Despite a decrease in predicted T_m of ~7 °C, the predicted T_m with the mismatches (56 °C) is still above the annealing and extension temperature (55 °C) used in this study.

Figure 5

Amplification plots comparing positive control versus a known variant for the China_N assay show that the 3 mismatches at the 5' end of the forward primer do not affect assay performance.

Figure 5. PCR Amplification Plots for the China_N PC template (left) and FN4754 template (right) containing a common 3-nucleotide mismatch in the forward primer region.

Along with the FN4754 template containing the common 3-nucleotide mismatches, 54 additional unique FN templates of the China_N assay were also tested. Summary PCR metrics (amplification efficiency, y-intercept, and average C_t values obtained at each test level) for each of these templates are shown as heat maps (Figure 4), along with the decrease in in silico predicted T_m caused by mismatches in the primer binding sites. Only 9 of the 55 predicted China_N FN templates produced false negative results at the lowest template level tested (50 copies per reaction), though many produced delayed amplification, as evidenced by later C_t values and y-intercepts.

Another one of the three assays that generated high overall false negative percentages during in silico analyses was the Young-S assay. This was primarily due to a high number of sequences containing a 6-nucleotide deletion that is prevalent in many lineages. We tested a total of nine Young-S FN templates, including an FN template (FN5685) with the prevalent 6-nucleotide deletion. Summary PCR metrics (amplification efficiency, y-intercept, and average C_t values obtained at each test level) for each of these templates are shown (Figure 4), along with the decrease in in silico predicted T_m caused by mismatches in the primer binding sites. Four of the nine FN templates tested—including FN5685—produced false negative results at the lowest level tested (50 copies per reaction), and all produced delayed amplification (right C_t shift) relative to the PC. Interestingly, some templates with similarly sized and even larger deletions than FN5685 produced better PCR results. For example, FN5974 contained a larger 9-nucleotide deletion overlapping the predominant 6-nucleotide deletion, but produced earlier amplification than FN5685 and produced no false negative results at the lowest level. This is because the 9-nucleotide deletion in FN5974 in fact results in just three mismatches in the primer-binding region due to similarities in the upstream flanking region, making the 9-nucleotide deletion less impactful in PCR performance than 3-and 6-nucleotide deletions in the same region. This is also evident in the predicted Δ T_m values, with the larger FN5974 deletion causing a Δ T_m of 10 °C, while the smaller FN5685 deletion caused a Δ T_m of 21 °C.

We tested six C4 ORF8 FN templates, each containing up to four mismatches in the reverse primer (Figure 4). None of the tested FN templates produced notably delayed amplification, with maximum C_t shifts of around 1–2 C_t values. We also tested one C3 ORF3a FN template, which contained three mismatches in the forward primer. In contrast to the minimal impact observed for C4 ORF8 templates with up to four mismatches in a primer binding site, the C3 ORF3a FN template with three mismatches caused significant amplification delays, resulting in failed detection below 5,000 copies per reaction and large right C_t shifts (>15 C_t values) at higher test levels. The extreme negative impact is likely attributed to the position of the three mismatches, which are all located within the last four nucleotides of 3’ end of the forward primer.

Additionally, we tested one France_nCoV_IP2 FN template, which contained a 3-nucleotide deletion near the 3’ end of the reverse primer and caused false negative results at all levels tested. We also tested one HKU-Orf1b-nsp14 FN template that contained a 3-nucleotide deletion near the middle of the reverse primer and two Japan_N FN templates with two mismatches each in the reverse primer region, one Young-Orf1ab FN template with three mismatches in the reverse primer region, and one Noblis.40 FN template with three mismatches in the forward primer region. All of these FN templates produced significantly delayed amplification (C_t shifts of 3–8 C_t values), but only the Noblis.40 FN template produced false negative results, and only at the 50 and 500 copies/reaction levels.

Assay performance with FN templates containing mismatches in probe binding sites

To evaluate the impact of mismatches in probe binding region, we tested a total of 56 templates, that comprised of templates predicted to cause false negatives during in silico analyses and the corresponding positive controls (no primer or probe mismatches) for 6 different SARS-CoV-2 assays (Figure 6).

Figure 6

A heat map displaying probe-template specificity and efficiency across multiple assay amplicon sequences. The left panel displays sequences with color-coded probes, while the right panel includes a heatmap indicating copy efficiency and correlation. Mutations in probe binding regions are annotated. A heatmap key is provided at the bottom right.

Figure 6. Alignment of all templates with mutations in probe binding sites, alongside a heat map of in silico predicted changes in primer T_m and in vitro PCR testing metrics (average C_t values at 50, 500, 5,000, and 50,000 copies per reaction; y-intercept; and amplification efficiency). For template sequences, green = primer and probe binding sites; blue = transition mutations; red = transversion mutations; grey = deletions. For ease of visualization, in the assay map, the forward primer is depicted closer to the heat map on the right side.

We tested ten FN templates derived from one assay, C4 ORF8, with deletions in the probe region (Figure 6). This assay demonstrated a surprisingly high tolerance to deletions in the probe binding region. Assay performance was only moderately affected for templates with a deletion of ≤6 nucleotides (of a 26-nucleotide probe binding site), with C_t shifts of ≤5 C_ts and no failed detection (false negatives), even at the lowest template concentration (50 copies/reaction). For the FN template with a 7-nucleotide deletion (FN5859), the C_t shift was more pronounced and the average C_t value at 50 copies/reaction was >40. For the two FN templates with 8-nucleotide deletions (FN6972 and FN5973), no positive results were produced at any of the template levels tested.

Additionally, we tested one ncov_n_gene FN template, which contained a 3-nucleotide deletion in the probe region (Figure 6). This template did not produce false negative results at any levels tested, although it did cause C_t shifts of approximately 3 C_t values at each level tested. We also tested one Young-S FN template that contained a 3-nucleotide deletion in the probe region and three mismatches; this template produced false negative results at all levels tested.

Comparison of the impact of mismatches between primer and probe binding regions

We compared the overall impact of mismatches in primer versus probe binding regions on amplification efficiency and found that mismatches in primer binding regions produced a much broader spectrum of amplification efficiencies, whereas mismatches in probe binding regions had a narrower effect (Figure 7). Probe mismatches were more likely to result in a weaker fluorescent signal rather than reduced amplification efficiency, producing a more binary qualitative result of either detection or no detection depending on the impact. These data collectively suggest that primer and probe binding region mismatches should be considered individually when evaluating the potential for assay failure and false negative results.

Figure 7

Histogram with overlaid density plots showing the distribution of efficiency for two types: Primer (blue) and Probe (orange), Primer and Probe (gray). Efficiency ranges from 0.4 to 1.0 on the x-axis, with count on the y-axis.

Figure 7. A graphical representation comparing the effects of mismatches in primer and probe binding regions on amplification efficiency. The gray portion of the bar graph represents assays containing both primer and probe mismatches. Counts represent the number of templates that fell within each efficiency value grouping.

Variant panel assay that produced unexpected false negative results not predicted by the in silico pipeline

The N501Y mutation is present in many lineages, and we previously developed a triplex panel that included this mutation for detecting variants (Stanhope et al., 2022). This assay is specific for the mutant allele and PCR amplification is seen only with the mutant template and not with the wild-type template. In silico analyses of the presence of N501Y in omicron and its sub lineages indeed showed the presence of the mutation in these lineages (BA1, BA1.1, BA2, BA3, BA4, and BA5). However, this assay produced false negative results in retrospective testing with Omicron clinical specimens. Alignment of the amplicon region for reference sequences from wild-type (A), B.1.1.7, and Omicron BA.1 strains showed that the Omicron variant had mutations in both primer regions: two mutations in the forward primer and one in the reverse primer (Figure 8). However, the in silico inclusivity analysis pipeline did not identify this as a potential false negative, as two mismatches in a 20-nucleotide primer is below the ≤10% mismatch threshold that triggers classification as a false negative.

Figure 8

DNA sequence alignment image comparing wild-type (A_NC_045512) to two variants: B.1.1.7_OK244698 and BA.1_ON466927. Green sections indicate matching sequences, while blue and red highlight mismatches in specific nucleotides. The image shows forward and reverse sequence mismatches, with labels indicating the number of mismatches for each variant.

Figure 8. Alignment of the BVP 501Y amplicon region for A (wild type), B.1.1.7, and BA.1 reference sequences. Green = primer binding site; blue = probe binding site; red = mutation or mismatch.

To investigate whether the false negative result was due to mismatches in the forward or reverse primer binding site, or a combination of both, three synthetic templates were tested: one with all three mutations present in Omicron primer regions (“omi_FN”), one with only the forward primer mismatches (“omi_ForMms”), and one with only the reverse primer mismatch (“omi_RevMms”), as previously shown in Figure 3. The resulting PCR metrics indicated that the two mismatches in the forward primer are responsible for the delayed amplification and resulting false negative results (see Table 2). The reverse primer mismatch has little effect, despite being located near the 3’ end.

Table 2

Table 2. Average C_t values obtained with the original 501Y assay and the 501Y-Omicron assay, demonstrating significant adverse impacts of two forward primer mismatches and minimal impacts of the one reverse primer mismatch.

Testing of Alternative 501Y Primers

We designed new primers to specifically detect the N501Y mutation in Omicron strains. When aligned to the B.1.1.7 sequence, the new assay (“501Y-omi”) contained two mismatches in the forward primer region and one mismatch in the reverse primer region (Figure 4). This assay was tested with the same set of four templates previously tested with the original 501Y assay (Table 2). The new assay produced good results with the perfectly matched Omicron template and the template containing just one mismatch in the reverse primer but produced a notable C_t shift (approximately 4–6 C_t values) when used with the B.1.1.7-derived PC for the original assay and the other template containing two mismatches in the forward primer region. Similar to results obtained for the original BVP 501Y assay, results for the 501Y-omi assay confirmed that having just two mismatches in the forward primer region was significantly detrimental to assay sensitivity, despite being below the ≤10% mismatch threshold used to identify potential false negatives in silico, and that the mismatch in the reverse primer had negligible impact, despite being just one nucleotide from the 3’ end.

As an additional option for detection of the N501Y mutation, we designed primers containing mixed bases at the three nucleotide locations in the original primer set that were mismatched with the Omicron sequence. Mixed-base primers allowed comparable detection of both Omicron and the original variants (represented by B.1.1.7), whereas the Omicron-based primers (501Y-omi) allowed strong detection of Omicron but delayed (though not eliminated) detection of B.1.1.7 (Figure 9). Based on these results, we concluded that mixed-base primers could be used for broader detection of the 501Y mutation in both Omicron and non-Omicron variants.

Figure 9

Four amplification graphs comparing 501Y mixed based primers with Omicron-based primers at various concentrations ranging from 50 copies to 50,000 copies per reaction. The use of mixed-base primers more broadly detect the 501Y mutation indicated by amplification of B.1.1.7 variant compared to amplification with Omicron-specific primers. The key indicates circles for Omicron gRNA and crosses for B.1.1.7 gRNA.

Figure 9. PCR amplification plots for the 501Y mixed base primers and Omicron-based primers (501Y-omi) when tested with gRNA from Omicron and B.1.1.7 strains. Mixed base primers produced comparable results for both variant strains (top left), whereas Omicron-based primers produced strong detection of Omicron gRNA, but significantly delayed amplification of B.1.1.7 gRNA that resulted in failed detection at the lowest test level (top right). Likewise, both primer sets produced comparable results for Omicron gRNA (bottom left), whereas only the mixed base primers produced strong detection of B.1.1.1 gRNA (bottom right).

Assays with known failures in clinical testing

We tested three assays known to produce negative results with some variants in clinical testing: the Yale 69/70 del assay (which is known to produces negative results for variants containing the S:60/90 deletion), the CDC_N2 assay, and the Japan_N2 assay.

Yale 69/70 del (SGTF) assay

In December 2020, the emergence and rapid spread of the Alpha variant was witnessed by the entire global community. Among the many mutations in Alpha variant, a deletion of 6 base pairs in the spike gene resulting in the deletion of two amino acids at positions 69–70 was responsible for some commercial testing kits—e.g., the Thermo Fisher TaqPath COVID-19 assay—producing false negative results in samples containing SARS-CoV-2 Alpha variant. Extensive genome sequencing data confirmed that the widespread S gene target failure (SGTF) phenomenon was primarily due to the new Alpha variant (Public Health England, 2020). The primer and probe sequences of the commercial TaqPath assay are not published; however, the Grubaugh lab published an assay that recapitulates the assay, designated here as the Yale 69/70 del assay (Vogels et al., 2021). Of note, the 6-nucleotide deletion that produces negative results with the Yale 69/70 del assay is the same deletion responsible for the many false negative results predicted for the Young-S assay, but the location of the deletion relative to the primer and probe binding sites is different. With the Yale 69/70 del assay, the deletion is located in the probe region, whereas with the Young-S assay, the deletion is located in the forward primer region (Figure 10). As previously discussed, this deletion caused delayed amplification for the Young-S assay and produced false negative results at the lowest level tested (50 copies per reaction).

Figure 10

Diagram of variant B.1.1.7 with a 6-nucleotide deletion in the S gene that causes delayed amplification for the Young-S assay (green arrows) and false-negative results for the Yale_69-70del assay (blue arrows). The deletion is within the forward primer binding site for the Young-S assay, and in the probe binding site for the Yale_69-70del assay.

Figure 10. Alignment of the S:Δ69/70 genomic region showing location of the Yale 69/70 del assay sequences (green) and Young-S assay sequences (blue) relative to the 6-nucleotide deletion (represented by a break in the reference line).

We tested the Yale 69/70 del assay with five templates representing sequences from five different variant lineages: A (wild type), B.1.1.7, BA.1, BA1.1, and BA.5. All lineages with the common 6-nucleotide deletion in the probe binding site produced negative results at all levels tested, as previously shown (Figure 6).

Using white, fluorescence-focusing PCR plates, we were able to detect dim (low RFU) amplification curves for the B.1.1.7 template (Figure 11) with Ct values similar to the wild-type template (Figure 11), but this would most likely result in false negative results in routine testing of clinical specimens. When plotted together with the positive wild-type template to provide additional context, the B.1.1.7 template amplification is indiscernible from the baseline (Figure 11) due to inefficient binding of the probe.

Figure 11

Amplification plots comparing Yale_69-70del assay performance for wild-type template and B.1.1.7 template, where the deletion in the probe-binding region leads to sharply reduced relative fluorescence units (RFU) compared to wild-type.

Figure 11. PCR amplification plots for the Yale 69/70 del assay showing results for just the Wild-type template or the B.1.1.7 (with the deletion) template or plotted together for both templates for additional context given the significantly reduced RFU (see the y-axis scale) for the B.1.1.7 template. Red = 50,000 copies/reaction; orange = 5,000 copies/reaction; grey = 500 copies/reaction; green = 50 copies/reaction.

CDC_N2 and Japan_N2 assays

We tested two additional assays that were producing false negative results in clinical sample testing (Fox-Lewis et al., 2021; Rajib et al., 2022): the CDC_N2 assay and the Japan_N2 assay with mutated probe binding sites. Mutant template tested and the corresponding qPCR results are shown above (Figure 6). In contrast to results reported with clinical samples, none of these templates produced false negative results even at the lowest level tested (50 copies/reaction) or exhibited a severe decrease in assay performance (C_t difference > 3) using our test method. Differences in master mix reagents, primer and probe concentrations, cycling protocols, instrumentation, and other factors may account for the difference in reported impact of these mutations in clinical specimens versus the lack of impact observed in our analytical testing using synthetic templates.

RNA templates

The majority of the foregoing work was performed using synthetic double stranded DNA templates. In order to assess the true impact of mismatches in an assay format where a reverse transcription step is included, we also synthesized and tested RNA templates. A subset of China_N and Young_S templates that demonstrated significantly delayed amplification (y-intercept > 46 or not determined due to amplification failures) during initial testing with DNA templates were synthesized and tested as RNA templates (RNA Ultramers from IDT, Coralville, Iowa) alongside corresponding PC RNA templates. In cases where the primary issue identified by in silico analysis was related to a primer mismatch or deletion (as opposed to an issue with the probe), RNA templates typically produced better results than the corresponding DNA templates, as evidenced by lower C_t values, lower y-intercepts, and/or lower limit of detection (Figure 12). For the one template (FN5760) with a probe issue rather than a primer issue, no difference in performance was observed between the DNA and RNA templates; both template types produced false negative results at all levels tested.

Figure 12

A heat map comparing performance of RNA templates to DNA templates that demonstrated significantly delayed amplification for the Young-S and China_N assays. A key on the right provides a color scale for tm_delta, cycles to threshold, and efficiency. RNA templates typically produced better results than the corresponding DNA templates.

Figure 12. Comparison of results for DNA and RNA templates in a heat map showing in silico predicted changes in primer and probe T_m and in vitro PCR testing metrics (average C_t values at 50, 500, 5,000, and 50,000 copies per reaction; y-intercept; and amplification efficiency).

These results underscore the importance of considering template type (DNA or RNA) in assessing the potential impact of mismatches, especially in primer binding regions. The finding that mismatches in primer binding regions have less of an adverse impact on assay efficiency when the template is RNA is likely because RNA templates are initially reverse transcribed prior to amplification during qPCR. Reverse transcription is performed at a lower temperature that is more permissive of binding to templates with mismatches. For single-stranded RNA templates, whether the mismatch is located on the forward or reverse primer may also affect the impact of mismatches, as only one primer is involved in reverse transcription. Furthermore, if reverse transcription successfully occurs from the region with adverse mismatches, the mismatches will be “corrected” by incorporation of the primer sequence in the cDNA produced by reverse transcription.

Discussion

In silico analysis is a widely used and a critical tool to evaluate inclusivity and exclusivity of PCR assays. It is an especially valuable tool for PCR assays designed to detect or diagnose widespread and rapidly evolving pathogens of global concern, such as the SARS-CoV-2 virus. In an earlier study, we reported in silico prediction of assay performance and impact of mutations (signature erosion) in Ebolavirus molecular assays (Sozhamannan et al., 2015). A number of recent studies have performed in silico analyses of SARS-CoV-2 diagnostic assays using genome sequences (Khan and Cheung, 2020; Mentes et al., 2022; Negrón et al., 2022; Rana et al., 2022). These studies have taken only the alignment-based mismatches between primers and probe sequences and not any other PCR parameters into consideration for determining the impact. The accepted rule is that mismatches are detrimental because they alter the primer/probe-template hybridization temperatures and interaction kinetics. An in silico approach based on thermodynamic evaluation of the impact of DNA mismatches in PCR-type SARS-CoV-2 primers and probes has been done in one study (Miranda and Weber, 2021). These authors performed a more quantitative assessment of the mismatches on assay performance by considering mismatched hybridization temperature within a range of 5 °C to the fully matched reference temperature. Based on their analyses they recommended to consider mismatch hybridization for the design of primers whenever possible, especially to avoid undesired cross-reactivity. Our study also indicates that Δ T_m is a reliable factor that impacts assay performance among various metrics of real time PCR (e.g., efficiency, y-intercept, and C_t shift).

Routine assessment of performance in vitro or in clinical samples is not feasible in an ongoing outbreak due to delays in real time sample acquisition. In these scenarios, synthetic templates may help, but an improved algorithm may be even more helpful to quickly evaluate assay failures in a changing pathogen genomic profile. In order to achieve that goal, empirical data based on an array of assays, mismatches and varying positions with respect to priming site (3’) and different mismatch types (SNPs and indels) are needed to train a model and incorporate useful features. In this study, we conducted an extensive study on a variety of templates to gather such data using a universal set of test conditions (same master mix, primer and probe concentrations, cycling protocol, and instrumentation) rather than the unique optimized conditions developed and validated for each of the individual assays. It is important to note that we chose this study design with the sole intent of gathering broadly applicable feedback on in silico test parameters in a manner that excludes (to the greatest extent possible) the variables inherent in differing or optimized wet lab test conditions. This study was not intended to evaluate clinical test performance of the individual assays evaluated and does not account for variables like assay-specific test conditions, clinical matrix effects, the differences between genomic and synthetic templates, and the use of multiplexed assays to enhance robustness against genetic drift and mutations. While we believe this study design approach is ideal for assay-agnostic in silico tool development, we also recognize the limitations of this approach regarding extrapolation to clinical assay performance, and we believe assay-specific test conditions should be incorporated whenever possible into both in silico monitoring and wet lab performance testing of established assays.

Over 16 million SARS-CoV-2 sequences have been generated spanning the entire gamut of variants that emerged during the pandemic. We looked at the unique variation permutations across the assay signature sequences and found that unique FNs were extremely low (<0.1 -1.5%) across the 16 assays examined in this study. Among these FNs, we evaluated the laboratory performance of over 200 templates predicted in silico (or reported in the literature) to cause false negative results. We found that in some cases, in silico analysis accurately predicted failures, but in many instances, templates that were predicted to produce false negative results instead produced PCR results comparable to templates without any mutations, or caused only minor right shifts in C_t values, indicating that the assays are more robust to genetic drift than anticipated (or at least can be more robust than predicted, depending on the specific wet lab conditions used). Conversely, we also identified cases in which negative, or significantly delayed amplification was observed for templates not identified by in silico analysis as potentially problematic templates. In addition, assays that failed in clinical sample testing performed well without any reduction in efficiency in analytical testing with synthetic templates.

This work underscores the importance of performing laboratory testing to confirm or refute in silico predictions. It also highlights the need for a better understanding of the factors contributing to whether a given set of mutations will or will not significantly impact assay performance, which in turn could lead to the development of more accurate in silico analysis pipelines. To that end, the results obtained in this study suggest that some of the potentially important features to consider include, but are not limited to: 1) Whether the mutations are located in primer or probe binding sites, as that directly impacts the mechanism of potential failure (failed or delayed amplification caused by primer mismatches, versus weak fluorescent signal caused by inefficient probe hybridization), 2) Location of the mutations (i.e., distance from the 5’ vs 3’ end of primers), 3) Type of template (DNA or RNA), 4) Impact of the mutations on melting temperature and hybridization kinetics, and 5) Specifics of the test method being used (e.g., master mix components, primer and probe concentrations, thermal cycling protocol, and PCR instruments used). Based on the data, we have generated a random forest model to assess the impact of mutations on assay performance (Knight et al., 2025). In future work, we will test the model predictions with new designs of an existing poorly performing assay to assess the performance improvement.

Understanding the impact of mismatches on PCR performance may aid in tweaking the in silico approaches of building a robust model to better predict assay failures, and may circumvent the need for expansive and expensive wet lab testing. Increased understanding of the impact of mutations in primer and probe binding regions as well as in other parts of the amplicon can not only improve the prediction models used to assess mutation impact during assay design, but also during real world testing of a designed assay in clinical samples containing a variant of the pathogen with specific mutations in the assay signatures. Our future studies will focus on incorporation of data from testing templates with systematically selected or introduced mutations to capture a more comprehensive dataset for inclusion in our in silico prediction model, and on more direct validation of our model predictions in analytical and clinical testing scenarios of new assay designs based on the model.

Data availability statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Author contributions

TO: Data curation, Investigation, Methodology, Validation, Visualization, Writing – review & editing. BK: Data curation, Investigation, Methodology, Validation, Visualization, Writing – review & editing. MC: Data curation, Investigation, Methodology, Validation, Writing – review & editing. JS: Conceptualization, Methodology, Formal analysis, Resources, Data curation, Writing – original draft, Writing – review & editing, Visualization, Supervision, Project administration. PD: Formal analysis, Software, Validation, Visualization, Writing – review & editing. BN: Funding acquisition, Project administration, Writing – review & editing. PC: Conceptualization, Funding acquisition, Resources, Supervision, Writing – review & editing. SS: Conceptualization, Formal analysis, Project administration, Resources, Supervision, Writing – original draft, Writing – review & editing.

Funding

The author(s) declare financial support was received for the research and/or publication of this article. This study was funded by the Coronavirus Aid, Relief, and Economic Security (CARES) Act and conducted by the Department of Defense’s Joint Program Executive Office for Chemical, Biological, Radiological and Nuclear Defense (JPEO-CBRND) in collaboration with the Defense Health Agency under the Contract (FA807518D0017) with MRIGlobal. Work performed by FDA was funded by the FDA Medical Countermeasures Initiative Regulatory Science Program.

Acknowledgments

We gratefully acknowledge all data contributors, i.e., the Authors and their Originating laboratories responsible for obtaining the specimens, and their Submitting laboratories for generating the genetic sequence and metadata and sharing via the GISAID Initiative, on which this research is based.

Conflict of interest

Author SS was employed by the company Joint Research and Development, LLC.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declare that no Generative AI was used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Author disclaimers

The content of this publication are the authors’ opinions and do not necessarily reflect the views or policies of the U.S. Department of Defense, the U.S. Department of the Army, or the institutions and companies affiliated with the authors, nor does the mention of trade names, commercial products, or organizations imply endorsement by the U.S. Government. The author (s) reports no potential conflicts of interest. Some authors of this work are military service members or federal/contracted employees of the United States government. This work was prepared as part of their official duties. Title 17 U.S.C. 105 provides that copyright protection under this title is not available for any work of the United States Government. Title 17 U.S.C. 101 defines a U.S. Government work as work prepared by a military service member or employee of the U.S. Government as part of that person’s official duties.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fcimb.2025.1524025/full#supplementary-material

Footnotes

References

Bonner, T. I., Brenner, D. J., Neufeld, B. R., and Britten, R. J. (1973). Reduction in the rate of DNA reassociation by sequence divergence. J. Mol. Biol. 81, 123–135. doi: 10.1016/0022-2836(73)90184-8

PubMed Abstract | Crossref Full Text | Google Scholar

Boyle, B., Dallaire, N., and MacKay, J. (2009). Evaluation of the impact of single nucleotide polymorphisms and primer mismatches on quantitative PCR. BMC Biotechnol. 9, 75. doi: 10.1186/1472-6750-9-75

PubMed Abstract | Crossref Full Text | Google Scholar

Cao, L., Xu, T., Liu, X., Ji, Y., Huang, S., Peng, H., et al. (2022). The impact of accumulated mutations in SARS-coV-2 variants on the qPCR detection efficiency. Front. Cell Infect. Microbiol. 12, 823306. doi: 10.3389/fcimb.2022.823306

PubMed Abstract | Crossref Full Text | Google Scholar

Chang, C. T., Hain, T. C., Hutton, J. R., and Wetmur, J. G. (1974). Effects of microscopic and macroscopic viscosity on the rate of renaturation of DNA. Biopolymers 13, 1847–1858. doi: 10.1002/bip.1974.360130915

PubMed Abstract | Crossref Full Text | Google Scholar

Chen, Y., Han, Y., Yang, J., Ma, Y., Li, J., and Zhang, R. (2022). Impact of SARS-coV-2 variants on the analytical sensitivity of rRT-PCR assays. J. Clin. Microbiol. 60, e0237421. doi: 10.1128/jcm.02374-21

PubMed Abstract | Crossref Full Text | Google Scholar

Christopherson, C., Sninsky, J., and Kwok, S. (1997). The effects of internal primer-template mismatches on RT-PCR: HIV-1 model studies. Nucleic Acids Res. 25, 654–658. doi: 10.1093/nar/25.3.654

PubMed Abstract | Crossref Full Text | Google Scholar

Cock, P. J., Antao, T., Chang, J. T., Chapman, B. A., Cox, C. J., Dalke, A., et al. (2009). Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics 25, 1422–1423. doi: 10.1093/bioinformatics/btp163

PubMed Abstract | Crossref Full Text | Google Scholar

Davies, N. G., Abbott, S., Barnard, R. C., Jarvis, C. I., Kucharski, A. J., Munday, J. D., et al. (2021). Estimated transmissibility and impact of SARS-CoV-2 lineage B.1.1.7 in England. Science 372, eabg3055. doi: 10.1126/science.abg3055

PubMed Abstract | Crossref Full Text | Google Scholar

Eun, H.-M. (1996). Enzymology primer for recombinant DNA technology (San Diego: Academic Press), 702.

Google Scholar

Fox-Lewis, S., Fox-Lewis, A., Harrower, J., Chen, R., Wang, J., De Ligt, J., et al. (2021). Lack of N2-gene amplification on the Cepheid Xpert Xpress SARS-CoV-2 assay and potential novel causative mutations: A case series from Auckland, New Zealand. IDCases. 25. Netherlands: © 2021 The Authors (Amsterdam, Netherlands: Elsevier).

Google Scholar

Gardner, S. N. and Slezak, T. (2014). Simulate_PCR for amplicon prediction and annotation from multiplex, degenerate primers and probes. BMC Bioinf. 15, 237. doi: 10.1186/1471-2105-15-237

PubMed Abstract | Crossref Full Text | Google Scholar

Hatcher, E. L., Zhdanov, S. A., Bao, Y., Blinkova, O., Nawrocki, E. P., Ostapchuck, Y., et al. (2017). Virus Variation Resource - improved response to emergent viral outbreaks. Nucleic Acids Res. 45, D482–Dd90. doi: 10.1093/nar/gkw1065

PubMed Abstract | Crossref Full Text | Google Scholar

Howson, E. L. A., Orton, R. J., Mioulet, V., Lembo, T., King, D. P., and Fowler, V. L. (2020). GoPrime: development of an in silico framework to predict the performance of real-time PCR primers and probes using foot-and-mouth disease virus as a model. Pathogens 9 (4), 303. doi: 10.3390/pathogens9040303

PubMed Abstract | Crossref Full Text | Google Scholar

John, G., Sahajpal, N. S., Mondal, A. K., Ananth, S., Williams, C., Chaubey, A., et al. (2021). Next-generation sequencing (NGS) in COVID-19: A tool for SARS-coV-2 diagnosis, monitoring new strains and phylodynamic modeling in molecular epidemiology. Curr. Issues Mol. Biol. 43, 845–867. doi: 10.3390/cimb43020061

PubMed Abstract | Crossref Full Text | Google Scholar

Khan, K. A. and Cheung, P. (2020). Presence of mismatches between diagnostic PCR assays and coronavirus SARS-CoV-2 genome. R. Soc. Open Sci. 7, 200636. doi: 10.1098/rsos.200636

PubMed Abstract | Crossref Full Text | Google Scholar

Khare, S., Gurry, C., Freitas, L., Schultz, M. B., Bach, G., Diallo, A., et al. (2021). GISAID's role in pandemic response. China CDC. Wkly. 3, 1049–1051. doi: 10.46234/ccdcw2021.255

PubMed Abstract | Crossref Full Text | Google Scholar

Knight, B., Otwell, T., Coryell, M., Stone, J., Davis, P., Necciai, B., et al. (2025). Using Machine learning models to predict the impact of template mismatches on polymerase chain reaction (PCR) assay performance. Sci. Rep. 15, 16184. doi: 10.1038/s41598-025-98444-8

PubMed Abstract | Crossref Full Text | Google Scholar

Lefever, S., Pattyn, F., Hellemans, J., and Vandesompele, J. (2013). Single-nucleotide polymorphisms and other mismatches reduce performance of quantitative PCR assays. Clin. Chem. 59, 1470–1480. doi: 10.1373/clinchem.2013.203653

PubMed Abstract | Crossref Full Text | Google Scholar

Li, H. (2018). Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100. doi: 10.1093/bioinformatics/bty191

PubMed Abstract | Crossref Full Text | Google Scholar

Mentes, A., Papp, K., Visontai, D., Stéger, J., Csabai, I., Medgyes-Horváth, A., et al. (2022). Identification of mutations in SARS-CoV-2 PCR primer regions. Sci. Rep. 12, 18651. doi: 10.1038/s41598-022-21953-3

PubMed Abstract | Crossref Full Text | Google Scholar

Miranda, P. and Weber, G. (2021). Thermodynamic evaluation of the impact of DNA mismatches in PCR-type SARS-CoV-2 primers and probes. Mol. Cell Probes 56, 101707. doi: 10.1016/j.mcp.2021.101707

PubMed Abstract | Crossref Full Text | Google Scholar

Negrón, D. A., Kang, J., Mitchell, S., Holland, M. Y., Wist, S., Voss, J., et al. (2022). Impact of SARS-coV-2 mutations on PCR assay sequence alignment. Front. Public Health 10, 889973. doi: 10.3389/fpubh.2022.889973

PubMed Abstract | Crossref Full Text | Google Scholar

Nozari, G., Rahbar, S., and Wallace, R. B. (1986). Discrimination among the transcripts of the allelic human beta-globin genes beta A, beta S and beta C using oligodeoxynucleotide hybridization probes. Gene 43, 23–28. doi: 10.1016/0378-1119(86)90004-1

PubMed Abstract | Crossref Full Text | Google Scholar

Persson, S., Karlsson, M., Borsch-Reniers, H., Ellström, P., Eriksson, R., and Simonsson, M. (2019). Missing the match might not cost you the game: primer-template mismatches studied in different hepatitis A virus variants. Food Environ. Virol. 11, 297–308. doi: 10.1007/s12560-019-09387-z

PubMed Abstract | Crossref Full Text | Google Scholar

Public Health England. (2020). Investigation of novel SARS-CoV-2 variant 202012/01: technical briefing 1. Available online at: https://www.gov.uk/government/publications/investigation-of-novel-sars-cov-2-variant-variant-of-concern-202012012020 (Accessed January 30, 2021).

Google Scholar

Rajib, S. A., Ogi, Y., Hossain, M. B., Ikeda, T., Tanaka, E., Kawaguchi, T., et al. (2022). A SARS-CoV-2 Delta variant containing mutation in the probe binding region used for RT-qPCR test in Japan exhibited atypical PCR amplification and might induce false negative result. J. Infect. Chemother. 28, 669–677. doi: 10.1016/j.jiac.2022.01.019

PubMed Abstract | Crossref Full Text | Google Scholar

Rambaut, A., Holmes, E. C., O'Toole, Á, Hill, V., McCrone, J. T., Ruis, C., et al. (2020). A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology. Nat. Microbiol. 5, 1403–1407. doi: 10.1038/s41564-020-0770-5

PubMed Abstract | Crossref Full Text | Google Scholar

Rana, D. R., Pokhrel, N., and Dulal, S. (2022). Rational primer and probe construction in PCR-based assays for the efficient diagnosis of drifting variants of SARS-coV-2. Adv. Virol. 2022, 2965666. doi: 10.1155/2022/2965666

PubMed Abstract | Crossref Full Text | Google Scholar

So, K. Y. K., Fong, J. J., Lam, I. P. Y., and Dudgeon, D. (2020). Pitfalls during in silico prediction of primer specificity for eDNA surveillance. Ecosphere 11, e03193. doi: 10.1002/ecs2.3193

Crossref Full Text | Google Scholar

Sozhamannan, S., Holland, M. Y., Hall, A. T., Negrón, D. A., Ivancich, M., Koehler, J. W., et al. (2015). Evaluation of signature erosion in ebola virus due to genomic drift and its impact on the performance of diagnostic assays. Viruses 7, 3130–3154. doi: 10.3390/v7062763

PubMed Abstract | Crossref Full Text | Google Scholar

Stadhouders, R., Pas, S. D., Anber, J., Voermans, J., Mes, T. H., and Schutten, M. (2010). The effect of primer-template mismatches on the detection and quantification of nucleic acids using the 5' nuclease assay. J. Mol. Diagn. 12, 109–117. doi: 10.2353/jmoldx.2010.090035

PubMed Abstract | Crossref Full Text | Google Scholar

Stanhope, B. J., Peterson, B., Knight, B., Decadiz, R. N., Pan, R., Davis, P., et al. (2022). Development, testing and validation of a SARS-CoV-2 multiplex panel for detection of the five major variants of concern on a portable PCR platform. Front. Public Health 10, 1042647. doi: 10.3389/fpubh.2022.1042647

PubMed Abstract | Crossref Full Text | Google Scholar

Storey, N., Brown, J. R., Pereira, R. P. A., O'Sullivan, D. M., Huggett, J. F., Williams, R., et al. (2021). Single base mutations in the nucleocapsid gene of SARS-CoV-2 affects amplification efficiency of sequence variants and may lead to assay failure. J. Clin. Virol. Plus. 1, 100037. doi: 10.1016/j.jcvp.2021.100037

PubMed Abstract | Crossref Full Text | Google Scholar

Süss, B., Flekna, G., Wagner, M., and Hein, I. (2009). Studying the effect of single mismatches in primer and probe binding regions on amplification curves and quantification in real-time PCR. J. Microbiol. Methods 76, 316–319. doi: 10.1016/j.mimet.2008.12.003

PubMed Abstract | Crossref Full Text | Google Scholar

United States Food and Drug Administration In Vitro Diagnostics EUAs - Molecular Diagnostic Tests for SARS-CoV-2. Available online at: https://www.fda.gov/medical-devices/covid-19-emergency-use-authorizations-medical-devices/in-vitro-diagnostics-euas-molecular-diagnostic-tests-sars-cov-2 (Accessed April 17, 2024).

Google Scholar

Vogels, C. B. F., Breban, M. I., Ott, I. M., Alpert, T., Petrone, M. E., Watkins, A. E., et al. (2021). Multiplex qPCR discriminates variants of concern to enhance global surveillance of SARS-CoV-2. PloS Biol. 19, e3001236. doi: 10.1371/journal.pbio.3001236

PubMed Abstract | Crossref Full Text | Google Scholar

Vogels, C. B. F., Brito, A. F., Wyllie, A. L., Fauver, J. R., Ott, I. M., Kalinich, C. C., et al. (2020). Analytical sensitivity and efficiency comparisons of SARS-CoV-2 RT-qPCR primer-probe sets. Nat. Microbiol. 5, 1299–1305. doi: 10.1038/s41564-020-0761-6

PubMed Abstract | Crossref Full Text | Google Scholar

Wallace, R. B., Shaffer, J., Murphy, R. F., Bonner, J., Hirose, T., and Itakura, K. (1979). Hybridization of synthetic oligodeoxyribonucleotides to phi chi 174 DNA: the effect of single base pair mismatch. Nucleic Acids Res. 6, 3543–3557. doi: 10.1093/nar/6.11.3543

PubMed Abstract | Crossref Full Text | Google Scholar

Wetmur, J. G. (1991). DNA probes: applications of the principles of nucleic acid hybridization. Crit. Rev. Biochem. Mol. Biol. 26, 227–259. doi: 10.3109/10409239109114069

PubMed Abstract | Crossref Full Text | Google Scholar

Wright, E. S., Yilmaz, L. S., Ram, S., Gasser, J. M., Harrington, G. W., and Noguera, D. R. (2014). Exploiting extension bias in polymerase chain reaction to improve primer specificity in ensembles of nearly identical DNA templates. Environ. Microbiol. 16, 1354–1365. doi: 10.1111/1462-2920.12259

PubMed Abstract | Crossref Full Text | Google Scholar

Wu, J. H., Hong, P. Y., and Liu, W. T. (2009). Quantitative effects of position and type of single mismatch on single base primer extension. J. Microbiol. Methods 77, 267–275. doi: 10.1016/j.mimet.2009.03.001

PubMed Abstract | Crossref Full Text | Google Scholar

Wu, F., Oghuan, J., Gitter, A., Mena, K. D., and Brown, E. L. (2023). Wide mismatches in the sequences of primers and probes for monkeypox virus diagnostic assays. J. Med. Virol. 95, e28395. doi: 10.1002/jmv.28395

PubMed Abstract | Crossref Full Text | Google Scholar

Keywords: signature erosion, real time PCR, qPCR, SARS-CoV-2, PCR efficiency, in silico prediction, false negative (FN), wet lab testing

Citation: Otwell T, Knight B, Coryell M, Stone J, Davis P, Necciai B, Carlson P and Sozhamannan S (2025) Reality check: testing the in silico predictions of false negative results due to mutations in SARS-CoV-2 PCR assays using templates with mismatches in vitro. Front. Cell. Infect. Microbiol. 15:1524025. doi: 10.3389/fcimb.2025.1524025

Received: 26 November 2024; Accepted: 07 October 2025;
Published: 31 October 2025.

Edited by:

Jadranka Milosevic, Captis Diagnostics Inc, United States

Reviewed by:

Michelle M Tabb, DiaSorin Molecular LLC, United States
Brahmchetna Bedi, Emory University, United States
David W. Ussery, Oklahoma State University, United States
Nathan A Tanner, New England Biolabs, United States

Copyright © 2025 Otwell, Knight, Coryell, Stone, Davis, Necciai, Carlson and Sozhamannan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Brittany Knight, YmtuaWdodEBtcmlnbG9iYWwub3Jn; Shanmuga Sozhamannan, c2hhbm11Z2Euc296aGFtYW5uYW4uY3RyQGFybXkubWls

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.