- 1South Carolina Department of Public Health, Columbia, SC, United States
- 2South Carolina First Steps, Columbia, SC, United States
Whole genome sequencing (WGS) is the gold standard for identifying emerging variants during epidemics but is resource intensive. Traditionally, a low RT-qPCR cycle threshold (Ct) is used to select samples with presumed high viral loads but whether better alternatives exist is unclear. This study introduces and evaluates a Ct-independent method, SCQC-Plus (SCQC+) approach, combining enhanced library preparation and agarose gel-based quality control for selecting samples for sequencing. From June 2022 through December 2024 at the state public health laboratory, over 1,800 SARS-CoV-2 positive clinical samples were sequenced and studied in two phases: retrospectively and prospectively. In the first phase when all PCR positive samples received into the laboratory were sequenced, we simulated the impact of two Ct-restriction thresholds (Ct < 28 and Ct < 30) by excluding those samples from the data. In the prospective phase, we tested three selection strategies on sequencing efficiency: Seq-All, Ct < 30, and SCQC+. Lastly, we compared the variants captured by a centralized state public health laboratory with those of commercial and clinical labs in the state. Results from the retrospective study suggested that Ct restriction of 30 was cost effective but missed variants in circulation. Prospectively, we found that the SCQC+ approach had a comparable cost effectiveness to Ct-restricted approach. Notably the SCQC+ approach halved the fail rate for samples with Ct over 30, resulting in the sequencing of two variants not found among samples with Ct under 30. Finally comparing the variants detected by commercial and clinical laboratories in the state identified unique variants not detected in the sampling of the state public health laboratory. This observation suggested the importance to public health of maintaining such partnerships to enable timely and comprehensive variant surveillance program. The goal of the sequencing program can impact the cost effectiveness of different approaches for sample selection. When the goal is the early detection of emerging or rare variants of concern prior to wide dispersal into the population, we propose a combination of the SCQC+ approach internally and partnership with in-state commercial and clinical laboratories, externally, as important requirements for achieving that goal.
1 Introduction
Real time variant surveillance by whole genome sequencing (WGS) has increasingly become an essential tool for the public health response to epidemics such as Zika, Ebola, SARS-CoV-2, and MPOX (1). SARS-CoV-2 variants with mutations impacting clinical severity, transmissibility, diagnostics, therapeutics, vaccine efficacy were identified and designated as variants of interest (VOI) or variants of concern (VOC) and prioritized for molecular surveillance (2). WGS can enable data driven decision making to limit and control morbidity and mortality in a population during an epidemic.
While WGS is the gold standard for identification of emerging variants, it is resource intensive making the sequencing of most SARS-CoV-2 positive samples impractical, even for high income countries. As such an unbiased sampling of the population of interest and sequencing of samples as efficiently as possible are required to detect VOCs/VOIs or monitor their prevalence in the community. Traditionally, a low PCR cycle threshold (Ct), typically less than 30 Ct, is used to select samples with presumed high viral loads and culturable virus (Ct-Restricted Approach) and reflex them for sequencing (3–6) but the impact this selection policy may have on the detection of rare variants remains unclear. This study introduces a Ct-independent method—SCQC+ approach—combining enhanced library preparation and agarose gel-based quality control with a goal of improving cost effectiveness and maintaining comprehensive variant detection.
Here we present a case report evaluating the cost-effectiveness of different approaches to selecting samples for sequencing and explore the contribution afforded by commercial labs to variant surveillance provided by a state public health laboratory.
2 Context (setting and population)
State public health agencies are uniquely positioned at the front lines of epidemic response. A state public health system can either be centralized, decentralized, or hybrid. South Carolina is a state with a population of 5.4 million and represents one of 14 states or territories in the U.S. with a centralized public health system (7). As such, South Carolina’s local health departments scattered throughout the state (Figure 1A), belong to the same organization, the South Carolina Department of Public Health (DPH), and enable the statewide collection and shipping of surveillance samples to the South Carolina public health laboratory (SCPHL) (Figure 1B), a bureau within DPH. Moreover, all the sequencing of other SARS-CoV-2 positive cases by other laboratories are also reported to DPH.
Figure 1. South Carolina has a centralized public health system that enables sampling of clinical cases throughout the state. (A) As a state with a centralized public health system, all local health department offices throughout the state, indicated by gray icons, are joined together in one surveillance network, with sequencing of samples carried out centrally at the SCPHL, as indicated by the blue building icon. (B) An overview of the sequencing effort of the state public health laboratory, in red, versus the total number of sequences from other sources in the state, in blue. The shaded area indicates the period of study for this publication. (C) An overview of the study design and its two phases aimed at determining a cost-effective approach for real time variant surveillance during a public health emergency.
To detect existing and emerging variants of SARS-CoV-2, SC PHL conducts random sampling of PCR positive samples throughout the state. To sequence samples, sample RNA are first converted to cDNA using New England Biolabs, LunaScript RT Supermix Kit. Enrichment and amplification of the cDNA library was done using Artic Primers and New England Biolabs, Q5 Hot Start High-Fidelity 2x Master Mix. For library preparation, Illumina’s DNA Prep Kit was used to perform tagmentation and post PCR clean-up for the selection of optimal DNA fragments. The DNA Prep Kit uses Bead-Linked Transposomes that bind to a limited volume of cDNA, selecting for size-specific DNA fragments. The result is a pooled sample of ideal-sized DNA strands. Libraries were sequenced on Illumina MiniSeq or MiSeq platforms, producing FASTQ sequence files. Analysis was achieved using Illumina’s BaseSpace App, DRAGEN COVID Lineage, which used the Wuhan-Hu-1 reference to analyze pathogen data and create consensus FASTAS.
For analysis of sequencing efforts throughout the state, SARS-CoV-2 variant surveillance data were obtained from DPH surveillance database covering the period from January 1, 2021, to January 1, 2025. Data originated from South Carolina’s multi-laboratory surveillance network including SCPHL, commercial diagnostic laboratories (Quest Diagnostics, LabCorp, Aegis Sciences Corporation, Mako Medical Laboratories), academic medical centers (Medical University of South Carolina), and other participating healthcare facilities. To ensure quality assurance duplicate and statistical independence, entries were identified and removed using case identification numbers. Data validation included verification of specimen collection dates, facility identifiers, and variant nomenclature consistency. The final analytical dataset comprised of 43,150 unique, successfully sequenced SARS-CoV-2 cases with metadata. Statewide analyses were implemented in Python 3.x using established scientific computing libraries: pandas (≥1.5) for data manipulation, matplotlib (≥3.6) and seaborn (≥0.11) for visualization, and NumPy (≥1.21) for numerical operations.
3 Key programmatic elements
To proceed, we designed a two-phase study looking at our past sequencing data retrospectively and evaluating our SCQC+ approach prospectively (Figure 1C). In the retrospective phase, we analyzed sequencing data from the SC PHL collected from June through August 2022 during a period when all samples were sequenced regardless of Ct. Plotting the Ct of all sequenced samples during this time period against a sequence quality metric suggested that samples with Ct over 30 resulted in reduced coverage (Figure 2A). Samples that successfully completed the sequencing process had a mean Ct of 22.4 and 22.6 for two SARS-CoV-2 PCR gene targets, Orf1 and N gene, compared to 31.1 and 31.0 for samples that failed the sequencing (Figure 2B). Student t-test also confirmed that the differences between the Ct-values of successful and failed samples was statistically significant.
Figure 2. Simulated data suggests Ct restriction is cost effective but does not capture all variants in circulation. (A) Chart showing the Ct values of 811 samples vs. a measure of genome coverage, % of non-N bases with coverage >/= 10. (B) Chart showing the Ct values for the two PCR gene targets, Orf1 and N gene, of SARS-CoV-2 positive samples. Successful samples are shown in blue and the failed in red. Chart was generated on Graphpad Prism and statistical significance was determined by student t-test. (C) Chart showing sequencing fail rate for different sample sub-population based on Ct. Referenced data was based on 948 samples collected and sequenced between June through August 2022. Chart was generated on Graphpad Prism and statistical significance was determined by student t-test.
By excluding data from samples above a desired threshold, we could simulate the impact of using Ct-restriction to select samples for sequencing. Next, we show that sequencing only samples with Ct under 30 or Ct under 28 would have drastically reduced our sequencing fail rate from 13.8 to 3.2 and 2.6%, respectively (Figure 2C). Looking at the impact of these approaches on the number of unique variants detected, we determined that a Ct-under-30 approach would have identified 96% and a Ct-under-28 approach only detected 80% of variants in our datasets (Table 1).
Next, we considered the cost effectiveness of both Ct-restriction approaches compared to sequencing all samples. We determined that either sequencing based on a Ct-under-30 selection criteria was more cost effective than a “Seq All” or Ct-under-28 approach. We also determined that amongst samples with high Ct (Ct > 30), though 19 distinct variants were detected, one variant found in a lone sample was detected which was not found among samples with Ct < 30. Detecting that rare variant would have required sequencing 241 high Ct samples at a 44% fail rate at the cost of over $30,000 (Table 1).
Altogether, the retrospective study suggested that Ct-over-30 approach to selecting samples for sequencing was more cost effective than sequencing all collected samples while enabling the capture of most but not all variants within our surveillance samples.
The prospective phase of our study was initiated by the observation that successful samples tended to have significantly higher cDNA concentrations early in the library prep process (Figure 3A). Additionally running the cDNA on a Agilent TapeStation instrument, an automated agarose electrophoresis machine, showed that failed samples tended to lack the expected target cDNA band (Figure 3B).
Figure 3. South Carolina Quality Control-plus method compared favorably with Ct restriction decreasing fail rate while maximizing capture of novel variants. (A) Chart showing the concentration of samples post cDNA generation step of library preparation grouped by whether those sequences were ultimately successful or failed. Chart was generated on Graphpad Prism and statistical significance was determined by student t-test. (B) A representative Agilent TapeStation result suggesting that successful samples should all have a clear band indicating the presence of the target cDNA. (C) A schematic showing the changes we implemented to the standard Illumina library prep method which we refer to as SCQC+. (D) Chart showing sequencing fail rate based on different sequencing approaches. The Seq All fail rate is reproduced from the earlier figure for comparison.
By plotting the cDNA concentration data, we observed that most of our failed samples had a cDNA concentration less than 15 ng/μL though they passed the manufacturer’s QC threshold of 3.3 ng/μL. This indicated that the typical QC with the Qubit instrument was insufficient at excluding samples that will ultimately fail during sequencing. Thus, we wanted to evaluate whether we could optimize our sequencing outcomes by reflexing samples with cDNA lower than 15 ng/uL to additional QC by the Agilent TapeStation. This would exclude samples without the target band from further processing, this could be a more cost-effective way to do sequencing. We also implemented the Agilent TapeStation to measure fragment sizes of the libraries before sequencing to optimize the loading of the machines. These two changes to the standard protocol are what we refer to as the South Carolina’s Quality Control plus method, hereafter SCQC+ (Figure 3C).
We were able to determine that sample selection using the SCQC+ approach led to higher quality sequencing runs as measured by % Runs Passing Filter (%RPF) metric compared to the Ct-under-30 approach (Table 1). SCQC+ approach also resulted in a low fail rate of 2.1% compared to the 4.6% fail rate when following a Ct-under-30 approach (Figure 3D). Exploring whether this method helped in the successful sequencing of samples with high Ct, we discovered that over 25 samples were successfully sequenced with a Ct over 30 (not shown) with a fail rate of 22.9% compared to 44% fail rate of the unmodified sequencing process (Table 1). Among these high Ct samples, two unique variants not found in specimens with Ct < 30 were detected using SCQC+ at a cost of only $3,319 (Table 1). In sum, this data suggested that the SCQC+ approach compared favorably with the Ct-based approach enabling more comprehensive variant surveillance while maintaining low fail rates.
Next, we analyzed statewide variant surveillance data where many commercial and clinical laboratories contributed to the sequencing of SARS-CoV-2 positive cases throughout the state (Figure 4A). We were able to determine that the sampling of the various laboratories routinely identified unique variants not found by others in the same month (Figure 4B). While most variants were eventually found by different laboratories, Labcorp and SCPHL identified the most unique variants (Figure 4C). This suggested that even in a centralized public health agency, the support of commercial and clinical labs would be required for a comprehensive detection of rare emerging variants.
Figure 4. Commercial and clinical lab support throughout state contributed to a comprehensive variant surveillance network in South Carolina. (A) Chart showing the proportion of SARS-CoV-2 sequences that were provided by different sequencing laboratories during the study period. (B) Chart showing the number of variant lineages each month that were uniquely detected by different sequencing laboratories or found by more than one facility, labeled “Shared discoveries.” (C) Chart represents a summary of data from the entire study period for unique variants detected exclusively by different sequencing laboratory. (D) A summary of the findings of this case is shown graphically suggesting that the SCQC+ approach enables a more cost-effective approach for detecting rare variants in high Ct sample.
4 Discussion
According to the World Health Organization (WHO), there are two priority objectives for a variant surveillance program namely: monitoring the relative prevalence of variants across time and geographic areas and detecting variants circulating at low levels ((8)). Meeting such objectives enable decision makers to ensure that proposed interventions are adequate and effective to the variants circulating in their communities. Despite the failing prices of genomic sequencing technologies, the cost of WGS remains prohibitive and cost-effective methods for delivering on these goals of variant surveillance remain unclear.
Alternatives to WGS, such as multiplex PCR and wastewater surveillance (WWS), are available but can only deliver on one of the priority objectives for variant surveillance. While multiplex PCR is cheaper than WGS its utility is limited for monitoring the relative prevalence of variants of interest already defined by WGS. The accuracy of the test is also impacted by various factors including the spontaneous emergence of characteristic mutations from one lineage in an independent lineage; which may lead to an over-estimation of the prevalence of certain variants of concern (9) Lastly multiplex PCR cannot be used to detect new and emerging variants. Contrary to this relative paucity in variant detection with PCR, WWS enables the detection of a broad array of variants circulating in a population in a timely way (10). However, the significance of the WWS detected mutations remain unclear until confirmed in local clinical samples (11). As such while these technologies provide complementary data, WGS remains the gold standard to which they are compared and confirmed.
Ct value has been used as a proxy for the selection of high-quality samples for sequencing, culturing live viruses, and risk for viral transmission in households (3–6, 12). We reproduced the observation of Lu et al., that samples with Ct over 30 had reduced genome coverage. Our data also suggests the cost effectiveness of using a Ct-under-30 approach, but not Ct-under-28, for selecting samples when the objective is for monitoring the prevalence of non-rare variants, with successful samples on average having a Ct of 22 and failed samples with a Ct of 31.
However, other studies also demonstrated that 34% of secondary cases occurred in homes where the primary case had a Ct > 30 (5). Moreover, the Ct of a sample is impacted by a variety of factors like time from symptom onset, quality of collection method, or storage conditions and may not be an indicator of a reduced pathogenicity of the infecting virus stain (4, 13). Thus, in the search for the timely detection of VOCs/VOIs, all samples should be considered. Indeed, SCPHL data identified three instances of novel variants that appeared only in high Ct samples. However, the direct sequencing of high Ct samples lead to inordinate fail rates and reagent waste. Through the SCQC+ method, we were able to reduce the fail rate for sequencing high Ct samples in half while keeping the overall fail rate low.
The detection of low frequency variants requires large diagnostic testing volumes and sequencing volumes (8, 14). In South Carolina, we show that commercial and clinical labs were major contributors of sequence data asides from the public health agency. Population-wide sampling by different laboratories largely resulted in detection of shared variants but also produced unique variants, at least temporary on a month-to-month basis suggesting detection of emerging low-frequency variants. When the entire study period is considered, unique variants not identified by others were identified in Labcorp vs. SCPHL sequencing- suggesting the need for maintaining formalized partnerships between state, commercial, and clinical laboratories for a comprehensive surveillance of emerging variants before the next epidemic. In conclusion, our case study suggests cost effective approaches to molecular surveillance depending on the primary objectives of the surveillance program and introduces a Ct-independent method for optimizing the sequencing of high Ct samples (Figure 4D).
5 Study limitations and constraints
This case study has several limitations and constraints. For instance, in testing the SCQC+ method there were examples of samples that passed the Agilent TapeStation QC but eventually failed sequencing, suggesting the need for additional QC prior to sequencing. Additionally, the results presented came from the experience of one state public health agency and may need to be tailored to be applied to another. For example, at SCPHL we selected a threshold of 15 ng/ul for reflexing samples for additional QC. Therefore, it is possible that another laboratory seeking to implement the SCQC+ method may want to select a different threshold based on their own laboratory data. Finally, we used only one disease model and one sequencing method to test the cost effectiveness of SCQC+, it is possible that results may differ with other disease models or sequencing methods. However, since more than 50% of state public health agencies use the same ARTIC primers and 80% use Illumina sequencers, we believe that our study is highly relevant to many public health laboratories (15).
Data availability statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.
Ethics statement
Ethical approval was not required for the study involving humans in accordance with the local legislation and institutional requirements. The human samples used in this study were acquired from local health departments providing clinical care to their communities. Written informed consent to participate in this study was not required from the participants or the participants' legal guardians/next of kin in accordance with the national legislation and the institutional requirements.
Author contributions
RC: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Writing – review & editing. GrG: Investigation, Methodology, Project administration, Resources, Supervision, Writing – review & editing. JF: Formal analysis, Methodology, Resources, Writing – review & editing. GaG: Formal analysis, Methodology, Resources, Writing – review & editing. AK: Conceptualization, Writing – review & editing. AS: Formal analysis, Visualization, Writing – review & editing. JS: Formal analysis, Writing – review & editing. AD: Funding acquisition, Supervision, Writing – review & editing. KB: Supervision, Writing – review & editing. CW: Project administration, Supervision, Writing – review & editing. OA: Funding acquisition, Supervision, Writing – review & editing. JM: Funding acquisition, Supervision, Writing – review & editing. CA: Conceptualization, Project administration, Supervision, Visualization, Writing – original draft, Writing – review & editing.
Funding
The author(s) declare that financial support was received for the research and/or publication of this article. This work was supported by the Centers for Disease Control & Prevention under grant 19NU50CK000542EDEXC.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The authors declare that no Gen AI was used in the creation of this manuscript.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Author disclaimer
The content is solely the responsibility of the authors and does not necessarily represent the official views of the Centers for Disease Control & Prevention.
References
1. Struelens, MJ, Ludden, C, Werner, G, Sintchenko, V, Jokelainen, P, and Ip, M. Real-time genomic surveillance for enhanced control of infectious diseases and antimicrobial resistance. Front Sci. (2024) 2:1298248. doi: 10.3389/fsci.2024.1298248
2. World Health Organization. (2023). Updated working definitions and primary actions for SARSCoV2 variants. Geneva: World Health Organization.
3. Lu, J, Du Plessis, L, Liu, Z, Hill, V, Kang, M, Lin, H, et al. Genomic epidemiology of SARS-CoV-2 in Guangdong province, China. Cell. (2020) 181:997–1003.e9. doi: 10.1016/j.cell.2020.04.023
4. Singanayagam, A, Patel, M, Charlett, A, Bernal, JL, Saliba, V, Ellis, J, et al. Duration of infectiousness and correlation with RT-PCR cycle threshold values in cases of COVID-19, England, January to may 2020. Euro Surveill. (2020) 25:2001483. doi: 10.2807/1560-7917.ES.2020.25.32.2001483
5. Lyngse, FP, Mølbak, K, Franck, KT, Nielsen, C, Skov, RL, Voldstedlund, M, et al. Association between SARS-CoV-2 transmissibility, viral load, and age in households. MedRxiv. (2021). [preprint]. doi: 10.1101/2021.02.28.21252608
6. Bordoy, AE, Saludes, V, Panisello Yagüe, D, Clarà, G, Soler, L, Paris de León, A, et al. Monitoring SARS-CoV-2 variant transitions using differences in diagnostic cycle threshold values of target genes. Sci Rep. (2022) 12:21818. doi: 10.1038/s41598-022-25719-9
7. Meit, M, Sellers, K, Kronstadt, J, Lawhorn, N, Brown, A, Liss-Levinson, R, et al. Governance typology: a consensus classification of state-local health department relationships. J Public Health Manag Pract. (2012) 18:520–8. doi: 10.1097/PHH.0b013e31825ce90b
8. World Health Organization. (2021). Guidance for surveillance of SARS-CoV-2 variants: interim guidance, 9 august 2021. Geneva: World Health Organization.
9. Gomes, L, Jeewandara, C, Jayadas, TP, Dissanayake, O, Harvie, M, Guruge, D, et al. Surveillance of SARS-CoV-2 variants of concern by identification of single nucleotide polymorphisms in the spike protein by a multiplex real-time PCR. J Virol Methods. (2022) 300:114374. doi: 10.1016/j.jviromet.2021.114374
10. Timme, RE, Woods, J, Jones, JL, Calci, KR, Rodriguez, R, Barnes, C, et al. SARS-CoV-2 wastewater variant surveillance: pandemic response leveraging FDA’S GenomeTrakr network. MSystems. (2024) 9:e01415-23. doi: 10.1128/msystems.01415-23
11. Swift, CL, Isanovic, M, Velez, KEC, and Norman, RS. Community-level SARS-CoV-2 sequence diversity revealed by wastewater sampling. Sci Total Environ. (2021) 801:149691. doi: 10.1016/j.scitotenv.2021.149691
12. de França Cirilo, MV, Pour, SZ, de Fatima Benedetti, V, Farias, JP, Fogaça, MMC, da Conceição Simões, R, et al. Co-circulation of chikungunya virus, zika virus, and serotype 1 of dengue virus in Western Bahia, Brazil. Front Microbiol. (2023) 14:1240860. doi: 10.3389/fmicb.2023.1240860
13. Rabaan, AA, Tirupathi, R, Sule, AA, Aldali, J, Mutair, AA, Alhumaid, S, et al. Viral dynamics and real-time RT-PCR ct values correlation with disease severity in COVID-19. Diagnostics. (2021) 11:1091. doi: 10.3390/diagnostics11061091
14. Han, AX, Toporowski, A, Sacks, JA, Perkins, MD, Briand, S, Van Kerkhove, M, et al. SARS-CoV-2 diagnostic testing rates determine the sensitivity of genomic surveillance programs. Nat Genet. (2023) 55:26–33. doi: 10.1038/s41588-022-01267-w
Keywords: whole genome sequencing, South Carolina, variant surveillance, public health laboratory, genomic epidemiology, cycle threshold
Citation: Cox R, Goodwin G, Freeman J, Godfrey G, Kapingidza A, Smith A, Scott J, Diedhiou A, Buru K, Weaver CJ, Adair O, Meredith J and Aroh C (2025) Optimizing resources in genomic surveillance: South Carolina’s QC-plus approach. Front. Public Health. 13:1694911. doi: 10.3389/fpubh.2025.1694911
Edited by:
David Sue, Centers for Disease Control and Prevention (CDC), United StatesReviewed by:
Hayley Danielle Yaglom, Translational Genomics Research Institute, United StatesEun-Jin Kim, Korea Disease Control and Prevention Agency, Republic of Korea
Copyright © 2025 Cox, Goodwin, Freeman, Godfrey, Kapingidza, Smith, Scott, Diedhiou, Buru, Weaver, Adair, Meredith and Aroh. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Chukwuemika Aroh, ZWFyb2hAc2NmaXJzdHN0ZXBzLm9yZw==
†These authors have contributed equally to this work
Gregory Goodwin1†