Leveraging of SARS-CoV-2 PCR Cycle Thresholds Values to Forecast COVID-19 Trends

Introduction: We assessed the usefulness of SARS-CoV-2 RT-PCR cycle thresholds (Ct) values trends produced by the LHUB-ULB (a consolidated microbiology laboratory located in Brussels, Belgium) for monitoring the epidemic's dynamics at local and national levels and for improving forecasting models. Methods: SARS-CoV-2 RT-PCR Ct values produced from April 1, 2020, to May 15, 2021, were compared with national COVID-19 confirmed cases notifications according to their geographical and time distribution. These Ct values were evaluated against both a phase diagram predicting the number of COVID-19 patients requiring intensive care and an age-structured model estimating COVID-19 prevalence in Belgium. Results: Over 155,811 RT-PCR performed, 12,799 were positive and 7,910 Ct values were available for analysis. The 14-day median Ct values were negatively correlated with the 14-day mean daily positive tests with a lag of 17 days. In addition, the 14-day mean daily positive tests in LHUB-ULB were strongly correlated with the 14-day mean confirmed cases in the Brussels-Capital and in Belgium with coinciding start, peak, and end of the different waves of the epidemic. Ct values decreased concurrently with the forecasted phase-shifts of the diagram. Similarly, the evolution of 14-day median Ct values was negatively correlated with daily estimated prevalence for all age-classes. Conclusion: We provide preliminary evidence that trends of Ct values can help to both follow and predict the epidemic's trajectory at local and national levels, underlining that consolidated microbiology laboratories can act as epidemic sensors as they gather data that are representative of the geographical area they serve.


INTRODUCTION
The coronavirus disease 2019 (COVID-19) pandemic dramatically highlighted the central position of diagnostic testing, not only for the clinical management of infected individuals but also for surveillance purposes (1). The use of clinical microbiology laboratories (CMLs) data to survey the presence of specific microorganisms in a given population represents one of the most established public health surveillance tools of infectious diseases. In a previous study, we proved that influenza trends in Belgium may be estimated using laboratory data provided by a CML serving the wider Brussels-Capital Region area (2). Since the start of the COVID- 19 pandemic, several authors have demonstrated that CMLs could represent the first step toward a global set of sensor networks for infectious diseases surveillance, where each one of the CMLs can be seen as a real-time sensor in its area within an interconnected, complex network (1,3,4). In this perspective, CMLs have become a cornerstone in the fight against SARS-CoV-2 infections due to their ability to process large amounts of samples in large geographic areas while using highly specialised diagnostic tests (1,5).
By reporting to Sciensano, the Belgian national public health research institute, the number of new positives among the tests conducted each day, CMLs share the data needed to estimate the effective reproduction number (R t ) (6,7). However, the data represent the growth rate of positive tests and not the incidence of infection, which requires adjustments to account for changes in testing capacity, delay between infection and test report date, and conversion from prevalence to incidence. We previously showed that SARS-CoV-2 RT-PCR cycle threshold (Ct) values are different between populations, with lower Ct values -thus higher viral loads -for outpatients, likely to be recently infected and higher Ct values for inpatients (8). In a recent article, Hay et al. used the SARS-CoV-2 RT-PCR Ct values in a model to forecast epidemic's trajectory (9). At the time of writing, RT-PCR assays are not standardised and the Ct values obtained using various PCR methods on various instruments in various laboratories using various sampling methods cannot be easily aggregated by surveillance systems. Sciensano recently encouraged laboratories to report their results using a semi-quantitative approach where a viral load below 10 3 RNA copies/mL is considered as "weak positive" (10). Sciensano's primary goal was to approach the actual infectiousness of patients with persistent positive RT-PCR. Therefore, the semi-quantitative dimension of positive test results is not used by surveillance systems yet.
Besides the difficulty of making use of all the data provided by CMLs in real time, public health authorities also face the challenge of making decisions, as the constantly evolving situation requires permanent adaptation (11). In this perspective, various predictive models have been developed to support policy makers (12)(13)(14)(15). To improve and facilitate the decisionmaking process, Hens et al. developed a phase portrait to monitor the epidemic allowing a real-time assessment of whether intervention measures are needed to keep hospital capacity under control (16). Nevertheless, such supportive decision tools are often designed at the national level instead of the hospital level where, during the pandemic, hospital managers needed support to forecast the cancellation and reintroduction of a series of medical activities, such as the surgical care program, or the number of COVID-related ICU beds (17). Thanks to the huge amount of data they collect on a daily basis, CMLs could also help the hospital structures they serve to anticipate the evolution of the epidemic and forecast their hospitalisation and medical activities.
The objectives of this study were: (1) to verify the accuracy of using of SARS-CoV-2 PCR Ct values trends in a single CML to monitor the dynamics of the epidemic; (2) to determine the added-value of using these data as an additional advanced information for scenario analysis, in relation to a phase diagram and an age-structured compartmental model, both developed to follow the path of the Belgian COVID-19 epidemic (14,15).

METHODS
The "Laboratoire Hospitalier Universitaire de Bruxelles -Universitair Laboratorium Brussel" (LHUB-ULB) is a merged clinical laboratory serving five university hospitals located in the Brussels-Capital region in Belgium (8). All the SARS-CoV-2 PCR results produced between April 1, 2020, and May 15, 2021, by the LHUB-ULB were extracted anonymously from its laboratory information system. The data collected were patients' postal code, age, qualitative PCR results, Ct values, instruments on which PCR were performed, and sampling dates. National Belgian data were extracted from the "total number of tests by date" and the "confirmed cases by date, province, age and sex" public dataset available on the Sciensano website on May 27, 2021. These datasets contain the total number of tests, the number of positive tests per day, and the confirmed number of cases per day and province. All Ct values were considered at their time of sampling regardless the days since symptoms onset or deduplication. To analyse trends and minimise day-to-day and holiday-related fluctuations, we computed mean daily positive tests and cases, and median and mean Ct values from May 1, 2020 to May 15, 2021, using a backward sliding window of 14 days (hereafter referred as "14-day mean positive tests/cases" and "14-day median/mean Ct values").
To follow the trends of Ct values variation during the study period, only the SARS-CoV-2 PCR results on nasopharyngeal swabs (NPS) obtained using the m2000 RealTime SARS-CoV-2 assay (Abbott Molecular, USA) were considered, this assay being the only one used by our laboratory during the entire period of interest. As detection of both targeted genes (RdRp and N) was performed using the same fluorophore, the Ct values of this assay were observed up to 32 cycles and were not comparable with Ct values of other RT-PCR assays. Ct values were plotted against a standard calibration curve provided by the Belgian NRC to obtain the semi-quantitative results recommended by Sciensano (10). Accordingly, results with a Ct > 22.3 were considered as "weak positive" (viral load < 10 3 RNA copies/mL). Correlations between 14-day median/mean Ct values and daily mean positive tests were calculated using Spearman's r S rank correlation coefficient. This correlation was performed with shifts of 0 to 30 days in the median and mean Ct values, to determine the shift with the highest r S between the daily mean number of positive tests and Ct values. To test their validity as a source for COVID-19 surveillance, LHUB-UB's data were also compared with all COVID-19 confirmed case notifications according to geographical coverage and time distribution.
We used phase diagrams depicting the evolution of COVID-19 hospitalisations in Belgium to compare these trends with the evolution of Ct values measures through time (16). These diagrams were developed to predict the number of COVID-19 patients requiring intensive care by considering the 7-day mean new hospitalisations and the daily ratio of the past 14-day new hospitalisations. For each combination, the total number of hospitalisations is projected for a horizon of 14 days, from which the number of patients requiring intensive care is predicted based on the distribution of the time spent in an intensive care unit (ICU). The hospital contingency plan in Belgium consists of five different phases (phases 0, 1A and 1B, 2A and 2B), incrementing COVID-19 related ICU beds capacities (18). Within this scheme, the total number of patients in ICU moves from 2001 to 2821, yielding a gradual decrease in non-COVID-19 ICU capacity (16). The hospital and future COVID-related ICU load is thus depicted from green to red: the green region can be considered a "safe zone" in which the number of new hospitalisations is limited with a decrease (growth < 1) or a limited increase (growth > 1) and associated with a limited number of COVID-19 patients at ICU (first part of Phase 0); the yellow region, a region of increased vigilance (second part of Phase 0). The orange (Phases 1A and 1B) and red (Phases 2A and 2B) regions are "high impact" and "nogo" zones, in which non-COVID-19 care decreases substantially and additional capacity for COVID-19 needs to be provided for.
A comparison between the evolution of 14-day median Ct values by age classes and the daily estimated Belgian COVID-19 prevalence for theses age classes has been performed using a model of deterministic continuous age-structured compartmental model (extended SEIR-type) integrating social contact data and calibrated on hospitalisations and deaths incidence data as well as serological studies (15). The prevalence was estimated for the following age classes in years: 0-24, 25-44, 45-64, 65-74, and 75+ as the proportion of the sum of the infected compartments (exposed, asymptomatic, presymptomatic, symptomatic, and hospitalised individuals) compared to the total size of the age class, with a 90% confidence interval estimated by Bayesian analysis. This method aims to provide a reliable comparison with the spreading of COVID-19 in Belgium among age classes since the number of RT-PCR positive tests are known to be biassed over time due to testing policy changes (19).
Data from all sources were collected retrospectively and anonymously before analysis from a routine surveillance perspective without any additional intervention. Therefore, ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. Written informed consent from the participants' legal guardian/next of kin was not required to participate in this study in accordance with the national legislation and the institutional requirements.

Ct Values vs. Epidemic Trends
From April 1, 2020, to May 15, 2021, a total of 155,049 SARS-CoV-2 RT-PCR were performed in the LHUB-ULB and resulted in 12,771 positive results of which 7,906 Cts were analysed. A peak of LHUB-ULB 14-day mean daily positive tests was reached during the Belgian second wave on October 28, 2020 (n = 153.6, Figure 1). Beforehand, a lower peak was reached during the summer on August 22, 2020 (n = 24.4). In both cases, these peaks were preceded by a drastic decrease in the 14-day median Ct values reaching local minima, respectively, 16 days before (13.12 on October 12, 2020) and 12 days before (12.76 on August 10, 2020). Ct values were negatively correlated with the number of LHUB-ULB positive tests, with a maximum reached for the correlation between the 14-day median Ct values with a lag of 17 days and the 14-day mean positive tests (r S = −0.836), as well as between the 14-day mean Ct values with a lag of 19 days and the 14-day mean positive tests (r S = −0.834).

CML Data vs. Local and National Data
During the same period, a total of 1,381,393 tests were performed across the Brussels Region and 13,219,135 tests across the whole of Belgian territory, of which, respectively, 142,562 and 1,131,719 tests were positive. Overall, LHUB-ULB performed, respectively, 8.96% (12,771/142,562) and 1.13% (12,771/1,131,719) of all positive tests reported in the Brussels Region and at the national level. Figure 2 shows the geographical distribution by postal code of the confirmed COVID-19 cases notified by the LHUB-ULB to Sciensano and the LHUB-ULB's representativeness in the COVID-19 notification (proportion of cases identified by the national surveillance network thank to the reporting of LHUB-ULB positive test results). Beside the Brussels Region, which concentrated most of the tests produced by the LHUB-ULB, its service area extended to several municipalities in Walloon and Flemish Region with, for some of them, about 5% of all notifications. Overall, the number of positive tests produced by the LHUB-ULB showed a high correlation with the regional and national trends of the incidence of COVID-19 notifications with coinciding start, peak and end of the different waves of the epidemic. The 14-day average number of positive tests in LHUB-ULB were strongly correlated with the 14-day average number of   positive tests in the Brussels Region (r S = 0.843) and in Belgium (r S = 0.810) but also with the 14-day average confirmed cases in the Brussels Region (r S = 0.832) and in the whole country (r S = 0.804) (Figure 3).

Ct Values Trends vs. Epidemic's Dynamics
In Figure 4, the 14-days median estimates of daily Ct values are plotted in a white to blue colour scale on the phase diagram introduced above, showing how Ct values decrease when the situation worsens (and vice versa) in trends making clockwise movements. Figure 4A shows the downward trend of the end of the first epidemic wave during which the growth in new hospitalisations progressively decreased to reach below 0%, a moment at which the number of new hospitalisations started to decline: the Ct values, low at the peak, increase when the number of new hospitalisations starts to decline. In Figure 4B, an upward trend was observed, leading to a small summer wave. As soon as both growth and hospitalisations passed from the green "safe zone" to the yellow region of "increased vigilance, " the Ct values started to decrease, concurrently crossing the threshold value of 22.3. The opposite effect was observed when the points fell in the green region. The second wave is visualised in Figure 4C, with a clear decrease and increase of the Ct values. Finally, Figure 4D corresponds to the third wave, with again the same pattern observed in the evolution of Ct values.

Ct Values Trends vs. Modelled Prevalence by Age Group
In Figure 5, the median of daily Ct values for different age groups are compared to the daily estimated prevalence of those age groups. The overall behaviour of Ct values was almost similar for all age classes and was negatively correlated to the estimated prevalence.

DISCUSSION
Estimating the likely number of infected patients during epidemics but also the dynamics of spreading in the population is crucial to carry out adequate testing and infection control measures. As large and accurate data providers, CMLs can adequately support hospital capacity planning by providing valuable real-time information about the incidence trends of the pandemic. This was already established by a previous study on influenza (2), but seems to be even more relevant in the context of a more severe disease like COVID-19, where hospital capacities are crucially challenged. Indeed, LHUB-ULB processed on its own 8.95% of the SARS-CoV-2 testing in the Brussels-Capital Region and was proved here to be representative not only of the region but to a certain extent, the whole country due its central geographic position in Belgium. A step further would be to capitalise on the ability of these CMLs to rapidly detect and communicate abnormal events such as sudden increase or emergence of variants of concern without the delay resulting from sending samples to central sequencing platforms. Thanks to the expertise gained in such data integration, UK scientists were able to rapidly share an early assessment of the variant Alpha's (lineage B.1.1.7) genomic characteristics and associated clinical outcomes (20).
Complementarily, and providing an adequate standardisation under appropriate management and regulatory structures, "virtual" CMLs consolidation can also adequately support ongoing COVID-19 surveillance by connecting some or all the produced data to national public health surveillance systems. In the frame of the COVID-19 pandemic, Sciensano started to monitor on a daily basis the epidemiological situation of SARS-CoV-2 in the country through multiple surveillance systems including the "healthdata.be" platform aggregating all information from all CMLs located in Belgium (21)(22)(23). The added value of such a combined structure was already demonstrated for monitoring viral infections by the Infection Response Through Virus Genomics-ICONIC consortium in London (24).
Beyond the variation of the infectiousness over time, our results suggest that following the trend of SARS-CoV-2 RT-PCR Ct values could predict the epidemic trends. Recently infected patients are known to have higher viral load, thus higher infectiousness (25). A decrease in Ct values, linked to an increase of recently infected people is likely to favour spreading, and goes hand in hand with an increase in the total number of cases. On the contrary, inpatients are known to have higher Ct values because of a longer evolution since the onset of the disease. Thus, the overall increase of Ct values can be observed before the decrease of hospitalisations at the decreasing phase of an epidemic wave. By gathering enough comparable data using semi-quantitative results, our Ct values based surveillance systems could approach in real time the average level of viral load in the population, hence approach the current spreading of the virus before the increase of cases becomes apparent, while avoiding the recurrent problem of normalisation. Predicting the shape and the size of the epidemic curve is not straightforward; and many parameters may influence it such as seasonality, infection control measures and population immunity level, to cite a few. The evolution of 14-day median Ct values was also tested against the daily estimated prevalence by age classes, and Ct values were similarly negatively correlated for all age classes, even we observed a shift by approximatively half a month for the 75+ which might be due to intergenerational transmission. However, a starting divergence was observed in May 2021, with Ct values increasing for the oldest classes while remaining low for the youngest one. This was related to a period of resumption of activities in Belgium such as reopening of schools, while older age-classes were progressively becoming protected through the vaccination campaign. The prevalence projections from the compartmental model followed the same trend. Hence, Ct values divergence by age classes could be a good indicator of a divergence in transmission in these age classes.
Following the trend of the Ct values might have helped the decision makers as demonstrated with the integration of the Ct values in a phase diagram predicting the number of COVID-19 patients requiring intensive care at a national level. For instance, in March 2021, after a long period of stagnation in the epidemic, the Belgian government decided to reopen close-contact professions and increased the number of people authorised to gather outside, at a time when Ct values were decreasing. This reopening was reversed a few weeks later due to the increase of cases underlining the untimely decision. During the summer 2020, the evolution of Ct values accurately followed the dynamic of the epidemic with an increase accompanying each decrease of the pressure on hospitals. But the shift between the initial diagnosis, the admission, and the length of stay for COVID-19 inpatients makes it harder to anticipate the trends in hospitalisation between October 2020 and March 2021 when the epidemic had no real break between peaks and the tension in hospital beds remained stable. Only future evolution after a real epidemic reflux could confirm the added value of following the Ct values to anticipate the phase shifts.
At the hospital level, being able to foresee epidemic dynamics could allow a greater ability to anticipate measures such as pre-admission screenings, isolation, and postponement of nonurgent interventions, triage, and upscaling of human resources. In our study, each epidemic wave was preceded by a drastic decrease of Ct values, the median crossing back the Ct = 22.3 value threshold (i.e. the proportion of "weak positive" tests went below 50%), setting here an eventual easy-to-evaluate parameter at the local level. This threshold value of 22.3 was clearly crossed back concurrently with the passage of the number of new hospitalisations vs. the new hospitalisations' daily ratio from the green "safe zone" to the yellow region of "increased vigilance" in the phase diagram. Even if the setting described here should likely be adjusted before being transposed to other laboratories to take account of the specificity of their own patients (ratios inpatients/outpatients and symptomatic/asymptomatic), repeating this exercise with their own data could allow them to set up their own alarm threshold. Likewise, local and national surveillance systems should track the difference in the proportion of strong vs. weak positive results to model the dynamics of the epidemic and thus to provide guidance for prevention measures as suggested by Hay et al. (9).
A potential weakness of our data is that a limited part of the LHUB-ULB activity relies on ambulatory patients at the general practitioner level. Being able to reach this "nonhospitalised" population would likely increase the sensitivity of a surveillance system to weak signals when the epidemic begins in the community before affecting hospitals. However, the fact that overall behaviour of Ct values was almost similar for all age classes and was negatively correlated to the estimated prevalence in the compartment model indicates that our data capture the whole Belgian population to a sufficient extend. One could also argue that correlation between Ct value and actual viral load depends on many factors, such as sampling method, targeted genes, primers and probes, and possible mutations in targeted genes (26). Due to the absence of standardisation between SARS-CoV-2 RT-PCR assays, we only analysed Ct values obtained using one RT-PCR assay performed on nasopharyngeal swabs all along the studied period. We do believe that their number is sufficient to neutralise the effect of measurement bias. Furthermore, it has been discussed that some variants could exhibit an average higher viral load (20,27,28) which could directly impact observed trends in the overall evolution of Ct values. Nevertheless, this potentially higher viral load is likely to favour infectiousness and should not introduce a bias regarding epidemic surveillance.
In conclusion, this study established a correlation between the trends in the SARS-CoV-2 RT-PCR Ct values and the trends of the COVID-19 incidence a few days later. Following the dynamics of the average viral load could add a dimension in the surveillance of respiratory infectious diseases. Moreover, it underlines that the considerable amount of data daily collected by CMLs can play a key role at both local level and beyond, depending on the geographical area they serve. By gathering comparable laboratory data approaching the average viral load of respiratory viruses in the population, surveillance systems might be able to better follow epidemic dynamics, establish forecast models, capture weak signals, and thus anticipate uncontrolled spreading.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found at: https://github.com/ sdellicour/Ct_measures_LHUB.

ETHICS STATEMENT
Data from all sources were collected retrospectively and anonymously before analysis from a routine surveillance perspective without any additional intervention. Therefore, ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. Written informed consent from the participants' legal guardian/next of kin was not required to participate in this study in accordance with the national legislation and the institutional requirements.

AUTHOR CONTRIBUTIONS
NY, SD, VD, LN, MH, and OV designed the study. NY and SD managed the database. NY, SD, DV, VD, and NF did the statistical analyses. NY, VD, MW, and MH validated the laboratory analyses on clinical samples. NY, SD, VD, NF, CF, MH, and OV accessed and verified the data. NY, SD, NF, MH, and OV wrote the paper. All authors had the opportunity to discuss the results, comment on the manuscript, full access to all the data in the study, and final responsibility for the decision to submit for publication.