Advancing Risk Analysis of COVID-19 Clinical Predictors: The Case of Fasting Blood Glucose

Department of Medical Laboratory Sciences, Faculty of Allied Health Sciences, Health Sciences Center, Kuwait University, Kuwait City, Kuwait, Department of Genetics and Bioinformatics, Dasman Diabetes Institute (DDI), Kuwait City, Kuwait, Department of Mathematics, Faculty of Sciences, Kuwait University, Kuwait City, Kuwait, Department of Environmental Health, Harvard T.H. Chan School of Public Health, Harvard University, Boston, MA, United States

HIGHLIGHTS -Many of the clinical predictors of COVID-19 are naturally continuous.
-Such continuity may imply that a complex predictor-risk relationship is underplay.
-Risk analyses that allow continuous predictors to take a restriction-free shape can provide a better understanding of the clinical course of the disease. This refined approach can help generate hypotheses characterizing the mechanisms of disease progression.
To understand or predict the effects of serum glucose on COVID-19 outcomes such as hospitalization, intensive care unit (ICU) admission or death, one could try to use conventional regression techniques with glucose as the independent variable and one of these outcomes as the dependent variable. But how should the glucose variable be included in such models? One may try applying clinical threshold values to fit the regression model. For example, in the context of diabetes diagnosis, we can use the threshold values of hemoglobin A1c (A1C); A1C <5.7%, A1C between 5.7 and 6.4%, and A1C ≥6.5% to characterize patients as normal, prediabetic, or diabetic, respectively (1). Alternatively, we can use two categories instead of three: diabetics vs. non-diabetics. These threshold or categorical approaches, albeit commonly useful for identifying high risk groups, have underlying limitations. First, they assume complete homogeneity within each group, hence patients with A1C values of 6.5 and 10% are to be considered clinically identical. Put another way, these approaches assume that patients with A1C of 5.69% are entirely different from those with A1C of 5.70%. Secondly, the dose-response relationship is a step or staircase function, which is rarely a realistic description of real-life patient risks (2,3).
To capture the natural trends of a continuous exposure variable, one may surprisingly benefit from allowing the dose-response relationship to take whatever natural shape the data describe, rather than forcing it to fit idealized relationships such as linear (straight line) and categorical (staircase) functions ( Figure 1A). The risk analyses based on such natural relationships are only made possible with modern computational algorithms. Take penalized splines as an example. These are smoothing non-parametric functions that, unlike forcing steps and lines, allow significant flexibility in estimating the dose-response curve. The only thing that governs these specific types of splines is, in fact, the goodness-of-fit. In other words, this smoothing of the relationship comes without idealized assumptions and prevents under-or over-fitting the data (4).
To illustrate, let us take fasting blood glucose (FBG) as an indicator for in-hospital complications among COVID-19 patients. Creating a three diagnostic categories of FBG [<6.1 mmol/L (reference), 6.1-6.9 mmol/L and ≥7.0 mmol/L] demonstrated that the odds ratios (OR) of developing 28-day in-hospital complications for the higher categories were 3.99 and 2.61, respectively (5). However, it remains unclear how much risk is associated with increasing FBG within the range of each group, and whether the patients within each group have a sufficiently homogeneous risk. Applying splines for the glucose variable suggested that even small changes in FBG within normal ranges can significantly increase the risk of severe illness (6). Surely this cannot be overlooked clinically, hence warranting recommendations for strict monitoring of FBG upon admission. Unexpectedly, this "unconventional" type of risk analysis has also brought to light an important and uncharted scientific question: why and how glucose can influence the outcome of COVID-19 even within normal ranges? At this stage we can only speculate as to what the answer might be, but we hope to inspire further research into this subject. In this spirit, we argue that there are potentially two independent mechanisms in which glucose can influence COVID-19 outcomes. First, at high levels of glucose (in the diabetic ranges), low-grade chronic inflammation state disturbs the homeostatic glucose regulation and insulin sensitivity. This could also in turn disrupt normal immune response by weakening T-cell function and add to the risk of hyperinflammation and cytokine storm syndrome which is associated with worse COVID-19 outcome (7). On the other hand, increases of FBG, even within normal ranges could affect COVID-19 outcomes through enhancing aerobic glycolysis in the infected monocytes with SARS-CoV2 which in turn enhance and facilitate viral replication and infection resulting in more severe outcome (8). The inflammation and glycolysis mechanisms are likely to be affected by different levels of FBG with the latter being sensitive to lower levels ( Figure 1B).
The novelty presented here is the application of wellknown tools that are not being applied much in the COVID-19 epidemiology, because in many cases, researchers opt to conventional and clinically straightforward approaches such as linear, dichotomous and categorical modeling. While this was acceptable for some time because of the computational complexity of the smoothing applications, they can be easily implemented in modern time computers and statistical softwares. We argue that they ought to be used.
Although utilizing smoothing functions, in our case, sounds reasonable, we must always exercise caution with smaller sample sizes. In addition, relying on cross-validation to determine penalty terms for penalized splines is computationally extensive. For example, the leave-one-out validation will leave one observation out at a time; fit the model on the remaining training data; test on the held-out data point and so on. An alternative approach to specify penalized splines is using Restricted Maximum Likelihood, which is a Likelihood based approach. Furthermore, interpretation of coefficients is not straightforward. Improving the fit of the dose-response relationship comes at the expense of easy interpretation.
What we are advocating for in this opinion piece is the mere attention to the nature of the dose-response relationship which is usually overlooked by simplifying assumptions such as forcing a straight line or forcing a staircase shape. In fact, with the pandemic hitting us harder, we need to leverage all the tools we have in the toolbox in order to get a better understanding of the complex pathophysiology of clinical predictors (like FBG) during the state of infection.
Bottom line, non-linearities, steep slopes, plateaus, or any other shape should always be considered for continuous variables such as serum blood glucose or A1C, perhaps even age, body mass index, and so on. In the age of big data, electronic health records, and artificial intelligence the conventional practices maybe too archaic. Once we correctly characterize these complex relationships, we can better capture the clinical course of the disease.