Establishment and validation of a clinical model for predicting diabetic ketosis in patients with type 2 diabetes mellitus

Background Diabetic ketosis (DK) is one of the leading causes of hospitalization among patients with diabetes. Failure to recognize DK symptoms may lead to complications, such as diabetic ketoacidosis, severe neurological morbidity, and death. Purpose This study aimed to develop and validate a model to predict DK in patients with type 2 diabetes mellitus (T2DM) based on both clinical and biochemical characteristics. Methods A cross-sectional study was conducted by evaluating the records of 3,126 patients with T2DM, with or without DK, at The Affiliated Hospital of Qingdao University from January 2015 to May 2022. The patients were divided randomly into the model development (70%) or validation (30%) cohorts. A risk prediction model was constructed using a stepwise logistic regression analysis to assess the risk of DK in the model development cohort. This model was then validated using a second cohort of patients. Results The stepwise logistic regression analysis showed that the independent risk factors for DK in patients with T2DM were the 2-h postprandial C-peptide (2hCP) level, age, free fatty acids (FFA), and HbA1c. Based on these factors, we constructed a risk prediction model. The final risk prediction model was L= (0.472a - 0.202b - 0.078c + 0.005d – 4.299), where a = HbA1c level, b = 2hCP, c = age, and d = FFA. The area under the curve (AUC) was 0.917 (95% confidence interval [CI], 0.899–0.934; p<0.001). The discriminatory ability of the model was equivalent in the validation cohort (AUC, 0.922; 95% CI, 0.898–0.946; p<0.001). Conclusion This study identified independent risk factors for DK in patients with T2DM and constructed a prediction model based on these factors. The present findings provide an easy-to-use, easily interpretable, and accessible clinical tool for predicting DK in patients with T2DM.


Introduction
Diabetes mellitus (DM) and its related complications are regarded as a major global health threat. The International Diabetes Federation predicts that the global diabetes prevalence in individuals aged 20-79 years will rise to 12.2% (783.2 million) by 2045 (1). Type 2 DM (T2DM) accounts for over 90% of DM cases (2). Diabetic ketosis (DK) is an acute complication of T2DM, but is preventable (3). However, despite the development of new therapeutic drugs and improvements in diabetic care, DK remains one of the leading causes of hospitalization among patients with diabetes (4, 5). Failure to recognize the symptoms of DK can lead to complications, including diabetic ketoacidosis (DKA), severe neurological morbidity, and death (6,7). A recent study analyzed the National Inpatient Sample database for all DKA admissions in the USA between 2003 and 2014 and found a significant increase (56%) in the inflation-adjusted hospital charges for DKA admissions during that timeframe (4). Early diagnosis and effective management of DK are essential for delaying disease progression, reducing its economic impact, and lowering the risk of complications (8).
DK can develop in patients with diabetes who experience an increase in the level of ketone bodies in the blood, which is a consequence of both the increased ketone production in the liver and reduced urinary clearance of ketones (9). There is growing evidence that a ketogenic diet is therapeutically beneficial for neurologic diseases (10,11). However, when the concentration of ketones becomes too high, they interfere with normal cellular function. Consequently, hyperketonemia with or without acidosis is an acute severe complication of poorly controlled or newly diagnosed DM. Thus, ketones play an important pathophysiological role in the development of both diabetes and diabetic complications (12).
While the importance of early detection and prevention of DK is known, there have been no population-based studies aiming to develop a clinical model for predicting DK in patients with T2DM. A predictive model for a DK diagnosis, based on clinical information, could improve the effectiveness of T2DM management if the risk for DK is known. Thus, we designed a cross-sectional study to assess patients with T2DM in our hospital, with the aim of developing a DK prediction model for patients with T2DM.

Methods
This retrospective cross-sectional study was approved by the Ethics Committee of The Affiliated Hospital of Qingdao University (Approval No. QYFYWZLL26666). This study was conducted and reported in accordance with the guidelines of the Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD) initiative and the Declaration of Helsinki.

Study design and patient selection
Data were collected from hospitalized patients with T2DM at The Affiliated Hospital of Qingdao University (Shandong, China) from January 1, 2015, to May 1, 2022. Selected study patients (3,061 patients in total) met the American Diabetic Association (ADA) 2014 criteria for T2DM. Of these patients, 309 had ketosis and 2,752 did not. The diagnosis of ketosis was based on having moderate to high levels of urine ketones, in accordance with the ADA 2009 guideline (13). A computer system was used to randomly allocate 70% of the patients into a cohort for model development and the remainder were moved into a validation cohort.

Outcomes and variables
The primary outcome of the study was the incidence of DK, which was defined as a severe, acute complication of DM. The patients' medical files provided anthropometric parameters, including age, BMI, DM duration, WC (waist circumference), and sex. The following laboratory values were also available, including the systolic and diastolic blood pressures and levels of low-density lipoprotein cholesterol (LDL-C), total cholesterol (TC), triglycerides (TG), free fatty acids (FFA), glycosylated hemoglobin (HbA1c), alanine aminotransferase (ALT), aspartate aminotransferase (AST), serum creatinine (sCr), serum uric acid (sUA), fasting plasma glucose, fasting plasma insulin, 120-min postprandial insulin (Ins120), 2-h postprandial C-peptide (2hCP), and Cpeptide (CP).

Statistical methods
The baseline characteristics of the development and validation cohorts were analyzed. Normally distributed data are presented as the mean and standard deviation; nonnormally distributed data are presented as the median and interquartile range. Comparisons between the DK and non-DK (NDK) groups were performed using parametric or nonparametric tests (i.e., chi-square, Student's t-test, and Mann-Whitney U test), as appropriate.
The most relevant patient variables from the univariable analysis were used in the final multivariable logistic regression model. The significant variables from the first analysis, as well as those that could be of clinical interest, were included in the multivariate logistic regression analysis. Variables with collinearity or that were a linear combination of other variables were eliminated.
The regression coefficients were used to construct a DK prediction model, in which the dependent variable was the presence or absence of DK. The calibration was assessed using the Hosmer-Lemeshow test, and model discrimination was assessed by analyzing the area under the curve (AUC) of the receiver operating characteristic (ROC) curve for each imputed dataset. The final model was regarded as having optimal clinical significance. Internal validation was used to evaluate the discrimination of the model (14). All statistical analyses were performed using SPSS Version 20.0 (IBM Corp., Armonk, NY, USA).

Study population
This study included 3,061 hospitalized patients who were treated at The Affiliated Hospital of Qingdao University between January 1, 2015 and May 1, 2022 ( Figure 1). A computer system randomly assigned 2,156 patients to the development cohort and the remainder (n = 905) to the validation cohort, with 10.1% and 10.1% DK prevalence in each cohort, respectively. Table 1 shows the baseline characteristics for the DK and NDK datasets of patients with T2DM.

Prediction model
The strongest predictors of DK incidence in patients with T2DM were age, DM duration and CP, 2hCP, HbA1c, FPG, LDL-C, TC, FFA, TG, ALT, and AST levels ( Table 2). The result FIGURE 1 Flowchart of the inclusion and exclusion of participants.

Model performance
The Hosmer-Lemeshow test, which is a goodness-of-fit test for logistic regression, showed significance for all imputed datasets (p =0.833). The calibration curve for the predicted and observed DK is shown in Figure 3.

Internal validation
Using the validation cohort dataset, an ROC curve for the DK risk prediction model was generated and is shown in Figure 4. The AUC was 0.922 (95% CI: 0.898-0.946) in this cohort. Given that the aim of developing the model was the early detection of patients with T2DM who are at high risk for DK, a value of -2.779 was selected as the optimal cutoff risk score, which had a sensitivity of 0.901 and a specificity of 0.767.

Discussion
This study developed the first predictive model for DK in patients with T2DM. The final predictive model included four variables: age and the levels of 2hCP, FFA, and HbA1c. At present, there are several models for predicting T2DM complications such as nephropathy and diabetic retinopathy (15,16). To the best of our knowledge, this is the first study to suggest a model for predicting the diagnosis of T2DM with ketosis (T2DK).
Early studies identified ketosis as a complication of type 1 DM (17). However, evidence shows that ketosis occurs more frequently in patients with T2DM. Pancreatic islet cell dysfunction is thought to cause ketosis; individuals who are more susceptible to ketosis tend to have poorer pancreatic islet function (18). Changes in CP levels are pathophysiologically The relationship with the DK risk in the univariate logistic regression model is expressed as the odds ratio (OR) and its 95% confidence interval (CI). The relationship with the DK risk in the multivariate logistic regression model is expressed as the odds ratio (OR) and its 95% confidence interval (CI).

FIGURE 3
The performance of the DK in patients with T2DM risk prediction model in the validation cohorts. Receiver operating characteristic curve for the DKD risk prediction model. The AUC and its 95% CI were 0.922 (0.898-0.946).  (19). The interpretation of a CP test can assist in predicting ketosis and inform recommendations for patient treatment options. In addition, studies have confirmed a predominance of postprandial glycemia in the overall glycemic control of patients with well-controlled T2DM who are managed using oral hypoglycemic agents or basal insulin (20). This may explain why 2hCP was one of the variables identified for inclusion in our final model. In recent years, the importance of HbA1c levels has been increasingly focused on for the diagnosis of diabetes. Hallberg et al. (21) showed that HbA1c levels are an important indicator in the management of diabetes. The level of HbA1c, rather than the fasting glucose level, is a more stable long-term indicator of blood glucose levels over a period of 2-3 months. It also correlates well with the risk of long-term diabetes complications such as acute illness and infection. The HbA1c level is now considered to be the best method for predicting glycemia-associated risks for DM complications (22). Ketosis in T2DM is often caused by poor control of blood glucose levels; based on this, we incorporated the HbA1c level into our model. For decades, FFA has been an important risk factor for insulin resistance, defective insulin secretion, glucose intolerance, and T2DM (23,24). In our statistical analysis, T2DK patients have increased FFA levels and decreased insulin secretion, which were consistent with previous studies (25, 26).
Several studies have reported higher blood ketone levels in younger patients with T2DM, a finding that is consistent with the results of our study (27,28). Furthermore, there is accumulating evidence showing that relatively young patients with T2DM have a more aggressive disease phenotype, which often leads to premature development of complications. This can have adverse effects on a patient's quality of life and unfavorable effects on long-term outcomes (29,30). In this study, we found that patients in the DK group were younger, on average, than those in the NDK group (p <0.001); this is consistent with previously reported results. We decided to include age in our model, instead of the duration for which patients had diabetes, in consideration of the high rate of undiagnosed diabetes.
Ketosis in patients with T2DM is often overlooked because its symptoms can be atypical (31). However, we found that age and the levels of 2hCP and HbA1c were closely associated with and predictive of ketosis in patients with T2DM. An advantage of our predictive model is that the development and validation datasets are from the same database, which avoids bias and increases predictive efficiency. For the three indicators included in the model, none of the patients had missing values.
In addition, we found that DK is more common in younger patients with diabetes and a higher proportion of male patients have poor pancreatic islet function compared to NDK patients; this is consistent with the results of other studies (28). Previous studies have shown that the ketogenic function is impaired in patients with T2DM, which is related to high serum insulin levels (32). High insulin levels directly inhibit the ketogenic function of the liver; they also inhibit the secretion of growth hormone and glucagon, both of which may contribute to the inhibition of ketone production. Consistent with this information, our logistic regression analysis revealed low insulin as a risk factor for ketosis in patients with T2DM.
Our study had inherent limitations due to the cross-sectional design. It was not possible to identify the cause of ketosis in patients with T2DM. In addition, all patients in this study were enrolled in one center, which may limit its generality. Finally, this study is that some potential risk factors for T2DK may have Calibration plots show the relationship between the predicted probabilities base on the prediction model and actual values of the development cohorts. The x-axis represents deciles of predicted risk, and the y-axis reveals predicted and actual prevalence of DK. The H-L chi-square which measure the calibration was 4.255 (P = 0.833).
been overlooked or absent altogether in the patient population. Taken together, a multi-center cohort study to identify the risk factors of T2DK is necessary. It would be helpful to verify the findings of this study using a cohort of patients who have longterm follow-up data with repeated measurements of ketone bodies, and collecting the data in detail would have improved the quality of this study.

Conclusions
The rapidly increasing prevalence of T2DM has resulted in its identification as an international health concern. T2DM has a major effect on global morbidity and premature mortality, and it is an increasing economic burden due to chronic complications (33). This study found that age and the levels of 2hCP, FFA and HbA1c were correlated significantly with the prediction of T2DK. Moreover, this study developed a satisfactory predictive model with promising application for clinical practice.

Data availability statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics statement
The studies involving human participants were reviewed and approved by the research ethics committee of the Affiliated Hospital of Qingdao University (Approval No. QYFYWZLL26666).