AUTHOR=Wang Ting-Qiang , Zhuo Ying , Lv Chun-E , Shi Jing , Yao Ling-Hui , Zhang Shi-Yan , Shi Jinbao TITLE=Prediction of bacteremia using routine hematological and metabolic parameters based on logistic regression and random forest models JOURNAL=Frontiers in Cellular and Infection Microbiology VOLUME=Volume 15 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/cellular-and-infection-microbiology/articles/10.3389/fcimb.2025.1605485 DOI=10.3389/fcimb.2025.1605485 ISSN=2235-2988 ABSTRACT=BackgroundThis study aimed to evaluate the predictive utility of routine hematological, inflammatory, and metabolic markers for bacteremia and to compare the classification performance of logistic regression and random forest models.MethodsA retrospective study was conducted on 287 inpatients who underwent blood culture testing at Fuding Hospital, Fujian University of Traditional Chinese Medicine between March and August 2024. Patients were divided into bacteremia (n = 137) and non-bacteremia (n = 150) groups based on blood culture results. Hematological indices, inflammatory markers (e.g., C-reactive protein (CRP), procalcitonin (PCT)), metabolic indices (e.g., glucose, cholesterol) and nutritional markers (e.g., albumin) were analyzed. Univariate and multivariate binary logistic regression analyses were used to identify independent risk factors. Logistic regression and random forest models were developed using 33 features with a 70:30 train-test split and evaluated using the receiver operating characteristic (ROC) curves, confusion matrices and standard classification.ResultsHemoglobin, cholesterol, and albumin levels were significantly lower in the bacteremia group, while platelet count, CRP, PCT, glucose, and triglycerides were significantly elevated (all p < 0.05). Logistic regression identified platelet count (Odds ratios (OR) = 1.003, 95% confidence interval (CI): 1.001–1.006), PCT (OR = 1.032, 95% CI: 1.004–1.060), triglycerides (OR = 1.740, 95% CI: 1.052–2.879), and low cholesterol (OR = 0.523, 95% CI: 0.383–0.714) as independent risk factors. The area under the ROC curve (AUC) was 0.75 for the random forest model and 0.74 for logistic regression, with recall rates of 0.69 and 0.60, respectively.ConclusionRoutine laboratory markers integrated into machine learning models demonstrated potential for early bacteremia prediction. Random forest exhibited superior sensitivity compared to logistic regression, suggesting its potential utility as a clinical screening tool.