AUTHOR=Xiao Yingjun , Chen Xiling , Ou Xiping , Dong Zheqing , Zhang Xiaoyan , Liang Wei , Nan Xiaojing , Xu Chan , Lai Xiaobo , Xu Peng , Fang Kui TITLE=Machine learning-based high-specificity diagnostic model for Talaromyces marneffei infection in febrile patients using routine clinical laboratory data JOURNAL=Frontiers in Microbiology VOLUME=Volume 16 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/microbiology/articles/10.3389/fmicb.2025.1654918 DOI=10.3389/fmicb.2025.1654918 ISSN=1664-302X ABSTRACT=ObjectiveThis study developed and validated a machine learning (ML)-based predictive model utilizing febrile patients’ routine clinical laboratory data for the purpose of screening such patients for Talaromyces marneffei infection and to provide reference information for feature selection in the subsequent establishment of a more precise early warning model.MethodsThis retrospective study enrolled febrile patients who visited Zhejiang Provincial People’s Hospital and the Third Affiliated Hospital of Zhejiang Chinese Medical University from January 2021–April 2025. Patient data, including sex, age, and laboratory test results, were collected. Through sparse partial least squares discriminant analysis, the most informative features were extracted from the dataset. Six classic machine learning algorithms were utilized to develop the optimal predictive model through 1000 bootstrap resamplings. Finally, the model was validated on an independent clinical validation dataset.ResultsThe training dataset comprised 485 febrile patients (141 with T. marneffei infection). The clinical validation dataset comprised 1,953 febrile patients (13 with T. marneffei infection). The random forest model demonstrated the highest performance in classifying T. marneffei-infected patients, with an area under the receiver operating characteristic curve of 0.987 in out-of-bag validation and 0.989 in clinical validation. The model also exhibited good specificity (0.999) for T. marneffei infection and good sensitivity (0.845) in predicting bacteraemia in clinical validation.ConclusionA random forest model can effectively utilize routine clinical laboratory data to predict T. marneffei infection and bacteraemia in febrile patients, offering a promising early screening tool for individuals at high risk for T. marneffei infection.