ORIGINAL RESEARCH article
Front. Pediatr.
Sec. Pediatric Infectious Diseases
Volume 13 - 2025 | doi: 10.3389/fped.2025.1608812
Development of a Host-Signature-Based Machine Learning Model to Diagnose Bacterial and Viral Infections in Febrile Children
Provisionally accepted- 1Institute of Laboratory Medicine of School of Medical Technology, The First Dongguan Affiliated Hospital, Guangdong Medical University, Dongguan, China
- 2Shenzhen Yantian District Center for Disease Control and Prevention, Shenzhen, China
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Background: Early aetiological diagnosis is critical for the management of febrile children with infectious illness, as it strongly influences the choice of appropriate medication and can affect a child’s complications and outcome. New diagnostic strategies based on host genes have recently been developed and have achieved high accuracy and clinical practicability. In this study, through integrative bioinformatics analysis, we aimed to construct artificial neural network (ANN, multilayer perceptron) and random forest (RF) models based on host gene signatures to diagnose bacterial or viral (B/V) infection in febrile children. Results: Transcriptome data from the whole blood of children were collected from a public database. Of these, 384 febrile young children (definite bacterial: n=135, definite viral: n=249) were involved in the construction of the RF model. For the generalized RF model, 1042 patients were included with various aetiological infections, such as Staphylococcus aureus, pathogenic Escherichia coli, Salmonella, Shigella, adenovirus, HHV6, enterovirus, rhinovirus, human rotavirus, human norovirus, and influenza A pneumonia. The overlap of 57 candidate genes between the 117 differentially expressed genes (DEGs) and the 264 module member genes was identified through DEGs analysis and weighted gene co-expression network analysis (WGCNA). Subsequently, L1 regularization algorithms and variable significance analysis (multilayer perceptron) were used to simplify and rank the predictive features, and LCN2 (100.0%), IFI27 (84.4%), SLPI (63.2%), IFIT2 (44.6%) and PI3 (44.5%) were identified as the top predictors. By utilizing the transformed value RefValue (i) of these five genes, the RF model achieved an AUC of 0.9917 in training and 0.9517 in testing for diagnosing B/V infection in children. The ANN model achieved an AUC of 0.9540 in testing. Furthermore, a generalized RF model involving 1042 patients was developed to predict different aetiological types of samples, achieving an AUC of 0.9421 in training and 0.8968 in testing. Conclusions: A five-gene host signature (IFIT2, SLPI, IFI27, LCN2, and PI3) was identified and successfully used to construct an RF model that distinguishes B/V infection in febrile children, achieving 85.3% accuracy, 95.1% sensitivity, and 80.0% specificity, and to construct an ANN model that achieves 92.4% accuracy, 86.8% sensitivity, and 95% specificity.
Keywords: machine learning, Bacteria, virus, Febrile children, Host gene signatures, diagnosis
Received: 09 Apr 2025; Accepted: 23 Jul 2025.
Copyright: © 2025 Bai, Gong, Cui, Zhang, Hong, Gao, Lin, Chen, Li, Huang, Zheng, Xu and Xiao. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence:
Jun-Fa Xu, Institute of Laboratory Medicine of School of Medical Technology, The First Dongguan Affiliated Hospital, Guangdong Medical University, Dongguan, China
Na Xiao, Shenzhen Yantian District Center for Disease Control and Prevention, Shenzhen, China
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.