AUTHOR=Mc Cord—De Iaco Kimberly A. , Gesualdo Francesco , Pandolfi Elisabetta , Croci Ileana , Tozzi Alberto Eugenio 

TITLE=Machine learning clinical decision support systems for surveillance: a case study on pertussis and RSV in children

JOURNAL=Frontiers in Pediatrics

VOLUME=Volume 11 - 2023

YEAR=2023

URL=https://www.frontiersin.org/journals/pediatrics/articles/10.3389/fped.2023.1112074

DOI=10.3389/fped.2023.1112074

ISSN=2296-2360

ABSTRACT=We tested the accuracy of a machine learning (ML) algorithm based on signs and symptoms for the diagnosis of RSV or pertussis in the first year of age to support clinical decisions and provide timely data for public health surveillance
We used data from a retrospective case series of children in the first year of life investigated for acute respiratory infections in the emergency room from 2015 to 2020. We collected data from PCR lab tests for confirming pertussis or RSV infection, clinical symptoms, and routine lab tests, which were used for the algorithm development. We used a LightGBM model to develop  2 sets of models for predicting pertussis and RSV: for each pathogen, we developed one model trained with the combination of clinical symptoms and routine laboratory tests, and one with symptoms only. All analyses were performed using Python 3.7.4 with Shapley values (Shap values) visualization package for predictor visualization. The performance of the models was assessed through confusion matrices. 
The models were developed on a dataset of 599 children. The recall for the pertussis model combining symptoms and routine lab tests was 0.72,  and 0.74 with clinical symptoms only. For RSV, recall was 0.68 with clinical symptoms and laboratory testing and 0.71 with clinical symptoms only. The F1 score for the pertussis model was 0.72 in both models,  and, for RSV, it was 0.69 and 0.75. 
ML models can support the diagnosis and surveillance of infectious diseases such as pertussis or RSV infection in children based on common symptoms and laboratory tests. ML-CDSS may be developed in the future in large networks to create accurate tools for clinical support and public health surveillance.