AUTHOR=Chaganti Shikha , Singh Vivek , Gent Alasdair Edward , Kamaleswaran Rishikesan , Kamen Ali TITLE=Evaluating the impact of common clinical confounders on performance of deep-learning-based sepsis risk assessment JOURNAL=Frontiers in Artificial Intelligence VOLUME=Volume 8 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2025.1452471 DOI=10.3389/frai.2025.1452471 ISSN=2624-8212 ABSTRACT=IntroductionEarly identification of sepsis in the emergency department using machine learning remains a challenging problem, primarily due to the lack of a gold standard for sepsis diagnosis, the heterogeneity in clinical presentations, and the impact of confounding conditions.MethodsIn this work, we present a deep-learning-based predictive model designed to enable early detection of patients at risk of developing sepsis, using data from the first 24 h of admission. The model is based on routine blood test results commonly performed on patients, including CBC (Complete Blood Count), CMP (Comprehensive Metabolic Panel), lipid panels, vital signs, age, and sex. To address the challenge of label uncertainty as a part of the training process, we explore two different definitions, namely, Sepsis-3 and Adult Sepsis Event. We analyze the advantages and limitations of each in the context of patient clinical parameters and comorbidities. We specifically examine how the quality of the ground truth label influences the performance of the deep learning system and evaluate the effect of a consensus-based approach that incorporates both definitions. We also evaluated the model's performance across sub-cohorts, including patients with confounding comorbidities (such as chronic kidney, liver disease, and coagulation disorders) and those with infections confirmed by billing codes.ResultsOur results show that the consensus-based model identifies at-risk patients in the first 24 h with 83.7% sensitivity, 80% specificity, 36% PPV, 97% NPV, and an AUC of 0.9. Our cohort-wise analysis revealed a high PPV (77%) in infection-confirmed subgroups and a drop in specificity across cohorts with confounding comorbidities (47-70%).DiscussionThis work highlights the limitations of retrospective sepsis definitions and underscores the need for tailored approaches in automated sepsis detection, particularly when dealing with patients with confounding comorbidities.