ORIGINAL RESEARCH article

Front. Big Data

Sec. Data Mining and Management

A Reinforcement Learning-Guided Interpretable Method for Postoperative Sepsis Prediction with Hilbert-Schmidt Independence Criterion

  • 1. Chinese Academy of Sciences Chongqing Institute of Green and Intelligent Technology, Chongqing, China

  • 2. Fuling District Linshi Community Health Service Center, Chongqing, China

  • 3. 3 Centre for Medical Big Data and Artificial Intelligence, The First Affiliated Hospital (Southwest Hospital) of Army Medical University (Third Military Medical University), Chongqing, China

  • 4. Chongqing University Fuling Hospital, Chongqing, China

The final, formatted version of the article will be published soon.

Abstract

Sepsis is a major cause of postoperative morbidity and mortality, and early risk stratification from perioperative electronic health records (EHR) is a representative large-scale, high-dimensional data processing problem that requires models to be accurate, efficient, and clinically interpretable. However, many existing sepsis prediction methods operate as black boxes and rely on extensive temporal monitoring streams, which increases feature dimensionality and computation while limiting transparency. We propose a reinforcement learning–guided, interpretable feature engineering framework for postoperative sepsis prediction that targets scalable learning on heterogeneous perioperative data. Within an Actor–Critic formulation, feature selection is treated as an action: an Actor network produces a stochastic feature mask over preoperative static variables and intraoperative statistical summaries, while a Critic network performs downstream prediction using a self-attention–based classifier. To benchmark and stabilize learning, we introduce an auxiliary baseline model that incorporates intraoperative temporal signals extracted by a temporal convolutional network (TCN) and regularized using the Hilbert–Schmidt Independence Criterion (HSIC) to encourage non-redundant representations between statistical and temporal feature views. The Actor is optimized to achieve comparable predictive performance to the baseline while using a reduced feature set, improving computational efficiency and supporting instance-level interpretability. Experiments on a real-world surgical cohort from Southwest Hospital (2014–2018) demonstrate that the proposed framework attains performance comparable to or better than competitive machine learning baselines while selecting fewer input features. On this dataset, our method achieved perfect scores of 1.00 for F1-score, Sensitivity, and Specificity. Finally, integrated gradients are used for post hoc explanation, providing clinically meaningful attribution and enhancing trust in large-scale healthcare deployment.

Summary

Keywords

Feature engineering, Hilbert–Schmidt independence criterion, Large-scale medical data, reinforcement learning, Self-attention, Sepsis, Temporal Convolutional Networks

Received

14 February 2026

Accepted

09 March 2026

Copyright

© 2026 Zhong, Chen, Sun, Wang, Liu and Chen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Zhenbei Liu; Yu-wen Chen

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Share article

Article metrics