Interpretable machine learning model for predicting covert hepatic encephalopathy in patients with cirrhosis: a multicenter study

Liu, Yilong; Ding, Kai; Qiu, Yifan; Wang, Peiqin; Wang, Ruoyao; Zeng, Xin; Yin, Chuan

doi:10.3389/fmed.2025.1686005

ORIGINAL RESEARCH article

Front. Med., 25 November 2025

Sec. Precision Medicine

Volume 12 - 2025 | https://doi.org/10.3389/fmed.2025.1686005

Interpretable machine learning model for predicting covert hepatic encephalopathy in patients with cirrhosis: a multicenter study

Yilong Liu¹^†

Kai Ding¹^†

Yifan Qiu²^†

Peiqin Wang¹^†

Ruoyao Wang²

Xin Zeng³^*

Chuan Yin¹^*

¹Department of Gastroenterology, Changzheng Hospital, Naval Medical University, Shanghai, China
²College of Basic Medical Sciences, Naval Medical University, Shanghai, China
³Department of Gastroenterology, Shanghai East Hospital, Tongji University School of Medicine, Shanghai, China

Background and aim: Covert hepatic encephalopathy (CHE) is a neurocognitive complication affecting 40.9–50.4% of patients with cirrhosis. It often remains undiagnosed owing to its subclinical nature and the limitations of existing diagnostic tools, which are constrained by subjectivity, variable sensitivity, and limited accessibility. This study aims to develop and validate interpretable machine learning (ML) models for predicting CHE in patients with cirrhosis using multidimensional clinical and lifestyle data.

Methods: This retrospective study included 503 patients with liver cirrhosis from 16 medical centers in China. CHE was diagnosed using the psychometric hepatic encephalopathy score and EncephalApp Stroop tests. Recursive feature elimination and Pearson’s correlation analysis were used for feature selection. Eight ML models were implemented to predict CHE. Performance was assessed via AUC, sensitivity, specificity, and decision curve analysis. The SHapley Additive exPlanations (SHAP) values are interpreted by the optimal model.

Results: The light gradient boosting machine (LightGBM) model achieved the highest area under the receiver operating characteristic (ROC) curve (AUC) of 0.810 in the training set and 0.710 in the validation set. Decision curve analysis showed that LightGBM had better diagnostic performance than random forest (RF) and eXtreme gradient boosting (XGBoost). The SHAP analysis identified key predictors of CHE, including lower Mini-Mental State Examination (MMSE) scores, older age, hypoalbuminemia, lack of prior computer usage, and higher blood urea nitrogen levels.

Conclusion: This study presents a novel ML-based approach for predicting CHE in cirrhotic patients, with LightGBM offering the best balance of performance and interpretability. The identified clinical and demographic predictors could facilitate early CHE detection and personalized management, ultimately improving outcomes for this high-risk population.

Introduction

Covert hepatic encephalopathy (CHE), the subclinical precursor to overt hepatic encephalopathy (OHE), is a neurocognitive complication affecting 40.9–50.4% of patients with cirrhosis (1, 2). Characterized by subtle impairments in attention, visuospatial abilities, and psychomotor speed, CHE significantly compromises the quality of life and predicts hospitalization risks and mortality (3–6). In consequence, CHE often remains undiagnosed in routine clinical practice, which prevents the initiation of medical therapy. Early identification is critical for initiating interventions, such as lactulose or rifaximin, to mitigate progression (3, 7).

However, current diagnostic paradigms for CHE face substantial challenges. Owing to limited time, cost, and resource availability, only a few patients are routinely tested (8). Current diagnostic tools for CHE—including the psychometric hepatic encephalopathy score (PHES) and neuropsychological tests (NPTs)—are limited by subjectivity, variable sensitivity (PHES: 45–71%), and limited accessibility in routine clinical practice (9). Furthermore, these methods detect deficits only after neurological dysfunction is established, delaying therapeutic opportunities.

Blood biomarkers facilitating the diagnosis of covert hepatic encephalopathy (CHE) in patients with cirrhosis are lacking. While ammonia levels, inflammatory biomarkers (IL-6), and serum glial fibrillary acidic protein (sGFAP) have been investigated for minimal hepatic encephalopathy (MHE) prediction, their utility remains constrained by inconsistent thresholds and multifactorial pathophysiology (10–12). The heterogeneity of cirrhosis etiologies (e.g., viral, alcoholic, and metabolic dysfunction-associated steatotic liver disease) further complicates biomarker generalizability. Therefore, easy-to-use and reliable testing strategies are urgently required.

Machine learning (ML) offers transformative potential to decode complex patterns in multidimensional clinical data (13). By integrating neuropsychological parameters, serum biomarkers, and electronic health record trajectories, ML algorithms could help overcome the “silent” nature of CHE through predictive modeling. In recent years, some scholars have already utilized machine learning to predict or assist in the diagnosis of HE. Yang et al. demonstrated a weighted random forest (WRF) model achieving an AUC of 0.816–0.831 in predicting patients with liver cirrhosis complicated by HE (14). A recent study has demonstrated an ML model achieving an AUC of 0.825 in predicting post-transjugular intrahepatic portosystemic shunt (TIPS) overt hepatic encephalopathy (OHE) in patients with acute variceal bleeding (AVB), and the key predictors identified were Child–Pugh score, age, and portal vein thrombosis (15). However, at present, most of the studies on ML for predicting or diagnosing minimal hepatic encephalopathy focus on the use of imaging modules, while there is a lack of relevant research on predicting minimal hepatic encephalopathy using other clinical data (16, 17).

Therefore, this study aims to develop and validate interpretable ML models for CHE prediction in patients with cirrhosis, comparing their performance against conventional diagnostic tools, while identifying key predictive features across heterogeneous patient subgroups.

Methods

Study population

This is a retrospective study, with patient data derived from a multi-center cross-sectional study that enrolled 503 patients with liver cirrhosis at 16 medical centers in China between January 2021 and March 2022 (Supplementary Figure S1). Among the 503 patients with cirrhosis, 266 patients (52.88%) had covert hepatic encephalopathy (CHE) and 237 patients (47.12%) did not have CHE. To diagnose CHE, the standardized PHES, including five subtests, the app-dependent concise Color and Word Stroop tests, and the EncephalApp Stroop tests, were used (18, 19). PHES and the Stroop tests were performed according to the methods illustrated in previous studies (18, 19). The five subtests of PHES, number connection test A (NCT-A), number connection test B (NCT-B), line tracing test (LTT), serial dotting test (SDT), and digit symbol test (DST), were carried out by a trained investigator at each center. A total value of all subtests < 4 indicated a PHES positive result. The EncephalApp Stroop tests were administered with the same model of iPad in each center. The cutoff for the EncephalApp Stroop test was > 187 s for on time + off time (18, 19). When both the PHES and EncephalApp Stroop tests resulted positively, CHE was diagnosed. The study was performed in accordance with the Declaration of Helsinki (as revised in 2013). The study protocol was approved by the Institutional Ethics Committee of the Shanghai Changzheng Hospital (2020SL022). The protocol was explained to each patient, and informed consent forms were obtained from all individuals.

Data collection and processing

The clinical and laboratory information of patients was retrieved from the medical records of participating hospitals (2). Features with over 25% missing values were excluded from the following analyses to minimize the bias resulting from missing data. Missing values were imputed using mean imputation, where the mean value of each respective feature was substituted for missing data points (20).

Dataset preparation and feature selection

Prior to model development, continuous variables were standardized using z-score normalization based on the mean and standard deviation (SD) calculated from the training set. Categorical variables were binarized (1 indicating event presence and 0 representing absence), with gender specifically encoded as 1 for male and 0 for female. The complete dataset was partitioned into a training set (70%) for predictive model development and a test set (30%) for performance validation. The test set consisted of 84 patients with liver cirrhosis recruited from Changzheng Hospital from January 2024 to May 2025. To mitigate overfitting, 5-fold cross-validation was systematically implemented during model development. The recursive feature elimination (RFE) algorithm was used to select features from the data of the cohort. Pearson’s correlation coefficient was used to assess collinearity between variables.

Model development and comparison

The features selected above were used to develop prediction models. Eight ML models, namely, adaptive boosting (AdaBoost), artificial neutral network (ANN), decision tree (DT), extra tree (ET), gradient boosting machine (GBM), light gradient boosting machine (LightGBM), random forest (RF), and eXtreme gradient boosting (XGBoost) were trained and established in the training cohort to predict CHE in liver cirrhosis.

Several commonly used evaluation indices, such as the area under the receiver operating characteristic (ROC) curve (AUC), sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy, and F1 score, were used to evaluate the reliability of these models. The diagnostic performance of the model was evaluated using decision curve analysis (DCA).

Model interpretation

The SHapley Additive exPlanations (SHAP) method was used to analyze the importance of features in the model because of the limited interpretability in the ML algorithm (21). SHAP was used as a scoring metric for feature contributions by determining the difference between the predicted values with and without each feature for all combinations. The greater the influence a particular value of a sample has on the composition of the model, the farther that point deviates from 0 on the x-axis. Using SHAP values and a summary plot, it is thus possible to determine which features have a significant effect on prediction and whether this contribution is positive or negative.

Statistical analysis

Depending on the data distribution, the statistical significance of the difference in continuous variables was tested using Student’s t-test or Mann–Whitney U-test for quantitative variables and the chi-square test or Fisher’s test for qualitative variables. All statistical tests were two-sided, with p-values <0.05 indicating statistical significance. Statistical analyses and model development were performed using R software (version 4.05) and Python (version 3.8).

Results

Features selected from models

To identify for the optimal subset to procure the most favorable combination of features, we used recursive feature elimination (RFE) coupled with 5-fold cross-validation. RFE enhances the performance of predictive models by eliminating overfitting and improving the generalizability of the model. According to a specific feature ranking standard, RFE starts from a complete set and then eliminates the least relevant feature one by one to select the most important features. Finally, 35 features achieved the highest cross-validation score, including age, education, sex, body mass index (BMI), history of OHE, etiology of liver disease, the course of liver cirrhosis, comorbidities of cardiovascular diseases, hypertension, cerebral apoplexy or diabetes, history of drinking, previous usage of smartphone, previous usage of computer, previous usage of tablet computer, medication history (L-ornithine aspartate, lactulose, probiotic formulations, non-selective β blockers, antiviral drugs, diuretic, rifaximin, metronidazole, and other antibiotics), alanine aminotransferase (ALT), aspartate aminotransferase(AST), alkaline phosphatase (AKP), g-glutamyltransferase (GGT), albumin (ALB), total bilirubin (TBil), creatinine (Cr), blood urea nitrogen (BUN), prolonged prothrombin time (PT), international normalized ratio (INR), and a series of health-related scores, namely, Mini-Mental State Examination (MMSE) score, Chronic Liver Disease Questionnaire (CLDQ) score, Child–Pugh score, and Model for end-stage liver disease (MELD) score (Figure 1).

Figure 1

Figure 1. RFE coupled with 5-fold cross-validation to procure the most favorable combination of features.

Pearson’s correlation of variables

We evaluated the correlations of variables using Pearson’s correlation and visualized the relationships among them through a heatmap (Figure 2). From the heatmap, we observed that there was collinearity between ALT and AST, AKP and GGT, and INR and prolonged PT. After discussion among gastroenterology experts, a decision was made to remove the three features of ALT, GGT, and INR.

Figure 2

Correlation heatmap showing relationships between various medical and demographic variables. The colors range from blue to red, indicating correlation coefficients from negative to positive. The diagonal shows perfect correlation of 1.0. The intensity of the colors signifies the strength and direction of the correlations.

Figure 2. Heatmap shows the relationships among the variables. Each cell represents the correlation coefficient between two variables, ranging from −1 (perfect negative correlation, indicated by dark blue) to +1 (perfect positive correlation, indicated by dark red). A color bar on the right provides a reference scale.

Patient characteristics

A total of 503 patients were allocated to separate training and validation sets at a ratio of 7:3. The demographic and clinical characteristics of the training and validation sets are listed in Table 1.

Table 1

Table 1. Demographic and clinical characteristics of the training and validation sets.

Model development and validation

Eight ML models—AdaBoost, ANN, DT, ET, GBM, LightGBM, RF, and XGBoost—were constructed based on 32 features. As shown in Table 2, among these models, LightGBM, RF, and XGBoost exhibited higher AUC values (0.810, 0.797, and 0.801) compared to the others. These three models were further evaluated, with the results presented in Table 3. Supplementary Figure S2 shows the confusion matrices of the validation set. Additionally, an extensive decision curve analysis (DCA) demonstrated that LightGBM exhibited better diagnostic performance than RF and XGBoost in the test set (Figure 3). Given that LightGBM has the highest AUC value in the training set and performs well in the test DCA, we chose LightGBM as the final model for this study. In the independent test set, the LightGBM model achieved an outstanding AUC of 0.855 (0.852, 0.857), further confirming its robust generalization capability (Supplementary Table S1).

Table 2

Table 2. Performance of the eight ML models in the training set.

Table 3

Table 3. Performance of the three ML models in the validation set.

Figure 3

Decision curve analysis graph for a test set, showing net benefit versus threshold probability. Lines represent Random Forest, LightGBM, and Xgboost models, along with

Figure 3. DCA analysis of three machine learning algorithms (LightGBM, RF, and XGBoost) in the validation set.

Model interpretation

To enhance the clinical utility of the model, we used the SHAP method to identify the features contributing to the prediction of CHE in patients with cirrhosis, as illustrated in Figure 4. The bar plot was generated by ranking features according to their mean absolute SHAP values in descending order, which reflects the relative contribution of each feature to the overall model. A higher absolute SHAP value indicates greater feature importance and a stronger influence on the model’s output.

Figure 4

Bar chart and scatter plot compare SHAP values for various features, including MMSE score and age. Panel A shows feature importance ranked by SHAP values. Panel B displays SHAP value distribution with color indicating feature value intensity.

Figure 4. SHAP analyses of the LightGBM model for predicting CHE in cirrhotic patients. (A) Mean absolute SHAP value bar plot. This plot ranks the top 20 features by their mean absolute SHAP value, which represents their overall importance in the model’s predictions. A larger bar indicates a greater average impact on the model output. (B) SHAP summary plot (beeswarm plot). This plot shows the distribution of the impact each feature has on the model output for every patient in the dataset. Each dot represents a single patient.

As shown in Figure 4A, the top five clinically significant features were identified as MMSE score, age, ALB, age, previous computer usage, and BUN. Furthermore, we present the distribution of SHAP values for the top 20 clinical features, where each dot represents a feature (Figure 4B). The position of the dot indicates the SHAP value, quantifying the feature’s contribution to the model output. A positive SHAP value suggests a positive influence on the output, whereas a negative value indicates an inhibitory effect. Color intensity reflects feature magnitude—red denotes high values, while blue represents low values, with darker shades signifying a stronger impact on the target outcome.

It demonstrates that low MMSE scores, older age, low albumin levels, high blood urea nitrogen, and lack of prior computer usage were associated with an increased likelihood of CHE development (Figure 4B). This interpretability analysis enhances the model’s clinical applicability by identifying key predictive factors and their directional influence on CHE risk.

Discussion

The clinical implications of CHE in patients with cirrhosis are profound, as it is associated with cognitive impairment, reduced quality of life, and an increased risk of OHE and mortality (1, 3). Early identification of CHE is crucial for timely intervention, yet its diagnosis remains challenging due to the lack of overt symptoms and the reliance on specialized neuropsychological tests such as the Psychometric Hepatic Encephalopathy Score (PHES) and EncephalApp Stroop tests (18, 22). In this study, we developed and validated multiple ML models to predict CHE in cirrhotic patients, leveraging a multicenter cohort of 503 cirrhotic patients.

To our knowledge, this is the first multi-center study to systematically evaluate eight ML models for predicting CHE in patients with cirrhosis using multidimensional clinical and lifestyle data. Our findings demonstrate that the LightGBM model outperformed other ML algorithms, achieving an AUC of 0.810 (0.779, 0.841) in the training set and 0.710 (0.708, 0.713) in the validation set, highlighting its potential as a reliable predictive tool for CHE. Notably, our model does not achieve comparable accuracy to advanced biomarker-driven approaches (e.g., MRI-based models), while relying solely on low-cost, less time-consuming parameters—a critical advantage in resource-limited settings (17, 23).

The SHAP-based interpretability framework revealed key predictors of CHE, including lower MMSE scores, older age, hypoalbuminemia, lack of prior computer usage, and higher levels of blood urea nitrogen. The prominence of MMSE scores underscores the need for early cognitive screening in cirrhosis management, while the association between hypoalbuminemia and CHE risk corroborates its role in hepatic synthetic dysfunction and neurotoxin accumulation (24). Previous studies have shown that older age is an independent biomarker associated with CHE (2). Older age may lead to increased defects in some areas of the central nervous system, affecting cognitive function and the development of CHE (25). This is also consistent with the results of a recent meta-analysis (1). Blood urea nitrogen can reflect renal function, and high levels of urea nitrogen are usually associated with poor protein catabolism, dehydration, and gastrointestinal bleeding. Some studies have shown that high urea nitrogen is an independent biomarker related to the severity and prognosis of HE (26, 27). An intriguing finding from our SHAP analysis was the importance of “previous computer usage” as a protective factor. We hypothesize that this variable acts as a practical proxy for cognitive reserve—the brain’s resilience to pathology. Engaging with computers is a complex cognitive activity that may help build neural networks that are more resistant to the neurotoxic insults of cirrhosis. While this factor likely correlates with socioeconomic status and education level—the latter of which was included in our model and provided independent information—its retention as a key predictor suggests that it captures a unique dimension of a patient’s cognitive lifestyle. This finding aligns with studies linking technology use to cognitive function in older adults (28). Although direct socioeconomic data were not available, this finding highlights the potential value of incorporating simple assessments of life engagement into risk stratification. Future prospective studies should aim to collect more detailed socioeconomic and lifestyle data to disentangle these complex relationships.

The clinical relevance of our model lies in its potential to streamline CHE diagnosis by integrating readily available clinical and laboratory variables, thereby reducing reliance on time-consuming neuropsychological tests. For instance, hypoalbuminemia, a key predictor in our model, is a well-established marker of liver synthetic dysfunction and is associated with blood–brain barrier disruption, facilitating neurotoxin accumulation (29, 30). Similarly, the MMSE score can, to some extent, reflect the interplay between global cognitive function and CHE (24).

Our study addressed critical gaps in CHE prediction research. First, the integration of lifestyle factors (e.g., digital device usage) with traditional clinical variables (e.g., MELD score) provides a holistic risk profile, capturing both biological and psychosocial determinants of CHE—a paradigm shift from prior biomarker-centric models. Second, the rigorous feature selection pipeline (RFE + expert-guided collinearity reduction) optimized model parsimony. For instance, excluding redundant variables (e.g., ALT and INR) improved generalizability without sacrificing predictive power.

Despite its strengths, our study has limitations. First, the retrospective design may introduce selection bias, and the reliance on imputation for missing data (though limited to variables with <25% missingness) could affect model generalizability. Second, while our retrospective design mitigated recall bias, the cohort was restricted to Chinese patients, necessitating external validation in global populations to confirm generalizability. Third, while the LightGBM model demonstrated good discrimination, its moderate AUC (0.710 in the validation set) suggests room for improvement, potentially through the inclusion of additional biomarkers such as gut microbiota profiles, inflammatory markers, or imaging parameters. Besides, the operational definition of “previous computer usage” requires standardization across diverse socioeconomic contexts. Future prospective studies should also explore the integration of dynamic variables (e.g., longitudinal cognitive assessments) to enhance predictive accuracy.

Despite these limitations, our LightGBM model offers a pragmatic tool for CHE risk stratification in routine practice. By prioritizing easily accessible variables (e.g., albumin and MMSE scores), it could be seamlessly embedded into the Hospital Information System (HIS) to guide targeted monitoring. For clinicians, the SHAP dashboard provides actionable insights, transforming opaque algorithms into transparent decision aids (31). Furthermore, we could develop an interactive SHAP visualization tool to translate model outputs into clinician-friendly risk assessments, bridging the “black-box” gap in ML applications. Such tools could be integrated into the HIS to trigger real-time alerts for high-risk patients.

In conclusion, our study presents a novel ML-based approach to CHE prediction, with the LightGBM model offering the best balance of performance and interpretability. By identifying key clinical and demographic predictors, this tool could facilitate early CHE detection and personalized management in cirrhotic patients. Future efforts should focus on external validation and the development of user-friendly applications to translate this model into clinical practice, ultimately improving outcomes for this high-risk population.

Data availability statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics statement

The studies involving humans were approved by The Institutional Ethics Committee of the Shanghai Changzheng Hospital. The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study.

Author contributions

YL: Data curation, Methodology, Writing – original draft, Conceptualization. KD: Writing – original draft, Methodology, Data curation. YQ: Writing – original draft, Data curation, Methodology. PW: Methodology, Writing – original draft, Data curation. RW: Data curation, Methodology, Writing – original draft. XZ: Supervision, Conceptualization, Writing – review & editing. CY: Funding acquisition, Conceptualization, Writing – review & editing, Supervision.

Funding

The author(s) declare that financial support was received for the research and/or publication of this article. This study was supported by the National Natural Science Foundation of China (Grant No. 81900514), the Clinical Research Project of Shanghai Municipal Health Commission (20244Y0160), and the Shanghai Natural Science Fund (Grant No. 22ZR1477400).

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The authors declare that no Gen AI was used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmed.2025.1686005/full#supplementary-material

References

1. Lv, XH, Lu, Q, Deng, K, Yang, J-L, and Yang, L. Prevalence and characteristics of covert/minimal hepatic encephalopathy in patients with liver cirrhosis: a systematic review and meta-analysis. Am J Gastroenterol. (2024) 119:690–9. doi: 10.14309/ajg.0000000000002563

PubMed Abstract | Crossref Full Text | Google Scholar

2. Zeng, X, Yin, C, Sun, CY, Lu, CH, Zhao, SS, Gao, XH, et al. Prevalence and risk factors of covert hepatic encephalopathy in cirrhotic patients: a multicenter study in China. J Dig Dis. (2023) 24:122–32. doi: 10.1111/1751-2980.13171

PubMed Abstract | Crossref Full Text | Google Scholar

3. Patidar, KR, and Bajaj, JS. Covert and overt hepatic encephalopathy: diagnosis and management. Clin Gastroenterol Hepatol. (2015) 13:2048–61. doi: 10.1016/j.cgh.2015.06.039

PubMed Abstract | Crossref Full Text | Google Scholar

4. Patidar, KR, Thacker, LR, Wade, JB, Sterling, RK, Sanyal, AJ, Siddiqui, MS, et al. Covert hepatic encephalopathy is independently associated with poor survival and increased risk of hospitalization. Am J Gastroenterol. (2014) 109:1757–63. doi: 10.1038/ajg.2014.264

PubMed Abstract | Crossref Full Text | Google Scholar

5. Labenz, C, Baron, JS, Toenges, G, Schattenberg, JM, Nagel, M, Sprinzl, MF, et al. Prospective evaluation of the impact of covert hepatic encephalopathy on quality of life and sleep in cirrhotic patients. Aliment Pharmacol Ther. (2018) 48:313–21. doi: 10.1111/apt.14824

PubMed Abstract | Crossref Full Text | Google Scholar

6. Gairing, SJ, Mangini, C, Zarantonello, L, Gioia, S, Nielsen, EJ, Danneberg, S, et al. Minimal hepatic encephalopathy is associated with a higher risk of overt hepatic encephalopathy and poorer survival. J Intern Med. (2024) 295:331–45. doi: 10.1111/joim.13747

Crossref Full Text | Google Scholar

7. Tan, W, Wang, J, Shi, PM, Feng, LM, Shi, J, Ning, BF, et al. Effects of low-dose and high-dose Rifaximin in the treatment of covert hepatic encephalopathy. J Clin Transl Hepatol. (2022) 10:1099–106. doi: 10.14218/JCTH.2021.00457

PubMed Abstract | Crossref Full Text | Google Scholar

8. Tapper, EB. Predicting overt hepatic encephalopathy for the population with cirrhosis. Hepatology. (2019) 70:403–9. doi: 10.1002/hep.30533

PubMed Abstract | Crossref Full Text | Google Scholar

9. Duarte-Rojo, A, Allampati, S, Thacker, LR, Flud, CR, Patidar, KR, White, MB, et al. Diagnosis of covert hepatic encephalopathy: a multi-center study testing the utility of single versus combined testing. Metab Brain Dis. (2019) 34:289–95. doi: 10.1007/s11011-018-0350-z

PubMed Abstract | Crossref Full Text | Google Scholar

10. Gairing, SJ, Anders, J, Kaps, L, Nagel, M, Michel, M, Kremer, WM, et al. Evaluation of IL-6 for stepwise diagnosis of minimal hepatic encephalopathy in patients with liver cirrhosis. Hepatol Commun. (2022) 6:1113–22. doi: 10.1002/hep4.1883

PubMed Abstract | Crossref Full Text | Google Scholar

11. Tranah, TH, Ballester, MP, Carbonell-Asins, JA, Ampuero, J, Alexandrino, G, Caracostea, A, et al. Plasma ammonia levels predict hospitalisation with liver-related complications and mortality in clinically stable outpatients with cirrhosis. J Hepatol. (2022) 77:1554–63. doi: 10.1016/j.jhep.2022.07.014

PubMed Abstract | Crossref Full Text | Google Scholar

12. Gairing, SJ, Danneberg, S, Kaps, L, Nagel, M, Schleicher, EM, Quack, C, et al. Elevated serum levels of glial fibrillary acidic protein are associated with covert hepatic encephalopathy in patients with cirrhosis. JHEP Rep. (2023) 5:100671. doi: 10.1016/j.jhepr.2023.100671

PubMed Abstract | Crossref Full Text | Google Scholar

13. Greener, JG, Kandathil, SM, Moffat, L, and Jones, DT. A guide to machine learning for biologists. Nat Rev Mol Cell Biol. (2022) 23:40–55. doi: 10.1038/s41580-021-00407-0

PubMed Abstract | Crossref Full Text | Google Scholar

14. Yang, H, Li, X, Cao, H, Cui, Y, Luo, Y, Liu, J, et al. Using machine learning methods to predict hepatic encephalopathy in cirrhotic patients with unbalanced data. Comput Methods Prog Biomed. (2021) 211:106420. doi: 10.1016/j.cmpb.2021.106420

PubMed Abstract | Crossref Full Text | Google Scholar

15. Liu, DJ, Jia, L-X, Zeng, F-X, Zeng, W-X, Qin, G-G, Peng, Q-F, et al. Machine learning prediction of hepatic encephalopathy for long-term survival after transjugular intrahepatic portosystemic shunt in acute variceal bleeding. World J Gastroenterol. (2025) 31:100401. doi: 10.3748/wjg.v31.i4.100401

PubMed Abstract | Crossref Full Text | Google Scholar

16. Chen, QF, Chen, HJ, Liu, J, Sun, T, and Shen, QT. Machine learning classification of cirrhotic patients with and without minimal hepatic encephalopathy based on regional homogeneity of intrinsic brain activity. PLoS One. (2016) 11:e0151263. doi: 10.1371/journal.pone.0151263

PubMed Abstract | Crossref Full Text | Google Scholar

17. Chen, QF, Zou, T-X, Yang, Z-T, and Chen, H-J. Identification of patients with and without minimal hepatic encephalopathy based on gray matter volumetry using a support vector machine learning algorithm. Sci Rep. (2020) 10:2490. doi: 10.1038/s41598-020-59433-1

PubMed Abstract | Crossref Full Text | Google Scholar

18. Zeng, X, Li, XX, Shi, PM, Zhang, YY, Song, Y, Liu, Q, et al. Utility of the EncephalApp Stroop test for covert hepatic encephalopathy screening in Chinese cirrhotic patients. J Gastroenterol Hepatol. (2019) 34:1843–50. doi: 10.1111/jgh.14656

PubMed Abstract | Crossref Full Text | Google Scholar

19. Zeng, X, Zhang, LY, Liu, Q, Lu, CH, Wei, J, Shi, ZW, et al. Combined scores from the EncephalApp Stroop test, number connection test B, and serial dotting test accurately identify patients with covert hepatic encephalopathy. Clin Gastroenterol Hepatol. (2020) 18:e7:1618–25. doi: 10.1016/j.cgh.2019.11.018

PubMed Abstract | Crossref Full Text | Google Scholar

20. Berkelmans, GF, Read, SH, Gudbjörnsdottir, S, Wild, SH, Franzen, S, Van Der Graaf, Y, et al. Population median imputation was noninferior to complex approaches for imputing missing values in cardiovascular prediction models in clinical practice. J Clin Epidemiol. (2022) 145:70–80. doi: 10.1016/j.jclinepi.2022.01.011

Crossref Full Text | Google Scholar

21. Xue, B, Li, D, Lu, C, King, CR, Wildes, T, Avidan, MS, et al. Use of machine learning to develop and evaluate models using preoperative and intraoperative data to identify risks of postoperative complications. JAMA Netw Open. (2021) 4:e212240. doi: 10.1001/jamanetworkopen.2021.2240

PubMed Abstract | Crossref Full Text | Google Scholar

22. Montagnese, S. Covert hepatic encephalopathy: agreement and predictive validity of different indices. World J Gastroenterol. (2014) 20:15756–62. doi: 10.3748/wjg.v20.i42.15756

PubMed Abstract | Crossref Full Text | Google Scholar

23. Zhang, G, Li, Y, Zhang, X, Huang, L, Cheng, Y, and Shen, W. Identifying mild hepatic encephalopathy based on multi-layer modular algorithm and machine learning. Front Neurosci. (2020) 14:627062. doi: 10.3389/fnins.2020.627062

PubMed Abstract | Crossref Full Text | Google Scholar

24. Corrias, M, Turco, M, Rui, MD, Gatta, A, Angeli, P, Merkel, C, et al. Covert hepatic encephalopathy: does the mini-mental state examination help? J Clin Exp Hepatol. (2014) 4:89–93. doi: 10.1016/j.jceh.2013.12.005

PubMed Abstract | Crossref Full Text | Google Scholar

25. Gur, RC, Gunning-Dixon, FM, Turetsky, BI, Bilker, WB, and Gur, RE. Brain region and sex differences in age association with brain volume: a quantitative MRI study of healthy young adults. Am J Geriatr Psychiatry. (2002) 10:72–80. doi: 10.1097/00019442-200201000-00009

Crossref Full Text | Google Scholar

26. Duah, A, Agyei-Nkansah, A, Osei-Poku, F, Duah, F, Ampofo-Boobi, D, and Peprah, B. The prevalence, predictors, and in-hospital mortality of hepatic encephalopathy in patients with liver cirrhosis admitted at St. Dominic Hospital in Akwatia, Ghana. Can J Gastroenterol Hepatol. (2020) 2020:8816522. doi: 10.1155/2020/8816522

PubMed Abstract | Crossref Full Text | Google Scholar

27. Yoneyama, K, Nebashi, Y, Kiuchi, Y, Shibata, M, and Mitamura, K. Prognostic index of cirrhotic patients with hepatic encephalopathy with and without hepatocellular carcinoma. Dig Dis Sci. (2004) 49:1174–80. doi: 10.1023/B:DDAS.0000037808.44897.8a

PubMed Abstract | Crossref Full Text | Google Scholar

28. Wu, YH, Lewis, M, and Rigaud, AS. Cognitive function and digital device use in older adults attending a memory clinic. Gerontol Geriatr Med. (2019) 5:2333721419844886. doi: 10.1177/2333721419844886

PubMed Abstract | Crossref Full Text | Google Scholar

29. Bernardi, M, Angeli, P, Claria, J, Moreau, R, Gines, P, Jalan, R, et al. Albumin in decompensated cirrhosis: new concepts and perspectives. Gut. (2020) 69:1127–38. doi: 10.1136/gutjnl-2019-318843

PubMed Abstract | Crossref Full Text | Google Scholar

30. Hillmer, L, Erhardt, EB, Caprihan, A, Adair, JC, Knoefel, JE, Prestopnik, J, et al. Blood-brain barrier disruption measured by albumin index correlates with inflammatory fluid biomarkers. J Cereb Blood Flow Metab. (2023) 43:712–21. doi: 10.1177/0271678X221146127

PubMed Abstract | Crossref Full Text | Google Scholar

31. Qi, X, Wang, S, Fang, C, Jia, J, Lin, L, and Yuan, T. Machine learning and SHAP value interpretation for predicting comorbidity of cardiovascular disease and cancer with dietary antioxidants. Redox Biol. (2025) 79:103470. doi: 10.1016/j.redox.2024.103470

PubMed Abstract | Crossref Full Text | Google Scholar

Keywords: covert hepatic encephalopathy, machine learning, SHapley Additive exPlanations, cirrhosis, LightGBM

Citation: Liu Y, Ding K, Qiu Y, Wang P, Wang R, Zeng X and Yin C (2025) Interpretable machine learning model for predicting covert hepatic encephalopathy in patients with cirrhosis: a multicenter study. Front. Med. 12:1686005. doi: 10.3389/fmed.2025.1686005

Received: 17 August 2025; Accepted: 03 November 2025;
Published: 25 November 2025.

Edited by:

Udhaya Kumar, Baylor College of Medicine, United States

Reviewed by:

Hossam El-Din Moustafa, Mansoura University, Egypt
Neda Fatima, Manav Rachna International Institute of Research and Studies (MRIIRS), India
Veeraswamy Ammisetty, Koneru Lakshmaiah Education Foundation, India

Copyright © 2025 Liu, Ding, Qiu, Wang, Wang, Zeng and Yin. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Chuan Yin, aWxzZTEyMjVAMTYzLmNvbQ==; Xin ZengemVuZ3hpbm1kMTk3OEAxNjMuY29t

^†These authors have contributed equally to this work and share first authorship

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.