ORIGINAL RESEARCH article

Front. Public Health

Sec. Public Mental Health

Volume 13 - 2025 | doi: 10.3389/fpubh.2025.1606316

This article is part of the Research TopicMental Health Dynamics for Vulnerable Populations in the Digital Era: Opportunities and ChallengesView all 3 articles

A Risk Prediction System for Depression in Middle-Aged and Elderly Grounded in Machine Learning and Visualization Technology: A Cohort Study

Provisionally accepted
Jinsong  DuJinsong Du1,2,3Xinru  TaoXinru Tao1Le  ZhuLe Zhu1Wenhao  QiWenhao Qi2Xiaoqiang  MinXiaoqiang Min3,4Hongyan  DengHongyan Deng1Shujie  WeiShujie Wei5Xiaoyan  ZhangXiaoyan Zhang6Xiao  ChangXiao Chang2*
  • 1Zaozhuang University, Zaozhuang, China
  • 2Hangzhou Normal University, Hangzhou, Zhejiang Province, China
  • 3Shandong Coal Health School, Zaozhuang, China
  • 4Shandong Healthcare Group Xinwen Central Hospital, Taian, China
  • 5Zaozhuang Municipal Hospital, Zaozhuang, China
  • 6Shandong Healthcare Group Zaozhuang Central Hospital, Zaozhuang, China

The final, formatted version of the article will be published soon.

Middle-aged and elderly individuals are highly susceptible to depression. For this reason, early identification and intervention can substantially reduce its prevalence. This study innovatively put forth a visual risk prediction system for depressive symptoms and depression in middle-aged and elderly adults rooted in machine learning and visualization technologies. Using cohort data from the China Health and Retirement Longitudinal Study (CHARLS) database, involving 8,839 middle-aged and elderly participants, the study constructed predictive models with eight machine learning algorithms, predominantly encompassing LightGBM, XGBoost, and AdaBoost. The XGBoost model yielded the most satisfactory performance, achieving an average ROC-AUC of 0.69. On this basis, it was ultimately selected as the predictive model for depressive symptoms and depression risk in this population. In an effort to reinforce the interpretability of the XGBoost model, SHAP technology was utilized to visualize the prediction results, while the model was deployed on a web platform to establish the risk prediction system. The system can output the probability of users developing depressive symptoms or depression within five years and explain the prediction results, rendering it easier for users to understand and use. Rooted in China's national longitudinal cohort, this platform integrates machine learning analytics with interactive visualization, with web deployment enhancing clinical translational value. By enabling early depression detection and evidence-based interventions for middle-aged/elderly populations, it establishes a novel health management paradigm with demonstrated life quality enhancement potential.

Keywords: Depression, machine learning, CHARLS, risk prediction, visualization

Received: 05 Apr 2025; Accepted: 15 May 2025.

Copyright: © 2025 Du, Tao, Zhu, Qi, Min, Deng, Wei, Zhang and Chang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Xiao Chang, Hangzhou Normal University, Hangzhou, Zhejiang Province, China

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.