ORIGINAL RESEARCH article

Front. Big Data

Sec. Medicine and Public Health

Volume 8 - 2025 | doi: 10.3389/fdata.2025.1574683

This article is part of the Research TopicMachine Learning and Cutting-Edge Tools for Prediction and Treatment Strategies of Dementia and Associated DiseasesView all 3 articles

Optimizing Public Health Management with Predictive Analytics: Leveraging the Power of Random Forest"

Provisionally accepted
Hongman  WangHongman Wang1Yifan  SongYifan Song2*Hua  BiHua Bi3
  • 1Southeast University, Nanjing, Jiangsu Province, China
  • 2Macao Polytechnic University, Macau, Macao, SAR China
  • 3General Hospital of the Central Theater of the People's Liberation Army, Wuhan, Hebei Province, China

The final, formatted version of the article will be published soon.

Abstract:Community health outcomes significantly impact older populations' well-being and quality of life. Traditional analytical methods often struggle to accurately predict health risks at the community level due to their inability to capture complex, non-linear relationships among various health determinants. This study employs a Random Forest Algorithm (RFA) to address this limitation and enhance the predictive modelling of community health outcomes. By leveraging ensemble learning techniques and multi-factor analysis, the study aims to identify and quantify the relative contribution of key health indicators to risk assessment. The study begins with comprehensive data collection from diverse health sources, followed by a systematic preprocessing stage, which includes resolving missing values, normalizing variables, and encoding categorical features. Using bootstrap sampling, multiple decision trees were trained on random subsets of health data, ensuring variability in model learning. The trees grow to full depth and aggregate their predictions to enhance accuracy. Out-of-bag (OOB) error estimation was applied to refine the model and provide unbiased performance assessments, ensuring robust generalization to unseen data. The proposed model effectively analyses key health indicators, ranking feature importance to determine the most influential predictors of health risk. Results indicate that RFA achieves an accuracy rate of 92%, outperforming conventional prediction methods in precision and recall. These findings underscore the efficacy of Random Forest in identifying critical health risk factors, paving the way for more targeted, data-driven public health management strategies and interventions tailored to older adults.

Keywords: predictive analytics, random forest, Public Health, Health risk factors, machine learning, community-based health management, older adults

Received: 11 Feb 2025; Accepted: 23 Jun 2025.

Copyright: © 2025 Wang, Song and Bi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Yifan Song, Macao Polytechnic University, Macau, Macao, SAR China

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.