Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Public Health

Sec. Public Health Education and Promotion

This article is part of the Research TopicActive Commuting: A Strategy for Improving Student Health in Educational SettingsView all articles

Random Forest-based Identification and Ranking of Predictive Factors for Physical Activity in Chinese College Students

Provisionally accepted
  • Nantong University, Nantong, China

The final, formatted version of the article will be published soon.

【Objective】To explore the key predictors of physical activity (PA) levels of Chinese university students, and to analyse the predictive roles of different variables and their relative importance by means of the Random Forest (RF) algorithm. 【Methods】A cross-sectional study was conducted using a stratified whole-group sampling method, covering 17 provinces of the country and collecting 10,182 valid questionnaires. Assessment of PA levels using the Physical Activity Rating Scale-3 (PARS-3) divides participants into attainment and non-attainment groups. The independent variables encompass the individual and interpersonal organisational levels of the socio-ecological model (SEM), comprising a total of 39 variables. These variables include demographic characteristics, psycho-behavioural factors, and social support, which were measured using several standardised scales. Feature importance analysis was performed using the Random Forest algorithm, and the model parameters were optimized with a grid search and 5-fold cross-validation to identify the most significant factors predicting PA. 【Results】The RF model had an accuracy of 0.704 and an AUC value of 0.762. Characteristic importance analysis revealed that exercise adherence (exercise behaviour ), sex, exercise adherence (effort investment), mastery of sports skills, exercise motivation (ability), alcohol consumption level, exercise adherence (emotional experience, exercise motivation (social), and exercise motivation (fun) ranked as the top nine predictive factors. Specifically, all sub-dimensions of exercise adherence (exercise behaviour) positively predict PA (SHAP values > 0); sex, males are more likely than females to meet the standard group criteria (OR > 1, p < 0.001); mastery of sports skills correlates positively with PA levels; and among alcohol consumption level, 'occasional drinking' shows a negative correlation with the standard attainment rate (p < 0.001). 【Conclusion】Exercise adherence, sex, mastery of sports skills, and alcohol consumption level are significant factors predicting PA levels among Chinese university students. Recommendations for promoting PA include enhancing the "emotional value" and social attributes of exercise, addressing female students' willingness to participate, and improving physical capabilities through skills training to effectively elevate activity levels.

Keywords: university students, machine learning, random forest, physical activity, Socio-ecological model

Received: 11 Sep 2025; Accepted: 07 Nov 2025.

Copyright: © 2025 Zhang, Lou, Liu and Li. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
Jun Liu, zdy9774649@qq.com
Bo Li, wangqiulibo@163.com

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.