ORIGINAL RESEARCH article
Front. Public Health
Sec. Environmental Health and Exposome
Volume 13 - 2025 | doi: 10.3389/fpubh.2025.1672479
Decoding the Association Between Health Level and Human Settlements Environment: A Machine Learning-Driven Provincial Analysis in China
Provisionally accepted- City University of Hong Kong, Hong Kong, Hong Kong, SAR China
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Background: Rapid urbanization in China has significantly reshaped the human settlement environment (HSE), bringing opportunities and challenges for public health. While existing studies have explored environmental-health relationships, most are confined to micro-level contexts, focus on single environmental dimensions, or assess specific diseases, thus lacking a comprehensive, macro-level understanding. Objective: This study aims to assess the associations between population health level and multidimensional HSE features at the provincial level in China and uncover nonlinear relationships and interaction effects underlying the association between HSE and population health level. Methods: Using panel data from 31 Chinese provinces spanning 2012 to 2022, a composite Health Level Index (HLI) was constructed based on four core health indicators using the Entropy-TOPSIS method. 19 HSE indicators covering five dimensions—ecological environment, living environment, infrastructure, public services, and sustainable environment—were selected as explanatory variables. The study employed the XGBoost machine learning algorithm to model the relationship between HSE and HLI. SHAP values and Partial Dependence Plots (PDPs) were used to interpret feature importance, nonlinear relationships, threshold values, and interaction effects. Results: XGBoost outperformed all benchmark models, confirming its strong predictive capacity. SHAP analysis identified six key features—number of medical institution beds (NMIB), urbanization rate (UR), mobile phone penetration rate (MPPR), road area per capita (RAPC), population density (PD), and urban gas penetration rate (UGPR)—as the most influential factors. Nonlinear relationships and threshold effects were observed between key features and population health level. PDP plots further revealed that optimal health levels are typically associated with high UR, high MPPR, high RAPC, and moderate NMIB, underscoring the importance of structural synergy over isolated infrastructure expansion. This is a provisional file, not the final typeset article Conclusion: This study provides robust evidence that the relationship between HSE and health is nonlinear, multidimensional, and highly interactive. Effective urban health governance requires coordinated development of urbanization, digital infrastructure, and public services, along with rational healthcare resource allocation. The findings offer actionable insights for health-oriented urban planning and policy formulation in rapidly urbanizing regions.
Keywords: Health level, Human settlement environment, machine learning, XGBoost, Shapley additive explanations
Received: 24 Jul 2025; Accepted: 21 Aug 2025.
Copyright: © 2025 Zhu and Peng. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence:
Haidong Zhu, City University of Hong Kong, Hong Kong, Hong Kong, SAR China
Xiaoqing Peng, City University of Hong Kong, Hong Kong, Hong Kong, SAR China
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.