AUTHOR=Zhu Haidong , Peng Xiaoqing TITLE=Decoding the association between health level and human settlements environment: a machine learning-driven provincial analysis in China JOURNAL=Frontiers in Public Health VOLUME=Volume 13 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/public-health/articles/10.3389/fpubh.2025.1672479 DOI=10.3389/fpubh.2025.1672479 ISSN=2296-2565 ABSTRACT=BackgroundRapid urbanization in China has significantly reshaped the human settlement environment (HSE), bringing opportunities and challenges for public health. While existing studies have explored environmental-health relationships, most are confined to micro-level contexts, focus on single environmental dimensions, or assess specific diseases, thus lacking a comprehensive, macro-level understanding.ObjectiveThis study aims to assess the associations between population health level and multidimensional HSE features at the provincial level in China and uncover nonlinear relationships and interaction effects underlying the association between HSE and population health level.MethodsUsing panel data from 31 Chinese provinces spanning 2012 to 2022, a composite Health Level Index (HLI) was constructed based on four core health indicators using the Entropy-TOPSIS method. 19 HSE indicators covering five dimensions—ecological environment, living environment, infrastructure, public services, and sustainable environment—were selected as explanatory variables. The study employed the XGBoost machine learning algorithm to model the relationship between HSE and HLI. SHAP values and Partial Dependence Plots (PDPs) were used to interpret feature importance, nonlinear relationships, threshold values, and interaction effects.ResultsXGBoost outperformed all benchmark models, confirming its strong predictive capacity. SHAP analysis identified six key features—number of medical institution beds (NMIB), urbanization rate (UR), mobile phone penetration rate (MPPR), road area per capita (RAPC), population density (PD), and urban gas penetration rate (UGPR)—as the most influential factors. Nonlinear relationships and threshold effects were observed between key features and population health level. PDP plots further revealed that optimal health levels are typically associated with high UR, high MPPR, high RAPC, and moderate NMIB, underscoring the importance of structural synergy over isolated infrastructure expansion.ConclusionThis study provides robust evidence that the relationship between HSE and health is nonlinear, multidimensional, and highly interactive. Effective urban health governance requires coordinated development of urbanization, digital infrastructure, and public services, along with rational healthcare resource allocation. The findings offer actionable insights for health-oriented urban planning and policy formulation in rapidly urbanizing regions.