Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Water

Sec. Water and Artificial Intelligence

Application and Comparison of Multiple Machine Learning Models in Flood Susceptibility Assessment in the Beijing-Tianjin-Hebei Region of China

Provisionally accepted
Haijun  LiHaijun LiJiubo  DongJiubo Dong*Yaowen  ZhangYaowen ZhangHongtao  LiuHongtao LiuYixin  PangYixin PangShuiqing  ZhouShuiqing ZhouBingbin  DuBingbin Du
  • Institute of Disaster Prevention, Sanhe, China

The final, formatted version of the article will be published soon.

The confluence of extreme precipitation and rapid urbanization has led to a marked increase in flood risk across the Beijing-Tianjin-Hebei(BTH)region.To this end, conducting a thorough flood susceptibility assessment is of paramount importance to safeguard the region and ensure its sustainable development. Based on historical flood disaster records, 15 flood related influencing factors such as elevation, average annual rainfall, and the Normalized Difference Vegetation Index(NDVI) were selected as the initial variable set. A flood disaster susceptibility evaluation framework was established through multicollinearity analysis and feature selection based on the Information Gain Ratio(IGR). Support Vector Machine (SVM), Random Forest (RF), Extreme Gradient Boosting(XGBoost),and Multilayer Perceptron (MLP) models were employed to conduct the susceptibility assessment. The predictive performance and susceptibility zoning outcomes of the models were systematically compared using the Area Under the Receiver Operating Characteristic Curve(AUC) and a set of statistical evaluation metrics, including accuracy, Kappa coefficient, and sensitivity. Research findings demonstrate that (1) elevation, distance from rivers, average 24-hour maximum rainfall, and slope constitute the primary controlling factors for flood occurrence in the BTH region;(2) Very high and High susceptibility zones are primarily concentrated in topographic transition zones, critical nodes of the river system, and key flood storage and detention areas. The high and relatively high susceptibility zones identified by the four models show a strong spatial consistency with the actual distribution of flood disasters, and exhibit minimal overfitting. (3) The AUC validation results of the four models are as follows: XGBoost (0.938) > RF (0.920) > MLP (0.867) > SVM (0.854). Among these models, XGBoost produced the smallest proportion of high-susceptibility zones, demonstrating a superior ability to accurately identify areas with the highest potential flood risk. This study provides a scientific foundation for flood risk management in the BTH region and holds significant practical value for improving regional flood control strategies and spatial planning.

Keywords: Beijing-Tianjin-Hebei region, Flood disaster, machine learning algorithms, Risk prevention and control, Susceptibility evaluation

Received: 14 Dec 2025; Accepted: 19 Jan 2026.

Copyright: © 2026 Li, Dong, Zhang, Liu, Pang, Zhou and Du. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Jiubo Dong

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.