Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Artif. Intell.

Sec. Medicine and Public Health

This article is part of the Research TopicAdvancing Healthcare AI: Evaluating Accuracy and Future DirectionsView all 24 articles

Antenatal prediction of small for gestational age at birth based on four birthweight standards using machine learning algorithms

Provisionally accepted
  • 1University of Oxford, Oxford, United Kingdom
  • 2Wenzhou Medical University, Wenzhou, China

The final, formatted version of the article will be published soon.

Background Accurate antenatal prediction of SGA at birth is essential to improve development and delivery of preventative and therapeutic interventions. This study aimed to assess the performance of machine learning (ML) models to predict SGA at birth among Chinese pregnancies classified according to the Chinese birthweight standard and three international birthweight standards. Methods We collected multimodal, longitudinal, antenatal surveillance data on 350,135 singleton pregnancies in Wenzhou City, China, between Jan 1, 2014 and Dec 31, 2016. For three pregnancy intervals we developed ML prediction models for newborns classified as SGA using the China, Intergrowth21st, Fetal Medicine Foundation (FMF), and Gestation-related Optimal Weight (GROW) standards. We applied lasso regression to conduct feature selection, and CatBoost, XGBoost, LightBoost, Artificial Neural Networks, Random Forest, Stacked ensemble model, and logistic regression for predictive modelling in training data sets, with validation in testing data sets. Results Among 22,603 singleton pregnancies with complete data, the rate of SGA using the China standard was 6.1%, compared to 4.3%, 6.0%, and 9.7% for the Intergrowth21st, GROW, and FMF standards, respectively. Late pregnancy models (<37 weeks) had the best power to predict SGA, compared to middle (<26 weeks) and early pregnancy (<18 weeks) models. With the China standard, the logistic regression model in late pregnancy performed best with an area under the receiver operating characteristic curve (ROC-AUC) of 0.74. Logistic regression also performed better than ML algorithms with the Intergrowth-21st and GROW standards at each pregnancy interval, although differences were small. The Random Forest model with the FMF standard achieved superior performance at each pregnancy interval, reaching a ROC-AUC of 0.79 in late pregnancy. Notably, the middle pregnancy Random Forest model with the FMF standard already attained a ROC-AUC of 0.72 at 26 weeks' gestation. Symphysis-fundal height, maternal abdominal circumference, maternal age, maternal height and weight, and parity were consistently identified as key predictors of SGA across the different standards. Conclusions Both machine learning models and traditional logistic regression demonstrated comparable predictive performance for SGA identification. These findings hold promise for guiding risk-stratified prenatal care and optimizing resource allocation in clinical settings.

Keywords: artificial intelligence, Birthweight standards, Feature Selection, machine learning, Prediction models, Small-for-gestational-age

Received: 05 Aug 2025; Accepted: 17 Dec 2025.

Copyright: © 2025 Qiuyan, Lin, Zhou, Yang and Hemelaar. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Joris Hemelaar

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.