ORIGINAL RESEARCH article
Front. Med.
Sec. Obstetrics and Gynecology
An Individualized Risk Prediction Tool for Ectopic Pregnancy within the First 10 Weeks of Gestation Based on Machine Learning Algorithms
Provisionally accepted- 1Nanjing Women and Children’s HealthCare Hospital, Nanjing, China
- 2China Pharmaceutical University School of Basic Medicine and Clinical Pharmacy, Nanjing, China
- 3Nanjing First Hospital, Nanjing, China
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Background: As the main cause of maternal deaths in early pregnancy, delayed diagnosis of ectopic pregnancy (EP) may lead to severe consequences. Patients with pregnancy of unknown location (PUL) exhibit a significantly higher incidence of EP and associated risks compared to the general population. Therefore, this study aims to construct an early prediction model to identify EP risk among patients with PUL and provide a valuable direction for clinical intervention. Methods: Retrospectively recruited 1896 patients with PUL within 10 weeks of gestation. Feature selection was done using the least absolute shrinkage and selection operator (LASSO). Logistic Regression (LR), Extreme Gradient Boosting (XGB), Random Forest (RFC), Support Vector Machine (SVM), and CatBoost were used to construct the early risk prediction model of EP. The model ' s performance was evaluated by the area under the receiver operating characteristic curve (AUROC), the area under the precision-recall curve (AUPRC), and the F1 score. SHapley Additive exPlanations (SHAP) algorithms ranked the feature importance for model output interpretation. Results: Among the PUL patients included in this study, 66 (4.08%) were diagnosed with EP. Key predictors selected for model construction included vaginal bleeding, progesterone, homogeneous adnexal mass, gravidity, hCG levels, history of cesarean section, abdominal tenderness, and history of pelvic surgery. Among the five models, the CatBoost algorithm demonstrated the best performance, achieving an AUROC of 0.930 (95% CI: 0.876–0.984) and an AUPRC of 0.685 (95% CI: 0.464–0.845). A user-friendly web-based platform was developed for EP risk assessment based on this model. According to SHAP analysis, the three most important clinical predictors were vaginal bleeding, progesterone levels, and the presence of a homogeneous adnexal mass. Conclusion: This study employed the CatBoost algorithm to develop an individualized risk prediction model by integrating multiple features from the initial visit. This model enhances the detection rate of EP in patients with PUL during early pregnancy. Additionally, we created a web-based tool, offering potential for future clinical applications.
Keywords: First trimester, Pregnancy of unknown location, Ectopic pregnancy, machine learning, Prediction model
Received: 16 Oct 2025; Accepted: 25 Nov 2025.
Copyright: © 2025 Du, Chen, Lu, Hu, Chen, Huang, Ji, Zou, Zhou and Ruan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence:
Jianjun Zou
Zhou Zhou
Hongjie Ruan
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.
