AUTHOR=Rueangket Ploywarong , Rittiluechai Kristsanamon , Prayote Akara TITLE=Predictive analytical model for ectopic pregnancy diagnosis: Statistics vs. machine learning JOURNAL=Frontiers in Medicine VOLUME=Volume 9 - 2022 YEAR=2022 URL=https://www.frontiersin.org/journals/medicine/articles/10.3389/fmed.2022.976829 DOI=10.3389/fmed.2022.976829 ISSN=2296-858X ABSTRACT=Abstract Objective: Ectopic pregnancy is well-known for its critical outcome. Early detection could make the difference between life and death in pregnancy. Our aim was to make a prompt diagnosis before the ruptured occurred. Thus, the predictive analytical models using both conventional statistics and machine learning (ML) methods were studied. Materials and methods: A retrospective cohort study was conducted on 407 pregnancies with unknown location (PULs); 377 PULs for internal validation and 30 PULs for external validation, randomized with nested cross validation technique. Using a set of 22 study features based on clinical factors, serum marker and ultrasound findings from electronic medical records, analyzing with neural networks (NNs), decision tree (DT), support vector machines (SVMs) and a statistical logistic regression (LR). Diagnostic performances were compared with area under the curve (ROC-AUC), also sensitivity and specificity for decisional use. Results: Comparing model performance (internal validation) to predict EP, LR ranked first, with mean ROC-AUC ± SD of 0.879 ± 0.010. In testing data (external validation), NNs ranked first, followed closely by LR, SVMs and DT with average ROC-AUC ± SD of 0.898 ± 0.027, 0.896 ± 0.034, 0.882 ± 0.029 and 0.856 ± 0.033, respectively. For clinical aid, we report sensitivity of mean ± SD in LR; 90.20% ± 3.49%, SVM; 89.79% ± 3.66% and DT; 89.22% ± 4.53%, NNs; 86.92% ± 3.24%, consecutively. Whereas specificity ± SD was ranked by NNs, then followed by SVMs, LR and DT, which were 82.02 ± 8.34%, 80.37 ± 5.15%, 79.65% ± 6.01% and 78.97% ± 4.07%, respectively. Conclusion: Both statistics and ML model could achieve satisfactory predictions for EP. In model learning, the highest ranked model was LR, showing that EP prediction might possessed linear or causal data pattern. However, in new testing data, NNs could overcome statistics. This highlights the potency of ML in solving complicated problem with various patterns, while overcoming generalization error of data. Key Words: ectopic pregnancy, pregnancy of unknown location, machine learning, neural networks, decision tree and support vector machines