SYSTEMATIC REVIEW article

Front. Bioeng. Biotechnol., 19 January 2022

Sec. Cell and Gene Therapy

Volume 9 - 2021 | https://doi.org/10.3389/fbioe.2021.780389

Using Machine Learning to Predict Complications in Pregnancy: A Systematic Review

  • 1. Metabolic Diseases Research Laboratory (MDRL), Interdisciplinary Center for Research in Territorial Health of the Aconcagua Valley (CIISTe Aconcagua), Center for Biomedical Research (CIB), Universidad de Valparaíso, Valparaiso, Chile

  • 2. PhD Program Doctorado en Ciencias e Ingeniería para La Salud, Faculty of Medicine, Universidad de Valparaíso, Valparaiso, Chile

  • 3. School of Biomedical Engineering, Faculty of Engineering, Universidad de Valparaíso, Valparaiso, Chile

  • 4. Centro de Investigación y Desarrollo en INGeniería en Salud – CINGS, Universidad de Valparaíso, Valparaiso, Chile

  • 5. Instituto Milenio Intelligent Healthcare Engineering, Valparaíso, Chile

  • 6. Cellular and Molecular Physiology Laboratory (CMPL), Division of Obstetrics and Gynaecology, School of Medicine, Faculty of Medicine, Pontificia Universidad Católica de Chile, Santiago, Chile

  • 7. Department of Physiology, Faculty of Pharmacy, Universidad de Sevilla, Seville, Spain

  • 8. University of Queensland Centre for Clinical Research (UQCCR), Faculty of Medicine and Biomedical Sciences, University of Queensland, Herston, QLD, Australia

  • 9. Department of Pathology and Medical Biology, University of Groningen, University Medical Center Groningen, Groningen, Netherlands

  • 10. Medical School (Faculty of Medicine), São Paulo State University (UNESP), São Paulo, Brazil

  • 11. Tecnologico de Monterrey, Eutra, The Institute for Obesity Research, School of Medicine and Health Sciences, Monterrey, Mexico

  • 12. School of Medicine, Campus San Felipe, Faculty of Medicine, Universidad de Valparaíso, San Felipe, Chile

Abstract

Introduction: Artificial intelligence is widely used in medical field, and machine learning has been increasingly used in health care, prediction, and diagnosis and as a method of determining priority. Machine learning methods have been features of several tools in the fields of obstetrics and childcare. This present review aims to summarize the machine learning techniques to predict perinatal complications.

Objective: To identify the applicability and performance of machine learning methods used to identify pregnancy complications.

Methods: A total of 98 articles were obtained with the keywords “machine learning,” “deep learning,” “artificial intelligence,” and accordingly as they related to perinatal complications (“complications in pregnancy,” “pregnancy complications”) from three scientific databases: PubMed, Scopus, and Web of Science. These were managed on the Mendeley platform and classified using the PRISMA method.

Results: A total of 31 articles were selected after elimination according to inclusion and exclusion criteria. The features used to predict perinatal complications were primarily electronic medical records (48%), medical images (29%), and biological markers (19%), while 4% were based on other types of features, such as sensors and fetal heart rate. The main perinatal complications considered in the application of machine learning thus far are pre-eclampsia and prematurity. In the 31 studies, a total of sixteen complications were predicted. The main precision metric used is the AUC. The machine learning methods with the best results were the prediction of prematurity from medical images using the support vector machine technique, with an accuracy of 95.7%, and the prediction of neonatal mortality with the XGBoost technique, with 99.7% accuracy.

Conclusion: It is important to continue promoting this area of research and promote solutions with multicenter clinical applicability through machine learning to reduce perinatal complications. This systematic review contributes significantly to the specialized literature on artificial intelligence and women’s health.

Introduction

While most pregnancies and births are uneventful, all pregnancies are at risk. About 15% of all pregnant women will develop a life-threatening complication that requires specialized care, and some will require major obstetric intervention to survive (WHO, 2019). According to the World Health Organization (WHO), around 800 women die every day around the world from preventable causes related to the inherent risks of pregnancy. About 295,000 women died during and following pregnancy and childbirth in 2017. The vast majority of these deaths (94%) occurred in low-resource settings, and most could have been prevented (WHO, 2019).

Several maternal factors influence the appearance of perinatal complications. It is recognized that the first trimester of pregnancy is the best stage to predict and prevent perinatal complications. For example, it is known that increasing obesity in women of childbearing age leads to increased risk of perinatal complications such as gestational diabetes, large for gestational age (LGA), fetal macrosomia, and hypertensive syndromes in pregnancy (Denison et al., 2010; Mariona, 2016; Edwards and Wright, 2020). On the other hand, developed countries tend to see decreased birth rates over the years, leading to advanced gestational ages, predisposing women to adverse pregnancy outcomes (Laopaiboon et al., 2014).

Artificial intelligence (AI) technologies have been developed to analyze a wide range of health data, including patient data from multibiotic approaches, as well as clinical, behavioral, environmental, and drug data, and from various data included in the biomedical literature (Hinton, 2018). AI can help professionals in making decisions, reducing medical errors, improving accuracy in the interpretation of various diagnoses, and thereby reducing the workload to which they are exposed (Makary and Daniel, 2016). Machine learning (ML) is the subfield of computer science and a branch of AI. These techniques provide the ability to infer meaningful connections between data items from various data sets that would otherwise be difficult to correlate (Darcy et al., 2016; Obermeyer and Emanuel, 2016). Due to the large quantity and complex nature of medical information, ML is recognized as a promising method for supporting diagnosis or predicting clinical outcomes (Bottaci et al., 1997; Frizzell et al., 2017).

There are different types of data used for health learning models, including electronic medical records, medical images, biochemical parameters, and biological markers (Ahmed et al., 2020). The type of data that is used depends on what one tries to diagnose through ML.

Most of these decision support systems remain complex black boxes, which means that their internal logic is hidden from the clinical team who cannot fully understand the rationale behind their predictions. Interpretability is important before any health-care team can increase reliance on ML systems (Carvalho et al., 2019). Therefore, the research community has focused on developing both interpretable models and explanatory methods in recent years.

In general, the ML models are validated using the train–test split or the cross-validation schemes. Models are usually initially fitted to a training data set (Sohil et al., 2021), a set of examples used to fit the model parameters. Model fitting may include both variable selection and parameter estimation (Ripley, 1996). The test data set is a data set that is used to provide an unbiased evaluation of a final model fit on the training data set (Brownlee, 2017). Cross-validation is a statistical method for evaluating and comparing learning algorithms by dividing the data into k-folds, where each fold is separated into two segments: one used to learn or train a model and one used to validate the model. In typical cross-validation, the training and validation sets must be crossed in successive rounds so that each data point has a chance to be validated (Refaeilzadeh et al., 2009). Deciding the sizes and strategies for partitioning data sets into training, test, and validation sets depend mainly on the problem and available data. The performance metrics of the ML model are related to the ability of a test to determine if a health diagnosis is effective. Some of the commonly used metrics are accuracy (number of correctly classified assessments over the total number of assessments), precision, sensitivity and specificity, predictive values, probability ratios, and the area under the ROC curve (Šimundić, 2009). To evaluate the success of an ML system when predicting a medical diagnosis, these must be taken into account. It is relevant to note that the area under the curve (AUC) is one of the main performance metrics used in prediction systems; however, metrics such as precision are recommended to complement the results.

Recent studies have described how AI has been involved in areas like gynecology and obstetrics (Iftikhar et al., 2020; Cecula, 2021); however, the effect of all ML techniques on the prediction of perinatal complications has not been reviewed. Thus, we decided to carry out this review to present and synthesize different ML-based models, highlighting the main input characteristics used for training, output results, performance metrics in prediction, and contribution to decision-making related to perinatal complications associated with non-congenital risk factors in pregnant women.

Methods

This systematic review was carried out following the guidelines for systematic reviews and meta-analysis (PRISMA) (Urrútia and Bonfill, 2010) (Supplementary Table S1).

Information Sources and Search Strategy

Full and original articles related to ML techniques on complications during pregnancy published in English from 2015 to 2020 were searched on PubMed, Web of Science, and Scopus databases. Search terms were chosen and searches performed in an iterative process, initially using word headings associated with ML, such as “machine learning,” “deep learning,” “artificial intelligence,” and related to perinatal complications, such as “complications in pregnancy” and “pregnancy complications,” and excluding articles related to postpartum and congenital complications. For PubMed, the MESH terms were used to include associated synonyms in the search, and for Scopus and Web of Science, the terms of interest mentioned before with Boolean operators were used (Table 1). The search and final collection of articles were 98 articles, of which 20 were excluded by duplication.

TABLE 1

Data baseSearch expressionYear of publication
PubMed[“Machine learning” (Mesh)] AND “Pregnancy Complications” (Mesh) NOT (“postpartum”)2015–2020
Web of Science(“Machine learning” OR “Deep learning” AND (“complications in pregnancy” OR “pregnancy complications” OR “perinatal complications”) NOT (“postpartum”)
Scopus

Search expressions used in the systematic review.

Eligibility Criteria

The included criteria for the articles searched were 1) English original articles, 2) access to full text, 3) studies based on humans, 4) studies using machine learning methods to predict complications in pregnancy, and 5) complications during pregnancy and at term in the mother and the newborn. The exclusion criteria applied were 1) systematic reviews, meta-analysis, and bibliographic reviews; 2) articles that included postpartum complications; 3) maternal congenital disease that increases the risk of perinatal complications; and 4) fetal congenital diseases. Articles were added manually according to the aforementioned criteria.

Article Screening

All articles found were uploaded to the Mendeley desktop platform, where they were saved in a dedicated folder for the present systematic review. After eliminating the duplicate articles, a total of 78 articles remained. Then 16 articles were excluded by title, 18 were excluded by criteria, and 19 were excluded after reading. Finally, 31 articles for the review were selected. The selected articles were classified by the ML model used, type of features used, outputs, and performance metrics, in order to estimate which methods are the most accurate in the context of predicting perinatal complications.

Risk of Bias

The 31 articles were subjected to the CASP checklist, which contains 11 questions to help evaluate a clinical prediction rule (CASP, 2017). Study quality was scored according to the CASP critical score: if the criterion was met entirely = 2 points; criterion partially met = 1 point; and criterion not applicable/not met/not mentioned = 0. Finally, study quality was ranked: a total score of 22 = high quality; 16–21 = moderate quality; and ≤15 = low quality.

Data Synthesis and Visualization

To optimize the visualization of the results obtained in the systematic review, several tables were made according to the terms addressed in the search, showing complications that the models seek to predict, input characteristics for the training of the ML model, the type of ML used, and its validation and performance metrics.

Results

Study Characteristics

To apply the PRISMA method, the articles have been classified according to the criteria mentioned before: title, abstract, and the full article. A total of 84 articles were found, of which 52 were excluded because they did not meet the search criteria of interest. Of these, 16 were eliminated by title, 18 after reading the abstract, and 19 after reading the entire article, leaving 31 articles to analyze (Figure 1 and Supplementary Table S2). The type of studies in the manuscripts analyzed were mainly cohort (87.2%) and retrospective (96.8%). The populations studied were primarily from Asia and Europe (both 32.3%), followed by North and South America (22.5 and 6.5%, respectively). An increased rate of studies was observed during 2019 (35.5%) (Table 2). The features mainly used to predict perinatal complications are electronic medical records (48%) and then medical images (29%), biological markers (19%), and 4% are based on another type of feature, in this case, sensors (Moreira et al., 2016a) and fetal heart rate (Zhao et al., 2019). Two studies contemplate two features: electronic medical records and medical images (Nair, 2018; Lipschuetz et al., 2020).

FIGURE 1

TABLE 2

Type of studyTemporalityGeographic location of the study groupYear of publication
Cohort (87.2%)Retrospective (96.8%)Asia (32.3%)2015 (3.2%)
Control case (6.4%)Prospective (3.2%)Europe (32.3%)2016 (9.6%)
Exploratory (3.2%)North America (22.5%)2017 (12.9%)
Cross section (3.2%)South America (6.5%)2018 (19.4%)
Africa (3.2%)2019 (35.5%)
Oceania (3.2%)2020 (19.4%)

Main characteristics of selected articles.

According to the CASP checklist, one article met the total score and was classified as a high-quality article (Gao et al., 2019). The rest of the items were classified as moderate quality and none as low quality according to the evaluation criteria (average total score = 18–19). It is essential to mention that the “non-compliance” items were not being mentioned or not applicable to the study. The item asking whether the sample was randomized in 15 articles does not apply since analyzed retrospective electronic health records or images. Regarding using a comparison group, 12 reports do not apply due to retrospective data and data management for the prediction model (Supplementary Figure S1).

Features Studied

The choice of informative, discriminatory, and independent characteristics is crucial to achieving effective algorithms for recognizing, classifying, and regression patterns. Thus, the four types of features analyzed in the articles were electronic medical records (EMRs) (Table 3), medical images (recordings, ecotomographs, ultrasound, resonance, etc.) (Table 4), biological markers (Table 5), and others (sensors and fetal heart rate) (Table 6).

TABLE 3

Electronic medical records
RefTime of data collectionNumber of recordsOutcomeValidation techniqueML methodsPerformance metrics
AUCSen. (%)Spec. (%)Acc. (%)
Lipschuetz et al. (2020)During pregnancy with term delivery9,888TOLAC failure risk10-fold cross-validation and deletion of a portion of the dataGradient increasing machines0.793
-HighRF0.756
-MediumRF0.782
-LowAdaBoost set0.784
Hamilton et al. (2020)<22 gw100Severe neonatal mortality v/s no severe neonatal mortality10 replicates of 10-fold cross-validation and on the one standard error ruleDecision tree0.85379.780.975.6
SVM0.85179.179.677.4
Generalized additive model0.85080.681.875.0
Simple neural network0.84878.580.773.3
Artzi et al. (2020)<20 gw588,622High-risk GDM v/s low-risk GDMCross-validation on the training set, and resampling from the validationGradient augmentation machine built with decision tree base learners0.850
Jhee et al. (2019)Early second trimester to 34 gw1,006Pre-eclampsia v/s no pre-eclampsiaTraining (70%) validation set (30%)Logistic regression70.386.2
Decision tree64.887.4
Naive Bayes5089.9
SVM13.789.2
RF67.992.3
Stochastic gradient augmentation method60.397.3
Rittenhouse et al. (2019)During pregnancy (not specified)1,450Premature v/s not prematureak-fold cross-validation (with 10 folds)Binary logistic regression model, RF classification, and generalized additive model0.86898.9
Gestational age predictionk-fold cross-validation (with 10 folds)Combined continuous model of linear regression, RF, regression, and generalized additive models0.87890.2
Kuhle et al. (2018)Pre-pregnancy at 26 gw30,705LGA v/s AGATest (20%) training (80%) and ten-fold cross-validation in the training dataRF0.72879.9
Decision tree0.71879.4
Elastic net0.74880.9
Gradient increasing machines0.74880.5
Logistic regression0.74581.3
Neural network0.74681.2
SGA v/s AGATest (20%) training (80%) and ten-fold cross-validation in the training dataRF0.74590.3
Decision tree0.71380.1
Elastic net0.77191.2
Gradient increasing machines0.76691.1
Logistic regression0.77191.2
RF0.77291.4
Khatibi et al. (2019)During pregnancy, before 37 gw1,547,677Non-premature delivery v/s prematureTraining datasetSet of decision trees, SVM and RF0.6881.0
Malacova et al. (2020)During pregnancy (not specified)952,813Miscarriage v/s born aliveDataset was randomly divided into 10 foldsArtificial neural networks: multilayer perceptron + radial base networks8094.190.9
Pan et al. (2017)During pregnancy (not specified)6,457Adverse delivery v/s non-adverse delivery10-fold cross-validation and repeated the cross-validation process with new folds 9 more times in the test setLogistic regression31.9
Linear discriminant analysis31.7
RF30.1
Naive Bayes29.2
Moreira et al., 2016bDuring pregnancy (not specified)25Hypertensive disorder v/s without hypertensive disorder10-fold cross-validation for decision treesDecision tree J480.74860
5-fold cross-validation methodNaive Bayes0.78252
Gao et al. (2019)During pregnancy (not specified)45,858Severe maternal morbidity v/s no serious maternal morbidityTrain dataset and 10-fold stratified cross-validationLogistic regression0.93776.5
Mailath-Pokorny et al. (2015)Between 22 and 32 gw617Delivery prediction within 48 h of transfer v/s Before 32 gwValidation setMultivariate logistic regression0.850
Shigemi et al. (2019)Data from the first and last prenatal checkup15,263Macrosomia v/s No macrosomiaTraining dataset (90%) and a validation dataset (10%)Logistic regression0.8808855
RF0.9906082
Paydar et al. (2017)Before the first trimester149Live births v/s stillbirthsTest (70%) training (30%)Logistic regression0.83440.599.794.7
Decision tree0.80840.694.799.7
RF0.83641.194.799.7
XGBoost0.84245.394.799.7
Artificial neural networks multilayer perceptron0.84043.594.799.7
Spontaneous preterm birthMultivariate logistic regression0.670
Boland et al. (2017)Each trimester of pregnancy36,898Pregnancies without congenital abnormality v/s pregnancies with congenital abnormalityMethod of data validation is not identifiedRF88.9

Perinatal complications predicted through ML models using electronic medical records.

Ref., references; ML, machine learning; AUC, area under curve; Sen, sensitivity; Spec, specificity; Acc, accuracy; TOLAC, trial of labor after caesarean, RF, random forest; gw, gestational weeks; SVM, support vector machine; GDM: gestational diabetes mellitus; LGA, large for gestational age; AGA, adequate for gestational age, SGA, mall for gestational age.

a

This study also uses biological markers.

TABLE 4

Medical Images
RefTime of data collectionNumber of recordsOutcomeValidation techniqueML methodsPerformance metrics
AUCSen. (%)Spec. (%)Acc. (%)
Sun et al. (2019)After 24 gw155Placental invasion v/s placenta previa simpleTest (83%) Training (17%)Genetic algorithm-based machine learning algorithm implemented in TPOT0.98010088.595.2
Chen et al. (2019)150 EHG in pregnancy (not specified) and 150 EHG in labor (24 h before delivery usually)300Premature v/s born of termTest (67%) training (33%)Stacked sparse autocoder0.900928890
Extreme learning machine0.840808883
SVM0.850888285
Fergus et al. (2018)>36 gw552Vaginal delivery v/s caesarean sectionTest (80%) training (30%)SVM RF and linear discriminant analysis of features0.9608790
Borowska et al. (2018)From 24 to 28 gw20Deliver after 7 days v/s deliver within 7 days10-fold cross-validationPCA + SVM83.32
RQA + SVM79.3
Veeramani and Muthusamy (2016)During pregnancy (not specified)niDiagnosis of recurrent lung diseases in the newbornTest TrainingRVM100
Multilevel RVM90
Romeo et al. (2019)During pregnancy (not specified)108Delivery with placental accreta spectrum v/s delivery without placental accreta spectrumTest (75%) training (25%) and a 10-fold cross-validationRF93.793.795.6
K-nearest neighbor97.598.798.1
Naive Bayes86.17580.5
Multilayer perceptron92.483.888.6
Sadi-Ahmed et al. (2017)Between the 27th and the 32nd gw30Premature vs. term100 iterations of “holdout” cross-validation for training and test setsSVM0.95298.49395.7
Cömert et al. (2018)During pregnancy (not specified)552Presence of fetal hypoxia v/s absence of fetal hypoxiaTest (90%) training (10%) and 10-fold cross-validationLeast squares support vector machines63.565.965.4
Weber et al. (2018)First prenatal visit∼2,700,000Born preterm v/s born of term in white women v/s colorTest set and 5-fold cross-validationLogistic Regression0.6255662.5

Perinatal complications predicted through ML models using medical images.

Ref., references; ML, machine learning; AUC, area under curve; Sen, sensitivity; Spec, specificity; Acc, accuracy; gw, gestational weeks; TPOT, tree-based pipeline optimization tool; EHG, electrohysterograhic; SVM, support vector machine; PCA, principal components analysis; RQA, recurrence quantification analysis; RVM, relevance vector machine.

TABLE 5

Biological Markers
RefTime of data collectionNumbers of recordsOutcomeValidation techniqueML methodsPerformance metrics
AUCSen. (%)Spec. (%)Acc. (%)
Guo et al. (2020)*For GDM <18 gw2,199GDMTraining and validationLogistic regression0.73272.6
For PEPE0.81381.5
<20gwMA0.7667182.380.0
For MA and FGR, 12–28 gwFGR0.77579.5
Liu et al. (2019)>20 gw77PE v/s controlTest and trainingSVM0.9589566.7
Nair (2018)>20 gw38PE v/s controlTest (85%) training (15%)Artificial neural networks multilayer perception0.908
Yoffe et al. (2019)First trimester of gestation43GDM v/s without GDMaTrained and evaluated the datasets via a leave-one-out cross-validationLogistic regression0.740884076
RF0.810944081
AdaBoost0.770946086
Munchel et al. (2020)Between 12 and 37 gw113Severe PE v/s without PEDataset trained with 10-fold stratified cross-validationAdaBoost0.964889289

Perinatal complications predicted through ML models using biological markers.

Ref., references; ML, machine learning; AUC, area under curve; Sen, sensitivity; Spec, specificity; Acc, accuracy; GDM, gestational diabetes mellitus; gw, gestational weeks; PE, pre-eclampsia; MA, macrosomia; FGR, fetal growth restriction; SVM, support vector machine.

a

This study also uses electronic medical records.

TABLE 6

Other features
Ref.Time of data collectionNumbers of recordsOutcomeValidation techniqueML methodsPerformance metrics
AUCSen. (%)Spec. (%)Acc. (%)
Moreira et al., 2016aDuring pregnancy (not specified)25Complication in hypertensive disorder v/s without complication in hypertensive disorderaLeave-one-out method of cross-validationNaive Bayes0.68742.394.480
Zhao et al. (2019)Intrapartum552Presence v/s absence of fetal acidemiabTraining set and 10-fold cross-validationDeep convolutional neural network0.97898.294.998.4

Perinatal complications predicted through ML models using sensors and fetal heart rate.

Ref., references; ML, machine learning; AUC, area under curve; Sen, sensitivity; Spec, specificity; Acc, accuracy.

a

Sensors.

b

Fetal heart rate.

Perinatal Complications to Predict

These have been divided into 16 main prediction outputs: prematurity, pre-eclampsia, adverse delivery, size for gestational age, gestational diabetes mellitus, neonatal mortality, fetal acidemia, fetal hypoxia, placental accreta, pulmonary diseases, cesarean section, placental invasion, congenital anomaly, severe maternal morbidity, spontaneous abortion, and trial of labor after cesarean (TOLAC) failure (Figure 2). The main perinatal complications considered in the application of ML are prematurity (7 studies) and pre-eclampsia (6 studies).

FIGURE 2

Validation Methods

Validation methods are strategies that allow the estimation of the predictive capacity of ML models. Fifty-five percent use training tests and the cross-validation method as a validation method with greater reliability in results, while 41.8% use a single validation method and 3.2% do not use any validation method (neither training tests nor cross-validation).

ML Models and Performance Metrics

In the present review, 67.7% of the articles used AUC and 61.3% used the accuracy metric. Sensitivity was only evaluated in 61.3% of the studies. While all studies assess results with at least one performance metric, reports of predictive accuracy were often incomplete, with a total of 38.7% of studies reviewing at most two performance methods. According to the studies, none had a clinical application, they only functioned to establish precise prediction systems in the diagnosis of the different perinatal complications presented.

Twenty-one different ML methods were used to predict these 16 perinatal complications. Placental invasion is referred to as placental adhesive disorders observed in women with placenta previa or prior cesarean section that lead to complications such as perinatal hemorrhage and visceral injuries, where an early diagnosis is necessary for appropriate treatment (Sun et al., 2019). Excellent performance of placental invasion can be observed with an AUC and an accuracy of 0.980 and 95.2%, respectively, using the Tree-based Pipeline Optimization Tool (TPOT) (Sun et al., 2019). To predict fetal acidemia, using convolutional neural networks, an AUC and accuracy of 0.978 and 98.4% are achieved, respectively (Zhao et al., 2019). Only one study of the six attempting to diagnose pre-eclampsia had a performance considered as good, using the AdaBoost model, with an AUC of 0.964 and an accuracy of 89% (Munchel et al., 2020). The prediction of prematurity has excellent results in two studies; the one that uses SVM achieves an AUC of 0.952 and an accuracy of 95.7% (Sadi–Ahmed et al., 2017), and the study that uses stacked sparse autoencoder achieves an AUC of 0.900 and an accuracy of 90% (Chen et al., 2019). For the prediction of neonatal mortality, through sociodemographic records using XGBoost, an AUC of 0.842 and an accuracy of 99.7% were obtained (Hamilton et al., 2020). Regarding the performance of the predictions included in the greatest number of studies, prematurity outperformed pre-eclampsia according to the AUC (Table 7).

TABLE 7

PredictionInput characteristicsML modelPerformanceNo of pregnant women
Placental invasionMagnetic resonanceTPOTAUC: 0.980 – Acc: 95.2%100–1,000
Fetal academiaMaternal sociodemographic characteristicsNeural networksAUC: 0.978 – Acc: 98.4%100–1,000
Pre-eclampsiaBiological markerAdaBoostAUC: 0.964 – Acc: 89%<100
PrematurityEHG recordingsSVMAUC: 0.952 – Acc. 95.7%100–1,000
PrematurityEHG recordingsStacked sparse autocoderAUC 0.900 – Acc: 90%100–1,000
Neonatal mortalityMaternal sociodemographic characteristicsXGBoostAUC: 0.842 – Acc: 99.7%>10,000

Models with best performance according to AUC and accuracy.

ML, machine learning; TPOT, tree-based pipeline optimization tool; AUC, area under curve; Acc, accuracy; EHG, electrohysterogram; SVM, support vector machine.

It was decided to corroborate the performance of the methods based on deep learning. Only four studies used deep learning methods. They all had an excellent performance. For the prediction of fetal acidemia, a deep convolutional network was used with an AUC of 0.978 and an accuracy of 98.4% (Zhao et al., 2019). For the prediction of spontaneous abortion, multilayer perceptron and radial-based networks were used, with an accuracy of 90.9% (Paydar et al., 2017). And as mentioned above, for the prediction of pre-eclampsia, using biological markers and multilayer perceptron, an AUC of 0.908 was obtained (Nair, 2018). For the prediction of neonatal mortality, through sociodemographic records using XGBoost, an AUC of 0.842 and an accuracy of 99.7% were obtained (Hamilton et al., 2020) (Table 8).

TABLE 8

PredictionInput characteristicsDeep learning modelPerformanceN° of pregnant women
Fetal acidemiaMaternal and newborn sociodemographic characteristicsDeep convolutional networkAUC: 0.978, Acc: 98.4%100 - 1,000
Spontaneous abortionMaternal sociodemographic characteristicsMultilayer Perceptron and radial-based networksAcc: 90.9%100 - 1,000
Pre-eclampsiaBiological markersMultilayer PerceptronAUC: 0.908<100
Neonatal mortalityMaternal sociodemographic characteristicsMultilayer PerceptronAUC: 0.84 - Acc: 99.7%>100,000

Models and precision based on deep learning.

AUC, area under curve; Acc, accuracy.

Interpretable ML Models

The interpretability of ML models refers to the degree to which a human being can consistently predict the outcome of the model (Kim et al., 2016), which has been well accepted by the clinical team. In this systematic review, we found that 24% of the studies use AI-interpretable ML models. The ML methods that were the most used in the prediction of perinatal complications were the random forest, logistic regression, neural networks, and support vector machine (SVM).

Predictive Variables

Forty-eight percent of the studies explain the main characteristics of pregnant women that could be relevant to predict some conditions. Characteristics and antecedents such as gestational diabetes, cardiovascular disease, underlying diseases, and the age of the mother, as well as the presence of chronic arterial hypertension, are considered high-ranking features for the prediction of premature births; and the father’s nationality is very important to differentiate the provider-initiated spontaneous preterm births (Khatibi et al., 2019).

On the other hand, important predictors to determine the likelihood of a newborn to be small for gestational age (SGA) were smoking, a particular amount of gestational weight gain, and low–birth weight newborn. The body mass index (BMI) before pregnancy, gestational weight gain, and a macrosomic newborn in a previous delivery were the strongest predictors to determine large for gestational age (LGA) newborns (Kuhle et al., 2018). To predict fetal macrosomia, the determining variables were age ≥30, multiparity, 12 kg of total weight gain during pregnancy, abdominal circumference >95 cm (at the last perinatal checkup), and a gestation period over 39 weeks (Shigemi et al., 2019).

In order to predict pre-eclampsia, the most influential variables were systolic blood pressure, serum levels of ureic nitrogen and creatinine, platelet count, serum potassium level, leukocyte count, blood glucose level, serum calcium, and proteinuria levels in the early second trimester (Jhee et al., 2019). Interestingly, high pre-pregnancy BMI and previous preterm births (Pan et al., 2017) were able to predict whether pregnant women will have an adverse pregnancy outcome (preterm, low birth weight, neonatal/infant death, stay in the neonatal intensive care unit) and indicate the main risk characteristics.

Furthermore, in order to predict TOLAC, the determining factors in the prediction model were parity, age, vaginal birth with cesarean section in the past, gestational weeks, minimum gestation week in previous deliveries, the weight of the newborn from the previous delivery, dilation, and head position (Lipschuetz et al., 2020). To predict pregnancy complications associated with placental alterations (pre-eclampsia, GDM, fetal growth restriction, macrosomia), maternal age, BMI, newborn weight, and the results of adverse events in previous pregnancies were the most influential characteristics in the study (Guo et al., 2020).

To predict gestational age at delivery (if the newborn will be preterm) variables such as the date of the mother’s last menstruation, birth weight, delivery of twins, maternal height, hypertension during labor and HIV serological status were decisive in the ML model (Rittenhouse et al., 2019). To determine preterm birth, the presence of premature rupture of membranes and/or vaginal bleeding, ultrasound cervical length, gestation week, fetal fibronectin, and serum C-reactive protein were the determining variables (Mailath-Pokorny et al., 2015). In another study, prediction of preterm birth considered the most relevant variables to be maternal age, whether the mother was black, Hispanic, Asian, born in the United States, delivered by herself or assisted by a physician, presence of diabetes mellitus, chronic arterial hypertension, thyroid dysfunction, asthma, previous stillbirth, fetal weight loss, in vitro fertilization, nulliparity, being a smoker during the first trimester, and BMI (Weber et al., 2018).

Stillbirth can potentially be identified prenatally considering the combination of current pregnancy complications, congenital anomalies, maternal characteristics, and medical history (Malacova et al., 2020). Determining factors for the prediction of fetal acidemia were maternal age, gestational age, pH, extracellular fluid deficit, pCO2, base excess, APGAR 1 and 5 min, parity, gestational diabetes, birth weight, child sex, and the type of delivery (Zhao et al., 2019).

In the case of the prediction of severe maternal morbidity, the following characteristics were determining factors: ventilator dependence, intubation, critical care, acute respiratory failure, ventilation, trauma and postoperative pulmonary failure, fluid and electrolyte disorder, systemic inflammatory response syndrome, acidosis, and septicemia (Gao et al., 2019).

Clinical Applicability of ML Systems

According to the studies, none had clinical application; they only served to establish precise prediction systems to diagnose the perinatal complications presented.

Discussion

Input Variables on Machine Learning

Machine learning plays a vital role and offers solutions with many applications, for example, image detection, data mining, natural language processing, and disease diagnosis (Maity and Das, 2017). This systematic review provides a study of different ML techniques for the diagnosis of different perinatal complications and frames a contribution to women’s health. A total of sixteen perinatal complications predicted by various ML models were detected, among which the most studied were prematurity and pre-eclampsia.

ML can significantly improve health care; however, it is necessary to consider the disadvantages of AI in health. Ethical dilemmas need to be addressed and the potential for human biases when creating computer algorithms (Ho et al., 2019). Health-care predictions can vary based on race, genetics, gender, and other characteristics, which could lead to the overestimation or underestimation of patient risk factors if not considered. When it comes to AI analysis in health care, it will be the physician’s responsibility to ensure that AI algorithms are developed and applied appropriately (Jordan and Mitchell, 2015).

In the present systematic review, the main data collection method was the use of electronic medical records. ML techniques can establish patterns from a data set based on electronic medical records (EMRs). Pattern recognition from these records supports in predicting and making decisions for diagnosis and treatment planning (Johnson et al., 2016). The application of EMR-based ML methods can be combined with other sources of large medical data, such as genomics, and medical imaging, which through predictive algorithms could improve clinical diagnosis and treatment systems, when used as complementary information (Barak-Corren et al., 2017). EMR data usually include demographics data, diagnoses, biochemical markers, vital signs, clinical notes, prescriptions, and procedures, which are generally easy to obtain and reduce transfer errors when handling large amounts of information. Previously, several studies have described medical diagnosis prediction tools mediated EMRs (McCoy et al., 2015; Osborn et al., 2015; Nguyen et al., 2017; Rajkomar et al., 2018); furthermore, in the present systematic review, 48% of the features for the diagnosis prediction model to perinatal complications came from EMRs, of which the most used features were sociodemographic maternal characteristics. Thus, this tool can predict perinatal complications common in a given population, contributing to the overall improvement of perinatal public health.

Perinatal complications as Output Variables

Output variables were usually binary outputs (with complication or without complication). However, some studies quantified the risk, for example, the risk of TOLAC was classified as high, medium, or low (Lipschuetz et al., 2020), and in studies of gestational diabetes, one article quantified it as high risk or low risk (Cömert et al., 2018). The most frequently predicted perinatal complications in ML models were prematurity and pre-eclampsia. According to the literature, the high rate of preterm birth is a public health problem, since these newborns suffer substantial morbidity and mortality in the neonatal period, which translates to high medical costs (McCormick et al., 2011). Pre-eclampsia is a pregnancy disorder characterized by the new onset of hypertension after 20 weeks gestation and organ damage with underlying causes being endothelial dysfunction (ACOG (American College of Obstetricians and Gynecologists), 2020; Carrasco-Wong et al., 2021; Roberts, 1998). It is the leading cause of maternal and neonatal mortality and morbidity (Salsoso et al., 2017; Fondjo et al., 2019). Thus, prediction of the risk for developing pre-eclampsia can be performed in the first half of pregnancy.

Performance of the Machine Learning Methods

Diagnostic accuracy is the ability of a test to discriminate between the target condition and health. This discriminative potential can be quantified by several performance tools, such as sensitivity and specificity, AUC, accuracy metric, and other measurements (Šimundić, 2009). While all studies assess results with at least one performance metric and just 38.7% assess at least two performance methods, reports of predictive accuracy were often incomplete. With this observation, it is imperative to show the same performance tools on the different prediction models to evaluate accuracy compared between them.

In this systematic review, several ML methods were used. One of the better performances was obtained by the Tree-based Pipeline Optimization Tool (TPOT) to predict placental invasion (Sun et al., 2019), which was previously used in the investigation of novel characteristics in data science, providing optimization of the studied parameters (Le et al., 2020). Another excellent performance observed was the convolutional neural network (CNN) to predict fetal acidemia (Zhao et al., 2019). The CNN has gained much attention from attempts made at harnessing its power to automatically learn intrinsic patterns from data, which can avoid time-consuming manual functions engineering, and capture hidden intrinsic patterns more effectively (Oquab et al., 2014). Moreover, in the health-care field, CNN has been shown to capture more hidden data patterns and learn high-level abstraction in problem-solving (Zhang et al., 2017).

It is essential to mention that it is difficult to reach a consensus on the best method for predicting perinatal complications, since not all of them had the same input variables, type of records, and a number of samples. However, the best performance metrics observed were the prediction model of prematurity from medical images using the SVM technique with an accuracy of 95.7% and the prediction of neonatal mortality using the XGBoost technique with an accuracy of 99.7%. SVM has shown simplicity and flexibility to address several classification problems and also offers balanced predictive performance even in studies where sample sizes may be limited (Alkhaleefah and Wu, 2018). The XGBoost technique is a very effective and widely used ML method that data scientists use to achieve state-of-the-art results in many ML challenges (Wang et al., 2020).

Interpretability of Machine Learning

Despite the recognition of the value of ML in medical care, impediments persist for its greater acceptance within medical teams (Holzinger et al., 2019). A fundamental impediment relates to the nature of the black box, or “opacity,” of many ML algorithms. The term refers to a system in which only the inputs and outputs are observable, while the question of what is transforming the inputs into the outputs cannot be fully understood (Molnar, 2019). Therefore, new techniques have been developed to facilitate the understanding of the internal functioning of the model, granting interpretability, which seeks to provide transparency to the black box (Freitas, 2014; Doshi-Velez et al., 2017; Lipton, 2018), so that the end-user can understand the model and may even improve the ML system (Freitas, 2014). The improvement in the precision of the prediction will depend on the interpretability of the model to be used. This means that with ML interpretability, clinical staff could know which variables are involved in the prediction of a diagnosis.

Regarding the predictive variables, while most of them agreed with current knowledge, it was also shown that ML models contributed new variables of relevance, which would be interesting to observe in controlled clinical studies (Table 9). For example, pre-eclampsia was found to be predictable based on systemic blood pressure, platelet count, and urinary protein levels as influential variables, with lesser influence found from glucose levels, leukocytes count, serum calcium, and potassium levels (Jhee et al., 2019). Other innovative variables of interest found using ML in the prediction of perinatal complications were newborn sex for the prediction of fetal acidemia (Liu et al., 2019), and father’s nationality and mother’s age for the prediction of provider-initiated spontaneous preterm delivery (Malacova et al., 2020). Nevertheless, some prediction models lack variable measurements, making them impossible to apply in a clinical setting. For example, “weight gain” is mentioned as a predictor for SGA and LGA, but the article does not specify whether it was inadequate or excessive (Kuhle et al., 2018). It is also stated that the underlying disease of the mother influences the delivery initiated by the provider; however, it is not detailed which underlying disease is considered in this association (Khatibi et al., 2019). Also, some studies describe obvious associations, such as low birth weight is associated with SGA, or fetal macrosomia is associated with LGA (Kuhle et al., 2018). pH was also a predictor of fetal acidemia, which is logical since this condition is associated with pH changes (Zhao et al., 2019). Since the engineering team behind these investigations emphasizes these characteristics in the results, without taking this obviousness into account, it is imperative to include clinical experts on women’s health into AI and data science teams.

TABLE 9

PredictionPredictive variablesMachine learning modelPerformance
AUCAcc
Premature birthGestational diabetesSet of decision trees, SVM and RF0.68081%
Cardiovascular disease
Underlying diseases
Maternal age
Chronic arterial hypertension
SGASmokingRF0.72879.9%
A particular values of gestational weight gainDT0.71879.4%
Low–birth weight newbornElastic net0.74880.9%
Gradient increasing machines0.74880.5%
Logistic regression0.74581.3%
Neural network0.74681.2%
LGAPre-pregnancy BMIRF0.74590.3%
Gestational weight gainDT0.71380.1%
Macrosomic newborn in a previous deliveryElastic net0.77191.2%
Gradient increasing machines0.76691.1%
Logistic regression0.77191.2%
Neural network0.77291.4%
Fetal MacrosomiaGreater than 30 years-oldLogistic regression0.888ni
MultiparityRF0.990ni
A 12 kg total weight gain in pregnancy
Abdominal circumference > 95 cm (at last perinatal checkup)
Gestation age > 39 weeks
Pre-eclampsiaAt second trimesterLogistic regressionni86.2%
Systolic blood pressureDTni87.4%
Serum levels of ureic nitrogenNaive Bayesni89.9%
Creatinine in the bloodSVMni89.2%
Platelet count, serum potassium levelRFni92.3%
Leukocyte countStochastic gradient augmentation methodni97.3%
Blood glucose level
Serum calcium and urinary protein levels
Adverse delivery (preterm, low birth weight, neonatal/infant death, stay in the neonatal intensive care unit) v/s non-adverse deliveryHigh pre-pregnancy BMILogistic regressionniania
Linear discriminant analysisniania
Previous preterm birthsRandom forestniania
Naive Bayesniania
TOLAC Failure RiskParityGradient increasing machines0.793ni
AgeRF0.756ni
Vaginal birth with cesarean section in the past Gestational weekRF0.782ni
Minimum gestation week in previous deliveriesAdaBoost set0.784ni
The weight of the newborn from the previous delivery
Dilation and head position
Gestational age (if the newborn will be preterm)Hypertension during laborBinary logistic regression model, random forest classification, and generalized additive model0.86898.9%
HIV serological status
Delivery prediction within 48 h of transfer v/s before 32 weeks gestationPresence of premature rupture of membranesMultivariate logistic regression0.850ni
Vaginal bleeding
Ultrasound cervical length
Gestation week
Fetal fibronectin and serum C-reactive protein
Spontaneous preterm birthMaternal ageMultivariate logistic regression0.670ni
Black woman
Hispanic woman
Asian
Mother born in the United States
Paid delivery by herself or physician
Diabetes mellitus
Chronic arterial hypertension
Thyroid dysfunction
Asthma
Previous stillbirth
Fetal weight loss
In vitro fertilization
Nulliparity
Pregnant smoker during the first trimester
BMI
StillbirthCurrent pregnancy complicationsLogistic regression0.83494.7%
Congenital anomaliesDecision tree0.80899.7%
Maternal characteristicsRandom forest0.83699.7%
Medical historyXGBoost0.84299.7%
Artificial neural networks multilayer perceptron0.84099.7%
Prediction of complications in pregnancy: pre-eclampsia, GDM, restriction of fetal growth, macrosomiaMaternal ageLogistic regression0.77078.6%
BMI
Newborn weight
Results of adverse events in previous pregnancies
Severe maternal morbidityVentilator dependenceLogistic regression0.937ni
Intubation
Critical care
Acute respiratory failure
Ventilation
Trauma and postoperative pulmonary failure
Fluid and electrolyte disorder
Systemic inflammatory response syndrome
Acidosis and septicemia
Fetal acidemiaMaternal ageDeep convolutional neural network0.97898.4%
Gestational age pH
Extracellular fluid deficit pC O 2
Base excess
APGAR 1 min, and 5 min
Parity
Gestational diabetes
Birth weight
Child sex
Type of delivery

Main predictive variables for predicting perinatal complications

AUC, area under the curve; Acc., accuracy; SVM, support vector machines; RF, random forest; SGA, small for gestational age; DT, decision tree; LGA, large for gestational age; BMI, body index mass; TOLAC, trial of labor of after cesarean; HIV, human immunodeficiency virus; GDM, gestational diabetes mellitus; ni, not informed.

a

This study does not specify either AUC or accuracy. The only performance metric used is sensitivity; logistic regression: 31.9%, linear discriminant analysis: 31.7%, random forest: 30.1%, naive Bayes: 29.2%.

Only 6.4% of the studies were case–control studies, while the vast majority were cohort studies. This may limit the use of these results in clinical practice (Salazar et al., 2019). Only one study was multicenter for predicting neonatal morbidity (Khatibi et al., 2019), representing higher quality evidence. Among the best performing studies, it is noteworthy that most had less than 1,000 patients, and only one based on XGBoost to predict neonatal mortality had over 10,000 patients. This may be risky since the sample size may not be representative for a given geographic group, representing one of the limitations of ML in health (Vayena et al., 2018). Also, another significant limitation of the present systematic review is that all studies included have different baselines, variable inputs, and separate complications (endpoints) assessed in their prediction, making it difficult to compare them.

It is essential to mention that all the studies reviewed have not been applied in a clinical phase; however, the majority mention that to optimize the results obtained, and the models should be used in hospitals or health services that care for pregnant women. Future prospective studies and additional population studies are needed to assess the clinical utility of the model for the real world (Liu et al., 2019; Malacova et al., 2020).

Few systematic reviews have addressed the use of AI in pregnancy. The first one describes how AI has been applied to evaluate maternal health during the entire pregnancy process and helped to understand the effects of pharmacological treatments during this stage (Davidson & Boland, 2020). The second systematic review concluded that using ML algorithms is better than using multivariable logistic regression for prognostic prediction studies in pregnancy care, focusing mainly on decision-making for the medical team (Sufriyana et al., 2020). Furthermore, the third one performed exclusively on neonatal mortality reported that ML models can accurately predict neonatal death (Mangold et al., 2021). Last, the use of modern bioinformatics methods analyzing ML models as non-invasive measures of heart rate variability to monitor newborns and infants was reported (Chiera et al., 2020). Although this body of evidence does not focus on predicting pregnancy complications, it encourages the clinical use of IA to support women’s health during pregnancy.

Conclusion

In conclusion, the main advantage of interpretable ML applications is that the output is not subjective, due to the fact that it is based on real-world data and results and identifies the most critical variables for clinicians. It is important to continue promoting this field of research in ML in order to obtain solutions with multicenter clinical applicability reduce perinatal complications. AI has the overall potential to revolutionize women’s health care by providing more accurate diagnosis, easing the workload of physicians, lowering health-care costs, and providing benchmark analysis for tests with substantial interpretation differences between specialists. This systematic review contributes significantly to the specialized literature on AI and women’s health.

Statements

Data availability statement

The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.

Author contributions

AB provided the principal idea, searched for information, and wrote the manuscript. RS provided the full support of the machine learning approach (search and discussion). SC provided the support on PRISMA technique and for machine learning applied on health. LS provided the support on the discussion on clinical approach on pregnancy complications. FP was the organizer of the manuscript and provided support on the discussion on machine learning, clinical approach, and pregnancy complications.

Funding

Supported by project PUENTE, UVA20993, Universidad de Valparaiso, Chile, the Fondo Nacional de Desarrollo Científico y Tecnológico (FONDECYT) (grant number 1190316), Chile, and International Sabbaticals (LS) (University Medical Centre Groningen, University of Groningen, The Netherlands) from the Vicerectorate of Academic Affairs, Academic Development Office of the Pontificia Universidad Católica de Chile. The work of RS and SC was partially funded by ANID, Chile–Millennium Science Initiative Program—ICN2021_004. LS is part of The Diamater Study Group, Sao Paulo Research Foundation-FAPESP, São Paulo (grant number FAPESP 2016/01743–5), Brazil. AB holds a fellowship from “Beca de Doctorado FIB—UV 2021” from Universidad de Valparaíso.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors, and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fbioe.2021.780389/full#supplementary-material

Supplementary Figure S1

CASP prediction rule score of each article for bias review.The score for every study included in the systematic review. The maximum score is 22.

Supplementary Table S1

Checklist for compliance with the review based on the PRISMA.

Supplementary Table S2

List of selected items.

References

  • 1

    ACOG (American College of Obstetricians and Gynecologists) (2020). Gestational Hypertension and Preeclampsia: ACOG Practice Bulletin, Number 222. Obstet. Gynecol.135, e237e260. 10.1097/AOG.0000000000003891

  • 2

    AhmedZ.MohamedK.ZeeshanS.DongX. (2020). Artificial Intelligence with Multi-Functional Machine Learning Platform Development for Better Healthcare and Precision Medicine. Database (Oxford)2020, baaa010. 10.1093/database/baaa010

  • 3

    AlkhaleefahM.WuC.-C. (2018). “A Hybrid CNN and RBF-Based SVM Approach for Breast Cancer Classification in Mammograms,”in 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Miyazaki, Japan, 7-10 Oct. 2018, 894899. 10.1109/SMC.2018.00159

  • 4

    ArtziN. S.ShiloS.HadarE.RossmanH.Barbash-HazanS.Ben-HaroushA.et al (2020). Prediction of Gestational Diabetes Based on Nationwide Electronic Health Records. Nat. Med.26, 7176. 10.1038/s41591-019-0724-8

  • 5

    Barak-CorrenY.CastroV. M.JavittS.HoffnagleA. G.DaiY.PerlisR. H.et al (2017). Predicting Suicidal Behavior from Longitudinal Electronic Health Records. Ajp174, 154162. 10.1176/appi.ajp.2016.16010077

  • 6

    BolandM. R.PolubriaginofF.TatonettiN. P. (2017). Development of A Machine Learning Algorithm to Classify Drugs of Unknown Fetal Effect. Sci. Rep.7, 12839. 10.1038/s41598-017-12943-x

  • 7

    BorowskaM.BrzozowskaE.KućP.OczeretkoE.MosdorfR.LaudańskiP. (2018). Identification of Preterm Birth Based on RQA Analysis of Electrohysterograms. Comp. Methods Programs Biomed.153, 227236. 10.1016/j.cmpb.2017.10.018

  • 8

    BottaciL.DrewP. J.HartleyJ. E.HadfieldM. B.FaroukR.LeeP. W.et al (1997). Artificial Neural Networks Applied to Outcome Prediction for Colorectal Cancer Patients in Separate Institutions. The Lancet350, 469472. 10.1016/S0140-6736(96)11196-X

  • 9

    BrownleeJ. (2017). What Is the Difference between Test and Validation Datasets. Available at: https://www.machinelearningmastery.com/difference-test-validation-datasets/(Accessed November 20, 2021).

  • 10

    Carrasco-WongI.Aguilera-OlguínM.Escalona-RivanoR.ChiarelloD. I.Barragán-ZúñigaL. J.Sosa-MacíasM.et al (2021). Syncytiotrophoblast Stress in Early Onset Preeclampsia: The Issues Perpetuating the Syndrome. Placenta113, 5766. 10.1016/j.placenta.2021.05.002

  • 11

    CarvalhoD. V.PereiraE. M.CardosoJ. S. (2019). Machine Learning Interpretability: A Survey on Methods and Metrics. Electronics8, 832. 10.3390/electronics8080832

  • 12

    CASP Clinical Prediction Rule Checklist (2017). Critical Appraisal Skills Programme. Oxford, UK: Oxford Centre for Triple Value Healthcare. Available at: https://casp-uk.net/wp-content/uploads/2018/01/CASP-Clinical-Prediction-Rule-Checklist_2018.pdf (Accessed November 12, 2021).

  • 13

    CeculaP. (2021). Artificial Intelligence: The Current State of Affairs for AI in Pregnancy and Labour. J. Gynecol. Obstet. Hum. Reprod.50, 102048. 10.1016/j.jogoh.2020.102048

  • 14

    ChenL.HaoY.HuX. (2019). Detection of Preterm Birth in Electrohysterogram Signals Based on Wavelet Transform and Stacked Sparse Autoencoder. PLoS One14, e0214712. 10.1371/journal.pone.0214712

  • 15

    ChieraM.CerritelliF.CasiniA.BarsottiN.BoschieroD.CavigioliF.et al (2020). Heart Rate Variability in the Perinatal Period: A Critical and Conceptual Review. Front. Neurosci.14, 561186. 10.3389/fnins.2020.561186

  • 16

    CömertZ.KocamazA. F.SubhaV. (2018). Prognostic Model Based on Image-Based Time-Frequency Features and Genetic Algorithm for Fetal Hypoxia Assessment. Comput. Biol. Med.99, 8597. 10.1016/j.compbiomed.2018.06.003

  • 17

    DarcyA. M.LouieA. K.RobertsL. W. (2016). Machine Learning and the Profession of Medicine. JAMA315, 551552. 10.1001/jama.2015.18421

  • 18

    DavidsonL.BolandM. R. (2020). Enabling Pregnant Women and Their Physicians to Make Informed Medication Decisions Using Artificial Intelligence. J. Pharmacokinet. Pharmacodyn.47, 305318. 10.1007/s10928-020-09685-1

  • 19

    DenisonF. C.RobertsK. A.BarrS. M.NormanJ. E. (2010). Obesity, Pregnancy, Inflammation, and Vascular Function. Reproduction140, 373385. 10.1530/REP-10-0074

  • 20

    Doshi-VelezF.KortzM.BudishR.BavitzC.GershmanS. J.O'BrienD.et al (2017). Accountability of AI under the Law: The Role of Explanation. SSRN J. arXiv. 10.2139/ssrn.3064761

  • 21

    EdwardsP.WrightG. (2020). Obesity in Pregnancy. Obstet. Gynaecol. Reprod. Med.30, 315320. 10.1016/j.ogrm.2020.07.003

  • 22

    FergusP.SelvarajM.ChalmersC. (2018). Machine Learning Ensemble Modelling to Classify Caesarean Section and Vaginal Delivery Types Using Cardiotocography Traces. Comput. Biol. Med.93, 716. 10.1016/j.compbiomed.2017.12.002

  • 23

    FondjoL. A.BoamahV. E.FiertiA.GyesiD.OwireduE.-W. (2019). Knowledge of Preeclampsia and its Associated Factors Among Pregnant Women: A Possible Link to Reduce Related Adverse Outcomes. BMC Pregnancy Childbirth19, 456. 10.1186/s12884-019-2623-x

  • 24

    FreitasA. A. (2014). Comprehensible Classification Models. SIGKDD Explor. Newsl.15, 110. 10.1145/2594473.2594475

  • 25

    FrizzellJ. D.LiangL.SchulteP. J.YancyC. W.HeidenreichP. A.HernandezA. F.et al (2017). Prediction of 30-Day All-Cause Readmissions in Patients Hospitalized for Heart Failure. JAMA Cardiol.2, 204209. 10.1001/jamacardio.2016.3956

  • 26

    GaoC.OsmundsonS.YanX.EdwardsD. V.MalinB. A.ChenY. (2019). Learning to Identify Severe Maternal Morbidity from Electronic Health Records. Stud. Health Technol. Inform.264, 143147. 10.3233/SHTI190200

  • 27

    GuoZ.YangF.ZhangJ.ZhangZ.LiK.TianQ.et al (2020). Whole‐Genome Promoter Profiling of Plasma DNA Exhibits Diagnostic Value for Placenta‐Origin Pregnancy Complications. Adv. Sci.7, 1901819. 10.1002/advs.201901819

  • 28

    HamiltonE. F.DyachenkoA.CiampiA.MaurelK.WarrickP. A.GariteT. J. (2020). Estimating Risk of Severe Neonatal Morbidity in Preterm Births under 32 Weeks of Gestation. J. Maternal-Fetal Neonatal Med.33, 7380. 10.1080/14767058.2018.1487395

  • 29

    HintonG. (2018). Deep Learning-A Technology with the Potential to Transform Health Care. JAMA320, 11011102. 10.1001/jama.2018.11100

  • 30

    HoC. W. L.SoonD.CaalsK.KapurJ. (2019). Governance of Automated Image Analysis and Artificial Intelligence Analytics in Healthcare. Clin. Radiol.74, 329337. 10.1016/j.crad.2019.02.005

  • 31

    HolzingerA.LangsG.DenkH.ZatloukalK.MüllerH. (2019). Causability and Explainability of Artificial Intelligence in Medicine. Wires Data Mining Knowl Discov.9, e1312. 10.1002/widm.1312

  • 32

    IftikharP. M.KuijpersM. V.KhayyatA.IftikharA.DeGouvia De SaM. (2020). Artificial Intelligence: A New Paradigm in Obstetrics and Gynecology Research and Clinical Practice. Cureus12, e7124. 10.7759/cureus.7124

  • 33

    JheeJ. H.LeeS.ParkY.LeeS. E.KimY. A.KangS.-W.et al (2019). Prediction Model Development of Late-Onset Preeclampsia Using Machine Learning-Based Methods. PLoS ONE14, e0221202. 10.1371/journal.pone.0221202

  • 34

    JohnsonA. E. W.GhassemiM. M.NematiS.NiehausK. E.CliftonD.CliffordG. D. (2016). Machine Learning and Decision Support in Critical Care. Proc. IEEE104, 444466. 10.1109/JPROC.2015.2501978

  • 35

    JordanM. I.MitchellT. M. (2015). Machine Learning: Trends, Perspectives, and Prospects. Science349, 255260. 10.1126/science.aaa8415

  • 36

    KhatibiT.KheyrikoochaksarayeeN.SepehriM. M. (2019). Analysis of Big Data for Prediction of Provider-Initiated Preterm Birth and Spontaneous Premature Deliveries and Ranking the Predictive Features. Arch. Gynecol. Obstet.300, 15651582. 10.1007/s00404-019-05325-3

  • 37

    KimB.KhannaR.KoyejoO. (2016). “Examples Are Not Enough, Learn to Criticize! Criticism for Interpretability,” in Proceedings of the 30th International Conference on Neural Information Processing Systems, 22882296.

  • 38

    KuhleS.MaguireB.ZhangH.HamiltonD.AllenA. C.JosephK. S.et al (2018). Comparison of Logistic Regression with Machine Learning Methods for the Prediction of Fetal Growth Abnormalities: a Retrospective Cohort Study. BMC Pregnancy Childbirth18, 333. 10.1186/s12884-018-1971-2

  • 39

    LaopaiboonM.LumbiganonP.IntarutN.MoriR.GanchimegT.VogelJ.et al (2014). Advanced Maternal Age and Pregnancy Outcomes: a Multicountry Assessment. Bjog: Int. J. Obstet. Gy121, 4956. 10.1111/1471-0528.12659

  • 40

    LeT. T.FuW.MooreJ. H. (2020). Scaling Tree-Based Automated Machine Learning to Biomedical Big Data with a Feature Set Selector. Bioinformatics36, 250256. 10.1093/bioinformatics/btz470

  • 41

    LipschuetzM.GuedaliaJ.RottenstreichA.Novoselsky PerskyM.CohenS. M.KabiriD.et al (2020). Prediction of Vaginal Birth after Cesarean Deliveries Using Machine Learning. Am. J. Obstet. Gynecol.222, e1613.e12. 10.1016/j.ajog.2019.12.267

  • 42

    LiptonZ. C. (2018). The Mythos of Model Interpretability. Commun. ACM61, 3643. 10.1145/3233231

  • 43

    LiuK.FuQ.LiuY.WangC. (2019). An Integrative Bioinformatics Analysis of Microarray Data for Identifying Hub Genes as Diagnostic Biomarkers of Preeclampsia. Biosci. Rep.39, BSR20190187. 10.1042/BSR20190187

  • 44

    Mailath-PokornyM.PolterauerS.KohlM.KueronyaiV.WordaK.HeinzeG.et al (2015). Individualized Assessment of Preterm Birth Risk Using Two Modified Prediction Models. Eur. J. Obstet. Gynecol. Reprod. Biol.186, 4248. 10.1016/j.ejogrb.2014.12.010

  • 45

    MaityN. G.DasS. (2017). “Machine Learning for Improved Diagnosis and Prognosis in Healthcare,” in 2017 IEEE Aerospace Conference, Big Sky, MT, USA, 4-11 March 2017, 19. 10.1109/AERO.2017.7943950

  • 46

    MakaryM. A.DanielM. (2016). Medical Error-The Third Leading Cause of Death in the US. BMJ353, i2139. 10.1136/bmj.i2139

  • 47

    MalacovaE.TippayaS.BaileyH. D.ChaiK.FarrantB. M.GebremedhinA. T.et al (2020). Stillbirth Risk Prediction Using Machine Learning for a Large Cohort of Births from Western Australia, 1980-2015. Sci. Rep.10, 5354. 10.1038/s41598-020-62210-9

  • 48

    MangoldC.ZoreticS.ThallapureddyK.MoreiraA.ChorathK.MoreiraA. (2021). Machine Learning Models for Predicting Neonatal Mortality: A Systematic Review. Neonatology118, 394405. 10.1159/000516891

  • 49

    MarionaF. G. (2016). Perspectives in Obesity and Pregnancy. Womens Health (Lond Engl.12, 523532. 10.1177/1745505716686101

  • 50

    McCormickM. C.LittJ. S.SmithV. C.ZupancicJ. A. F. (2011). Prematurity: An Overview and Public Health Implications. Annu. Rev. Public Health32, 367379. 10.1146/annurev-publhealth-090810-182459

  • 51

    McCoyT. H.CastroV. M.RosenfieldH. R.CaganA.KohaneI. S.PerlisR. H. (2015). A Clinical Perspective on the Relevance of Research Domain Criteria in Electronic Health Records. Ajp172, 316320. 10.1176/appi.ajp.2014.14091177

  • 52

    MolnarC. (2019). Interpretable Machine Learning. A Guide for Making Black Box Models Explainable. Konstanz, Alemania: Leanpub.

  • 53

    MoreiraM. W. L.RodriguesJ. J. P. C.OliveiraA. M. B.SaleemK.NetoA. (2016b). “Performance Evaluation of Predictive Classifiers for Pregnancy Care,” in 2016 IEEE Global Communications Conference (GLOBECOM), Washington, DC, USA, 4-8 Dec. 2016. 10.1109/GLOCOM.2016.7842136

  • 54

    MoreiraM. W. L.RodriguesJ. J. P. C.OliveiraA. M. B.SaleemK. (2016a). “Smart mobile System for Pregnancy Care Using Body Sensors,” in 2016 International Conference on Selected Topics in Mobile & Wireless Networking (MoWNeT), Cairo, Egypt, 11-13 April 2016. 10.1109/MoWNet.2016.7496609

  • 55

    MunchelS.RohrbackS.Randise-HinchliffC.KinningsS.DeshmukhS.AllaN.et al (2020). Circulating Transcripts in Maternal Blood Reflect a Molecular Signature of Early-Onset Preeclampsia. Sci. Transl. Med.12, eaaz0131. 10.1126/scitranslmed.aaz0131

  • 56

    NairT. M. (2018). Statistical and Artificial Neural Network-Based Analysis to Understand Complexity and Heterogeneity in Preeclampsia. Comput. Biol. Chem.75, 222230. 10.1016/j.compbiolchem.2018.05.011

  • 57

    NguyenP.TranT.WickramasingheN.VenkateshS. (2017). $\mathtt {Deepr}$: A Convolutional Net for Medical Records. IEEE J. Biomed. Health Inform.21, 2230. 10.1109/JBHI.2016.2633963

  • 58

    ObermeyerZ.EmanuelE. J. (2016). Predicting the Future - Big Data, Machine Learning, and Clinical Medicine. N. Engl. J. Med.375, 12161219. 10.1056/NEJMp1606181

  • 59

    OquabM.BottouL.LaptevI.SivicJ. (2014). “Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 10.1109/CVPR.2014.222

  • 60

    OsbornD. P. J.HardoonS.OmarR. Z.HoltR. I. G.KingM.LarsenJ.et al (2015). Cardiovascular Risk Prediction Models for People with Severe Mental Illness. JAMA Psychiatry72, 143151. 10.1001/jamapsychiatry.2014.2133

  • 61

    PanI.NolanL. B.BrownR. R.KhanR.Van Der BoorP.HarrisD. G.et al (2017). Machine Learning for Social Services: A Study of Prenatal Case Management in Illinois. Am. J. Public Health107, 938944. 10.2105/AJPH.2017.303711

  • 62

    PaydarK.Niakan KalhoriS. R.AkbarianM.SheikhtaheriA. (2017). A Clinical Decision Support System for Prediction of Pregnancy Outcome in Pregnant Women with Systemic Lupus Erythematosus. Int. J. Med. Inform.97, 239246. 10.1016/j.ijmedinf.2016.10.018

  • 63

    RajkomarA.OrenE.ChenK.DaiA. M.HajajN.HardtM.et al (2018). Scalable and Accurate Deep Learning with Electronic Health Records. Npj Digital Med.1, 18. 10.1038/s41746-018-0029-1

  • 64

    RefaeilzadehP.TangL.LiuH. (2009). “Cross-Validation,” in Encyclopedia of Database Systems. Arizona: Springer Science Business Media, LLC. 532538. 10.1007/978-0-387-39940-9_565

  • 65

    RipleyB. D. (1996). Pattern Recognition and Neural Networks. Cambridge, UK: Cambridge University Press.

  • 66

    RittenhouseK. J.VwalikaB.KeilA.WinstonJ.StonerM.PriceJ. T.et al (2019). Improving Preterm Newborn Identification in Low-Resource Settings with Machine Learning. PLoS One14, e0198919. 10.1371/journal.pone.0198919

  • 67

    RobertsJ. (1998). Endothelial Dysfunction in Preeclampsia. Semin. Reprod. Med.16, 515. 10.1055/s-2007-1016248

  • 68

    RomeoV.RicciardiC.CuocoloR.StanzioneA.VerdeF.SarnoL.et al (2019). Machine Learning Analysis of MRI-Derived Texture Features to Predict Placenta Accreta Spectrum in Patients with Placenta Previa. Magn. Reson. Imaging64, 7176. 10.1016/j.mri.2019.05.017

  • 69

    Sadi-AhmedN.KachaB.TalebH.Kedir-TalhaM. (2017). Relevant Features Selection for Automatic Prediction of Preterm Deliveries from Pregnancy ElectroHysterograhic (EHG) Records. J. Med. Syst.41, 204. 10.1007/s10916-017-0847-8

  • 70

    SalazarF. P.ManterolaC.QuirozS. G.GarcíaM. N.OtzenH. T.MoraV. M.et al (2019). Estudios de Cohortes. 1a Parte. Descripción, Metodología y Aplicaciones. Rev. Cirugia71, 482493. 10.35687/s2452-45492019005431

  • 71

    SalsosoR.FaríasM.GutiérrezJ.PardoF.ChiarelloD. I.ToledoF.et al (2017). Adenosine and Preeclampsia. Mol. Aspects Med.55, 126139. 10.1016/j.mam.2016.12.003

  • 72

    ShigemiD.YamaguchiS.AsoS.YasunagaH. (2019). Predictive Model for Macrosomia Using Maternal Parameters without Sonography Information. J. Maternal-Fetal Neonatal Med.32, 38593863. 10.1080/14767058.2018.1484090

  • 73

    ŠimundićA.-M. (2009). Measures of Diagnostic Accuracy: Basic Definitions. EJIFCC19, 203211.

  • 74

    SohilF.SohaliM. U.ShabbirJ. (2021). An Introduction to Statistical Learning with Applications in R. Statistical Theory and Related Fields. New York, NY: Taylor and Francis Group. 10.1080/24754269.2021.1980261

  • 75

    SufriyanaH.HusnayainA.ChenY.-L.KuoC.-Y.SinghO.YehT.-Y.et al (2020). Comparison of Multivariable Logistic Regression and Other Machine Learning Algorithms for Prognostic Prediction Studies in Pregnancy Care: Systematic Review and Meta-Analysis. JMIR Med. Inform.8, e16503. 10.2196/16503

  • 76

    SunH.QuH.ChenL.WangW.LiaoY.ZouL.et al (2019). Identification of Suspicious Invasive Placentation Based on Clinical MRI Data Using Textural Features and Automated Machine Learning. Eur. Radiol.29, 61526162. 10.1007/s00330-019-06372-9

  • 77

    UrrútiaG.BonfillX. (2010). Declaración PRISMA: una propuesta para mejorar la publicación de revisiones sistemáticas y metaanálisis. Medicina Clínica135, 507511. 10.1016/j.medcli.2010.01.015

  • 78

    VayenaE.BlasimmeA.CohenI. G. (2018). Machine Learning in Medicine: Addressing Ethical Challenges. Plos Med.15, e1002689. 10.1371/journal.pmed.1002689

  • 79

    VeeramaniS. K.MuthusamyE. (2016). Detection of Abnormalities in Ultrasound Lung Image Using Multi-Level RVM Classification. J. Maternal-Fetal Neonatal Med.29, 19. 10.3109/14767058.2015.1064888

  • 80

    WangC.DengC.WangS. (2020). Imbalance-XGBoost: Leveraging Weighted and Focal Losses for Binary Label-Imbalanced Classification with XGBoost. Pattern Recognition Lett.136, 190197. 10.1016/j.patrec.2020.05.035

  • 81

    WeberA.DarmstadtG. L.GruberS.FoellerM. E.CarmichaelS. L.StevensonD. K.et al (2018). Application of Machine-Learning to Predict Early Spontaneous Preterm Birth Among Nulliparous Non-hispanic Black and white Women. Ann. Epidemiol.28, 783789. 10.1016/j.annepidem.2018.08.008

  • 82

    WHO (2019). Trends in Maternal Mortality: 2000 to 2017: Estimates by WHO, UNICEF, UNFPA, World Bank Group and the United Nations Population Division. Geneva: World Health Organization. WHO, UNICEF, UNFPA, World Bank Group and the United Nations Population Division.

  • 83

    YoffeL.PolskyA.GilamA.RaffC.MecacciF.OgnibeneA.et al (2019). Early Diagnosis of Gestational Diabetes Mellitus Using Circulating microRNAs. Eur. Jour. Endocrinol.181, 565577. 10.1530/EJE-19-0206

  • 84

    ZhangQ.ZhouD.ZengX. (2017). HeartID: A Multiresolution Convolutional Neural Network for ECG-Based Biometric Human Identification in Smart Health Applications. IEEE Access5, 1180511816. 10.1109/ACCESS.2017.2707460

  • 85

    ZhaoZ.DengY.ZhangY.ZhangY.ZhangX.ShaoL. (2019). DeepFHR: Intelligent Prediction of Fetal Acidemia Using Fetal Heart Rate Signals Based on Convolutional Neural Network. BMC Med. Inform. Decis. Mak.19, 286. 10.1186/s12911-019-1007-5

Summary

Keywords

perinatal complications, machine learning, pregnancy, artificial intelligence, predictive tool, prediction model

Citation

Bertini A, Salas R, Chabert S, Sobrevia L and Pardo F (2022) Using Machine Learning to Predict Complications in Pregnancy: A Systematic Review. Front. Bioeng. Biotechnol. 9:780389. doi: 10.3389/fbioe.2021.780389

Received

21 September 2021

Accepted

10 December 2021

Published

19 January 2022

Volume

9 - 2021

Edited by

Lana McClements, University of Technology Sydney, Australia

Reviewed by

Mugdha V. Joglekar, Western Sydney University, Australia

Anandwardhan Hardikar, Western Sydney University, Australia

Updates

Copyright

*Correspondence: Fabián Pardo,

This article was submitted to Preclinical Cell and Gene Therapy, a section of the journal Frontiers in Bioengineering and Biotechnology

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Figures

Cite article

Copy to clipboard


Export citation file


Share article

Article metrics