Prediction model for gestational diabetes mellitus using the XG Boost machine learning algorithm

Hu, Xiaoqi; Hu, Xiaolin; Yu, Ya; Wang, Jia

doi:10.3389/fendo.2023.1105062

ORIGINAL RESEARCH article

Front. Endocrinol., 09 March 2023

Sec. Clinical Diabetes

Volume 14 - 2023 | https://doi.org/10.3389/fendo.2023.1105062

This article is part of the Research TopicCurrent and Future Trends in Gestational Diabetes Diagnosis, Care and Neonatal OutcomesView all 15 articles

Prediction model for gestational diabetes mellitus using the XG Boost machine learning algorithm

Xiaoqi Hu^1†*

Xiaolin Hu^2†

Ya Yu³

Jia Wang⁴

¹Department of Nursing, Yantian District People's Hospital, Shenzhen, Guangdong, China
²School of Basic Medical Sciences, Southern Medical University, Guangzhou, Guangdong, China
³Department of Nursing, Guangzhou First People's Hospital, Guangzhou, Guangdong, China
⁴Department of Nursing, Shenzhen Hospital of Southern Medical University, Shenzhen, Guangdong, China

Objective: To develop the extreme gradient boosting (XG Boost) machine learning (ML) model for predicting gestational diabetes mellitus (GDM) compared with a model using the traditional logistic regression (LR) method.

Methods: A case–control study was carried out among pregnant women, who were assigned to either the training set (these women were recruited from August 2019 to November 2019) or the testing set (these women were recruited in August 2020). We applied the XG Boost ML model approach to identify the best set of predictors out of a set of 33 variables. The performance of the prediction model was determined by using the area under the receiver operating characteristic (ROC) curve (AUC) to assess discrimination, and the Hosmer–Lemeshow (HL) test and calibration plots to assess calibration. Decision curve analysis (DCA) was introduced to evaluate the clinical use of each of the models.

Results: A total of 735 and 190 pregnant women were included in the training and testing sets, respectively. The XG Boost ML model, which included 20 predictors, resulted in an AUC of 0.946 and yielded a predictive accuracy of 0.875, whereas the model using a traditional LR included four predictors and presented an AUC of 0.752 and yielded a predictive accuracy of 0.786. The HL test and calibration plots show that the two models have good calibration. DCA indicated that treating only those women whom the XG Boost ML model predicts are at risk of GDM confers a net benefit compared with treating all women or treating none.

Conclusions: The established model using XG Boost ML showed better predictive ability than the traditional LR model in terms of discrimination. The calibration performance of both models was good.

Introduction

Gestational diabetes mellitus (GDM) is the most common metabolic complication to occur during pregnancy and is classed as a mild form of diabetes. It is normally diagnosed at 24–28 weeks’ gestation, and is characterized by hyperglycemia (1). The global prevalence of hyperglycemia during pregnancy is approximately 15.8%, and over 80% of cases are due to GDM (2). With the growth of the economy and the transition to a more sedentary lifestyle, the prevalence of GDM in Chinese women continues to increase, and ranges from 14.8% to 24.24% (3–5). Over time, China has loosened its fertility restrictions, most recently with the replacement of the two-child policy with the three-child policy. Thus, this increase in GDM prevalence can be attributed mainly to the rising rates of pregnant women who are of advanced maternal age.

Hyperglycemia brings about both short- and long-term outcomes, resulting in a significant impact on the health of both pregnant women and their offspring. Several studies in mothers have reported that GDM is associated with adverse pregnancy complications, including pre-eclampsia, the need for delivery by cesarean section, as well as type 2 diabetes and cardiovascular disease after delivery (6). GDM can also affect their offspring, being associated with a higher prevalence of macrosomia, shoulder dystocia, birth trauma, stillbirth, and, in later life, obesity and metabolic syndrome (7). According to the Developmental Origins of Health and Disease framework for GDM, exposure to intrauterine hyperglycemia before GDM screening at 24–28 weeks’ gestation is associated with the abnormal growth and development of the fetus (8). which includes smaller fetuses at 24 weeks’ gestation increased abdominal circumference growth rates (9), and hyperinsulinemia (6). Lifestyle interventions during early pregnancy can reduce the risk of GDM by 18%–62% (10, 11), but are not effective if initiated at a later stage (12). Thus, we concluded that a hysteretic diagnosis of GDM in the second or third trimester of pregnancy might lead to a narrow time frame for sufficient intervention. Therefore, it is imperative to establish a prediction model for women at risk of GDM to provide early intervention prior to the diagnosis of the condition at 24–28 weeks’ gestation.

There is accumulating evidence indicating that models based on multiple risk factors can improve predictive abilities (9). Machine learning (ML) algorithms, as an artificial intelligence technology, have the advantage of presenting high-dimensional predictors constructed to model relatively small datasets with reduced overfit, and demonstrate a powerful selflearning ability to find complex relationships between predictors (9, 13). As major predictors of GDM, demographic characteristics and clinical features contribute to improving the predictive ability of models combined with biomarkers (14, 15). Consequently, we aim to present the results of prediction models for GDM based on demographic characteristics, clinical features, and laboratory parameters to make full use of the available variables. In addition, we compare and evaluate the performance of ML and logistic regression (LR) models to show the advantages of each.

Materials and methods

Participants

This case–control study of pregnant women was conducted at the Shenzhen Hospital of the Southern Medical University, Shenzhen, China. Pregnant women were eligible to participate in the study if they met all of the following inclusion criteria: (1) they were aged ≥ 18 years; (2) they had undergone all routine antenatal assessments; (3) they had taken a 75-g oral glucose tolerance test (OGTT) at 24–28 weeks’ gestation; and (4) they were willing to participate in this study and to sign the informed consent form. The exclusion criteria were as follows: (1) pre-existing type 1 or type 2 diabetes; (2) a history of severe diseases, such as hypertension or heart disease; and (3) taking medications affecting insulin and blood glucose levels.

Data collection

Information on participants’ demographic characteristics was collected by using a structured questionnaire. Clinical features and laboratory parameters in the first trimester were collected from the hospital’s electronic medical record system (EMRS).

Diagnosis of GDM

GDM was diagnosed at 24–28 weeks’ gestation when any one of the 75-g OGTT values met or exceeded 5.1 mmol/L at 0 h, 10.0 mmol/L at 1 h, and 8.5 mmol/L at 2 h, in accordance with the recommendations set out at the International Association of Diabetes and Pregnancy Study Groups Consensus Panel 2010 (IADPSG).

Statistical analysis

All analyses were performed using IBM^® SPSS^® Statistics version 26.0 software (IBM Corporation, Armonk, NY, USA). Continuous variables of two groups were expressed as means and standard deviations, and analyzed by Student’s t-test for normally distributed variables. Categorical variables were described as frequencies (percentages), and evaluated by a chi-squared test. Test results with a p-value of less than 0.05 were considered statistically significant. Results from these tests, clinically relevant findings, and previous literature were used to preliminarily screen the set of variables for potentially meaningful predictors of GDM. Multiple imputations were used to deal with missing data, to avoid selection bias. The prediction model using LR was carried out in R (The R Foundation, Vienna, Austria) using the rms package, and XG Boost ML was carried out by R package (XG Boost, XG Boost Explainer, and MLR).

Prediction models

In this study, we included variables with a p-value of < 0.05 in the univariate analysis, whereas variables indicated in previous literature and clinically meaningful variables were included in the LR analysis (stepwise). ML can present novel or complex combinations of multidomain variables, and also has features that weigh variable importance and reduce overfit (16). Therefore, we incorporated all variables of the univariate analysis into the model using XG Boost ML.

The model for GDM, trained on the training set, was validated in the testing set with the optimal hyperparameters using 10-fold cross-validation.

Model evaluation

The discrimination of the models was assessed using the receiver operating characteristic (ROC) curves and the area under the ROC curve (AUC). The calibration plots and the Hosmer–Lemeshow (HL) test were used to evaluate the calibration of each model. Decision curve analysis (DCA) was introduced to evaluate the clinical use of the models.

Results

Participant characteristics

In total, 925 pregnant women were included in this study (735 in the training set; 190 in the testing set). The alternative 33 variables were collected for each pregnant woman. Table 1 shows the univariate analysis of the demographic characteristics, clinical features, and laboratory parameters of participants with GDM (cases) and participants without GDM (controls) in the training set. Participants with GDM were significantly older and had higher pre-pregnancy body mass index (BMI) and mean arterial pressure (MAP) than participants without GDM. The average time since the last pregnancy was also longer in this group than in the control group. The percentage of women who had previously GDM and the number with a family history of diabetes mellitus were also significantly higher in the GDM group, but participants in this group were also markedly younger at menarche than those in the non-GDM group (all p-values were < 0.05). Laboratory parameters, including platelet count, white blood cell count, and the levels of glucose in urine, ketone in urine, alanine aminotransferase, thyroid hormone T₃, fasting plasma glucose, and glycated hemoglobin (HbA_1c), were also higher in women with GDM than in control participants. The demographic characteristics, clinical features, and laboratory parameters of participants in the training and testing sets are compared in Table 2. Good consistency in the data between the training data set and the testing data set is shown for the majority of the variables.

TABLE 1

Table 1 Demographic characteristics, clinical features, and laboratory parameters of participants with GDM and non-GDM control participants in the training set.

TABLE 2

Table 2 Demographic characteristics, clinical features, and laboratory parameters of the training and testing sets.

Predictors of models

Four predictors, previous GDM, age, HbA_1c level, and MAP, were used to construct the predictive model using LR (Table 3). Twenty predictors were finally included to build the model using XG Boost ML. Figure 1 shows the relative importance of the 20 variables included in the predictive model for GDM using XG Boost ML.

TABLE 3

Table 3 Four predictors included in the model using stepwise LR in the training set.

FIGURE 1

Figure 1 The relative importance of the 20 variables included in the XG Boost ML model for GDM in the training set. BMI, body mass index; GDM, gestational diabetes mellitus; HbA_1c, glycated hemoglobin; XG Boost ML, extreme gradient boosting (XG) machine learning (ML).

Accuracy of prediction models

For the data from the training set, the AUC of the prediction model for GDM using stepwise LR is 0.752, whereas the AUC of the model using XG Boost ML is 0.946; these are shown in Figures 2, 3, respectively. The accuracy of the two models for the data from the training set is 0.786 and 0.875, respectively. The specificity of the model using XG Boost ML was higher than that of the model using traditional LR for the data from both the training and testing sets. However, the sensitivity of the model using XG Boost ML was lower than that of the model using traditional LR, as shown clearly in Table 4.

FIGURE 2

Figure 2 The AUC of the prediction model for GDM by stepwise LR. AUC, area under the receiver operating characteristic curve; GDM, gestational diabetes mellitus; LR, logistic regression.

FIGURE 3

Figure 3 The AUC of the prediction model for GDM by XG Boost ML. AUC, area under the receiver operating characteristic curve; GDM, gestational diabetes mellitus; XG Boost ML, extreme gradient boosting (XG) machine learning (ML).

TABLE 4

Table 4 Accuracy of the four prediction models.

Calibration of different models

The calibration plots demonstrate the consistency between the predicted values and the real outcomes, which are shown in Figures 4–7. The Hosmer–Lemeshow (HL) test p-values were 0.288 and 0.402 for the training set and testing sets, respectively, in the model using LR, and 0.831 and 0.556 for the training set and testing sets, respectively, in the model using XG Boost ML.

FIGURE 4

Figure 4 The calibration plots of the training set by LR. LR, logistic regression.

FIGURE 5

Figure 5 The calibration plots of the testing set by LR. LR, logistic regression.

FIGURE 6

Figure 6 The calibration plots of the training set by XG Boost ML. XG Boost ML, extreme gradient boosting (XG) machine learning (ML).

FIGURE 7

Figure 7 The calibration plots of the testing set by XG Boost ML. XG Boost ML, extreme gradient boosting (XG) machine learning (ML).

Clinical use

The DCA results for the two models are presented in Figures 8, 9. Compared with treating all women and none of the women, the prediction models using LR provide a net benefit between a threshold probability of 6%–63% and 87%–90%. The DCA plot indicated good positive net benefits in the model using XG Boost ML with a threshold probability of between 5% and 92%.

FIGURE 8

Figure 8 The DCA of the model using LR. DCA, decision curve analysis; LR, logistic regression.

FIGURE 9

Figure 9 The DCA of the model using XG Boost ML. DCA, decision curve analysis; XG Boost ML, extreme gradient boosting (XG) machine learning (ML).

Discussion

Early screening and prediction of the likelihood of pregnant women developing GDM are imperative to the prevention and treatment of this condition (17). We compared two models and found that XG Boost ML models had better performance in terms of discrimination and achieved a larger AUC, which was as high as 0.946. Our results are concordant with a previous study showing that ML algorithms can be more accurate than traditional LR methods (18). The HL test shows that the observed probability is largely consistent with the predicted probability, which implies that both models had good calibration.

Given evidence indicates that, in the situation of no overfitting, a prediction model with a greater number of predictors has an improved prediction ability compared with a model with fewer predictors (19). Similarly, in our study, the XG Boost ML model presents 20 predictors with a higher predictive accuracy than the LR model with four predictors. Furthermore, linear models, such as LR models, highlight a clear linear contribution of each variable for GDM models, making them available for clinical implementation, whereas XG Boost ML models can weight the importance of factors and assess their complex non-linear relationships by boosting, integrating multiple factors, assess their complex non-linear relationships by boosting, and clearly demonstrate the relative contribution of each variable to GDM (18).

A recent relative study has indicated that hematologic and biochemical parameters measured during routine antenatal examination can be used in ML models to predict GDM (20). However, it has not until now been possible to weigh the relative importance of each variable. In this study we have shown that it is possible quantify the likelihood of individual independent risk factors leading to GDM. Another related study (18) developed a ML prediction model based on a large population and weighed the importance of risk factors, but there was no exploration of biomarkers in early pregnancy in this study; by contrast, this was explored in our study.

In the two models, previous GDM was the most classical predictor, and LR analysis showed that pregnant women with previous GDM are 7.8 times more likely to develop GDM (OR = 7.822; p < 0.05). Furthermore, other model studies have shown (9, 21) that previous GDM increases the risk of GDM in a current pregnancy 13.7- to 21.1-fold (p < 0.05). One review also found that having GDM in a previous pregnancy is the strongest risk factor for GDM, with reported recurrence rates of up to 84% (22). In addition to previous GDM, age, HbA_1c level, and MAP were considered independent factors for GDM in the LR analysis. Previously, age and HbA_1c level have been strongly associated with an elevated risk of GDM (17, 21). With increasing age, the fertility and organ function of pregnant women are reduced, and insulin sensitivity and pancreatic β-cell function are decreased, which in turn lead to insulin resistance (IR) and an increased risk of hyperglycemia. HbA_1c level, an identified risk factor, can diagnose the severity of GDM and reflects the average blood glucose level in the past 2 to 3 months, which is significantly related to the degree of IR (23). A previous study revealed that HbA_1c level is a reliable predictor of GDM(OR = 3.11; p < 0.05)and that HbA_1c levels are elevated in women with GDM, although still within the normal range (24), which is consistent with our results. MAP was calculated from one-third systolic blood pressure (SBP) and two-thirds diastolic blood pressure (DBP), both of which are considered to be predictors of GDM (18, 25, 26). MAP can probably predict GDM because IR is the involved in the pathogenesis of both gestational hypertension (GH) and GDM, and the level of MAP, which can reflect the severity of GH, also stimulates a certain degree of GDM (27).

Another 16 predictors, comprising pre-pregnancy BMI and 15 laboratory parameters routinely measured during antenatal assessment, were confirmed as risk factors by XG Boost ML. Pre-pregnancy BMI, despite being considered an established predictor of GDM (28), has the lowest predictive ability, probably because of the low frequency of overweight and obesity (among our sample affecting approximately 11.700% and 14.700% of women in the training and testing sets, respectively). Another explanation is that the relationship between BMI and GDM is complex, with women with GDM and a high BMI having IR and women with GDM and a low BMI having defective insulin secretion (29).

Existing studies have identified that several laboratory parameters are independent predictors of GDM, such as glycemic markers (e.g., fasting glucose and HBA_1c levels), alanine aminotransferase (ALT) levels, and thyroid function (levels of the thyroid hormones T₃ and T₄) (9, 18, 20); all of these are available clinically in the first trimester of pregnancy. The possible link between these variables and GDM could be explained by the fact that hyperglycemia can change the hemodynamics of the body, and that these variables can reflect the inflammation and immune responses that are highly associated with IR (30). Prior research has identified several blood potential biomarkers, such as platelet count, white blood cell count, and red blood cell count, which were positively correlated with the development of GDM (30). Consistent with a previous study (9), high T₃ and low T₄ levels were identified as being predictors of GDM in our study, strongly confirming the existence of a close relationship between thyroid function and GDM. ALT and AST (aspartate aminotransferase), as markers of hepatocellular damage, were also examined as predictors of GDM in our study. The pathogenesis of GDM is linked with IR, which may in turn be caused by mild ALT and AST elevations (15, 31). In summary, the laboratory parameters support the hypothesis that pregnancy blood routine examination is conducive to GDM screening.

Limitations

This study has several limitations. Firstly, this study has limited sample size. Secondly, the fact is that a time external verification was used to verify the extrapolation in a single center. Lastly, there is a lack of complete data for all laboratory parameters and a comparison of multiple ML models. Variables such as clinical features and laboratory parameters are based on retrospective data from the EMRS that may have inevitable selection biases. Further multicenter prospective studies should be carried out to update and validate the models based on a large, population-based sample. Models constructed from more variables that are available from EMRS are often the most feasible option.

Conclusion

In conclusion, a model with four predictors and using traditional LR and a model with 20 predictors and using XG Boost ML were successfully built and used to predict GDM. Compared with traditional LR, the XG Boost ML model can improve the discrimination of a prediction model for GDM and make full use of more predictors. The common laboratory parameters from pregnant women’s antenatal assessments can be used to predict the likelihood of their developing GDM.

Data availability statement

The datasets presented in this article are not readily available because the generated datasets belong to hospital. Requests to access the datasets should be directed to XH, NzMxNTM4MDQ1QHFxLmNvbQ==.

Ethics statement

This study was approved by the corresponding Hospital Ethics Committee (No.: NYSZYYEC20200032). The patients/participants provided their written informed consent to participate in this study. Written informed consent was obtained from the individual(s) for the publication of any potentially identifiable images or data included in this article.

Author contributions

XH and XiaolH contributed to the conception and design of the study. XH organized the database. XH and YY performed the statistical analysis. XH wrote the first draft of the manuscript. XH, XiaolH, YY, and JW wrote sections of the manuscript. All authors contributed to the article and approved the submitted version.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

1. Hod M, Kapur A, Sacks DA, Hadar E, Agarwal M, Di, Renzo GC, et al. The international federation of gynecology and obstetrics (FIGO) initiative onGestational diabetes mellitus: A pragmatic guide for diagnosis, management, and care. Int J Gynaecol Obstetr (2015) 131(Suppl.3):S173–211. doi: 10.1016/S0020-7292(15)30033-3

CrossRef Full Text | Google Scholar

2. International Diabetes Federation. IDF diabetes atlas ninth edition (2019). https://diabetesatlas.org/atlas/ninth-edition/[Accessed 2020].

Google Scholar

3. Gao C, Sun X, Lu L, Liu F JY. Prevalence of gestational diabetes mellitus in mainland China: A systematic review and meta-analysis. J Diabetes Investig (2019) 10(1):154–62. doi: 10.1111/jdi.12854

PubMed Abstract | CrossRef Full Text | Google Scholar

4. Wang C, Jin L, Tong M, Zhang J, Yu J, Meng W. Prevalence of gestational diabetes mellitus and its determinants among pregnant women in Beijing. J Matern Fetal Neonatal Med (2020) 35(7):1337–43. doi: 10.1080/14767058.2020.1754395

PubMed Abstract | CrossRef Full Text | Google Scholar

5. Zhu H, Zhao Z, Xu J, Chen Y, Zhu Q, Zhou L, et al. The prevalence of gestational diabetes mellitus before and after the implementation of the universal two-child policy in China. Front Endocrinol (2022) 13:960877. doi: 10.3389/fendo.2022.960877

CrossRef Full Text | Google Scholar

6. Moon JH, Jang HC. Gestational diabetes mellitus: Diagnostic approaches and maternal-offspring complications. Diabetes Metab J (2022) 46(1):3–14. doi: 10.4093/dmj.2021.0335

PubMed Abstract | CrossRef Full Text | Google Scholar

7. Sudasinghe BH, Wijeyaratne CN, Ginige PS. Long and short-term outcomes of gestational diabetes mellitus (GDM) among south Asian women - a community-based study. Diabetes Res Clin Pract (2018) 145:93–101. doi: 10.1016/j.diabres.2018.04.013

PubMed Abstract | CrossRef Full Text | Google Scholar

8. McKerracher L, Fried R, AW K, Moffat T, Sloboda DM, Galloway T. Synergies between the developmental origins of health and disease framework and multiple branches of evolutionary anthropology. Evolutionary Anthropol: Issues News Rev (2020) 29(5):214–9. doi: 10.1002/evan.21860

CrossRef Full Text | Google Scholar

9. Wu YT, Zhang CJ, BW Mo, Kawai A, Li C, Chen L, et al. Early prediction of gestational diabetes mellitus in the Chinese population via advanced machine learning. J Clin Endocrinol Metab (2021) 106(3):e1191–205. doi: 10.1210/clinem/dgaa899

PubMed Abstract | CrossRef Full Text | Google Scholar

10. Guo XY, Shu J, Fu XH, Chen XP, Zhang L, Ji MX, et al. Improving the effectiveness of lifestyle interventions for gestational diabetes prevention: A meta-analysis and meta-regression. BJOG Int J Obstet Gynaecol (2019) 126:311–20. doi: 10.1111/1471-0528.15467

CrossRef Full Text | Google Scholar

11. Juan J, Yang H. Prevalence, prevention, and lifestyle intervention of gestational diabetes mellitus in China. Int J Env Res PUB HE (2020) 17(24):9517. doi: 10.3390/ijerph17249517

CrossRef Full Text | Google Scholar

12. Song C, Li J, Leng J, Ma R, Yang X. Lifestyle intervention can reduce the risk of gestational diabetes: a meta-analysis of randomized controlled trials. Obes Rev (2016) 17(10):960–9. doi: 10.1111/obr.12442

PubMed Abstract | CrossRef Full Text | Google Scholar

13. Colmenarejo G. Machine learning models to predict childhood and adolescent obesity: A review. NUTRIENTS (2020) 12:2466. doi: 10.3390/nu12082466

PubMed Abstract | CrossRef Full Text | Google Scholar

14. Sweeting AN, Wong J, Appelblom H, Ross GP, Kouru H, Williams PF, et al. A novel early pregnancy risk prediction model for gestational diabetes mellitus. FETAL Diagn Ther (2019) 45(2):76–84. doi: 10.1159/000486853

PubMed Abstract | CrossRef Full Text | Google Scholar

15. Powe CE. Early pregnancy biochemical predictors of gestational diabetes mellitus. Curr Diabetes Rep (2017) 17(2):12. doi: 10.1007/s11892-017-0834-y

CrossRef Full Text | Google Scholar

16. Choi RY, Coyner AS, Kalpathy-Cramer J, Chiang MF, Campbell JP. Introduction to machine learning, neural networks, and deep learning. Transl Vis Sci TECHN (2020) 9(2):14. doi: 10.1167/tvst.9.2.14

CrossRef Full Text | Google Scholar

17. Kang M, Zhang H, Zhang J, Huang K, Zhao J, Hu J, et al. A novel nomogram for predicting gestational diabetes mellitus during early pregnancy. Front Endocrinol (2021) 12:779210. doi: 10.3389/fendo.2021.779210

CrossRef Full Text | Google Scholar

18. Liu H, Li J, Leng J, Wang H, Liu JN, Li WQ, et al. Machine learning risk score for prediction of gestational diabetes in early pregnancy in tianjin, China. Diabetes/Metabolism Res Rev (2021) 37(5):e3397. doi: 10.1002/dmrr.3397

CrossRef Full Text | Google Scholar

19. Ding X, Li J, Liang H, Wang ZY, Jiao TT, Zhuang L, et al. Predictive model for acute respiratory distress syndrome events in ICU patients in China using machine learning algorithms: a secondary analysis of a cohort study. J Transl Med (2019) 17(1):326. doi: 10.1186/s12967-019-2075-0

PubMed Abstract | CrossRef Full Text | Google Scholar

20. Xiong Y, Lin L, Chen Y, Salerno S, Li Y, Zeng XX, et al. Prediction of gestational diabetes mellitus in the first 19 weeks of pregnancy using machine learning techniques. J Matern Fetal Neonatal Med (2020) 2020:1–7. doi: 10.1080/14767058.2020.1786517

CrossRef Full Text | Google Scholar

21. Zhang Y, Xiao CM, Zhang Y, Chen Q, Zhang XQ, Li CF, et al. Factors associated with gestational diabetes mellitus: A meta-analysis. J Diabetes Res (2021), 6692695. doi: 10.1155/2021/6692695

PubMed Abstract | CrossRef Full Text | Google Scholar

22. Sweeting A, Wong J, Murphy HR, Ross GP. A clinical update on gestational diabetes mellitus. Endocr Rev (2022), 1–31. doi: 10.1210/endrev/bnac003

PubMed Abstract | CrossRef Full Text | Google Scholar

23. Wang YY, Liu Y, Li C, Lin J, Liu XM, Sheng JZ, et al. Frequency and risk factors for recurrent gestational diabetes mellitus in primiparous women: A case control study. BMC Endocrine Disord (2019) 19:22. doi: 10.1186/s12902-019-0349-4

CrossRef Full Text | Google Scholar

24. Lin J, Jin H, Chen L. Associations between insulin resistance and adverse pregnancy outcomes in women with gestational diabetes mellitus: A retrospective study. BMC Pregnancy Childbirth (2021) 21:526. doi: 10.1186/s12884-021-04006-x

PubMed Abstract | CrossRef Full Text | Google Scholar

25. Birukov A, Glintborg D, Schulze MB, Jensen TK, Kuxhaus O, Andersen LB, et al. Elevated blood pressure in pregnant women with gestational diabetes according to the WHO criteria: importance of overweight. J Hypertens (2022) 40(8):1614–23. doi: 10.1097/HJH.0000000000003196

PubMed Abstract | CrossRef Full Text | Google Scholar

26. Aburezq M, AlAlban F, Alabdulrazzaq M, Badr H. Risk factors associated with gestational diabetes mellitus: The role of pregnancy-induced hypertension and physical inactivity. Pregnancy Hypertension (2020) 22:64–70. doi: 10.1016/j.preghy.2020.07.010

PubMed Abstract | CrossRef Full Text | Google Scholar

27. Vieira MC, Begum S, Seed PT, Badran D, Briley AL, Gill C, et al. Gestational diabetes modifies the association between PlGF in early pregnancy and preeclampsia in women with obesity. Pregnancy Hypertension (2018) 13:267–72. doi: 10.1016/j.preghy.2018.07.003

PubMed Abstract | CrossRef Full Text | Google Scholar

28. Najafi F, Hasani J, Izadi N, Hashemi-Nazari SS, Namvar Z, Mohammadi S, et al. The effect of prepregnancy body mass index on the risk of gestational diabetes mellitus: A systematic review and dose-response meta-analysis. Obes Rev (2018) 20(3):472–86. doi: 10.1111/obr.12803

PubMed Abstract | CrossRef Full Text | Google Scholar

29. Bhaskaran K, Dos-Santos-Silva I, Leon DA, Douglas IJ, Smeeth L. Association of BMI with overall and cause-specific mortality: a population-based cohort study of 3·6 million adults in the UK. Lancet Diabetes Endocrinol (2018) 6(12):944–53. doi: 10.1016/S2213-8587(18)30288-2

PubMed Abstract | CrossRef Full Text | Google Scholar

30. Yang HL, Zhu CY, Ma QL, Long Y, Cheng Z. Variations of blood cells in prediction of gestational diabetes mellitus. J Perinat. Med (2015) 43(1):89–93. doi: 10.1515/jpm-2014-0007

PubMed Abstract | CrossRef Full Text | Google Scholar

31. Kim WJ, Chung Y, Park J, Park JY, Han K, Park Y, et al. Influences of pregravid liver enzyme levels on the development of gestational diabetes mellitus. LIVER Int (2021) 41(4):743. doi: 10.1111/liv.14759

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: gestational diabetes mellitus, machine learning, prediction model, extreme gradient boosting, logistic regression

Citation: Hu X, Hu X, Yu Y and Wang J (2023) Prediction model for gestational diabetes mellitus using the XG Boost machine learning algorithm. Front. Endocrinol. 14:1105062. doi: 10.3389/fendo.2023.1105062

Received: 22 November 2022; Accepted: 30 January 2023;
Published: 09 March 2023.

Edited by:

Elena Succurro, University of Magna Graecia, Italy

Reviewed by:

Patrizia Vizza, Magna Græcia University, Italy
Yanting Wu, Fudan University, China

Copyright © 2023 Hu, Hu, Yu and Wang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Xiaoqi Hu, NzMxNTM4MDQ1QHFxLmNvbQ==

^†These authors have contributed equally to this work

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.