Skip to main content


Front. Endocrinol., 19 March 2018
Sec. Clinical Diabetes
Volume 9 - 2018 |

Prediction of Glucose Tolerance without an Oral Glucose Tolerance Test

imageRohit Babbar1,2†, imageMartin Heni3,4,5, imageAndreas Peter3,4,5, imageMartin Hrabě de Angelis5, imageHans-Ulrich Häring3,4,5, imageAndreas Fritsche3,4,5, imageHubert Preissl4,5,6, imageBernhard Schölkopf1 and imageRóbert Wagner3,4,5*
  • 1Department of Empirical Inference, Max Planck Institute for Intelligent Systems, Tübingen, Germany
  • 2Aalto University, Helsinki, Finland
  • 3Department of Internal Medicine IV, Division of Endocrinology, Diabetology, Nephrology, Vascular Disease and Clinical Chemistry, University Hospital of Tübingen, Tübingen, Germany
  • 4Institute for Diabetes Research and Metabolic Diseases of the Helmholtz Centre Munich at the University of Tübingen (IDM), Tübingen, Germany
  • 5German Center for Diabetes Research (DZD), Neuherberg, Germany
  • 6Institute of Pharmaceutical Sciences, Interfaculty Centre for Pharmacogenomics and Pharma Research, Department of Pharmacy and Biochemistry, University of Tübingen, Tübingen, Germany

Introduction: Impaired glucose tolerance (IGT) is diagnosed by a standardized oral glucose tolerance test (OGTT). However, the OGTT is laborious, and when not performed, glucose tolerance cannot be determined from fasting samples retrospectively. We tested if glucose tolerance status is reasonably predictable from a combination of demographic, anthropometric, and laboratory data assessed at one time point in a fasting state.

Methods: Given a set of 22 variables selected upon clinical feasibility such as sex, age, height, weight, waist circumference, blood pressure, fasting glucose, HbA1c, hemoglobin, mean corpuscular volume, serum potassium, fasting levels of insulin, C-peptide, triglyceride, non-esterified fatty acids (NEFA), proinsulin, prolactin, cholesterol, low-density lipoprotein, HDL, uric acid, liver transaminases, and ferritin, we used supervised machine learning to estimate glucose tolerance status in 2,337 participants of the TUEF study who were recruited before 2012. We tested the performance of 10 different machine learning classifiers on data from 929 participants in the test set who were recruited after 2012. In addition, reproducibility of IGT was analyzed in 78 participants who had 2 repeated OGTTs within 1 year.

Results: The most accurate prediction of IGT was reached with the recursive partitioning method (accuracy = 0.78). For all classifiers, mean accuracy was 0.73 ± 0.04. The most important model variable was fasting glucose in all models. Using mean variable importance across all models, fasting glucose was followed by NEFA, triglycerides, HbA1c, and C-peptide. The accuracy of predicting IGT from a previous OGTT was 0.77.

Conclusion: Machine learning methods yield moderate accuracy in predicting glucose tolerance from a wide set of clinical and laboratory variables. A substitution of OGTT does not currently seem to be feasible. An important constraint could be the limited reproducibility of glucose tolerance status during a subsequent OGTT.


Impaired glucose tolerance (IGT) defines an intermediate state of blood glucose regulation that is not yet clearly pathologic. However, it is important to recognize this state because individuals with IGT have significantly increased incidence of type 2 diabetes (1), and it is associated with an elevated cardiovascular disease risk (24). Therefore, identification of IGT is important to allow focused diabetes and cardiovascular disease prevention strategies on those who are at the highest risk.

There is a widely accepted consensus definition of IGT using a cutoff of 7.8 mmol l−1 for the postchallenge plasma glucose level measured 2 h after the administration of a 75 g glucose load in an oral glucose tolerance test (OGTT) (5). While the test itself is usually well tolerable and not too difficult to perform, it requires considerable attention and time from both the participant and the medical personnel. Unfortunately, currently, there is no procedure of substituting the OGTT to establish IGT. For the diagnosis of diabetes, the use of glycated hemoglobin (HbA1c) levels has been proposed as an alternative to OGTT. However, it has been shown that using only HbA1c to diagnose diabetes misses more than half of the diabetes cases established by OGTT (6). According to recommendations of the American Diabetes Association, prediabetes can also be diagnosed with an intermediary HbA1c range of 5.7–6.4% (7). Nevertheless, it has recently been demonstrated that in obese individuals, 44% of prediabetes cases captured by OGTT were missed using the HbA1c criterion (8). Furthermore, the HbA1c-based diagnosis of prediabetes precludes the differentiation of impaired fasting glycemia from IGT.

Therefore, we investigated if machine learning could be utilized to differentiate IGT from normal glucose tolerance (NGT) using a battery of potential predictor variables that can be easily obtained in a fasting state. Machine learning can be employed as a computational technique to recognize specific patterns that are characteristic for a class of entities. Specifically, we set out to test supervised machine learning to predict IGT from anthropometric, clinical, and laboratory variables obtained at one time point.

Materials and Methods


Data of the Tuebingen Family Study (TUEF) were retrospectively analyzed. In the TUEF study, participants at an increased risk for type 2 diabetes were recruited if they had a family history of diabetes, obesity, or previously known prediabetes. All participants underwent OGTT with an extensive phenotyping including the laboratory measurements of several glycemic traits. The data were acquired between December 1996 and November 2016. To separate training and test sets, the data were longitudinally split at the date January 1, 2012. The training set comprised OGTTs performed before this date, and the test set comprised OGTTs performed after this date. To measure the intraindividual variability of an OGTT in our study center, we identified 78 participants who underwent a second OGTT within 1 year and had a weight difference <3 kg between the two measurements.

This study was carried out in accordance with the recommendations of the Ethics Committee of the University of Tübingen with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the Ethics Committee of the University of Tübingen.

Model Variables

We preselected routinely phenotyped variables upon scientific credibility and feasibility as model features for machine learning. The list of selected variables with available evidence for an association with IGT or glycemia is shown in Table 1. Mean arterial blood pressure, alanine aminotransferase, family history of diabetes (9), and prolactin (10) were initially selected and then excluded due to a high missing rate (>10%). The feature variables sex, age, height, weight, fasting glucose, and the outcome variable IGT had no missing values in the training set, and all other features had missing value rates <10%. Missing values were imputed using multivariate imputation by chained equations (11). For constructing the classifiers, all variables were normalized to a mean of 0 with a SD of 1 (scaled and centered). No further preprocessing was applied.


Table 1. List of anthropometric, clinical, and laboratory variables used as features in the machine learning classifiers.

OGTT and Laboratory Measurements

All participants received a 75-g glucose solution (Accu-Check Dextro, Roche) at 8:00 a.m. following an overnight fast. Venous blood was obtained through an indwelling venous catheter before and 30, 60, 90, and 120 min after glucose ingestion. Glucose values were measured directly using a bedside glucose analyzer (YSI, Yellow Springs, CO, USA). All other obtained blood samples were put on ice, and the serum was centrifuged within 2 h. Plasma insulin and C-peptide were determined by an immunoassay with the ADVIA Centaur XP Immunoassay System (Siemens Healthineers, Eschborn, Germany).

Serum proinsulin concentrations were measured using a microparticle enzyme immunoassay (IBL, Hamburg, Germany) on a BEP III System (Siemens Healthineers, Eschborn, Germany). Triglycerides (TGs) and total, HDL, and low-density lipoprotein cholesterol levels, as well as alanine aminotransferase, aspartate aminotransferase, and gamma-glutamyl transferase (GGT) activities, were measured using the ADVIA XPT clinical chemical analyzer (Siemens Healthineers, Eschborn, Germany). Plasma concentrations of total non-esterified fatty acid (NEFA) were measured with an enzymatic method (WAKO Chemicals, Neuss, Germany) on the latter instrument. Hematological parameters, including mean corpuscular volume, were determined on the Sysmex XN-10 (Sysmex GmbH, Norderstedt, Germany) or ADVIA 2120 hematological analyzers (Siemens Healthineers, Eschborn, Germany). HbA1c measurements were performed using the Tosoh glycohemoglobin analyzer HLC-723G8 (Tosoh Bioscience Tokyo Japan).

Model Computation and Statistics

All computations were run under R version 3.4 (26). Classifiers were computed using the wrapper package Classification and Regression Training (27). Detailed information on the machine learning packages is provided as Supplementary Material; see Table S1 in Supplementary Material. Model optimization in the training set was performed by fivefold cross-validation, with three sets of repeats. The granularity of the tuning parameter grid (“tuneLength”) was set to 5 (default: 3). We used Synthetic Minority Over-sampling Technique to compensate for the imbalanced prevalence of IGT and NGT in the training set (28). By doing this, we simulate balanced NGT and IGT prevalence for the classifier, thus precluding it from utilizing prevalence information in the estimation procedure. Intraindividual percentage error was calculated as the ratio of the difference of two measurements divided by their mean. Intraindividual coefficient of variation was calculated as the ratio of SD and mean. Insulin sensitivity was assessed using the method of Matsuda and DeFronzo (29).


Machine Learning Classifiers

We tested the performance of 10 machine learning classifiers to predict IGT from 22 biologically reasonable feature variables. Table 2 compares the characteristics of the training and test sets. In the training set during resampling by repeated cross-validation, the highest model accuracy indicating the proportion of right predictions over all predictions was shown for the recursive partitioning and regression trees (RPART) classifier at a median of 0.82 (interquartile range, 0.80–0.83). In the independent test set, the same RPART classifier reached the highest accuracy of 0.78. The mean sensitivity across all models was 0.67 ± 0.08, and the specificity was 0.75 ± 0.08. The generalized linear model and the penalized multinomial regression classifier had the highest sensitivity (both 0.74), while the highest specificity (0.88) was yielded by the RPART classifier. These measures translate to positive predictive values of around 51 ± 6% and negative predictive values of 86 ± 2%. Model accuracy showed a mean of 0.73 ± 0.04.


Table 2. Characteristics of the training and test set for the feature variables and the target variable defining the classification.

Simple accuracy measurements could be biased because the predicted categories were unbalanced. IGT has a prevalence of 27%, corresponding to an NGT prevalence of 73% in the test set, such that a uniform “forecast” of NGT would also result in an accuracy of 0.73. This value is also called no information rate. To quantify prediction accuracy adjusted for the expected accuracy, Cohen’s kappa (κ) was applied as a suitable measure. Table 3 shows the accuracy, κ statistic, and the p-value of the difference of κ from the no information rate for all models in the test set. The mean κ over all models in the test set was 0.38 ± 0.04.


Table 3. Model performance showing crude accuracy values (the ratio of right predictions over all predictions) and κ statistic (accuracy in relation to expected accuracy) for the evaluated machine learning classifiers in the test set.

Furthermore, to show which predictor variables are used by the machine learning classifiers, model-specific variable importance measures were obtained for each model. By using a common scale of 0–100 (with 0 indicating an omitted variable and 100 indicating the variable with the highest importance), we show overall variable importance in Figure 1. The single most important model feature was fasting glucose. It is followed by NEFA, TGs, and HbA1c. Some of the tested machine learning methods shrink the number of model features from the original 22 by completely eliminating weak predictor variables. For example, the best-performing RPART classifier has only retained fasting glucose, HbA1c, insulin, C-peptide, and TGs, while the lasso method used fasting glucose, NEFA, C-peptide, HbA1c, age, height, and CRP for the classifier.


Figure 1. Aggregated importance score in the machine learning classifiers for each feature variable. Individual importance scores are represented by colors in the stacked bars. The classifiers are described in Table S1 in Supplementary Material.

To compare model performance in insulin-sensitive and insulin-resistant individuals, we tested the classifiers in subsets of the test set split at the median insulin sensitivity index. Higher accuracy levels were obtained in the insulin-sensitive subset; however, κ was generally very low, probably due to the low proportion of individuals with IGT among insulin-sensitive subjects. In contrast, accuracy and κ values were moderate in individuals with low insulin sensitivity.

Intraindividual Variability of OGTT

To compare the predictive performance of machine learning models with the predictive performance of an earlier OGTT for discriminating IGT from NGT in an individual, we investigated 78 participants who received a second OGTT within 1 year of their first OGTT. Only participants who did not undergo lifestyle intervention in between and who did not experience substantial weight change (<3 kg) were selected. Postchallenge 120-min glucose in a repeated OGTT, i.e., the underlying variable for discrimination of IGT from NGT, showed large variation. The percentage error of measurement pairs for postchallenge 120-min glucose, calculated as the average of the bias-to-mean ratio of the individual data points of the plot, was 18.3% (±15.2%). For comparison, fasting glucose had a percentage error of only 6.4% (±5.0%). Bland-Altman plots and coefficients of variation of repeated OGTT measurements for each time point are shown in the Supplementary Material, Figures S1 and S2 in Supplementary Material, respectively. The calculated mean coefficients of variation were 13.0% (±11.0%) for postchallenge glucose and 4.6% (±3.5%) for fasting glucose. By using these data, we calculated the predictive accuracy of one OGTT for forecasting IGT in a second OGTT. In the set of 78 OGTT measurement pairs, the agreement (accuracy) between corresponding measurements was 0.77. Given the expected frequency of IGT, the computed κ statistic was 0.46.


Our work shows that machine learning is capable of predicting the glucose tolerance status by 22 baseline variables obtained at fasting blood acquisition. The best-performing RPART classifier had an unbiased predictive accuracy κ of 0.42, which is a moderate classifier according to the consideration of Landis and Koch (30). From another aspect, given a test population with similarly high IGT prevalence, this classifier’s IGT prediction would be correct in 62% of the cases (positive predictive value), and the NGT predictions would be correct in 83% of the cases (negative predictive value). This might fall short of initial expectations. However, the κ value of the RPART classifier (0.42) approaches the computed intraindividual κ of a repeated OGTT (0.46) that can be interpreted as the upper bound of a feasible prediction of IGT. A wealth of studies has been investigating the reproducibility of OGTT results since the 1960s (3136). It has been shown that in a population with NGT, the 95th percentile of random test–retest differences is 46% for the postchallenge glucose, while this ratio is only 16% for fasting glucose values (36). The coefficient of variation has been estimated around 16–17% for 2-h postchallenge glucose levels (34, 36). In our population, the coefficient of variation was only 12%, which might be due to the stringent laboratory methods employed in our single-center study. Higher variability seems to be also true for other postchallenge analytes such as insulin levels during the OGTT (32). Fasting levels of these analytes are more stable, but still prone to a relatively high intraindividual variance. For example, the intraindividual coefficient of variation for fasting TGs is 25–35% (37, 38). Since these analytes are feature variables in the machine learning classifiers, they introduce a further noise to the prediction.

As the aggregate statistics of variable importance shows, from the 22 initially selected feature variables, the top 5 variables were fasting glucose, NEFA, TGs, HbA1c, C-peptide, and CRP. One could speculate that fasting glucose, TGs, and NEFA might reflect insulin resistance, while lower C-peptide levels associate with a dysfunction of insulin secretion. HbA1c directly correlates with glycemia, such that higher postprandial glucose levels that are present with IGT contribute to an elevation of HbA1c. Interestingly, in one of our recent works, NEFA emerged as a very robust proxy for the estimation of insulin resistance (39). The emergence of CRP among the top feature variables underlines the role of subclinical inflammation in the pathogenesis of IGT and prediabetes (40). In the aggregated variable, importance score fasting insulin is unexpectedly only the sixth most important variable closely following CRP. However, the best-performing RPART classifier just retained fasting glucose, HbA1c, insulin, C-peptide, and TGs. This and the relatively small difference among the aggregate importance scores of these variables suggest that some of the most important feature variables at position 2–6 of the aggregate variable importance list relate to similar biological aspects and can be used interchangeably. This might be true for TGs and NEFA as well as, to some extent, C-peptide and insulin levels.

The most important limitation of our work is that we cannot generalize the classifier on different study populations and different study settings. In our subset analyzes, the classifiers seemed to yield a higher accuracy in the insulin sensitive, and a lower accuracy in the insulin-resistant subgroup. Kappa was strongly influenced by the different IGT rates in the two subsets and was generally lower than in the original test set. The retraining of classifiers could improve model performance in study populations with substantially different insulin sensitivity distributions. A further limitation of our study is the large set of clinical and laboratory variables, some of which, e.g., NEFA or proinsulin, will not be easily accessible in every laboratory. Also, a precise determination of NEFA levels requires careful preanalytical handling.

Taken together, prediction of IGT from baseline variables with supervised machine learning is a feasible technique. However, in spite of the complex analytes as feature variables, predictive accuracy remains moderate. Therefore, stringently performed OGTT still remains the gold standard for determining IGT. Prediction of IGT with machine learning could be employed to fill in IGT status when OGTT is technically not possible or to retroactively estimate IGT status from stored fasting samples.

Ethics Statement

This study was carried out in accordance with the recommendations of the Ethics Committee of the University of Tübingen with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the Ethics Committee of the University of Tübingen.

Author Contributions

RB and RW analyzed the data and wrote the manuscript. MH, AP, AF, and HP contributed to the interpretation of data and edited the manuscript. MH, H-UH, and BS contributed to the study design and interpretation of data and reviewed the manuscript.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.


We thank all the research volunteers for their participation. We gratefully acknowledge the excellent technical assistance of Anja Dessecker, Ellen Kollmar, and Andreas Vosseler, all Department of Internal Medicine, Division of Endocrinology, Diabetology, Nephrology, Vascular Disease and Clinical Chemistry, University Hospital, Eberhard Karls University, Tübingen. We acknowledge support by Deutsche Forschungsgemeinschaft and Open Access Publishing Fund of University of Tübingen.


This study supported in parts by a grant (01GI0925) from the Federal Ministry of Education and Research (BMBF) to the German Center for Diabetes Research (DZD e.V.). This project has received funding from the Innovative Medicines Initiative 2 Joint Undertaking under grant agreement No 115881 (RHAPSODY). This Joint Undertaking receives support from the European Union’s Horizon 2020 research and innovation programme and EFPIA. This work is supported by the Swiss State Secretariat for Education‚ Research and Innovation (SERI) under contract number 16.0097. The opinions expressed and arguments employed herein do not necessarily reflect the official views of these funding bodies.

Supplementary Material

The Supplementary Material for this article can be found online at


1. Knowler WC, Fowler SE, Hamman RF, Christophi CA, Hoffman HJ, Brenneman AT, et al. 10-year follow-up of diabetes incidence and weight loss in the Diabetes Prevention Program Outcomes Study. Lancet (2009) 374:1677–86. doi:10.1016/S0140-6736(09)61457-4

PubMed Abstract | CrossRef Full Text | Google Scholar

2. Fuller J, Shipley M, Rose G, Jarrett RJ, Keen H. Coronary-heart-disease risk and impaired glucose tolerance the Whitehall Study. Lancet (1980) 315:1373–6. doi:10.1016/S0140-6736(80)92651-3

CrossRef Full Text | Google Scholar

3. Tominaga M, Eguchi H, Manaka H, Igarashi K, Kato T, Sekikawa A. Impaired glucose tolerance is a risk factor for cardiovascular disease, but not impaired fasting glucose. The Funagata Diabetes Study. Diabetes Care (1999) 22:920–4. doi:10.2337/diacare.22.6.920

CrossRef Full Text | Google Scholar

4. Cederberg H, Saukkonen T, Laakso M, Jokelainen J, Härkönen P, Timonen M, et al. Postchallenge glucose, A1C, and fasting glucose as predictors of type 2 diabetes and cardiovascular disease: a 10-year prospective cohort study. Diabetes Care (2010) 33:2077–83. doi:10.2337/dc10-0262

PubMed Abstract | CrossRef Full Text | Google Scholar

5. Alberti KG, Zimmet PZ. Definition, diagnosis and classification of diabetes mellitus and its complications. Part 1: diagnosis and classification of diabetes mellitus provisional report of a WHO consultation. Diabet Med (1998) 15:539–53. doi:10.1002/(SICI)1096-9136(199807)15:7<539:AID-DIA668>3.0.CO;2-S

PubMed Abstract | CrossRef Full Text | Google Scholar

6. Peter A, Fritsche A, Stefan N, Heni M, Häring H-U, Schleicher E. Diagnostic value of hemoglobin A1c for type 2 diabetes mellitus in a population at risk. Exp Clin Endocrinol Diabetes (2011) 119:234–7. doi:10.1055/s-0030-1270440

PubMed Abstract | CrossRef Full Text | Google Scholar

7. Association AD. 2. Classification and diagnosis of diabetes. Diabetes Care (2015) 38:S8–16. doi:10.2337/dc15-S005

CrossRef Full Text | Google Scholar

8. Meijnikman AS, De Block CEM, Dirinck E, Verrijken A, Mertens I, Corthouts B, et al. Not performing an OGTT results in significant underdiagnosis of (pre)diabetes in a high risk adult Caucasian population. Int J Obes (2017) 41(11):1615–20. doi:10.1038/ijo.2017.165

PubMed Abstract | CrossRef Full Text | Google Scholar

9. Wagner R, Thorand B, Osterhoff MA, Müller G, Böhm A, Meisinger C, et al. Family history of diabetes is associated with higher risk for prediabetes: a multicentre analysis from the German Center for Diabetes Research. Diabetologia (2013) 56:2176–80. doi:10.1007/s00125-013-3002-1

PubMed Abstract | CrossRef Full Text | Google Scholar

10. Wagner R, Heni M, Linder K, Ketterer C, Peter A, Böhm A, et al. Age-dependent association of serum prolactin with glycaemia and insulin sensitivity in humans. Acta Diabetol (2013) 51:71–8. doi:10.1007/s00592-013-0493-7

CrossRef Full Text | Google Scholar

11. van Buuren S, Groothuis-Oudshoorn K. mice: multivariate imputation by chained equations in R. J Stat Softw (2011) 45:67. doi:10.18637/jss.v045.i03

CrossRef Full Text | Google Scholar

12. Kautzky-Willer A, Harreiter J, Pacini G. Sex and gender differences in risk, pathophysiology and complications of type 2 diabetes mellitus. Endocr Rev (2016) 37:278–316. doi:10.1210/er.2015-1137

PubMed Abstract | CrossRef Full Text | Google Scholar

13. Harris MI, Hadden WC, Knowler WC, Bennett PH. Prevalence of diabetes and impaired glucose tolerance and plasma glucose levels in U.S. Population aged 20–74 Yr. Diabetes (1987) 36:523–34. doi:10.2337/diab.36.4.523

CrossRef Full Text | Google Scholar

14. Geberhiwot T, Haddon A, Labib M. HbA1c predicts the likelihood of having impaired glucose tolerance in high-risk patients with normal fasting plasma glucose. Ann Clin Biochem (2005) 42:193–5. doi:10.1258/0004563053857950

PubMed Abstract | CrossRef Full Text | Google Scholar

15. Haap M, Fritsche A, Mensing HJ, Häring H-U, Stumvoll M. Association of high serum ferritin concentration with glucose intolerance and insulin resistance in healthy people. Ann Intern Med (2003) 139:869–71. doi:10.7326/0003-4819-139-10-200311180-00029

CrossRef Full Text | Google Scholar

16. Meisinger C, Stöckl D, Rückert IM, Döring A, Thorand B, Heier M, et al. Serum potassium is associated with prediabetes and newly diagnosed diabetes in hypertensive adults from the general population: the KORA F4-study. Diabetologia (2013) 56:484–91. doi:10.1007/s00125-012-2786-8

PubMed Abstract | CrossRef Full Text | Google Scholar

17. Laakso M, Zilinskaite J, Hansen T, Boesgaard TW, Vänttinen M, Stancáková A, et al. Insulin sensitivity, insulin release and glucagon-like peptide-1 levels in persons with impaired fasting glucose and/or impaired glucose tolerance in the EUGENE2 study. Diabetologia (2008) 51:502–11. doi:10.1007/s00125-007-0899-2

PubMed Abstract | CrossRef Full Text | Google Scholar

18. Hanefeld M, Koehler C, Fuecker K, Henkel E, Schaper F, Temelkova-Kurktschiev T. Insulin secretion and insulin sensitivity pattern is different in isolated impaired glucose tolerance and impaired fasting glucose: the risk factor in impaired glucose tolerance for atherosclerosis and diabetes study. Diabetes Care (2003) 26:868–74. doi:10.2337/diacare.26.3.868

PubMed Abstract | CrossRef Full Text | Google Scholar

19. Fritsche A, Madaus A, Stefan N, Tschritter O, Maerker E, Teigeler A, et al. Relationships among age, proinsulin conversion, and beta-cell function in nondiabetic humans. Diabetes (2002) 51(Suppl 1):S234–9. doi:10.2337/diabetes.51.2007.S234

PubMed Abstract | CrossRef Full Text | Google Scholar

20. Sinha R, Fisch G, Teague B, Tamborlane WV, Banyas B, Allen K, et al. Prevalence of impaired glucose tolerance among children and adolescents with marked obesity. N Engl J Med (2002) 346:802–10. doi:10.1056/NEJMoa012578

CrossRef Full Text | Google Scholar

21. Paolisso G, Tataranni PA, Foley JE, Bogardus C, Howard BV, Ravussin E. A high concentration of fasting plasma non-esterified fatty acids is a risk factor for the development of NIDDM. Diabetologia (1995) 38:1213–7. doi:10.1007/BF00422371

PubMed Abstract | CrossRef Full Text | Google Scholar

22. Pradhan AD, Manson JE, Rifai N, Buring JE, Ridker PM. C-reactive protein, interleukin 6, and risk of developing type 2 diabetes mellitus. JAMA (2001) 286:327–34. doi:10.1001/jama.286.3.327

PubMed Abstract | CrossRef Full Text | Google Scholar

23. Jimba S, Nakagami T, Takahashi M, Wakamatsu T, Hirota Y, Iwamoto Y, et al. Prevalence of non-alcoholic fatty liver disease and its association with impaired glucose metabolism in Japanese adults. Diabet Med (2005) 22:1141–5. doi:10.1111/j.1464-5491.2005.01582.x

CrossRef Full Text | Google Scholar

24. Yousefzadeh G, Shokoohi M, Yeganeh M, Najafipour H. Role of gamma-glutamyl transferase (GGT) in diagnosis of impaired glucose tolerance and metabolic syndrome: a prospective cohort research from the Kerman Coronary Artery Disease Risk Study (KERCADRS). Diabetes Metab Syndr (2012) 6:190–4. doi:10.1016/j.dsx.2012.08.013

PubMed Abstract | CrossRef Full Text | Google Scholar

25. Meisinger C, Döring A, Stöckl D, Thorand B, Kowall B, Rathmann W. Uric acid is more strongly associated with impaired glucose regulation in women than in men from the general population: the KORA F4-Study. PLoS One (2012) 7:e37180. doi:10.1371/journal.pone.0037180

PubMed Abstract | CrossRef Full Text | Google Scholar

26. R Core Team. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing (2015). Available from:

Google Scholar

27. Kuhn M. caret: Classification and Regression Training. Astrophysics Source Code Library (2015). ascl:1505.003. Available from:

Google Scholar

28. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: synthetic minority over-sampling technique. J Artif Intell Res (2002) 16:321–57. doi:10.1613/jair.953

CrossRef Full Text | Google Scholar

29. Matsuda M, DeFronzo RA. Insulin sensitivity indices obtained from oral glucose tolerance testing: comparison with the euglycemic insulin clamp. Diabetes Care (1999) 22:1462–70. doi:10.2337/diacare.22.9.1462

PubMed Abstract | CrossRef Full Text | Google Scholar

30. Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics (1977) 33:159–74. doi:10.2307/2529310

PubMed Abstract | CrossRef Full Text | Google Scholar

31. McDonald GW, Fisher GF, Burnham C. Reproducibility of the oral glucose tolerance test. Diabetes (1965) 14:473–80. doi:10.2337/diab.14.8.473

CrossRef Full Text | Google Scholar

32. Olefsky JM, Reaven GM. Insulin and glucose responses to identical oral glucose tolerance tests performed forty-eight hours apart. Diabetes (1974) 23:449–53. doi:10.2337/diab.23.5.449

CrossRef Full Text | Google Scholar

33. Ganda OP, Day JL, Soeldner JS, Connon JJ, Gleason RE. Reproducibility and comparative analysis of repeated intravenous and oral glucose tolerance tests. Diabetes (1978) 27:715–25. doi:10.2337/diab.27.7.715

PubMed Abstract | CrossRef Full Text | Google Scholar

34. Feskens EJM, Bowles CH, Kromhout D. Intra- and interindividual variability of glucose tolerance in an elderly population. J Clin Epidemiol (1991) 44:947–53. doi:10.1016/0895-4356(91)90058-H

PubMed Abstract | CrossRef Full Text | Google Scholar

35. Ko GT, Chan JC, Woo J, Lau E, Yeung VT, Chow CC, et al. The reproducibility and usefulness of the oral glucose tolerance test in screening for diabetes and other cardiovascular risk factors. Ann Clin Biochem (1998) 35(Pt 1):62–7. doi:10.1177/000456329803500107

PubMed Abstract | CrossRef Full Text | Google Scholar

36. Mooy JM, Grootenhuis PA, de Vries H, Kostense PJ, Popp-Snijders C, Bouter LM, et al. Intra-individual variation of glucose, specific insulin and proinsulin concentrations measured by two oral glucose tolerance tests in a general Caucasian population: the Hoorn Study. Diabetologia (1996) 39:298–305. doi:10.1007/BF00418345

PubMed Abstract | CrossRef Full Text | Google Scholar

37. Demacker PNM, Schade RWB, Jansen RTP, Van’t Laar A. Intra-individual variation of serum cholesterol, triglycerides and high density lipoprotein cholesterol in normal humans. Atherosclerosis (1982) 45:259–66. doi:10.1016/0021-9150(82)90227-1

PubMed Abstract | CrossRef Full Text | Google Scholar

38. Jacobs DR, Barrett-Connor E. Retest reliability of plasma cholesterol and triglyceridethe lipid research clinics prevalence study. Am J Epidemiol (1982) 116:878–85. doi:10.1093/oxfordjournals.aje.a113490

PubMed Abstract | CrossRef Full Text | Google Scholar

39. Wagner R, Fritsche L, Heni M, Fehlert E, Stefan N, Staiger H, et al. A novel insulin sensitivity index particularly suitable to measure insulin sensitivity during gestation. Acta Diabetol (2016) 53:1037–44. doi:10.1007/s00592-016-0930-5

PubMed Abstract | CrossRef Full Text | Google Scholar

40. Stefan N, Fritsche A, Schick F, Häring H-U. Phenotypes of prediabetes and stratification of cardiometabolic risk. Lancet Diabetes Endocrinol (2016) 4:789–98. doi:10.1016/S2213-8587(16)00082-6

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: clinical study, supervised machine learning, oral glucose tolerance test, prediction, classification, impaired glucose tolerance, test-retest variability, machine learning classification

Citation: Babbar R, Heni M, Peter A, Hrabě de Angelis M, Häring H-U, Fritsche A, Preissl H, Schölkopf B and Wagner R (2018) Prediction of Glucose Tolerance without an Oral Glucose Tolerance Test. Front. Endocrinol. 9:82. doi: 10.3389/fendo.2018.00082

Received: 05 December 2017; Accepted: 22 February 2018;
Published: 19 March 2018

Edited by:

Wei Bao, University of Iowa, United States

Reviewed by:

Rade Vukovic, Mother and Child Health Care Institute of Serbia, Serbia
Jan Gojda, 3rd Faculty of Medicine, Czechia

Copyright: © 2018 Babbar, Heni, Peter, Hrabě de Angelis, Häring, Fritsche, Preissl, Schölkopf and Wagner. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Róbert Wagner,

Present address: Rohit Babbar, Aalto University, Helsinki, Finland