AUTHOR=Wang Yikang , Zhang Liying , Niu Miaomiao , Li Ruiying , Tu Runqi , Liu Xiaotian , Hou Jian , Mao Zhenxing , Wang Zhenfei , Wang Chongjian TITLE=Genetic Risk Score Increased Discriminant Efficiency of Predictive Models for Type 2 Diabetes Mellitus Using Machine Learning: Cohort Study JOURNAL=Frontiers in Public Health VOLUME=Volume 9 - 2021 YEAR=2021 URL=https://www.frontiersin.org/journals/public-health/articles/10.3389/fpubh.2021.606711 DOI=10.3389/fpubh.2021.606711 ISSN=2296-2565 ABSTRACT=Background: Previous studies have constructed prediction models for type 2 diabetes mellitus (T2DM), but machine learning was rarely used and few focused on genetic prediction. This study aimed to establish an effective T2DM prediction tool, and to further explore the potential of genetic risk scores (GRS) via various classifiers among rural adults. Methods: In this prospective study, a total of 5,712 participants from the Henan Rural Cohort Study were conducted to GRS. Cox proportional hazards (CPH) regression was used to analyze associations between GRS and T2DM. CPH, artificial neural network (ANN), random forest (RF), and gradient boosting machine (GBM) were used to establish prediction models, respectively. The area under the receiver operating characteristic curve (AUC) and net reclassification index (NRI) were used to assess the discrimination ability of the models. The decision curve was plotted to determine the clinical-utility for prediction models. Results: Compared with the individuals in the lowest quintile of the GRS, the HR (95% CI) was 2.06 (1.40 to 3.03) for those with the highest quintile of GRS (P trend < 0.05). Based on the conventional predictors, the AUCs of the prediction model were 0.815, 0.816, 0.843, and 0.851 via CPH, ANN, RF, and GBM, respectively. And the changes with integration of GRS for CPH, ANN, RF, and GBM were 0.001, 0.002, 0.018 and 0.033, respectively. The reclassifications were significantly improved for all classifiers when adding GRS (NRI: 41.2% for CPH; 41.0% for ANN; 46.4% for ANN; 45.1% for GBM). Decision curve analysis presented the clinical benefit for model combined GRS. Conclusion: The prediction model combined with GRS may provide incremental predictive performance beyond conventional factors for T2DM, which demonstrated the potential clinical utility of genetic markers to promote screening vulnerable populations. Clinical Trial Registration: The Henan Rural Cohort Study has been registered at Chinese Clinical Trial Register (Registration number: ChiCTR-OOC-15006699). http://www.chictr.org.cn/showproj.aspx?proj=11375.