AUTHOR=Li Mengting , Lu Xiangyu , Yang HengBo , Yuan Rong , Yang Yong , Tong Rongsheng , Wu Xingwei TITLE=Development and assessment of novel machine learning models to predict medication non-adherence risks in type 2 diabetics JOURNAL=Frontiers in Public Health VOLUME=Volume 10 - 2022 YEAR=2022 URL=https://www.frontiersin.org/journals/public-health/articles/10.3389/fpubh.2022.1000622 DOI=10.3389/fpubh.2022.1000622 ISSN=2296-2565 ABSTRACT=Background Medication adherence is the main determinant of effective management of type 2 diabetes, yet there is no gold standard method available to screen patients with high-risk nonadherence. Developing machine learning models to predict high-risk nonadherence in patients with T2D could optimize management. Methods This cross-sectional study was carried out in patients with T2D at the Sichuan Provincial People’s Hospital from April 2018 to December 2019, who were examined for HbA1c on the day of the survey. Demographic and clinical characteristics were extracted from the questionnaire and electronic medical records. The sample was randomly divided into a training dataset and a test dataset with a radio of 8: 2 after data preprocessing. Four imputing methods, five sampling methods, three screening methods, and 18 machine learning algorithms were used to groom data, develop and validate models. Bootstrapping was performed to generate the validation set for external validation and univariate analysis. Models were compared on the basis of predictive performance metrics. Finally, we validated the sample size on the best model. Results The study included 980 patients with T2D, of whom 184 (18.8%) were defined as medication nonadherence. The results indicated that the model used Modified Random Forest as the imputation method, Random Under Sampler as the sampling method, Boruta as the feature screening method and the ensemble algorithms, had the best performance. The AUC, F1 score, and AUPRC of the best model, among a total of 1080 trained models, were 0.8369, 0.7912, and 0.9574, respectively. Age, present FBG values, present HbA1c values, present RBG values, and BMI were the most significant contributors associated with risks of medication adherence. Conclusions We found that machine learning methods could be used to predict the risk of nonadherence in patients with T2D. The proposed model was well performed to identify T2D patients with nonadherence and could help improve individualized T2D management.