ORIGINAL RESEARCH article
Front. Artif. Intell.
Sec. Medicine and Public Health
Constructing a risk screen for attention difficulty in U.S. adults using six machine learning methods
Provisionally accepted- Shenzhen Hospital, Peking University, Shenzhen, China
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Background: Concentration difficulty is recognized as a hall maker of various neurologic and neuropsychiatric disorders. However, an accurate estimation of epidemiological risk factors for concentration difficulty is still severely limited. Aims: The aim of this research was to develop an interpret able machine-learning (ML) model to predict the risk factors of concentration difficulty among US adults. Methods: 9971 participants were included from the 2015–2016 cycle of the National Health and Nutrition Examination Survey (NHANES). Six ML algorithms, including Logistic Regression, ExtraTrees classifier, Bagging, Gradient Boosting, Extreme Gradient Boosting (XGBoost), and Random Forest (RF) were performed in the study. The performance of the model was evaluated by the area under the receiver operating characteristic curve (AUC), accuracy, precision, specificity, decision curve analysis (DCA) curve as well as calibration plot. Finally, we built a nomogram based on the result of the best model. Results: A total of 2146 participants aged 20 years and older were involved in this study. The Logistic Regression exhibited the best clinical predictive value in the internal and external validation sets, with an AUC of 0.881 and 0.818, respectively. The DCA curve showed that the Logistic Regression had largest net benefits in the internal cohort, while the RF model had the largest net benefits in the external cohort (threshold: 0.2-0.3). Conclusions: Our results demonstrated that the Logistic Regression model had the best clinical value in predicting the concentration difficulty. Our findings would provide insight for the recognition, management, and effective interference for concentration difficulty.
Keywords: machine learning, NHANES, concentration difficulty, neuropsychiatric disorders, Logistic regression
Received: 16 Sep 2025; Accepted: 25 Nov 2025.
Copyright: © 2025 Yi, Song, Sun and Guo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Li Yi
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.
