AUTHOR=Jiang Liangjun , Yang Zerui , Liu Gang , Xia Zhenhua , Yang Guangyao , Gong Haimei , Wang Jing , Wang Lei TITLE=A feature optimization study based on a diabetes risk questionnaire JOURNAL=Frontiers in Public Health VOLUME=Volume 12 - 2024 YEAR=2024 URL=https://www.frontiersin.org/journals/public-health/articles/10.3389/fpubh.2024.1328353 DOI=10.3389/fpubh.2024.1328353 ISSN=2296-2565 ABSTRACT=In recent years, the prevalence of diabetes, a common chronic disease, has shown a gradual increase, posing substantial burdens on both society and individuals. In order to enhance the effectiveness of diabetes risk prediction questionnaires, optimize the selection of characteristic variables, and raise awareness of diabetes risk among residents, this study utilizes survey data obtained from the risk factor monitoring system of the Centers for Disease Control and Prevention in the United States. Following univariate analysis and meticulous screening, a more refined dataset was constructed. This dataset underwent preprocessing steps, including data distribution standardization, the application of the Synthetic Minority Oversampling Technique (SMOTE) in combination with the Round function for equilibration, and data standardization. Subsequently, machine learning (ML) techniques were employed, utilizing enumerated feature variables to evaluate the strength of the correlation among diabetes risk factors. The research findings effectively elucidated the ranking of characteristic variables that significantly influence the risk of diabetes. As a result of this feature optimization, the risk prediction model became more streamlined and efficient. These outcomes offer valuable insights for simplifying the characteristic variables utilized in diabetes risk questionnaires.