AUTHOR=Bird Matthew B. , Roach Megan H. , Nelson Roberts G. , Helton Matthew S. , Mauntel Timothy C. TITLE=A machine learning framework to classify musculoskeletal injury risk groups in military service members JOURNAL=Frontiers in Artificial Intelligence VOLUME=Volume 7 - 2024 YEAR=2024 URL=https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2024.1420210 DOI=10.3389/frai.2024.1420210 ISSN=2624-8212 ABSTRACT=Background Musculoskeletal injuries (MSKIs) are endemic in military populations. Thus, it is essential to identify and mitigate MSKI risks. Time-to-event machine learning models utilizing self-reported questionnaires or existing data (e.g., electronic health records) may aid in creating efficient risk screening tools.Methods 4,222 U.S. Army Service members completed a self-report MSKI risk screen as part of their unit's standard in-processing. Additionally, participant's MSKI and demographic data were abstracted from electronic health record data. Survival machine learning models (cox proportional hazard regression (COX), COX with splines, conditional inference trees, random forest), were deployed to develop a predictive model on the training data (75%; n = 2,963) for MSKI risk over varying time horizons (30-, 90-, 180-, 365-days) and were evaluated on the testing data (25%; n = 987). Probability of predicted risk (0.00 to 1.00) from the final model stratified Service members into quartiles based on MSKI risk.The COX model demonstrated the best model performance over the time horizons. Time dependent area under the curve ranged from 0.73-0.70 at 30-days and 180-days, respectively, and index prediction accuracy (IPA) was 12% better at 180-days when compared to the null model (0 variables). Within the COX model, "other" race, more self-reported pain items during the movement screens, female gender, and prior MSKI demonstrated the largest hazard ratios. When predicted probability was binned into quartiles, at 180-days, the highest risk bin had an MSKI incidence rate of 2,130.82±171.15 per 1,000 person-years and incidence rate ratio of 4.74 (95% confidence interval: 3.44, 6.54) compared to the lowest risk bin.Conclusions Self-reported questionaries and existing data can be used to create a machine learning algorithm to identify Service member's MSKI risk profiles. Further research should develop more granular Service member-specific MSKI screening tools and create MSKI risk mitigation strategies based on these screenings.