AUTHOR=Zhou Zhaohe , Luo Dan , Yang Bing Xiang , Liu Zhongchun TITLE=Machine Learning-Based Prediction Models for Depression Symptoms Among Chinese Healthcare Workers During the Early COVID-19 Outbreak in 2020: A Cross-Sectional Study JOURNAL=Frontiers in Psychiatry VOLUME=Volume 13 - 2022 YEAR=2022 URL=https://www.frontiersin.org/journals/psychiatry/articles/10.3389/fpsyt.2022.876995 DOI=10.3389/fpsyt.2022.876995 ISSN=1664-0640 ABSTRACT=Background: The COVID-19 related depression symptoms of health care workers have received world-wide recognition. Although many studies identified risk exposures associated with depression symptoms among health care workers, few have focused on a predictive model using machine learning methods. As a society, governments and organizations are concerned about the need for immediate interventions and alert systems for health care workers who are mentally at-risk.This study aims to develop and validate machine learning-based models for predicting depression symptoms using survey data collected during the COVID-19 outbreak in China. Method: Surveys were conducted of 2,574 health care workers in hospitals designated to care for COVID-19 patients. The outcome measure was a score of >=5 on the Patient Health Questionnaire. Descriptive statistics were used to describe the data. Four machine learning approaches were developed (75% of data) and validated (25% of data) using cross-validation with 100 repetitions to identify important predictors for depression symptoms. Finally, all models were compared to evaluate their predictive performances and screening utility: decision tree, logistics regression with least absolute shrinkage and selection operator (LASSO), random forest, and gradient-boosting tree. Results: Important risk predictors identified and ranked by the machine learning models were highly consistent: self-perceived health status factors always occupied the top five most important predictors, followed by worried about infection, working on the frontline, a very high level of uncertainty, having received any form of psychological support material and having COVID-like symptoms. The C-statistics [95% CI] of machine learning models were as follows: LASSO model, 0.824 [0.792-0.856]; random forest, 0.828 [0.797-0.859]; gradient-boosting tree, 0.829 [0.798-0.861]; and decision tree, 0.785 [0.752-0.819]. The calibration plot indicated that the LASSO model, random forest, and gradient-boosting tree fit the data well. Decision curve analysis showed that all models obtained net benefits for predicting depression symptoms. Conclusions: This study shows that machine learning prediction models are suitable for making predictions about mentally at-risk health care workers predictions in a public health emergency setting. The application of multidimensional machine learning models could support hospitals’ and health care workers’ decision-making on possible psychological interventions and proper mental health management.