AUTHOR=Shin Daun , Kim Kyungdo , Lee Seung-Bo , Lee Changwoo , Bae Ye Seul , Cho Won Ik , Kim Min Ji , Hyung Keun Park C. , Chie Eui Kyu , Kim Nam Soo , Ahn Yong Min 

TITLE=Detection of Depression and Suicide Risk Based on Text From Clinical Interviews Using Machine Learning: Possibility of a New Objective Diagnostic Marker

JOURNAL=Frontiers in Psychiatry

VOLUME=Volume 13 - 2022

YEAR=2022

URL=https://www.frontiersin.org/journals/psychiatry/articles/10.3389/fpsyt.2022.801301

DOI=10.3389/fpsyt.2022.801301

ISSN=1664-0640

ABSTRACT=Background: Depression and suicide are critical social problems worldwide, but tools to objectively diagnose them are lacking. Therefore, this study aimed to diagnose depression through machine learning and determine whether it is possible to discriminate high-risk groups of suicide through texts spoken by the participants in a semi-structured interview.
Methods: A total of 83 normal and 83 depressed patients were recruited. All participants were recorded during the Mini-International Neuropsychiatric Interview, and through the suicide risk assessment among interview items, 83 depressed patients were classified into 31 high-suicide risk groups and 52 low-suicide risk groups. The recorded file was transcribed into the text after the part uttered by the participant was extracted. In addition, all participants  were evaluated for depression, anxiety, suicidal ideation, and impulsivity. The chi-square test and student’s T-test were used to compare clinical variables, and the Naive Bayes classifier was used for the machine learning model of text.
Results: A total of 21,376 words were extracted from all participants, and the model for diagnosing depressed patients based on this confirmed area under the curve (AUC) of 0.905, a sensitivity of 0.699, and a specificity of 0.964. The AUC of the model that distinguishes the two groups using statistically significant demographic variables between them was only 0.761, and it was confirmed that the text-based classification was superior to the demographic model by the DeLong test result (p-value 0.001). When predicting the high suicide risk group, the demographics-based AUC was 0.499, and the text-based AUC was 0.632. However, the AUC of the ensemble model incorporating demographics variables was 0.800.
Conclusions: The possibility of diagnosing depression with the interview text was confirmed, and regarding suicide risk, the diagnosis accuracy increased when the text was used with the demographic variables. Therefore, it was confirmed that the participant’s words in the interview show significant potential as an objective and diagnostic marker through machine learning.