AUTHOR=Lau Clinton , Zhu Xiaodan , Chan Wai-Yip 

TITLE=Automatic depression severity assessment with deep learning using parameter-efficient tuning

JOURNAL=Frontiers in Psychiatry

VOLUME=Volume 14 - 2023

YEAR=2023

URL=https://www.frontiersin.org/journals/psychiatry/articles/10.3389/fpsyt.2023.1160291

DOI=10.3389/fpsyt.2023.1160291

ISSN=1664-0640

ABSTRACT=To assist mental health care providers with the assessment of depression, research to develop a standardized, accessible, and non-invasive technique has garnered considerable attention. Our study focuses on the application of deep learning models for automatic assessment of depression severity based on clinical interview transcriptions. Despite the recent success of deep learning, the lack of large-scale high-quality datasets is a major performance bottleneck for many mental health applications. In this paper, we address this data problem for depression assessment by presenting a novel approach which leverages recent advances in pretrained large language models and parameter-efficient tuning techniques. Our approach is built upon adapting a small set of tunable parameters, known as prefix vectors, to guide a pretrained model towards predicting the Patient Health Questionnaire (PHQ)-8 score of a person. While transfer learning through pretrained large language models can provide a good starting point for downstream learning, the introduction of prefix vectors can further adapt the pretrained models effectively to the depression assessment task by only adjusting a small number of parameters. Through experiments conducted on the widely used Distress Analysis Interview Corpus - Wizard of Oz (DAIC-WOZ) benchmark dataset, we demonstrate that a pretrained language model enhanced with prefix vectors outperforms previously published methods and achieves the best reported performance. Compared to conventionally fine-tuned baseline models, prefix-enhanced models are less prone to overfitting by using far fewer training parameters (less than 6% relatively). Moreover, we conjoin prefix-tuned embeddings with general-purpose sentence embeddings to further improve the predictive models. Our method achieves a new state-of-the-art result on the test set of DAIC-WOZ and outperforms models that utilize multiple types of data modalities, with a root mean square error of 4.67 and a mean absolute error of 3.80 on the PHQ-8 scale. The improvement is in part due to the fine-grain flexibility of prefix vector size in adjusting the model's learning capacity.