ORIGINAL RESEARCH article

Front. Psychiatry

Sec. Computational Psychiatry

Volume 16 - 2025 | doi: 10.3389/fpsyt.2025.1579543

This article is part of the Research TopicAdvancing Psychiatric Care through Computational Models: Diagnosis, Treatment, and PersonalizationView all 4 articles

Personalized Prediction and Intervention for Adolescent Mental Health: Multimodal Temporal Modeling Using Transformer

Provisionally accepted
Guiyuan  ZhangGuiyuan Zhang1*Shuang  LiShuang Li2
  • 1Guangxi Vocational College of Water Resources and Electric Power, Nanning, China
  • 2Institute of Semiconductors, Chinese Academy of Sciences (CAS), Beijing, Beijing Municipality, China

The final, formatted version of the article will be published soon.

The mental health problems of adolescents are becoming increasingly serious, and early prediction and personalized intervention have become important research topics. The existing methods have certain limitations in dealing with complex emotional fluctuations and multimodal data fusion. To address this issue, this paper proposes a novel model, MPHI Trans, which combines multimodal data and temporal modeling techniques to accurately capture the dynamic changes in adolescent mental health status. MPH Trans provides tailored intervention recommendations by predicting individuals' emotions and psychological states, promoting personalized mental health management. In the experiment, MPH Trans performed significantly better than various advanced models such as BERT, T5, and XLNet on the DAIC-WOZ and WESAD datasets. Specifically, the accuracy of MPHI Trans on the DAIC-WOZ dataset reached 89%, recall rate was 84%, precision rate was 85%, F1 score was 84%, and AUC-ROC was 92%; The accuracy on the WESAD dataset is 88%, recall rate is 81%, precision rate is 82%, F1 score is 81%, and AUC-ROC is 91%. In addition, the critical role of temporal modeling and multimodal fusion modules in model performance was verified through ablation experiments. After removing these modules, the performance of the model significantly decreased, demonstrating their indispensable role in capturing emotional fluctuations and information fusion

Keywords: Mental Health, Personalized intervention, multimodal fusion, temporal modeling, emotion recognition, deep learning

Received: 25 Feb 2025; Accepted: 07 May 2025.

Copyright: © 2025 Zhang and Li. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Guiyuan Zhang, Guangxi Vocational College of Water Resources and Electric Power, Nanning, China

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.