Your new experience awaits. Try the new design now and help us make it even better

STUDY PROTOCOL article

Front. Psychiatry

Sec. Computational Psychiatry

This article is part of the Research TopicMachine Learning Algorithms and Software Tools for Early Detection and Prognosis of SchizophreniaView all 10 articles

Speech Analytics across the Schizophrenia Spectrum Disorders: Multimodal Natural Language Processing and Machine Learning Modelling in a Chinese-Speaking Population

Provisionally accepted
Jiaqi  LiuJiaqi LiuSumiao  ZhouSumiao ZhouGuangxing  DengGuangxing DengMeng  JiMeng JiXufei  ZhuXufei ZhuXue  HeXue HeQijie  KuangQijie Kuang*shenglin  Sheshenglin She*
  • Guangzhou Brain Hospital, Guangzhou Medical University, Guangzhou, China

The final, formatted version of the article will be published soon.

Background: Formal thought disorder (FTD) is a core symptom of schizophrenia spectrum disorders (SSDs). As a key representational dimension of FTD, speech features have been shown in previous studies to hold potential as diagnostic biomarkers for SSD. However, relevant research remains limited, and such speech features have not yet been applied clinically for SSD diagnosis. Objective: The aim of this research is to establish a Chinese speech database for multidimensional analysis of speech characteristics, quantify these high-dimensional linguistic features using natural language processing (NLP), and ultimately develop objective biomarkers for diagnosing and assessing the severity of SSD. Methods: This will be a single-centre, prospective, observational study. In accordance with the DSM-5 criteria, a total of 300 inpatients or outpatients meeting the diagnostic criteria for SSD are planned to be included. Healthy controls with no history of intellectual disability will subsequently be matched. Each participant will undergo a 1-to-2-hour task-guided interview conducted by a psychiatrist, which includes an app-based assessment of the PANSS(Positive and Negative Syndrome Scale), short passage reading, an animal fluency test, a pseudosentence reading task, a symptom severity rating task, an inner-world expression task, and a picture description task. All the interviews will be audio-recorded. After the interview, clinical rating scales will assess psychiatric symptom severity, social functioning, and thought-language disorders. During the study, at an interval of 2 weeks. Discussion: By multidimensionally quantifying these speech characteristics and integrating machine learning, this study aims to screen highly discriminative speech feature combinations specific to SSD, thereby providing technical and theoretical support for the precise diagnosis and personalized intervention of SSD. These findings will deepen psychiatrists' understanding of the linguistic pathological mechanisms underlying SSD and promote the development of diagnostic tools and intervention protocols based on novel biomarkers.

Keywords: formal thought disorder, machine learning, Natural Language Processing, Schizophrenia spectrum disorders, Speech

Received: 15 Oct 2025; Accepted: 08 Dec 2025.

Copyright: © 2025 Liu, Zhou, Deng, Ji, Zhu, He, Kuang and She. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
Qijie Kuang
shenglin She

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.