ORIGINAL RESEARCH article

Front. Psychiatry

Sec. Digital Mental Health

Volume 16 - 2025 | doi: 10.3389/fpsyt.2025.1584719

Local adaptation and validation of a transdiagnostic risk calculator for first episode psychosis using mental health patient records

Provisionally accepted
  • 1Brighton and Sussex Medical School, Brighton, United Kingdom
  • 2King's College London, London, England, United Kingdom
  • 3Akrivia Health Ltd, Oxford, United Kingdom
  • 4Sussex Partnership NHS Foundation Trust, Worthing, England, United Kingdom
  • 5University of Sussex, Brighton, West Sussex, United Kingdom

The final, formatted version of the article will be published soon.

Few at-risk adults are identified by specialised services prior to the development of a first episode of psychosis. A transdiagnostic risk calculator, predicting psychosis using electronic health record (EHR) data, was developed in London, UK to identify patients at risk, using structured data and 14 natural language processing (NLP)-derived symptom and substance use concepts. We report the adaptation and internal validation of this risk calculator in a Southeast England region.In a retrospective cohort study using EHR patient notes we identified individuals accessing mental healthcare in Southeast England (Nov-1992 to Jan-2023) who received a primary diagnosis of a nonpsychotic or non-organic mental disorder. We developed new machine-learning NLP algorithms for diagnosis, symptom and substance use concepts by fine-tuning existing open-source transformer models. Baseline and outcome coded diagnoses were supplemented with NLP-derived diagnosis data.Cox regression was used to predict psychosis and prior weights were applied; discrimination (Harrell's C) was assessed.Nearly all NLP concepts achieved an F1-measure of accuracy above 0.8 following development. In an analysis sample of 63,922 patients with complete data, the risk calculator had acceptable but lower accuracy in Southeast England (Harrell's C 0.71) compared to the London benchmark (Harrell's C 0.85).The risk calculator performed similarly in Southeast England to in other external validation studies, discriminating acceptably, suggesting that this calculator may be adapted successfully for new patient populations, services and geographic areas. Differences in accuracy may be due to different cultures of data capture, different NLP approaches, or differences in the patient cohort.

Keywords: psychosis, Risk Assessment, Natural Language Processing, Electronic Health Records, mental health care, at risk mental state

Received: 27 Feb 2025; Accepted: 24 Jun 2025.

Copyright: © 2025 Ford, Stone, Oliver, Fell, Roque, Robertson, Fusar-Poli and Greenwood. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Elizabeth Ford, Brighton and Sussex Medical School, Brighton, United Kingdom

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.