ORIGINAL RESEARCH article
Front. Psychiatry
Sec. Digital Mental Health
Volume 16 - 2025 | doi: 10.3389/fpsyt.2025.1584719
Local adaptation and validation of a transdiagnostic risk calculator for first episode psychosis using mental health patient records
Provisionally accepted- 1Brighton and Sussex Medical School, Brighton, United Kingdom
- 2King's College London, London, England, United Kingdom
- 3Akrivia Health Ltd, Oxford, United Kingdom
- 4Sussex Partnership NHS Foundation Trust, Worthing, England, United Kingdom
- 5University of Sussex, Brighton, West Sussex, United Kingdom
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Few at-risk adults are identified by specialised services prior to the development of a first episode of psychosis. A transdiagnostic risk calculator, predicting psychosis using electronic health record (EHR) data, was developed in London, UK to identify patients at risk, using structured data and 14 natural language processing (NLP)-derived symptom and substance use concepts. We report the adaptation and internal validation of this risk calculator in a Southeast England region.In a retrospective cohort study using EHR patient notes we identified individuals accessing mental healthcare in Southeast England (Nov-1992 to Jan-2023) who received a primary diagnosis of a nonpsychotic or non-organic mental disorder. We developed new machine-learning NLP algorithms for diagnosis, symptom and substance use concepts by fine-tuning existing open-source transformer models. Baseline and outcome coded diagnoses were supplemented with NLP-derived diagnosis data.Cox regression was used to predict psychosis and prior weights were applied; discrimination (Harrell's C) was assessed.Nearly all NLP concepts achieved an F1-measure of accuracy above 0.8 following development. In an analysis sample of 63,922 patients with complete data, the risk calculator had acceptable but lower accuracy in Southeast England (Harrell's C 0.71) compared to the London benchmark (Harrell's C 0.85).The risk calculator performed similarly in Southeast England to in other external validation studies, discriminating acceptably, suggesting that this calculator may be adapted successfully for new patient populations, services and geographic areas. Differences in accuracy may be due to different cultures of data capture, different NLP approaches, or differences in the patient cohort.
Keywords: psychosis, Risk Assessment, Natural Language Processing, Electronic Health Records, mental health care, at risk mental state
Received: 27 Feb 2025; Accepted: 24 Jun 2025.
Copyright: © 2025 Ford, Stone, Oliver, Fell, Roque, Robertson, Fusar-Poli and Greenwood. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Elizabeth Ford, Brighton and Sussex Medical School, Brighton, United Kingdom
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.