AUTHOR=Kadra-Scalzo Giouliana , Chaturvedi Jaya , Dale Oliver , Hayes Richard D. , Li Lifang , Mahmood Shaza , Monk-Cunliffe Jonathan , Roberts Angus , Moran Paul TITLE=Recovery in personality disorders: the development and preliminary testing of a novel natural language processing model to identify recovery in mental health electronic records JOURNAL=Frontiers in Digital Health VOLUME=Volume 7 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/digital-health/articles/10.3389/fdgth.2025.1544781 DOI=10.3389/fdgth.2025.1544781 ISSN=2673-253X ABSTRACT=IntroductionThe concept of recovery is of great importance in mental health as it emphasizes improvements in quality of life and functioning alongside the traditional focus on symptomatic remission. Yet, investigating non-symptomatic recovery in the field of personality disorders has been particularly challenging due to complexities in capturing the occurrence of recovery. Electronic health records (EHRs) provide a robust platform from which episodes of recovery can be detected. However, much of the relevant information may be embedded in free-text clinical notes, requiring the development of appropriate tools to extract these data.MethodsUsing data from one of Europe's largest electronic health records databases [the Clinical Records Interactive Search (CRIS)], we developed and evaluated natural language processing (NLP) models for the identification of occupational and activities of daily living (ADL) recovery among individuals diagnosed with personality disorder.ResultsThe models on ADL performed better (precision: 0.80; 95% CI: 0.73–0.84) than those on occupational recovery (precision: 0.62; 95%CI: 0.52–0.72). However, the models performed less acceptably in correctly identifying all those who recovered, generally missing at least 50% of the population of those who had recovered.ConclusionIt is feasible to develop NLP models for the identification of recovery domains for individuals with a diagnosis of personality disorder. Future research needs to improve the efficiency of pre-processing strategies to handle long clinical documents.