AUTHOR=Dutta Rina , Gkotsis George , Velupillai Sumithra U. , Downs Johnny , Roberts Angus , Stewart Robert , Hotopf Matthew TITLE=Identifying features of risk periods for suicide attempts using document frequency and language use in electronic health records JOURNAL=Frontiers in Psychiatry VOLUME=Volume 14 - 2023 YEAR=2023 URL=https://www.frontiersin.org/journals/psychiatry/articles/10.3389/fpsyt.2023.1217649 DOI=10.3389/fpsyt.2023.1217649 ISSN=1664-0640 ABSTRACT=Background: Currently suicide risk assessment is based on a range of extensively reported stable risk factors, but critical to dynamic risk assessment is an understanding of each individual patient’s health trajectory over time. The use of electronic health records (EHRs) and analysis using machine learning has the potential to accelerate progress in developing early warning indicators. Objectives: To (i) investigate whether the rate at which EHR documents are recorded per patient is associated with a suicide attempt; (ii) compare document-level word usage between documents proximal and distal to a suicide attempt; (iii) compare n-gram frequency related to third-person pronoun use proximal and distal to a suicide attempt using machine learning Findings: n=8,247 patients were identified to have made a hospitalised suicide attempt. Of these n=3,167 (39.8%) of patients had at least one document available in their EHR prior to their first suicide attempt. N=1,424 (45.0%) of these patients had been ‘monitored’ by mental healthcare services in the past 30 days. From 60 days prior to a first suicide attempt, there was a rapid increase in the monitoring level (document recording of the past 30 days) increasing from 35.1% to 45.0%. Documents containing words related to prescribed medications / drugs / overdose / poisoning / addiction had the highest odds of being a risk indicator used proximal to a suicide attempt (OR 1.88; precision 0.91, recall 0.93) and documents with words citing a care plan were associated with the lowest risk for a suicide attempt (OR 0.22; precision 1.00, recall 1.00). Function words, word sequence and pronouns were most common in all three representations (uni-, bi- and trigram). Conclusion: EHR documentation frequency and language use can be used to distinguish periods distal from and proximal to a suicide attempt. However, in our study 55.0% of patients with documentation prior to their first suicide attempt, did not have a record in the preceding 30 days, meaning that there are a high number who are not seen at their most vulnerable point.