AUTHOR=Chapman Alec B. , Cordasco Kristina , Chassman Stephanie , Panadero Talia , Agans Dylan , Jackson Nicholas , Clair Kimberly , Nelson Richard , Montgomery Ann Elizabeth , Tsai Jack , Finley Erin , Gabrielian Sonya TITLE=Assessing longitudinal housing status using Electronic Health Record data: a comparison of natural language processing, structured data, and patient-reported history JOURNAL=Frontiers in Artificial Intelligence VOLUME=Volume 6 - 2023 YEAR=2023 URL=https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2023.1187501 DOI=10.3389/frai.2023.1187501 ISSN=2624-8212 ABSTRACT=Measuring long-term housing outcomes is important for evaluating the impacts of services for individuals with homeless experiences. However, assessing housing stability longitudinally requires follow-up and repeated measures of an individual’s housing status, which are challenging and costly to collect. The Veterans Affairs (VA) Electronic Health Record (EHR) provides detailed data for a large population of patients with homeless experiences and contains several indicators of housing instability, including structured data elements (e.g., ICD-10 codes) and free-text clinical narratives. However, the validity of each of these data elements for measuring housing stability over time are not well-studied. To fill this gap, we compared VA EHR indicators of housing instability, including information extracted from clinical notes using natural language processing (NLP), with patient-reported housing outcomes in a cohort of homeless-experienced Veterans. Our findings show that NLP can increase sensitivity for identifying unstable housing among homeless-experienced VA patients, and that combining natural language processing with structured data elements can further improve validity. Evaluation efforts and research studies assessing longitudinal housing outcomes should incorporate multiple data sources of documentation to achieve optimal performance.