Identifying clinical phenotypes of frontotemporal dementia in post-9/11 era veterans using natural language processing

Introduction Frontotemporal dementia (FTD) encompasses a clinically and pathologically diverse group of neurodegenerative disorders, yet little work has quantified the unique phenotypic clinical presentations of FTD among post-9/11 era veterans. To identify phenotypes of FTD using natural language processing (NLP) aided medical chart reviews of post-9/11 era U.S. military Veterans diagnosed with FTD in Veterans Health Administration care. Methods A medical record chart review of clinician/provider notes was conducted using a Natural Language Processing (NLP) tool, which extracted features related to cognitive dysfunction. NLP features were further organized into seven Research Domain Criteria Initiative (RDoC) domains, which were clustered to identify distinct phenotypes. Results Veterans with FTD were more likely to have notes that reflected the RDoC domains, with cognitive and positive valence domains showing the greatest difference across groups. Clustering of domains identified three symptom phenotypes agnostic to time of an individual having FTD, categorized as Low (16.4%), Moderate (69.2%), and High (14.5%) distress. Comparison across distress groups showed significant differences in physical and psychological characteristics, particularly prior history of head injury, insomnia, cardiac issues, anxiety, and alcohol misuse. The clustering result within the FTD group demonstrated a phenotype variant that exhibited a combination of language and behavioral symptoms. This phenotype presented with manifestations indicative of both language-related impairments and behavioral changes, showcasing the coexistence of features from both domains within the same individual. Discussion This study suggests FTD also presents across a continuum of severity and symptom distress, both within and across variants. The intensity of distress evident in clinical notes tends to cluster with more co-occurring conditions. This examination of phenotypic heterogeneity in clinical notes indicates that sensitivity to FTD diagnosis may be correlated to overall symptom distress, and future work incorporating NLP and phenotyping may help promote strategies for early detection of FTD.


Introduction
Frontotemporal dementia (FTD) is a type of dementia that primarily affects the frontal and temporal lobes of the brain, leading to the progressive deterioration of behavior, personality, language abilities and executive dysfunction (1,2).After Alzheimer's disease (AD), FTD is the second most common cause of early-onset dementia (3).Unlike Alzheimer's, which generally affects older individuals, FTD typically strikes at a younger age, with most cases occurring between 45 and 64 years of age (4).TBI and PTSD are both associated with an increased risk for neurodegenerative disorders, including FTD (5).Post-9/11 veterans represent a unique population with significant exposure to risk factors for FTD, as they are relatively young population and have a high prevalence of Traumatic Brain Injury (TBI) and Post-Traumatic Stress Disorder (PTSD) due to their military service experiences.The complexity of FTD presentation creates challenges for early detection, diagnosis, and treatment in this population.
FTD can manifest in two distinct clinical presentations; the behavioral variant of FTD (bvFTD) and the language variant (lvFTD).The behavioral variant is often marked by noticeable earlyonset behavioral and by executive symptoms (2).The language variant is further classified into the semantic and non-fluent presentations of primary progressive aphasia (6).The behavioral and language signs and symptoms of FTD, however, often overlap in complex ways, and each individual symptom exists along a spectrum of severity.This makes FTD diagnosis challenging, and it is often misdiagnosed as a psychiatric disorder or stroke during the early stages (7).Therefore, an examination of the phenotypic heterogeneity of the disease is needed to improve identification.Clarifying the boundaries of FTD's various presentations could also help clinician's discriminate FTD from psychiatric disorders and stroke.
Identifying distinct phenotypes, defined as 'any traits or characteristics that distinguish a specific state 'could help to elucidate the heterogeneity of FTD case presentations (8).As FTD is heterogenous, FTD phenotyping demands particularly rich forms of data capable of discriminating subtle variations in FTD presentation.Natural Language Processing (NLP) presents a promising approach to phenotyping FTD because NLP can extract rich information from clinical notes across a patient's full medical history (9).Mature NLP tools can automate the extraction of valuable information about the symptomology and characteristics of FTD, which may not be evident in traditional structured data (10).In this study, NLP tools were used to identify and characterize FTD-related symptoms and features in patient's clinical notes.These features were then used to compare the histories of FTD cases to matched controls, and cluster distinct presentations within FTD cases.

Cohort development
We initially identified post-9/11 era veterans who entered VA care between 2001-2012, had three or more years of VA care through the end of 2015, with one of those years being after 2007.Of those veterans, through 2019, n = 86,960 had an ICD9 code associated with cognitive dysfunction.Of the 86,960 patients identified, 98.98% of these cases (n = 86,071) were 65 years old or younger at the time of diagnosis.Those >65 years of age at the time of diagnosis were excluded.Because some of the ICD9 codes associated with cognitive dysfunction have poor predictive value in a population that is 65 years old or younger at the time of diagnosis (11).we only included those with ICD9 codes with a positive predictive value higher than 0.8 or that had been verified through expert chart review previously (11).The two ICD9 codes of Alzheimer's (331.0) and Frontotemporal Dementia (FTD, 331.1) have a positive predictive value of 0.85 and 0.95, respectively, in a younger population (11).Our approach, then, was to consider those with an Alzheimer's or FTD diagnosis positively validated (n = 460).We had 239 cases that had been validated by expert chart review in our previous study (11).This gave us a total of 699 cases that we treated as gold standards for training of our NLP system (Moonstone) (10).Since, the primary objective of our research was to discern whether it is possible to identify patients with TBI and cognitive dysfunction who are at heightened risk for developing FTD, it was essential to have a robust control group that mirrors the cases of interest in all respects.Therefore, cases were matched to at least one and up to four controls per case.Controls had to have a similar level of traumatic brain injury, if the case had a traumatic brain injury, but no indicator of cognitive dysfunction based on CTBIE and/or diagnosis codes.Cases were matched by age (± two years by birth year), gender, race, ethnicity, and year of first VA care.Nine of the cases of cognitive dysfunction lacked appropriate control matches, however, and were excluded leaving us with 690 cases and 2,624 control cases.We then randomly chose 200 FTD records with their matched controls (n = 713) for specific analysis of FTD (see Figure 1).

Moonstone ontology/grammar rule building
The Moonstone NLP platform ( 10) is designed to extract data from clinical text not just by capturing explicitly stated information but also by inferring complex concepts often embedded in the nuanced language of common narrative.Moonstone diverges from typical NLP systems that require unambiguous phrasing, as it was originally developed to recognize social risk factors (SRF) like housing status, whether a patient lives alone, and the presence of social support.The ontology within Moonstone denotes a concept hierarchy that includes both literal and inferred instances-'patient in communication with family' being a literal example, while 'social support' is more inferred (10).
Training of Moonstone for new NLP tasks involves the expansion of this ontology to encompass new concepts, supplemented by the creation of additional grammar rules until the system achieves satisfactory accuracy.For the purposes of our study, the ontology was augmented to include concepts pertaining to cognitive impairment, poor psychosocial function, and PTSD symptomatology.This enhancement was accomplished using two graphical tools: one for adding novel words and concepts to the ontology, and another for generating new grammar rules.This latter tool operates by allowing trainers to select from the array of "parse trees" that Moonstone produces when it processes sentences containing unknown words.From these trees, a new rule definition is extracted, to which a concept from the ontology is then attached.Consequently, this concept is applied to the interpretation of any phrase or sentence matching the new rule, thereby extending Moonstone's analytical reach.
The technique of expanding Moonstone's capabilities was meticulously applied to sentences from a set of reports, which were utilized to train the platform for this project.Through this iterative process, Moonstone's utility was refined, enabling it to more accurately parse and understand the complexities of clinical narratives related to cognitive and psychological assessments.After the ontology and grammar rule enhancement, and upon training Moonstone with the validated clinical notes, we employed a random forest classifier to identify cases with cognitive dysfunction.The classifier demonstrated a high level of precision, accurately identifying cases with an 88% success rate, further confirming the supervised nature of the learning paradigm employed by Moonstone (see Figure 2).

Clinical note type selection and training
Ontology and grammar rule building in Moonstone was trained with manual text annotation of clinical notes from 165 cases of cognitive dysfunction validated by chart review and consensus of three neuropsychologists in a prior study and which were considered "gold standard" for this work.All the gold standard cases had neuropsychologist consult notes and neuroimaging notes.Annotators reviewed 15,985 note title types that existed in the electronic health record for the 165 gold standard cases and determined the most relevant note types for FTD.Annotators chose 3,108 note types to review for possible inclusion.Two nurse practitioners reviewed and validated the clinical text of 20% of these 3,108 note types and determined 1,195 note title types for inclusion in this study for training the NLP software, Moonstone (10).The annotators validated these notes for sentence level evidence of cognitive dysfunction, poor psychosocial function, and PTSD symptomology, and symptoms relevant to traumatic brain injury.Then based on the ontology lexicon, Moonstone read the clinical text and counted the number of times each concept was found in each patient's history.Overall, 39 unique FTD-related concepts were identified by this process.Supplementary Table S1 provides a list of all 39 concepts.

RDoC domain
The 39 NLP-derived ontologies were grouped into Research Domain Criteria (RDoC) domains to improve interpretation.The RDoC framework is a comprehensive approach developed by the National Institute of Mental Health (NIMH) to understand mental disorders based on underlying neurobiological and behavioral dimensions (12).The RDoC framework consists of multiple domains that capture the fundamental dimensions of human functioning and psychopathology.These domains are Cognitive, Positive Valence, Negative Valence, Social Processes, Sensorimotor, Arousal-Regulatory (12).In this work we additionally added the domain of "Interpersonal Trauma" to increase specificity to some underlying PTSD ontologies.To translate individual's count of ontologies into a presence/absence of RDoC domains, we defined a domain as present in an individual's records if at least half of the domain's underlying ontologies were present in clinical notes.

Analysis Statistical analysis
All analyses were scripted in Python 3. We used Z tests for univariate analyses to test differences in proportions between FTD and Controls using the statsmodels feature.

Clustering
For clustering and dimensionality reduction, we employed the Uniform Manifold Approximation and Projection (UMAP) technique, utilizing frequency of 39 concepts extracted from the medical notes of our entire sample.UMAP is a manifold learning approach that facilitates the reduction of dimensions in the dataset.One of its key advantages lies in its ability to effectively preserve the global structure of high-dimensional data while simultaneously retaining the intersample distances.The subsequent clusters resulting from the UMAPbased analysis were assessed using the silhouette score method, a statistical measure that evaluates the effectiveness of a clustering technique by considering the defined subgroups' quality in relation to their number (19).

Word cloud
Word clouds are qualitative tools for visualizing the relative frequency of terms in text.Word cloud generating software was used to represent the relative frequency of symptom ontologies.The word cloud provides a summary overview of the most frequent FTD-specific concepts occurring in the medical notes for those with FTD relative to controls.In the word cloud, symptom ontologies that were more common in the FTD group are larger, and symptom ontologies that were equal across the two groups are represented in smaller text.

Data summary
Table 1 presents sociodemographic and military measures for FTD cases and matched controls.After matching, the groups were statistically similar in terms of age, gender, education, race, ethnicity, and marital status, and military branch affiliations (p > 0.05).A significant difference emerged in the distribution of military rank, with the FTD group having a lower proportion of enlisted (p < 0.001) and a higher proportion of officers (p < 0.01).
Table 2 compares the incidence of comorbidities between the FTD group and matched controls.The FTD group showed significantly higher rates of overdose, depression, bipolar disorder, schizophrenia, suicidal ideation/attempt, stroke/CVD, cardiac issues, and seizures (p < 0.001).domains showed significant percentage differences between the control group and the FTD group.The FTD group showed significantly higher percentages of individuals meeting the criteria for the cognitive, positive valence, negative valence, social processes, sensorimotor, arousal-regulatory, and interpersonal trauma domains.The cognitiverelated ontologies showed the strongest association with FTD. Figure 3 presents a word cloud of the most common behavioral characteristics among individuals with Frontotemporal Dementia (FTD) relative to matched controls.Dementia, impulsivity, executive symptoms, decision-making, and motor symptoms all featured prominently.A lack of recognition and motivation, alongside difficulties with social processes and interpersonal mannerisms featured moderately.

Phenotypes of FTD
Figure 4 shows a two-dimensional representation of the seven RDoC domains produced using UMAP dimensional reduction (20).In Figure 4A, a UMAP dimensional reduction is shown color coded by group membership (FTD, n = 200, blue circles.Controls, n = 713, white circles) where the distance between points is a preserved estimate of the distance between individuals across all RDoC domains.In Figure 4B, average percentage of all ontologies present in notes is shown against the percentage of veterans with FTD.Both measures were derived by iterating a boundary of inclusion across the ontology space in Figure 4C.There is a strong positive correlation between percentage with FTD and frequency of sign/symptom ontologies in clinical notes.For example, given a cluster where 70% are FTD+, then 71% of the 39 ontologies are present on average in clinical notes.
Table 4 represents the incidence of demographic and clinical characteristics across three phenotypes identified by the clustering approach (see method section): low distress (N = 149), moderate distress (N = 632), and high distress (N = 132).High distress individuals had a significantly higher incidence of FTD, 71.97% compared to 8.05 and 14.71% in the low and moderate distress groups, respectively (p < 0.001).Similar patterns were observed with total behavioral symptoms and various clinical characteristics like traumatic brain injury (TBI), cardiac issues, insomnia, obesity, stroke, headache, and seizures, with all showing a significantly higher prevalence in the high distress group (p < 0.001).Clinical conditions like schizophrenia, anxiety, bipolar disorder, depression, PTSD, overdose, substance abuse disorder, alcohol abuse, and suicide showed significantly higher incidence rates in the high distress group (p < 0.001).The average age was significantly lower in the high and moderate distress groups, and there were differences in racial distribution, with significantly more Hispanic and Black individuals in the high distress group.
Figure 5 assess whether distinct subtypes are identified through clustering.The UMAP dimensional reduction of RDoC domains was performed specifically for the FTD group, comprising 200 cases, resulting in a 2D 'symptom space' .Next, two indices were created: (1) Behavioral concepts (e.g., impulsivity, disinhibition, apathy, and behavioral traits), and (2) Language concepts (e.g., language, speech, learning, executive functions, and memory).The ratio of these two symptom sets was calculated for each individual, and a color code was assigned based on the ratio: records with more behavioral symptoms were marked as RED, while those with more language-related issues in text notes were labeled BLUE.Subsequently, the distribution of these color-coded ratios was evaluated across the RDoC space, where clustering of colors would indicate the presence of subtypes.

Discussion
In this study, NLP-aided medical chart reviews successfully identified distinct phenotypes of FTD and provided a novel signature of RDoC domain distress.Prior research has leveraged unsupervised learning and clustering approaches applied to dementia cohorts.These include clusters of cognitive impairment using biomarkers, anatomical cluster identification and genetic variant mapping, although no clustering studies have specifically evaluated post-9/11 era veterans with FTD (8,21).Our findings align with prior work by demonstrating  The diagnosis of Early onset FTD poses challenges due to its relative rarity, and its highly variable clinical manifestations that can mirror psychiatric disorders and neurological conditions such as stroke (2).The FTD diagnostic process is further complicated by the phenotypic heterogeneity of FTD, which encompasses many distinct behaviors, affective changes, and movement and speech difficulties.NLP provides an appropriate framework to capture these complex patterns, because NLP tools can glean valuable information about subtle features buried within a large corpus of clinical text, far beyond the simple presence/absence encodings typically found in health systems data.Future work may benefit from the use of NLP phenotyping pipelines trained on FTD-specific text features.
To facilitate clinical intuition, raw NLP ontologies extracted from text were organized into validated RDoC domains.RDoC domains were then clustered into a low dimensional space to enable visualization and the identification of three distinct phenotypes (Low, Moderate, and High distress).This analysis revealed a continuum of distress within and across FTD variants, with some diagnosed FTD cases showing surprisingly low levels of symptom distress, although the majority were in the Moderate to High groups.This approach demonstrates how unstructured clinical text can  This figure provides a summary overview of the difference in words used in medical notes that were classified based on the FTD ontologies between patients with FTD and controls.For example, the largest words represent words that were classified by the ontologies far more frequently for those with FTD relative to controls.The smaller words represent concepts that were classified by the ontologies about the same frequency for people with FTD in relative to controls.A comparison of the FTD group and matched controls revealed large differences in the incidence of multiple comorbidities.Prior work has found links between military related TBI and PTSD and FTD (22).The strong associations with specific comorbidities and FTD found in this study reinforce these connections.These findings have implications for identification and care, as these individuals present with a degree medical complexity that demands detailed and appropriate treatment strategies.Additionally, The FTD group exhibited significantly higher rates of overdose, depression, bipolar disorder, schizophrenia, suicidal ideation/attempt, stroke/CVD, cardiac issues, and seizures.FTD is associated with a higher burden of psychiatric and neurological comorbidities which may contribute to the complexity of its clinical presentation as demonstrated by the high prevalence of comorbidities identified among those with FTD.Thus, the broader clinical context is crucial when evaluating individuals for FTD, as the presence of these comorbidities may influence disease progression and treatment efficacy.A limitation of the interpretation of this data is a lack of review of the validity of psychiatric diagnosis associated with the FTD cases.For example, a patient could be misdiagnosed with bipolar disorder early on in the disease process, but then be diagnosed with FTD after consultation with experts and progression over time.It could be helpful for clinicians to continue to consider FTD as a rule out early on in the diagnostic stages, given the large overlap of FTD with psychiatric presentations.
Overall, those with FTD had higher risk of suicidal ideation and overdose as compared to controls and this could be an important factor when trying to decide on early intervention approaches and psychoeducation for clinicians and/or caregivers in the future studies.Additionally, the FTD cases in this study had the features of emotional liability and interpersonal trauma one might see in psychiatric disorders but this was often coupled with an impulsivity that could be associated with the high rates of overdose and interpersonal conflict.This is consistent with the current studies regarding FTD in the general population (23,24).Future studies looking at the effectiveness of therapeutic and pharmacological approaches aimed at mitigating this impulsivity could help to inform treatment options across phenotypes in the future (24).Our NLP approach is limited in being able to differentiate between apathy and impulsivity, or even to consistently identify apathy, because it is reliant on clinical bias in reporting while note taking, but it can identify these concepts generally across a large population which could help to aid future studies.
From chart review and verified with NLP analysis across cases, FTD cases had significantly higher incidence of interpersonal trauma as compared to control, although controls in this population also had incidences of interpersonal trauma.For the cases that were chart reviewed, this interpersonal trauma was related to high reports of distress, substance use, and suicidal ideation.This is consistent with work done by Takeda et al. and Massimo et al. showing the impact of FTD on caregivers and the impact of FTD on relationships (25,26).Our work is novel in that we were able to identify these issues from a large-scale NLP approach and validate these findings within our specific population.Future work could include studies evaluating the effectiveness of targeting therapeutic approaches aimed at helping people with FTD and their caregivers manage these interpersonal relationships and the difficulty of dealing with the relationship issues that arise given the stress of the disease could help in treatment of this disease.
In our statistical evaluation of symptoms over time since diagnosis, symptoms seemed to increase over time (Figure 3C).It is unclear, however, if this is due to lack of effective treatment or progression of the disease.Either way, taking the current literature as a whole, managing impulsivity and supporting patients in improving interpersonal relationships across the disease progression and across the lifespan, could be key in making a clinical impact on the experience of distress in this patient population.
Cumulative symptom severity across all domains distinguished FTD subtypes in important ways that may compliment the typical classification of FTD by variant.Our study explored the existence of distinct subtypes within the FTD population based on symptom presentation.By performing dimensional reduction of RDoC domains for the FTD group and creating two indices for Behavioral and Language variant, we assessed the variability in symptom profiles among veterans with FTD and were able to identify a unique subtype with distinct symptom profiles.Our result shows that phenotyping approaches may help to further elucidate the relationship between FTD symptom distress and disease progression, enabling more accurate prognoses.Future work could also explore whether an NLP tool for assessing overall dementia symptom severity could serve as a rapid heuristic for population level disease progression.Automated NLP screening of distress could also be useful for validating or extending existing tools such as the Frontotemporal dementia Rating Scale, FRS in large populations (27).This study highlights how clinical phenotyping and clustering approaches may offer opportunities to better understand rare and heterogeneous diseases and improve early detection and clinical care for individuals living with dementia.

Limitation
This study, focused on identifying the clinical phenotypes of Frontotemporal Dementia (FTD) among post-9/11 era veterans, holds several limitations.The generalizability of results is restricted given the specific study demographic, while the retrospective design could introduce bias due to the potential for incomplete historical medical records.The study relies on ICD-10 codes for identifying FTD cases.The number of FTD cases is relatively small (n = 200), which might limit the statistical power of the study.

Conclusion
This study demonstrated the potential of NLP and phenotyping approaches to enhance the classification of FTD subtypes, considering cumulative symptom severity alongside the traditional variant-based classification.By leveraging NLP and validated domains, valuable insights into distress levels, comorbidities, and interpersonal relationships in FTD patients were gained.The findings revealed that FTD exhibits a continuum of severity and symptom distress, both within and across variants, with distress levels often co-occurring with other conditions.This highlights the importance of sensitivity to overall symptom distress in diagnosing FTD and suggests that incorporating NLP and phenotyping methods could aid in early detection strategies for FTD, ultimately contributing to improved patient outcomes.

FIGURE 1
FIGURE 1Cohort development flow chart diagram.

FIGURE 2 Moonstone
FIGURE 2Moonstone system architecture.

FIGURE 4 (
FIGURE 4 (A) A reduced dimensional representation of all sign/symptom ontologies is shown for all individuals, color coded by group membership (FTD, n = 200, blue circles.Controls, n = 713, white circles).Three regions showing individuals with similar symptomology are enumerated (1-3).(B) Like (A) showing the percentage of symptom ontologies present in clinical records per individual.Most FTD ontologies are present for those in region 3, whereas group 1 shows low rates of ontologies in records.(C) The percentage of all ontologies in records is shown as a function of time since first FTD diagnosis.Boxplots broken out per year indicate more FTD-related signs and symptoms in health records are evident for those with more time since first FTD diagnosis.

FIGURE 3
FIGURE 3 be used to assess the heterogeneity of neurological disease.Future work could include an in-depth temporal analysis to better understand how time from diagnosis influences our current model of symptom distress and how different phenotypes progress through the disease over time.

Table 3
compares the percentage of FTD cases and controls with evidence of each RDoC domain criteria in clinical notes.All RDoC

TABLE 1
Sociodemographic and military measures for FTD cases and matched controls.

TABLE 2
Comorbidity prevalence for the FTD group and matched controls.
Bold values are significant.

TABLE 3
Percentage incidence of ontologies that fall into the RDoC domains for the control and FTD groups, with p-values testing for groups differences per domain.

TABLE 4
Percentage incidence of demographic and clinical characteristics criteria by each phenotype group.
FIGURE 5UMAP dimensional reduction of RDoC domains for the FTD+ group only (n = 200).To identify FTD variants, individuals were color coded by the relative ratio of behavioral (red) to language (blue) related concepts in their clinical notes.Colored clusters indicate individuals presenting with distinct behavioral and language variants and symptomology.