AUTHOR=Barrios Juan , Poznyak Elena , Lee Samson Jessica , Rafi Halima , Gabay Simon , Cafiero Florian , Debbané Martin TITLE=Detecting ADHD through natural language processing and stylometric analysis of adolescent narratives JOURNAL=Frontiers in Child and Adolescent Psychiatry VOLUME=Volume 4 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/child-and-adolescent-psychiatry/articles/10.3389/frcha.2025.1519753 DOI=10.3389/frcha.2025.1519753 ISSN=2813-4540 ABSTRACT=IntroductionAttention-Deficit/Hyperactivity Disorder (ADHD) significantly affects adolescents' everyday lives, particularly in emotion regulation and interpersonal relationships. Despite its high prevalence, ADHD remains underdiagnosed, highlighting the need for improved diagnostic tools. This study explores, for the first time, the potential of Natural Language Processing (NLP) and stylometry to identify linguistic markers within Self-Defining Memories (SDMs) of adolescents with ADHD and to evaluate their utility in detecting the disorder. A further novel aspect of this research is the use of SDMs as a linguistic dataset, which reveals meaningful patterns while engaging psychological processes related to identity and memory.MethodOur objectives were to: (1) characterize linguistic features of SDMs in ADHD and control groups; (2) assess the predictive power of stylometry in classifying participants' narratives as belonging to either the ADHD or control group; and (3) conduct a qualitative analysis of key linguistic markers of each group. Sixty-six adolescents (25 diagnosed with ADHD and 41 typically developing peers) recounted SDMs in a semi-structured format; these narratives were transcribed for analysis. Stylometric features were extracted and used to train a Support Vector Machine (SVM) classifier to distinguish between narratives from the ADHD and control groups. Linguistic metrics such as wordcount, lexical diversity, lexical density, and cohesion were computed and analyzed. A qualitative analysis was also applied to examine stylistic patterns in the narratives.ResultsAdolescents with ADHD produced narratives that were shorter, less lexically diverse, and less cohesive. Stylometric analysis using an SVM classifier distinguished between ADHD and control groups with up to 100% precision. Distinct linguistic markers were identified, potentially reflecting difficulties in emotion regulation.DiscussionThese findings suggest that NLP and stylometry can enhance ADHD diagnostics by providing objective linguistic markers, thereby improving both its understanding and diagnostic procedures. Further research is needed to validate these methods in larger and more diverse populations.