- 1Faculty of Psychology, University of Salamanca, Salamanca, Spain
- 2Faculty of Psychology, University of Murcia, Murcia, Spain
- 3Institute of Neuroscience of Castilla y León (INCYL), University of Salamanca, Salamanca, Spain
Background: Depression is highly prevalent among older adults, exceeding rates in the general population. Traditional diagnostic tools, such as interviews and self-reports, are limited by subjectivity, time demands, and overlap with age-related changes. Speech, as a non-invasive behavioral marker, is promising for objective depression assessment, but its specific utility in older populations remains less explored. This systematic review identifies speech characteristics linked to depression in older adults and their clinical potential.
Methods: Following PRISMA guidelines, a search was conducted in Medline, CINAHL, PsychINFO, IEEE, and Web of Science for studies published in the last 10 years. Eligible studies included adults aged over 55, with depression diagnosis or symptoms, and at least one acoustic variable. Sixteen studies met inclusion criteria. Methodological quality was assessed with JBI tools, and speech parameters and classification outcomes were extracted.
Results: Depressed older adults consistently showed slower speech rate, longer and more variable pauses, reduced intensity, and altered voice quality. Predictive studies using machine learning reached accuracies of 76–95%, particularly when age and gender were controlled. Findings were inconsistent for F0 and formants: women often showed lower peak frequency and amplitude, while men displayed higher amplitude change and formant frequencies. Limitations included small clinical samples and insufficient control of confounders, especially cognitive impairment.
Conclusion: Speech analysis appears reliable, non-invasive, and cost-effective for detecting depression in older adults. Temporal, prosodic, and spectral features show strong diagnostic potential. Further research with larger, representative samples is required to validate speech-based biomarkers as complements to existing assessments.
1 Introduction
Depression has become one of the greatest public health challenges worldwide. Depression affects approximately 3–5% of the world’s population (Yang et al., 2024). The magnitude of this problem is particularly evident in the elderly population, where various meta-analyses estimate prevalences of between 13 and 26% (Abdoli et al., 2022; Hu et al., 2022). These figures demonstrate the remarkable impact of depression on global public health and highlight the importance of effective strategies for its prediction and assessment, especially in older adults.
According to ICD-11, depression is a disorder characterized by a mood disturbance presenting with sadness or irritation, as well as a persistent loss of interest (World Health Organization, 2022). Other symptoms may often include changes in weight or appetite, sleep disturbances, fluctuations in energy levels, difficulties in making impactful decisions such as financial ones (Giannouli et al., 2022), feelings of guilt, among others, with recurrent thoughts of death or suicide being particularly common. Such memory deficits not only limit cognitive performance, but are also closely linked to the loss of daily functioning, increasing vulnerability at this stage of the life cycle (James et al., 2021). Particularly in older adults, cognitive symptoms such as episodic memory impairment are also common (James et al., 2021). In fact, several studies point to the connection between late-life depression and increased risk of dementia (Ly et al., 2021; Muhammad and Meher, 2021).
The methods for diagnosing and monitoring depression are based on clinical observations and self-administered scales that require clinical staff trained in the assessment of mood disorders. Self-administered scales have limitations that are particularly relevant in older adults. Their accuracy may be affected by reduced introspection ability related to cognitive impairment, and response biases including the influence of somatic symptoms that can inflate scores (Harvey et al., 2023; Tarailis et al., 2025). The main tool is the semi-structured interview, which checks for the presence and intensity of the disease. Psychometric questionnaires, such as the Beck’s Depression Inventory (BDI-II, Beck et al., 1996), the Geriatric Depression Scale (GDS, Yesavage and Sheikh, 1986), the Hamilton Rating Scale for Depression (HRSD, Hamilton, 1986), and the Patient Health Questionnaire (PHQ-9, Kroenke et al., 2001) are essential for the accurate assessment of symptoms and for monitoring progress. BDI demonstrated good sensitivity and specificity with older adult depressed outpatients (Edelstein et al., 2010). However, they have limitations. For example, PHQ9 (Manea et al., 2015) and BDI-II (Von Glischinski et al., 2019) shows considerable heterogeneity between studies and low sensitivity. Sometimes this entire process can be time consuming until the nature of the disorder and its severity can be determined.
In the elderly population, there are several factors that make it difficult to diagnose depression and lead to higher misdiagnosis rates or difficulties in receiving appropriate intervention. First, there is the difficulty in seeking help due to mobility issues, loneliness, which implies less monitoring by others, or downplaying the importance of the condition by considering it to be a natural part of aging. Actually, episodes of negative mood that may present symptoms similar to depression are common due to the stage of life and its coincidence with events such as retirement or the loss of loved ones. Other obstacles that can mask depressive symptoms are somatic symptoms and cognitive impairment (Devita et al., 2022). In this regard, one of the most common problems has to do with cognitive complaints. Although older adults with depression tend to overestimate their cognitive problems (Edmonds et al., 2014), it is common for them to present objective cognitive deficits. Therefore, for a correct diagnosis of mild cognitive impairment (MCI), it must be ruled out that the deficit is related to a mood disorder. In any case, the relationship between depression and dementia is complex, and patients with MCI also tend to show more depressive symptoms (Anderson, 2019), which can accelerate progressive deterioration and increase the risk of dementia (Lara et al., 2017). Also, the inability to reliably diagnose depression is especially problematic for suicide risk prevention because the risk of suicide is 20 times higher in individuals diagnosed with depression than it is in the general population (Mitchell et al., 2005). It is therefore necessary to seek objective assessment procedures that offer the possibility of screening for depression and predicting its progression with a high degree of reliability, both because of the limitations of current instruments and the need for appropriate and early intervention that reaches this vulnerable population.
Given these limitations, there is a need to explore new methodologies that complement and enrich the diagnostic process. In recent years, the use of technology that directly collects and evaluates an individual’s behavior in search of pathological patterns, such as wearables and smartphones, has gained popularity (Abd-alrazaq et al., 2023). This procedure allows for the passive collection of behavioral data, including speech. This technique has become increasingly important in clinical research and in the field of mental health. This approach allows objective information to be extracted from acoustic, prosodic, or linguistic parameters of the patient’s speech, which can function as biomarkers (Robin et al., 2020).
Speech production is a complex phenomenon that requires the integration and coordination of various cognitive, motor and sensory processes. From the encoding of the linguistic content in the central nervous system to its physical realization through the phonatory organs, multiple systems are involved to generate an articulate and prosodically organized acoustic signal. This sound signal not only conveys linguistic information intended by the speaker, but also reflects their functional and neurocognitive state. Consequently, speech analysis -understood as the process of extracting information from the emitted acoustic signal- is configured as a non-invasive tool with a high diagnostic value (Ramanarayanan et al., 2022). Individuals with neurological or motor disorders often manifest systematic alterations in the various constituents of speech, resulting in acoustic, articulatory, and prosodic deviations from normative patterns. Automatic speech analysis has been shown to be an objective measure (Kourtis et al., 2019), stable over time (Cohen et al., 2012), and highly correlated with the severity of symptoms of different disorders (Zhang et al., 2020) such as Alzheimer’s disease (Martínez-Nicolás et al., 2021), Parkinson’s disease (Solana-Lavalle and Rosas-Romero, 2021), schizophrenia (de Boer et al., 2023), bipolar disorder (Flanagan et al., 2021) or, of course, depression (Cummins et al., 2015).
In addition to transmitting linguistic information, speech production reflects the interaction between emotional, cognitive, and motor processes. In particular, vocal prosody—characterized by pitch, intensity, and rhythm—encodes relevant information about the valence and intensity of affective states (Scherer et al., 2003). Thus, the voice becomes an observable and measurable channel of emotional states, where variations in pitch, rhythm, and speech pauses serve as useful indicators for detecting and assessing emotional disorders such as depression, especially in older adults (Lamers et al., 2014). In this regard, several studies have characterized the speech of people with depression and used acoustic speech parameters for diagnosis and identified acoustic alterations including reduced intensity, narrower pitch range, longer pauses, slower speech rate, and various dysphonic features. For example, Hashim et al. (2022) observed that speech with low intensity and prolonged pauses was linked to greater severity of depressive symptoms and suicidal risk. Even in people with mild symptoms, patterns such as lower fluency and more pauses are detected (Albuquerque et al., 2021). Decreases in F0 have also been observed showing a lower pitch, and changes in spectral features and MFCCs, which point to a breathier voice with lower energy and resonance (Taguchi et al., 2018; Wang et al., 2019). Along the same lines, some markers related to vocal quality would be lower: jitter, shimmer and HNR, which negatively correlate with symptoms of depression (Quatieri and Malyska, 2012). In terms of rhythm, some changes have also been observed in people with depression, such as a reduced speech rate or longer average syllable duration (Alghowinem et al., 2012). With this, some studies have been able to correctly classify these patients with an accuracy of 80–95% (Carrillo et al., 2018; Riad et al., 2024).
These results correspond to relatively young adult populations, however, there are fewer studies exploring this phenomenon in older samples. It is quite possible that many of these observed features are not directly applicable to older adults, since the aging process entails several physiological and cognitive changes that would affect voice production. First, we would speak of changes in the vocal tract, lung capacity and the musculature involved in phonation, which are associated with a reduction in the vocal range, a decrease in the fundamental frequency, and a voice characterized by hoarseness, roughness, and breathiness (Lortie et al., 2015; Mazzetto de Menezes et al., 2014; Schultz et al., 2023). In addition to anatomical changes, we may find other features related to cognitive function, as even older adults with non-pathological aging will show lower speech rate and higher pause frequency (Bóna, 2014). Added to this is the possibility of presenting some type of cognitive impairment, either due to physiological causes or associated with the depressive process. Again, many of these depression-related parameters are common in studies of mild cognitive impairment or various neurodegenerative diseases, such as the aforementioned parameters of rhythm, monotony and voice quality (Qi et al., 2023; Saeedi et al., 2024).
There is, to our knowledge, only one study prior to 2015 that explored the possibility of using acoustic analysis specifically in elderly population to detect depression. Sanchez et al. (2011) used a database of 1,172 older adults with and without depression and were able to discriminate them with an accuracy of 81.3% (68.8% sensitivity and 93.8% specificity) based on pitch and formant parameters. This study opens the door to the use of this tool beyond the mentioned difficulties. In the following years, several studies have appeared to further develop this idea. For this reason, we intend to compile the evidence provided over the last 10 years and to explore the issue of speech analysis in depression in older adults. We aim to critically evaluate the quality of the evidence on this subject, thereby proposing the following research questions:
1. What are the characteristic acoustic speech patterns in people with depressive symptoms or a diagnosis of depression?
2. Is automatic speech analysis a reliable method for assessing depression in older adults?
2 Method
The PRISMA statement (Page et al., 2021) was followed for conducting this review. The review was not pre-registered in any public review registries.
2.1 Eligibility criteria
Inclusion criteria:
• Use of automated speech and language analysis technologies or acoustic analysis, using specialized software or mobile applications. Although there may be other types of measures, it must contain at least one variable obtained through this procedure.
• Sample made up of adults over 55 years of age.
• Must contain at least one group formed by people with a diagnosis of depression or depressive symptomatology to some degree.
• Both descriptive and diagnostic studies are included.
• Empirical studies, clinical trials, quasi-experimental studies or observational studies.
• Publications within the last 10 years
The choice of 55 years of age is motivated by the aforementioned relationship between depression and mild cognitive impairment. Although less common, early onset MCI, defined as beginning before age 65, has been shown to be related to the presence of neuropsychiatric symptoms, and many studies identify the onset of these problems around age 55 (Baird et al., 2021; Moon et al., 2018; Seath et al., 2024).
Exclusion criteria were as follows:
• Cooccurrence of serious psychiatric disorders unrelated to the study (e.g., schizophrenia, bipolar disorder).
• Speech disorders interfering with acoustic analysis (e.g., dysarthria or severe stuttering).
• Studies where depression was treated merely as a covariate rather than the primary outcome.
• Systematic reviews or meta-analyses, opinion articles, studies without results, case studies.
No language restrictions were applied; studies published in any language were considered eligible, although only studies in English were found to meet eligibility criteria.
2.2 Method for locating and identifying studies
The search was conducted in the electronic databases Medline, CINAHL, PsychINFO, IEEE, and Web of Science. The last search was conducted on 17 June 2025. The same terms were included in all the databases: (speech OR acoustic* OR voice OR signal OR spoken language) AND (depress*) AND (older adult* OR elderly OR Late-life). Only a filter for publications within the last 10 years was included.
The total results were compiled, and duplicate papers were removed. Then, two reviewers (IM and DC) independently reviewed the titles and abstracts of the studies. The interrater agreement according to Cohen’s Kappa was κ = 0.798. Disagreements between the reviewers were resolved by discussion. Finally, the references of the selected publications were explored to find possible studies that had missed in the search.
Those articles that did not meet the inclusion criteria were removed.
2.3 Quality assessment and data extraction
The methodological quality and risk of bias of the selected studies were assessed through two checklists: JBI critical appraisal checklist (Moola et al., 2020) for the analytical cross-sectional studies, and the one for diagnostic test accuracy studies (Campbell et al., 2015). In cases where cross-sectional designs were carried out but the discriminative capacity of parameters was also tested, it has been evaluated according to the objective of the study, whether its main objective was the development of a classifier, or the description of parameters and their characteristics, among which could be their discriminative power.
To be assessable, the results must contain information on speech features that are altered in people with depression or that allow the evaluation of diagnostic models of depression through speech. The information obtained from the articles was, first, the sample size and its main characteristics. Given that the objective of the study is to identify speech parameters relevant to the characterization of depression, we extracted those parameters that yielded significant results, either in descriptive studies or in algorithm-based classification studies where they were reported. In these classification studies, the most relevant performance metrics were obtained, seeking to obtain classification accuracy, sensitivity, and specificity whenever possible.
3 Results
The search process has been summarized in Figure 1 through a PRISMA flowchart. A total of 1,843 studies were retrieved, of which 523 duplicates were removed. After screening by title and abstract, 18 studies were selected for a full-text review. Four articles were excluded for not meeting the inclusion criteria, and one additional article was excluded because the full text could not be accessed. Most exclusions occurred because depression was treated only as a control variable rather than as the primary outcome. Within these studies that met the inclusion criteria, the references were explored, finding two new studies that had not been retrieved in the database searches. In total, 16 studies were finally included in the review.
Among the final retrieved studies, five were descriptive studies whose objective was to characterize speech in people with depression through cross-sectional designs, and 11 were predictive studies whose main objective was to obtain classification algorithms for older adults with depression.
3.1 Characteristics of the studies
Across the included studies, several speech parameters were consistently associated with depressive symptoms in older adults. Temporal measures such as slower speech rate, longer pause duration, and greater pause variability were the most frequently reported. Prosodic and spectral features including reduced intensity, changes in F0 and formant frequencies, and alterations in voice quality indices such as jitter, shimmer, and HNR, also showed significant associations with depression. These relationships were identified either through pseudo-experimental designs using groups of patients with depression versus other types and correlational designs, or through feature-selection procedures within machine-learning models. Below, descriptive and predictive studies are presented separately.
We begin by analyzing the descriptive studies. Table 1 shows the information from the studies, including sample size and main characteristics, the task used to elicit speech, statistical analysis that has been used to stablish relationship, the parameters column lists the specific speech-related variables or outcomes analyzed in each study, and a findings column that summarizes the overall conclusions reported by the authors. The studies are presented in the order of appearance in the text. Harlev et al. (2025) develop a classifier, not as the objective of the study, but to select speech parameters that discriminate depression. Interestingly, they find five voice quality parameters with predictive power that would show a decrease or restriction in the expression of emotion. They then attempt to explain the causes of these changes. Although participants with depression showed greater cognitive impairment in this sample, the authors report that the observed speech changes were more strongly associated with apathy than with cognitive status. As we will see, the relationship with cognitive impairment is common in studies exploring depression in the elderly. In this regard, Mijnders et al. (2023) reported slower speech rate in people with depression. Other parameters interact with medication, gender, and cognitive status, namely formant range and pause duration. A limitation of this study is that, even though they were controlled, patients with depression showed psychomotor slowing and physical frailty, whereas none of the controls showed such conditions. A study that specifically explores depression in people with mild cognitive impairment is that of König et al. (2021), which, in addition to depression, analyzes anxiety and apathy. They find parameters that correlate with the severity of the symptoms of all three. In participants with depressive symptoms, women showed lower peak frequency, lower power, and lower amplitude, whereas men exhibited greater average amplitude change. Next study includes samples of several age groups that they analyze separately, including older adults. This study indicates that several duration parameters would be affected by depression. They reported increases in the number and variability of pauses, longer total utterance duration, and a reduced speech rate in participants with depressive symptoms observed in men. Furthermore, these changes would also be mediated by age (Albuquerque et al., 2021). The last study compared groups with depression and dementia and found that several shared voice-quality and spectral features that showed changes. However, depressed patients exhibited negative correlations, whereas dementia patients showed positive ones (Sumali et al., 2020; Table 2).
Next, we will examine studies that developed automatic classifiers from speech. These studies share the use of a series of techniques such as machine learning and deep learning. Although diverse, they share the goal of identifying features or parameters with discriminative power and combining them so that the model captures relevant patterns in the data. These include linear models such as logistic regression, margin-based classifiers like SVM, tree-based methods such as Random Forest and Gradient Boosting, and deep neural networks capable of learning complex representations from the data. We can identify two trends: on the one hand, studies that, in principle, do not take into account cognitive status and consider it, at most, a covariate; and, on the other hand, studies that use only samples with such impairment. Most of the studies fall into the first group.
Higuchi et al. (2017) studies analyzed phone calls from a sample of 28 healthy elderly people and only 4 with depression. They perform an analysis of emotional components of the voice (anger, joy, sorrow, and excitement) and achieve a classifier based on logistic regression with an AUC of 0.76. In a follow-up study, they identified those at risk of depression with a slightly lower AUC (Higuchi et al., 2018). The specific features were not reported in these studies. Another study that is relatively opaque in terms of parameters is that of Lin et al. (2022), which uses a pre-specified algorithm (Ma et al., 2016) developed for the general population based on MFCCs, obtaining an AUC of 0.87 in relatively young patients aged 50 to 65 years.
The rest of the studies from this point on do report the features that are sensitive for classification. Next, studies that provide a general classification and those that focus on older adults with cognitive impairment will be presented.
Among the general studies, we find Little et al.’s (2021), which presents very interesting results given its ecological validity. They used a wearable device with a microphone that records sound throughout the day and identifies the speech produced by the patient, discarding all other sounds and speech. They found that older adults with depression produced substantially less speech during the day, which the authors interpret as a proxy for reduced social interaction and loneliness. These results would correlate with attention and psychomotor speed. Another interesting contribution is that of Smith et al. (2020), who not only developed a classifier between severe/moderate and mild/absent depression. They assessed week-to-week changes anticipating whether depression questionnaire scores would increase or decrease, thus serving as a predictor of therapy progress. Stasak et al.’s (2022) study again includes young and older populations, developing a general classifier, and others specific to age ranges. They observe that when age is taken into account, accuracy improves, based on voice quality parameters. Finally, Lee et al. (2021) developed classifiers taking gender into account, obtaining spectral and energy-related acoustic features most relevant for males and prosody-related features for females. In this study, however, there is no control of cognitive status.
Zhou et al. (2022, 2023, 2024) published three consecutive studies using the same sample of 319 older adults with cognitive impairment. In each paper, different classifications of neuropsychiatric symptoms were used, including depression, anxiety, and apathy. Across these studies they developed multimodal classifiers using speech and facial expression for those neuropsychiatric symptoms. In all of them, they obtained accuracies above 85% in multi-class classifications. They report a wide range of parameters and their correlations with the symptoms of each of the measured dimensions and conclude that depression is fundamentally related to reduced amplitude, poorer voice quality, monotony, and slowness. Finally, Fraser et al. (2016) obtain a sample of elderly people with Alzheimer’s, some of whom also had depression. First, they verify that in a dementia classifier based on voice and linguistic markers, elderly people with depression were not misclassified as having Alzheimer’s and that, in general, depression does not affect the classification. Next, they tried to distinguish older adults with Alzheimer’s disease and depression from others with dementia alone. However, they obtained a very low classification accuracy, indicating that depression can be difficult to separate from neurodegenerative changes using the available features.
3.2 Methodological quality according to JBI criteria
Among the descriptive studies, only two of the five show no evidence of bias according to the JBI appraisal checklist. Figure 2 shows the assessment of each item in each of the articles, indicating a possible low, high, or nuclear risk of bias. Figure 3 shows a summary of the results. In all cases, the possible concern relates to a lack of sufficiently specific information, either regarding the characteristics of the sample or the collection process, or in relation to possible confounding factors such as age or, mainly, cognitive status. Most studies are transparent in terms of sample collection and deal with potential confounding factors either through strict inclusion criteria or the creation of groups with similar age and cognitive status.
Figure 2. Quality assessment of the descriptive studies using the JBI appraisal checklist for analytical cross-sectional studies, and their rating is a high, low, or unclear risk of bias for each question.
As for predictive studies, 8 of the 11 show low risk in all items. The primary concerns were related to the patient selection domain. Several of the evaluated studies did not use a consecutive sequence or random sampling; instead, they selected a group of patients and then selected matched controls. Although it is a common strategy to minimize the effect of factors that influence vocal production, it may limit the generalizability of results (see Figures 4, 5).
Figure 4. Quality assessment of the predictive studies using the JBI checklist for diagnostic test accuracy studies and their rating as a high, low, or unclear risk of bias for each question.
Although in principle, according to the indications of the tools, there would be no major evidence of bias, some limitations must be considered. First, in several of the studies, the number of participants with pathology is excessively small. For example, Higuchi’s study, although methodologically correct in terms of design, sample selection, and gold standard testing, it is doubtful that a sample of four subjects with pathology would yield robust results. Another issue is that in several of the prediction articles, control for cognitive status was insufficient, and in several cases the depression groups had greater cognitive impairment than controls. Although this can be explained by the cognitive symptoms of depression indicated above, there may be doubts that this factor has not influenced the results, despite the selection of appropriate assessment tools. Finally, several of the studies were conducted on the same sample with some modifications in terms of the specificity of the classification of the subjects, which could lead to an overestimation of the findings of these studies.
4 Discussion
This review sought to identify and analyze the literature on the use of automatic speech analysis to characterize and diagnose depression in older adults. Across studies, a wide range of acoustic parameters—spanning prosodic, temporal, spectral, and voice quality domains—were associated with depressive symptoms. Table 3 summarizes the parameters that were found and the direction of change in older adults with depression.
Despite the variety of parameters found, we can find relative consistency. Parameters such as pause duration, speech rate, and voice intensity appeared repeatedly as markers of depressive symptoms. There are also numerous studies that find altered voice quality parameters and cepstral coefficients that suggest emotional blunting and a breathier voice. Some parameters have shown some inconsistency between studies, for example fundamental frequency, shimmer, or first formants. In this regard, the influence of gender must be highlighted. Some studies find different directions for men and women. The fundamental frequency seems to increase in women and decrease in men. Studies that do not control for this aspect and show contradictory directions could be affected by biases in the composition of the sample. These effects, and those of the other parameters, will have to be confirmed by meta-analysis, given that for the present review there are not enough studies, nor are they sufficiently homogeneous in their techniques, to carry out an analysis of this type.
If we compare the changes in speech found in older people with those in younger populations, we can observe both similarities and differences. The lower F0 is a commonly reported parameter (Wang et al., 2019) and identified as a good predictor of a depressive state (Menne et al., 2024), although not all studies agree on this point (Mundt et al., 2012). We observe the same trend in older adults. The decrease in intensity is more consistent with the characteristics in younger people (Calić et al., 2022), but not so much in the variability of intensity or in shimmer, given that we find studies that indicate greater variability in older adults with depression. In the case of formants, a general decline is usually observed in younger people (Helfer et al., 2013), although there are already studies indicating that functioning differs between men and women (Cummins et al., 2017). In this review, we find that while this effect is observed in women, men have higher average formant frequencies. A similar trend to that observed in the fundamental frequency. Older adults also show higher values in the bandwidths, which in any case seems to indicate that depression would be associated with some lack of motor coordination. Similarly, changes in MFCCs, usually associated with the configuration of the vocal tract and control of the articulatory organs, are common in the general literature on depression and speech (Das and Naskar, 2024; Rejaibi et al., 2022). Finally, in terms of temporal and fluency parameters, we found a reduction in total speech time, with longer and more variable pauses, and an increase in speech rate, indicating slower language production. These characteristics would also be shared with samples from younger subjects with large effect sizes (Cummins et al., 2023). Despite the many similarities in parameters between young and older people, as Stasak et al. (2022) points out, the use of age-matched reference populations increases the accuracy of classifications. False positives are much more common in this population because, for example, they show poorer voice quality indices in general, which can lead to them being mistaken for a depressed patient if the reference is the voice quality of a younger person.
Returning to the dimensions of time and rhythm, these seem especially important in older adults. It has been argued that the reduction in speech rate associated with depression is a potential measure of motor delay, or cognitive impairment (Cannizzaro et al., 2004; Cummins et al., 2015). This reduction may be related to a decrease in the speed of speech sound production, which would reflect motor impairment, or to the production of more pauses, which would suggest a cognitive impairment in which the individual has difficulty choosing words or performing other processes. Given that a decrease in articulation rate, which is not pause-dependent, has also been found, and that various pause-related parameters are affected, both issues may be at play. As seems reasonable, this is one of the complex challenges facing this field. In this review, we found many similarities in speech characteristics shared between depression and cognitive impairment. And as we can see, this could be due to different causes that are expressed in the same way, as well as to the presence of characteristics of impairment itself. However, we must be cautious in interpreting these data, since, as noted, in several of the studies, control of cognitive status is deficient.
It is thought that detecting depression by means of speech will be harder because of the natural neuro-muscular changes occurring with age that could be similar to depressive characteristics. And although it seems clear that specific scales are needed for the patient’s age, the results in classification studies are quite promising. Most obtain algorithms with accuracy above 80% and even up to 95%. This is consistent with meta-analyses conducted in younger populations with an accuracy of 89% (Liu et al., 2024). A notable finding is that studies that obtain separate results by gender show better classifications for males, an effect that has also been found in young populations (Hashim et al., 2017). Lee et al. (2021) propose that discriminating features would be different for each gender, with women showing greater importance for those related to rhythm. Therefore, the need to control for both age and gender is evident, and as the literature suggests, speech biomarkers should be targeted at very specific population groups (Fagherazzi et al., 2021). On the other hand, it should be noted that most models are transparent in terms of the speech features that are selected, which facilitates interpretation and cross-study comparisons. In this sense, as is the case with the detection of other disorders, individual parameters seem to have little specificity and relevance, and it is rather their relationship with others that could determine whether they are indicators of the pathology (Ramanarayanan et al., 2022).
A common issue in studies analyzing speech in different pathologies is the importance of the material or task used to produce speech. In this regard, the studies analyzed are very heterogeneous, and there has been little suggestion of a unified criterion for choosing the appropriate task. Tasks such as reading, spontaneous speech in interviews, picture description, diadochokinetic tasks, and verbal Stroop tests have been presented. The only consistent suggestion is the suitability of using tasks that involve emotional expression. Based on the available data, no particular trend can be observed in terms of a possible improvement in classification with the use of tasks of this type, and this issue will require further research. Studies with younger populations do suggest, at least, that spontaneous language is more appropriate in the assessment of depression (Alghowinem et al., 2013; Jiang et al., 2017).
In methodological terms, the studies analyzed are generally adequate. Limitations arise mainly in terms of sample size and non-randomized participant selection, both of which limit generalizability. In addition, control of factors such as age, gender, or cognitive impairment is sometimes deficient, although controlling for all of them would require very large samples, which are complex to obtain in the field of research. Being aware of this difficulty, we must take into account that some of the machine learning studies with small sample sizes, especially from the pathological group, may achieve artificially inflated performance metrics due to overfitting.
As proposals for the future, we can identify several potential objectives to complement research in this field. One promising direction involves developing ecological and longitudinal designs, as seen in studies using wearables or repeated assessments. These approaches align more closely with clinical realities and could allow for the monitoring of symptom trajectories and therapeutic progress in real time (Lamers et al., 2014). Within the field of study itself, it is necessary to standardize speech protocols that optimize results. To do so, studies comparing performance in different types of tasks will be necessary. On the other hand, it is surprising how little literature there is that complements various types of behavioral information sources, given that we have only found one study that combines speech and facial expression. One area to explore seems to be the combination with other types of linguistic markers, which have, however, been explored in other age groups (Tølbøll, 2019; Trifu et al., 2024). Finally, Zhou et al. (2022, 2023, 2024) and Fraser et al. (2016) works explore the possibility of focusing on people with cognitive impairment or even diagnosed dementia, populations in which the assessment of depression is particularly complex.
5 Conclusion
This systematic review shows that automatic speech analysis offers a promising approach for detecting depressive symptoms in older adults. Acoustic parameters have been identified that consistently reflect symptoms of the disorder, particularly in prosodic and temporal variables such as pause duration, speech rate, and vocal intensity, as well as spectral and voice quality variables. Together these features appear to capture motor and potentially cognitive impairments associated with depression, which may serve as valuable behavioral markers. In addition, factors such as gender, age, and cognitive status must be considered, given their significant impact on vocal patterns.
Taken together, these findings indicate that speech analysis holds potential as a non-invasive, cost-effective screening tool for depression in older adults that may provide earlier diagnosis and, therefore, better prognosis. This approach may not only enhance diagnostic accuracy but also facilitate continuous monitoring of patients’ emotional and cognitive status in clinical and daily life contexts. However, the current body of research remains preliminary, and future studies with larger, more representative samples and stronger control of confounding variables are needed before speech-based assessment can be implemented as a clinical screening instrument.
Beyond its diagnostic relevance, voice analysis carries important psychosocial implications. Older adults frequently experience social isolation and limited access to mental health services, and the implementation of automated monitoring systems may offer opportunities for preventive interventions and continuous follow-up. Conceptually, this aligns with the framework of health psychology and gerontology (Cesari et al., 2022), which advocate for person-centered models of care focused on early detection and the promotion of healthy aging.
Author contributions
IM-N: Conceptualization, Data curation, Formal analysis, Writing – original draft. DC: Data curation, Writing – review & editing. FG: Writing – review & editing. FM-S: Writing – review & editing. JM: Conceptualization, Writing – review & editing.
Funding
The author(s) declare that financial support was received for the research and/or publication of this article. This work was supported by the Spanish Ministry of Science, Innovation and Universities [PID2022-139103OB-I00].
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The authors declare that no Gen AI was used in the creation of this manuscript.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
References
Abd-alrazaq, A., AlSaad, R., Aziz, S., Ahmed, A., Denecke, K., Househ, M., et al. (2023). Wearable artificial intelligence for anxiety and depression: scoping review. J. Med. Internet Res. 25:e42672. doi: 10.2196/42672,
Abdoli, N., Salari, N., Darvishi, N., Jafarpour, S., Solaymani, M., Mohammadi, M., et al. (2022). The global prevalence of major depressive disorder (MDD) among the elderly: a systematic review and meta-analysis. Neurosci. Biobehav. Rev. 132, 1067–1073. doi: 10.1016/j.neubiorev.2021.10.041,
Albuquerque, L., Valente, A. R. S., Teixeira, A., Figueiredo, D., Sa-Couto, P., and Oliveira, C. (2021). Association between acoustic speech features and non-severe levels of anxiety and depression symptoms across lifespan. PLoS One 16:e0248842. doi: 10.1371/journal.pone.0248842,
Alghowinem, S., Goecke, R., Wagner, M., Epps, J., Breakspear, M., and Parker, G.. 2012. From Joyous to clinically depressed: mood detection using spontaneous speech. En Proceedings of the 25th International Florida Artificial Intelligence Research Society Conference, FLAIRS-25.
Alghowinem, S., Goecke, R., Wagner, M., Epps, J., Breakspear, M., and Parker, G.. 2013. Detecting depression: a comparison between spontaneous and read speech. 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, 7547–7551.
Anderson, N. D. (2019). State of the science on mild cognitive impairment (MCI). CNS Spectr. 24, 78–87. doi: 10.1017/S1092852918001347,
Baird, K., Baillon, S., Lau, L. S. L., Storey, M., Lindesay, J., and Velayudhan, L. (2021). Predictive factors for conversion to dementia in individuals with early-onset mild cognitive impairment. Dement. Geriatr. Cogn. Disord. 50, 548–553. doi: 10.1159/000520882,
Beck, A. T., Steer, R. A., and Brown, G. K. (1996). BDI-II: Beck depression inventory. London: Pearson.
Bóna, J. (2014). Temporal characteristics of speech: the effect of age and speech style. J. Acoust. Soc. Am. 136:EL116-EL121. doi: 10.1121/1.4885482,
Calić, G., Petrović-Lazić, M., Mentus, T., and Babac, S. (2022). Acoustic features of voice in adults suffering from depression. Psihol. istraz. 25, 183–203. doi: 10.5937/PSISTRA25-39224
Campbell, J. M., Klugar, M., Ding, S., Carmody, D. P., Hakonsen, S. J., Jadotte, Y. T., et al. (2015). Diagnostic test accuracy: methods for systematic review and meta-analysis. Int. J. Evid.-Based Healthc. 13, 154–162. doi: 10.1097/XEB.0000000000000061,
Cannizzaro, M., Harel, B., Reilly, N., Chappell, P., and Snyder, P. J. (2004). Voice acoustical measurement of the severity of major depression. Brain Cogn. 56, 30–35. doi: 10.1016/j.bandc.2004.05.003,
Carrillo, F., Sigman, M., Fernández Slezak, D., Ashton, P., Fitzgerald, L., Stroud, J., et al. (2018). Natural speech algorithm applied to baseline interview data can predict which patients will respond to psilocybin for treatment-resistant depression. J. Affect. Disord. 230, 84–86. doi: 10.1016/j.jad.2018.01.006,
Cesari, M., Sumi, Y., Han, Z. A., Perracini, M., Jang, H., Briggs, A., et al. (2022). Implementing care for healthy ageing. BMJ Glob. Health 7:e007778. doi: 10.1136/bmjgh-2021-007778,
Cohen, A. S., Najolia, G. M., Kim, Y., and Dinzeo, T. J. (2012). On the boundaries of blunt affect/alogia across severe mental illness: implications for research domain criteria. Schizophr. Res. 140, 41–45. doi: 10.1016/j.schres.2012.07.001,
Cummins, N., Dineley, J., Conde, P., Matcham, F., Siddi, S., Lamers, F., et al. (2023). Multilingual markers of depression in remotely collected speech samples: a preliminary analysis. J. Affect. Disord. 341, 128–136. doi: 10.1016/j.jad.2023.08.097,
Cummins, N., Scherer, S., Krajewski, J., Schnieder, S., Epps, J., and Quatieri, T. F. (2015). A review of depression and suicide risk assessment using speech analysis. Speech Comm. 71, 10–49. doi: 10.1016/j.specom.2015.03.004
Cummins, N., Vlasenko, B., Sagha, H., and Schuller, B. (2017). Enhancing speech-based depression detection through gender dependent vowel-level formant features. En A. ten Teije, C. Popow, J. H. Holmes, and L. Sacchi (Eds.), Artificial Intelligence in Medicine (pp. 209–214). Berlin: Springer International Publishing.
Das, A. K., and Naskar, R. (2024). A deep learning model for depression detection based on MFCC and CNN generated spectrogram features. Biomed. Signal Process. Control. 90:105898. doi: 10.1016/j.bspc.2023.105898
de Boer, J. N., Voppel, A. E., Brederoo, S. G., Schnack, H. G., Truong, K. P., Wijnen, F. N. K., et al. (2023). Acoustic speech markers for schizophrenia-spectrum disorders: a diagnostic and symptom-recognition tool. Psychol. Med. 53, 1302–1312. doi: 10.1017/S0033291721002804,
Devita, M., De Salvo, R., Ravelli, A., De Rui, M., Coin, A., Sergi, G., et al. (2022). Recognizing depression in the elderly: practical guidance and challenges for clinical management. Neuropsychiatr. Dis. Treat. 18, 2867–2880. doi: 10.2147/NDT.S347356,
Edelstein, B. A., Drozdick, L. W., and Ciliberti, C. M. (2010). Assessment of depression and bereavement in older adults. Amsterdam: Elsevier Academic Press.
Edmonds, E. C., Delano-Wood, L., Galasko, D. R., Salmon, D. P., and Bondi, M. W. (2014). Subjective cognitive complaints contribute to misdiagnosis of mild cognitive impairment. J. Int. Neuropsychol. Soc. 20, 836–847. doi: 10.1017/S135561771400068X,
Fagherazzi, G., Fischer, A., Ismael, M., and Despotovic, V. (2021). Voice for health: the use of vocal biomarkers from research to clinical practice. Dig. Biomark. 5, 78–88. doi: 10.1159/000515346,
Flanagan, O., Chan, A., Roop, P., and Sundram, F. (2021). Using acoustic speech patterns from smartphones to investigate mood disorders: scoping review. JMIR Mhealth Uhealth 9:e24352. doi: 10.2196/24352,
Fraser, K. C., Rudzicz, F., and Hirst, G.. 2016. Detecting late-life depression in Alzheimer’s disease through analysis of speech and language. Proceedings of the Third Workshop on Computational Linguistics and Clinical Psychology, 1–11.
Giannouli, V., Stamovlasis, D., and Tsolaki, M. (2022). Longitudinal study of depression on amnestic mild cognitive impairment and financial capacity. Clin. Gerontol. 45, 708–714. doi: 10.1080/07317115.2021.2017377,
Hamilton, M. (1986). The Hamilton rating scale for depression. En N. Sartorius and T. A. Ban (Eds.), Assessment of depression (pp. 143–152). Berlin: Springer.
Harlev, D., Singer, S., Goldshalger, M., Wolpe, N., and Bergmann, E. (2025). Acoustic speech features are associated with late-life depression and apathy symptoms: preliminary findings. Alzheim. Dement. 17:e70055. doi: 10.1002/dad2.70055,
Harvey, P. D., Strassnig, A., Strassnig, M., Heaton, A., Kuehn, K., Torre, P., et al. (2023). Mild cognitive impairment, but not HIV status, is related to reduced awareness of level of cognitive performance among older adults. Am. J. Geriatr. Psychiatry 31, 1117–1128. doi: 10.1016/j.jagp.2023.07.009,
Hashim, N. N. W. N., Basri, N. A., Ezzi, M. A.-E. A., and Hashim, N. (2022). Comparison of classifiers using robust features for depression detection on Bahasa Malaysia speech. Int J Artif Intell ISSN 2252:8938. doi: 10.11591/ijai.v11.i1.pp238-253
Hashim, N. W., Wilkes, M., Salomon, R., Meggs, J., and France, D. J. (2017). Evaluation of voice acoustics as predictors of clinical depression scores. J. Voice 31, 256.e1–256.e6. doi: 10.1016/j.jvoice.2016.06.006,
Helfer, B. S., Quatieri, T. F., Williamson, J. R., Mehta, D. D., Horwitz, R., and Yu, B. (2013). Classification of depression state based on articulatory precision. Interspeech, 2172–2176. doi: 10.21437/interspeech.2013-513
Higuchi, M., Shinohara, S., Nakamura, M., Omiya, Y., Hagiwara, N., Mitsuyoshi, S., et al. 2017. Study on depression evaluation indicator in the elderly using sensibility technology. International Conference on Information and Communication Technologies for Ageing Well and e-Health 70–77.
Higuchi, M., Shinohara, S., Nakamura, M., Omiya, Y., Hagiwara, N., Takano, T., et al. (2018). Study on indicators for depression in the elderly using voice and attribute information. En C. Röcker, J. O’Donoghue, M. Ziefle, L. Maciaszek, and W. Molloy (Eds.), Information and communication Technologies for Ageing Well and e-health (pp. 127–146). Berlin: Springer International Publishing.
Hu, T., Zhao, X., Wu, M., Li, Z., Luo, L., Yang, C., et al. (2022). Prevalence of depression in older adults: a systematic review and meta-analysis. Psychiatry Res. 311:114511. doi: 10.1016/j.psychres.2022.114511,
James, T. A., Weiss-Cowie, S., Hopton, Z., Verhaeghen, P., Dotson, V. M., and Duarte, A. (2021). Depression and episodic memory across the adult lifespan: a meta-analytic review. Psychol. Bull. 147, 1184–1214. doi: 10.1037/bul0000344,
Jiang, H., Hu, B., Liu, Z., Yan, L., Wang, T., Liu, F., et al. (2017). Investigation of different speech types and emotions for detecting depression using different classifiers. Speech Comm. 90, 39–46. doi: 10.1016/j.specom.2017.04.001
König, A., Mallick, E., Tröger, J., Linz, N., Zeghari, R., Manera, V., et al. (2021). Measuring neuropsychiatric symptoms in patients with early cognitive decline using speech analysis. Eur. Psychiatry 64:e64. doi: 10.1192/j.eurpsy.2021.2236,
Kourtis, L. C., Regele, O. B., Wright, J. M., and Jones, G. B. (2019). Digital biomarkers for Alzheimer’s disease: the mobile/wearable devices opportunity. Npj Dig. Med. 2:9. doi: 10.1038/s41746-019-0084-2,
Kroenke, K., Spitzer, R. L., and Williams, J. B. W. (2001). The PHQ-9: validity of a brief depression severity measure. J. Gen. Intern. Med. 16, 606–613. doi: 10.1046/j.1525-1497.2001.016009606.x,
Lamers, S. M., Truong, K. P., Steunenberg, B., de Jong, F., and Westerhof, G. J.. 2014. Applying prosodic speech features in mental health care: an exploratory study in a life-review intervention for depression. ACL Workshop on Computational Linguistics and Clinical Psychology 2014: From Linguistic Signal to Clinical Reality, 61–68.
Lara, E., Koyanagi, A., Domènech-Abella, J., Miret, M., Ayuso-Mateos, J. L., and Haro, J. M. (2017). The impact of depression on the development of mild cognitive impairment over 3 years of follow-up: a population-based study. Dement. Geriatr. Cogn. Disord. 43, 155–169. doi: 10.1159/000455227,
Lee, S., Suh, S. W., Kim, T., Kim, K., Lee, K. H., Lee, J. R., et al. (2021). Screening major depressive disorder using vocal acoustic features in the elderly by sex. J. Affect. Disord. 291, 15–23. doi: 10.1016/j.jad.2021.04.098,
Lin, Y., Liyanage, B. N., Sun, Y., Lu, T., Zhu, Z., Liao, Y., et al. (2022). A deep learning-based model for detecting depression in senior population. Front. Psych. 13:1016676. doi: 10.3389/fpsyt.2022.1016676,
Little, B., Alshabrawy, O., Stow, D., Ferrier, I. N., McNaney, R., Jackson, D. G., et al. (2021). Deep learning-based automated speech detection as a marker of social functioning in late-life depression. Psychol. Med. 51, 1441–1450. doi: 10.1017/S0033291719003994
Liu, L., Liu, L., Wafa, H. A., Tydeman, F., Xie, W., and Wang, Y. (2024). Diagnostic accuracy of deep learning using speech samples in depression: a systematic review and meta-analysis. J. Am. Med. Inform. Assoc. 31, 2394–2404. doi: 10.1093/jamia/ocae189,
Lortie, C. L., Thibeault, M., Guitton, M. J., and Tremblay, P. (2015). Effects of age on the amplitude, frequency and perceived quality of voice. Age 37:117. doi: 10.1007/s11357-015-9854-1,
Ly, M., Karim, H. T., Becker, J. T., Lopez, O. L., Anderson, S. J., Aizenstein, H. J., et al. (2021). Late-life depression and increased risk of dementia: a longitudinal cohort study. Transl. Psychiatry 11:147. doi: 10.1038/s41398-021-01269-y,
Ma, X., Yang, H., Chen, Q., Huang, D., and Wang, Y.. 2016. DepAudioNet: an efficient deep model for audio based depression classification. Proceedings of the 6th International Workshop on Audio/Visual Emotion Challenge, 35–42.
Manea, L., Gilbody, S., and McMillan, D. (2015). A diagnostic meta-analysis of the patient health Questionnaire-9 (PHQ-9) algorithm scoring method as a screen for depression. Gen. Hosp. Psychiatry 37, 67–75. doi: 10.1016/j.genhosppsych.2014.09.009,
Martínez-Nicolás, I., Llorente, T. E., Martínez-Sánchez, F., and Meilán, J. J. G. (2021). Ten years of research on automatic voice and speech analysis of people with Alzheimer’s disease and mild cognitive impairment: a systematic review article. Front. Psychol. 12:620251. doi: 10.3389/fpsyg.2021.620251,
Mazzetto de Menezes, K. S., Master, S., Guzman, M., Bortnem, C., and Ramos, L. R. (2014). Differences in acoustic and perceptual parameters of the voice between elderly and young women at habitual and high intensity. Acta Otorrinolaringologica (English Edition) 65, 76–84. doi: 10.1016/j.otoeng.2013.11.012
Menne, F., Dörr, F., Schräder, J., Tröger, J., Habel, U., König, A., et al. (2024). The voice of depression: speech features as biomarkers for major depressive disorder. BMC Psychiatry 24:794. doi: 10.1186/s12888-024-06253-6,
Mijnders, C., Janse, E., Naarding, P., and Truong, K.. 2023. Acoustic characteristics of depression in older adults’ speech: the role of covariates.
Mitchell, A. M., Garand, L., Dean, D., Panzak, G., and Taylor, M. (2005). Suicide assessment in hospital emergency departments: implications for patient satisfaction and compliance. Adv. Emerg. Nurs. J. 27, 302–312,
Moola, S., Tufanaru, C., Aromataris, E., Sears, K., Sfetc, R., Currie, M., et al. (2020). Chapter 7: systematic reviews of etiology and risk. In JBI manual for evidence synthesis. eds. E. Aromataris, C. Lockwood, K. Porritt, B. Pilla, and Z. Jordan. 252–311. doi: 10.46658/JBIMES-20-08
Moon, S. W., Lee, B., and Choi, Y. C. (2018). Changes in the hippocampal volume and shape in early-onset mild cognitive impairment. Psychiatry Investig. 15, 531–537. doi: 10.30773/pi.2018.02.12,
Muhammad, T., and Meher, T. (2021). Association of late-life depression with cognitive impairment: evidence from a cross-sectional study among older adults in India. BMC Geriatr. 21:364. doi: 10.1186/s12877-021-02314-7,
Mundt, J. C., Vogel, A. P., Feltner, D. E., and Lenderking, W. R. (2012). Vocal acoustic biomarkers of depression severity and treatment response. Biol. Psychiatry 72, 580–587. doi: 10.1016/j.biopsych.2012.03.015,
Page, M. J., McKenzie, J. E., Bossuyt, P. M., Boutron, I., Hoffmann, T. C., Mulrow, C. D., et al. (2021). The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ 372:n71. doi: 10.1136/bmj.n71
Qi, X., Zhou, Q., Dong, J., and Bao, W. (2023). Noninvasive automatic detection of Alzheimer’s disease from spontaneous speech: a review. Front. Aging Neurosci. 15:1224723. doi: 10.3389/fnagi.2023.1224723,
Quatieri, T., and Malyska, N.. 2012. Vocal-source biomarkers for depression: a link to psychomotor activity. Proceedings of Interspeech.
Ramanarayanan, V., Lammert, A. C., Rowe, H. P., Quatieri, T. F., and Green, J. R. (2022). Speech as a biomarker: opportunities, interpretability, and challenges. Perspect. ASHA Spec. Inter. Groups 7, 276–283. doi: 10.1044/2021_PERSP-21-00174
Rejaibi, E., Komaty, A., Meriaudeau, F., Agrebi, S., and Othmani, A. (2022). MFCC-based recurrent neural network for automatic clinical depression recognition and assessment from speech. Biomed. Signal Process. Control. 71:103107. doi: 10.1016/j.bspc.2021.103107
Riad, R., Denais, M., de Gennes, M., Lesage, A., Oustric, V., Cao, X. N., et al. (2024). Automated speech analysis for risk detection of depression, anxiety, insomnia, and fatigue: algorithm development and validation study. J. Med. Internet Res. 26:e58572. doi: 10.2196/58572
Robin, J., Harrison, J. E., Kaufman, L. D., Rudzicz, F., Simpson, W., and Yancheva, M. (2020). Evaluation of speech-based digital biomarkers: review and recommendations. Dig. Biomark. 4, 99–108. doi: 10.1159/000510820,
Saeedi, S., Hetjens, S., Grimm, M. O. W., and Barsties Latoszek, B. (2024). Acoustic speech analysis in Alzheimer’s disease: a systematic review and meta-analysis. J. Prev Alzheimers Dis. 11, 1789–1797. doi: 10.14283/jpad.2024.132,
Sanchez, M. H., Vergyri, D., Ferrer, L., Richey, C., Garcia, P., Knoth, B., et al. (2011). Using prosodic and spectral features in detecting depression in elderly males. Interspeech, 3001–3004. doi: 10.21437/Interspeech.2011-751
Scherer, K. R., Johnstone, T., and Klasmeyer, G. (2003). “Vocal expression of emotion” in Handbook of affective sciences. eds. R. J. Davidson, K. R. Scherer, and H. H. Goldsmith (Oxford, United Kingdom: Oxford University Press).
Schultz, B. G., Rojas, S., St John, M., Kefalianos, E., and Vogel, A. P. (2023). A cross-sectional study of perceptual and acoustic voice characteristics in healthy aging. J. Voice 37, 969.e23–969.e41. doi: 10.1016/j.jvoice.2021.06.007,
Seath, P., Macedo-Orrego, L. E., and Velayudhan, L. (2024). Clinical characteristics of early-onset versus late-onset Alzheimer’s disease: a systematic review and meta-analysis. Int. Psychogeriatr. 36, 1093–1109. doi: 10.1017/S1041610223000509,
Smith, M., Dietrich, B. J., Bai, E., and Bockholt, H. J. (2020). Vocal pattern detection of depression among older adults. Int. J. Ment. Health Nurs. 29, 440–449. doi: 10.1111/inm.12678,
Solana-Lavalle, G., and Rosas-Romero, R. (2021). Analysis of voice as an assisting tool for detection of Parkinson’s disease and its subsequent clinical interpretation. Biomed. Signal Process. Control 66:102415. doi: 10.1016/j.bspc.2021.102415
Stasak, B., Joachim, D., and Epps, J. (2022). Breaking age barriers with automatic voice-based depression detection. IEEE Pervasive Comput 21, 10–19. doi: 10.1109/MPRV.2022.3163656
Sumali, B., Mitsukura, Y., Liang, K., Yoshimura, M., Kitazawa, M., Takamiya, A., et al. (2020). Speech quality feature analysis for classification of depression and dementia patients. Sensors 20:3599. doi: 10.3390/s20123599,
Taguchi, T., Tachikawa, H., Nemoto, K., Suzuki, M., Nagano, T., Tachibana, R., et al. (2018). Major depressive disorder discrimination using vocal acoustic features. J. Affect. Disord. 225, 214–220. doi: 10.1016/j.jad.2017.08.038,
Tarailis, P., Lory, K., Unschuld, P. G., Michel, C. M., and Bréchet, L. (2025). Self-related thought alterations associated with intrinsic brain dysfunction in mild cognitive impairment. Sci. Rep. 15:12279. doi: 10.1038/s41598-025-97240-8,
Tølbøll, K. B. (2019). Linguistic features in depression: a meta-analysis. J. Lang. Works Sprogvidenskabeligt Studentertidsskrift 4:39.
Trifu, R. N., Nemeș, B., Herta, D. C., Bodea-Hategan, C., Talaș, D. A., and Coman, H. (2024). Linguistic markers for major depressive disorder: a cross-sectional study using an automated procedure. Front. Psychol. 15:1355734. doi: 10.3389/fpsyg.2024.1355734,
Von Glischinski, M., Von Brachel, R., and Hirschfeld, G. (2019). How depressed is “depressed”? A systematic review and diagnostic meta-analysis of optimal cut points for the Beck depression inventory revised (BDI-II). Qual. Life Res. 28, 1111–1118. doi: 10.1007/s11136-018-2050-x,
Wang, J., Zhang, L., Liu, T., Pan, W., Hu, B., and Zhu, T. (2019). Acoustic differences between healthy and depressed people: a cross-situation study. BMC Psychiatry 19:300. doi: 10.1186/s12888-019-2300-7,
World Health Organization (2022). ICD-11: international classification of diseases (11th revision). Geneva: World Health Organization.
Yang, J., Zhang, L., Yang, C., Li, X., and Li, Z. (2024). Global, regional, and National Epidemiology of depression in working-age individuals, 1990–2019. Depress. Anxiety 2024:4747449. doi: 10.1155/2024/4747449,
Yesavage, J. A., and Sheikh, J. I. (1986). 9/geriatric depression scale (GDS). Clin. Gerontol. 5, 165–173. doi: 10.1300/J018v05n01_09
Zhang, L., Duvvuri, R., Chandra, K. K. L., Nguyen, T., and Ghomi, R. H. (2020). Automated voice biomarkers for depression symptoms using an online cross-sectional data collection initiative. Depress. Anxiety 37, 657–669. doi: 10.1002/da.23020,
Zhou, Y., Han, W., Yao, X., Xue, J., Li, Z., and Li, Y. (2023). Developing a machine learning model for detecting depression, anxiety, and apathy in older adults with mild cognitive impairment using speech and facial expressions: a cross-sectional observational study. Int. J. Nurs. Stud. 146:104562. doi: 10.1016/j.ijnurstu.2023.104562,
Zhou, Y., Yao, X., Han, W., Li, Y., Xue, J., and Li, Z. (2024). Measurement of neuropsychiatric symptoms in the older adults with mild cognitive impairment based on speech and facial expressions: a cross-sectional observational study. Aging Ment. Health 28, 828–837. doi: 10.1080/13607863.2023.2280913,
Zhou, Y., Yao, X., Han, W., Wang, Y., Li, Z., and Li, Y. (2022). Distinguishing apathy and depression in older adults with mild cognitive impairment using text, audio, and video based on multiclass classification and shapely additive explanations. Int. J. Geriatr. Psychiatry 37. doi: 10.1002/gps.5827,
Keywords: depression, older adults, speech, acoustic analysis, cognitive impairment
Citation: Martínez-Nicolás I, Criado D, Gordillo F, Martínez-Sánchez F and Meilán JJG (2025) Speech analysis for detecting depression in older adults: a systematic review. Front. Psychol. 16:1715538. doi: 10.3389/fpsyg.2025.1715538
Edited by:
Grigorios Nasios, University of Ioannina, GreeceReviewed by:
Vaitsa Giannouli, Aristotle University of Thessaloniki, GreeceNefeli Dimitriou, University of Ioannina, Greece
Copyright © 2025 Martínez-Nicolás, Criado, Gordillo, Martínez-Sánchez and Meilán. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Israel Martínez-Nicolás, aXNyYWVsbWFuaUB1c2FsLmVz