Association between heart rate variability metrics from a smartwatch and self-reported depression and anxiety symptoms: a four-week longitudinal study

Background Elucidating the association between heart rate variability (HRV) metrics obtained through non-invasive methods and mental health symptoms could provide an accessible approach to mental health monitoring. This study explores the correlation between HRV, estimated using photoplethysmography (PPG) signals, and self-reported symptoms of depression and anxiety. Methods A 4-week longitudinal study was conducted among 47 participants. Time–domain and frequency–domain HRV metrics were derived from PPG signals collected via smartwatches. Mental health symptoms were evaluated using the Patient Health Questionnaire-9 (PHQ-9) and Generalized Anxiety Disorder-7 (GAD-7) at baseline, week 2, and week 4. Results Among the investigated HRV metrics, RMSSD, SDNN, SDSD, LF, and the LF/HF ratio were significantly associated with the PHQ-9 score, although the number of significant correlations was relatively small. Furthermore, only SDNN, SDSD and LF showed significant correlations with the GAD-7 score. All HRV metrics showed negative correlations with self-reported clinical symptoms. Conclusions Our findings indicate the potential of PPG-derived HRV metrics in monitoring mental health, thereby providing a foundation for further research. Notably, parasympathetically biased HRV metrics showed weaker correlations with depression and anxiety scores. Future studies should validate these findings in clinically diagnosed patients.


Introduction
Mental health disorders constitute a major global public health concern, with an escalating number of patients requiring mental health services with considerable social cost (1).Depression and anxiety are the two most disabling mental disorders, ranked among the top 25 leading causes of healthcare burden worldwide in 2019 (2).In South Korea, the problem is severe, as the suicide rate has consistently ranked first in the OECD for more than 10 years, with around 25 people per 100,000 resorting to suicide each year (3).The disease burden of mental and behavioral disorders was estimated to account for 6.4% of the total disease burden in South Korea (4).The importance of mental health services therefore cannot be overstated.
Traditionally, patients receive diagnoses through face-to-face consultations with psychiatrists, which pave the way for various treatments such as counseling, medication, and hospitalization.However, this approach can render mental health services less accessible in certain areas or under specific circumstances (5).For instance, during the recent COVID-19 pandemic, the number of patients with depression surged; however, providing appropriate services proved challenging due to social distancing measures and other factors (6).Even outside of pandemic conditions, it is consistently reported that current mental health services are unable to cope with the rapid increase in the number of psychiatric patients (7).Furthermore, face-to-face consultations have inherent limitations; they rely on individuals' ability to recollect their symptoms, which can introduce significant bias (8).
Given these limitations in the provision and access to adequate mental health services under the current system, applications of digital technologies are increasing (9)(10)(11).Digital technologies can help overcome the issue of accessibility in providing mental health services and can also alleviate recollection bias by offering real-time physiological digital markers to both physicians and patients.Commercialized services for some conditions, such as insomnia, already exist.For instance, the Sleep Healthy Using the Internet (SHUTi) service has been effective for insomnia and significantly reduced depression and anxiety symptoms (12).Moreover, studies on mobile intervention platforms for insomnia in South Korea have emphasized the potential of wearable devices (13).The mobile and wearable devices enable scalable sampling of the experiences and feelings of a patient (14), thus facilitating well-being reports collection systematically and objectively at scale (15).Wearable devices not only address accessibility concerns in traditional healthcare, but also enable healthcare providers to achieve more precise diagnoses through continuous collection of real-time patient biometrics, allowing physicians to analyze a patient's condition over a broader spectrum (16).
Nevertheless, the information that wearable devices can collect is somewhat restricted in both quality and quantity compared with medical devices.Therefore, clinically relevant mental health data of the user is a priority for wearable devices.Research has subsequently expanded to heart rate variability (HRV) (17), referring to the small variations between heartbeat cycles.In a healthy human heart, a dynamic relationship exists between the parasympathetic nervous system (PNS) and the sympathetic nervous system, often referred to as autonomic nervous system balance.Consequently, HRV is associated with numerous psychiatric symptoms.It has been suggested that patients with depression exhibit lower HF power, which indicate a diminished regulatory ability of the parasympathetic nervous system and short-term flexibility of the autonomic nervous system, respectively (18).In addition, a meta-analysis has revealed that anxiety disorders are associated with significant reductions in both high-frequency and time-domain HRV metrics.This reduction may signify a failure of inhibition, characterized by a diminished capacity to inhibit responses, leading to decreased vagal outflow and lower HRV (19).Furthermore, it has been suggested that the LF/HF ratio, generally indicative of autonomic balance, may reflect aspects of an individual's resilience profile (20).
However, HRV is traditionally measured using an electrocardiogram, which is time-consuming and resource-intensive.Therefore, various methods to measure HRV using scalable devices, such as wearables, have been developed (21,22).Among them, the photoplethysmography (PPG) method is favored owing to its reliability compared with the gold standard electrocardiogram method (23)(24)(25).Therefore, in this study, we aimed to advance this exploration and investigate whether HRV measured using wearable devices could be applied to depression and anxiety.This study will elucidate whether real-time signals measured through wearable devices correlate with patient mental health.We therefore examined the association between HRV metrics collected in real-time while wearing a smartwatch and self-reported depression and anxiety in healthy adults.

Study participants
The study initially recruited young adults who studied or worked at the Korean Advanced Institute of Science and Technology (KAIST) and the Institute for Basic Science.We excluded individuals with comorbid medical or psychiatric conditions; therefore, those with a formal diagnosis of depression or other psychiatric disorders were not included in the study.However, participants with a certain level of depressive or anxiety symptoms, which did not meet the diagnostic criteria for psychiatric disorders such as major depressive disorder (MDD), were still eligible.Additionally, we excluded individuals with limited access to Wi-Fi and night or shift workers to avoid bias in biomedical signal interpretation.This four-week experiment ran from March 8th to April 4th, 2021.All of the data was anonymized prior to the analysis.This study was approved by the Institutional Review Board of KAIST (KH2020-027).

Psychiatric symptom assessment
Participants were requested to complete online assessments of psychiatric symptoms, such as depression and anxiety.SurveyMonkey (https://www.surveymonkey.com/), an online survey platform, was utilized to formulate a questionnaire that evaluated symptoms of depression and anxiety.Participants had to complete the survey three times at a two-week interval: baseline, week 2, and week 4.For depressive symptoms, the PHQ-9 questionnaire was used (26), while the GAD-7 questionnaire (27) was used to assess anxiety symptoms.The PHQ-9 and GAD-7 are brief self-report questionnaires comprising 9 and 7 questions, respectively.Higher scores on these questionnaires suggest more severe levels of depression and anxiety.

Collection and processing of biomedical signals
We utilized the Samsung Galaxy Active 2 (Samsung Electronics, Seoul, Korea) for the continuous collection of biomedical signals over a four-week period.Participants were required to wear this device at all times, enabling the uninterrupted gathering of signals.These signals were automatically uploaded to a central web server every 30 minutes, provided the device had Wi-Fi connectivity.On the server side, data were stored in a MongoDB database instance.Among the various signals gathered through wearable devices, this study predominantly focused on collecting PPG signals to measure HRV.The PPG signal was sampled every 100 ms (10 Hz), to enable continuous recordings throughout the day while ensuring battery life.The continuous PPG signal was segmented into consecutive 5-minute slices for later HRV analysis.Each slice was passed through a bandpass filter to remove frequency outliers not corresponding to human heart rates, based on the Nyquist-Shannon theorem.Subsequently, we used the HeartPy algorithm (28, 29) to identify RR intervals.From these intervals, we calculated HRV parameters for each signal slice.Figure 1 presents the processing pipeline for conducting HRV analysis on the raw PPG signal.Further detail is described elsewhere (30, 31).

HRV metrics
Several metrics have been studied for HRV.This study analyzed certain time-domain and frequency-domain measures by referencing existing literature (32).Initially, we evaluated the root mean square of successive differences (RMSSD) between normal heartbeats, the standard deviation of the inter-beat interval of normal sinus beats (SDNN), the standard deviation of successive differences between normal heartbeats (SDSD), and the percentage of successive differences between normal heartbeats that differ from each other by more than 50 ms (PNN50) as time-domain measures to quantify the extent of variability.Next, we assessed the absolute power of the low-frequency band (LF), the absolute power of the high-frequency band (HF), and LF/HF ratio as frequency-domain measures after dividing HRV into different frequency bands using a Fast Fourier Transformation.Table 1 provides a detailed description of each HRV metric.

Statistical analysis
The collected biomedical signals were analyzed at 2-week increments.The HRV metric was split into the first 2 weeks and the subsequent 2 weeks; the mean HRV metric was calculated for each period.These variables were then correlated with the selfreported depression and anxiety scores of each participant.Considering the potential delay in the temporal relationship between HRV and psychiatric symptoms, the mean HRV metrics of the first 2 weeks were correlated with clinical measures at baseline, and weeks 2 and 4, while the mean HRV metrics of the subsequent 2 weeks were correlated with clinical measures at weeks 2 and week 4. Pearson correlation analysis was applied, and a p-value of < 0.05 was considered statistically significant.Additionally, we performed sensitivity analyses for each gender to account for differences in cardiac electrophysiology (33), and psychiatric symptoms (34).All statistical analyses were conducted using R version 4.1.3(35).

Characteristics of study participants
A total of 47 participants were included in the study, 24 males and 23 females, with an average age of 28.7 (5.79) years.The average height of the participants was 169.0 (6.38) cm, and the average weight was 63.1 (11.5) kg.The study participants comprised 13 undergraduates, 17 graduate students, and 17 office workers.A total of 25 participants (53.2%) had previously used a personal smartwatch before the study.The mean PHQ-9 score of the study participants taken via self-report questionnaires was 3.62 (3.69) at baseline, 3.75 (3.69) at week 2, and 3.75 (3.78) at week 4.The mean GAD-7 score was 3.02 (3.53) at baseline, 2.53 (2.91) at week 2, and 2.45 (2.69) at week 4.There were no reported physiological or psychological adverse effects from wearing the watch throughout the study period.Despite some participants occasionally forgetting to wear the watch, all experiments were successfully completed with the support and continuous monitoring provided by researchers throughout the study period.The demographics and clinical characteristics of study participants are presented in Table 2.

Correlations between HRV and depression
Regarding time-domain measures, the mean RMSSD for the first 2 weeks did not significantly correlate with PHQ-9 scores at any time point.However, the mean RMSSD for the subsequent 2 weeks significantly correlated with the PHQ-9 score measured at week 2 (r = -0.329,p = 0.024).The mean SDNN for the first 2 weeks significantly correlated with the PHQ-9 score measured at the initial assessment (r = -0.310,p = 0.034), while the mean SDNN for the subsequent 2 weeks correlated significantly with PHQ-9 scores measured at weeks 2 and 4 assessments (r = -0.327,p = 0.025; r =3-0.301,p = 0.040, respectively).Only the mean SDSD of the subsequent 2 weeks showed a significant correlation with the PHQ-9 score of week 2 (r = -0.356,p = 0.014).
Regarding frequency-domain measures, the mean LF for the first 2 weeks significantly correlated with the PHQ-9 score at the initial assessment (r = -0.369,p = 0.011).The mean LF for the subsequent 2 weeks was significantly correlated with the PHQ-9 scores at weeks 2 and 4 (r = -0.355,p = 0.014; r = -0.341,p = 0.019, respectively).The mean LF/HF ratio of the first 2 weeks significantly correlated with the PHQ-9 score at the initial assessment (r = -0.363,p = 0.012).However, no significant correlation was found between the mean HF for any period and the measured PHQ-9 scores.The correlations between HRV metrics and depressive symptoms measured by PHQ-9 are presented in Table 3.

Correlations between HRV and anxiety
For time-domain measures, the mean RMSSD and PNN50 showed no significant correlations with GAD-7 scores at any time point.The mean SDNN and SDSD showed statistically significant correlations.Specifically, the mean SDNN for the first and the subsequent 2 weeks showed significant correlations with the GAD-7 score measured at week 2 (r = -0.300,p = 0.040; r = -0.293,p = 0.046), and the mean SDSD of the first 2 weeks showed a significant correlation with the GAD-7 score of week 2 (r = -0.310,p = 0.034).
For frequency-domain measures, the mean LF for the first 2 weeks significantly correlated with the GAD-7 score measured at the initial assessment (r = -0.292,p = 0.047).The mean LF for the subsequent 2 weeks also significantly correlated with the GAD-7 score measured at week 2 (r = -0.323,p = 0.027).However, neither the mean HF nor the LF/HF ratio significantly correlated with GAD-7 scores at any time point.The correlations between HRV metrics and anxiety symptoms measured by GAD-7 are presented in Table 4.

Sensitivity analyses
We performed sensitivity analyses for each gender.Overall, the results remained consistent when correlations were evaluated among males, females, or the group as a whole.However, when analyzing correlations within a single gender, the number of  significant correlations between HRV metrics and clinical measures decreased.This might be attributed to an insufficient sample size, reducing statistical power.The overall trend of correlation was the same across genders, with only differences in the p-values.The sole exception was observed in male participants, where the mean SDNN for the first 2 weeks significantly correlated with the PHQ-9 score measured at the final assessment (r = -0.452,p = 0.026).

Discussion
The association between HRV and psychiatric symptoms is acknowledged in the literature (18)(19)(20)36).From a physiological perspective, the abnormal serotonergic system observed in various psychiatric conditions may contribute to cardiovascular dysregulation through alterations in endocrine and autonomic functions (37).This relationship is thought to be partly modulated by the hypothalamic-pituitary-adrenal (HPA) axis.Furthermore, because the serotonin transporter is predominantly found in plateletswhere serotonin acts as a vasoconstrictora potential pathogenic link between psychiatric conditions and cardiovascular dysregulations may exist (38).
In line with previous literature, we found that some HRV metrics were significantly correlated with clinical measures with self-report questionnaires.What distinctly sets our findings apart from previous studies is that we obtained HRV data from PPG signals collected via a highly accessible wearable device, suggesting its potential to monitor mental health across a wide demographic range.Furthermore, we observed somewhat different patterns of associations between HRV metrics and clinical measures.Previous studies have suggested that HF is significantly associated with depression or anxiety, as lower HF power is linked to stress, panic, or anxiety (39).The relationships between time-domain measures and clinical measures are complex, yet multiple correlations have been continuously reported (18,19).In contrast, our study results demonstrated correlations between LF and depression or anxiety, rather than HF.Furthermore, the LF/HF ratio showed only one significant correlation coefficient with clinical measures.Moreover, only few correlations were observed between time-domain measures, except for SDNN, and depression or anxiety.
Nevertheless, our results are partially consistent with cardiac electrophysiology.The LF band (0.04-0.15Hz), which predominantly reflects baroreceptor activity under resting conditions (40), may plausibly be associated with mental health conditions.For the LF/ HF ratio, while it is often thought to reflect the balance of sympathetic and parasympathetic nervous systems, it has also been suggested that this ratio does not always reflect autonomic balance (41).Regarding time-domain measures, SDNN, generally considered the gold standard for clinical HRV metrics and medical stratification of cardiac risk (42), showed more than one correlations with clinical measures.A Similar HRV metric, SDSD, also showed two significant correlations with clinical measures.On the other hand, the RMSSD, a measure of beat-to-beat variance in heart rate representing vagal-mediated changes reflected in HRV and related to the parasympathetic activity (43), showed only one significant correlation.Similarly, PNN50, although recognized as a less sensitive measure of the PNS, was not significantly correlated with clinical measures in this study.These results suggested that HRV is related to the balance of sympathetic and parasympathetic nervous systems; however, metrics biased toward the parasympathetic region are less likely to be associated with depression and anxiety.Among the significant associations between HRV metrics and clinical measures, all metrics from the first two weeks were associated only with clinical measures at baseline or week 2. Similarly, HRV metrics from the following two weeks were associated with clinical measures at weeks 2 or 4.This suggests that HRV metrics may reflect temporal changes in psychiatric symptoms.However, the inconsistency in statistical significance across items, the relatively small number of overall participants and their measured PPG signals, and the division of the entire observation period into two halves create difficulty in comprehensively interpreting the temporal association between HRV and clinical symptoms based solely on the results of this study.
This study had several limitations.First, we recruited participants from specific contexts, limiting the generalizability of our results.Our study population consisted of well-educated young adults, not a diverse demographic.This limits the scalability of our findings, as groups such as the elderlywho have less access to digital technologiesmight display different characteristics.Moreover, extending our results to children or adolescents should be done with caution, as their cardiac electrophysiology differs from that of adults (44).Second, most participants were asymptomatic or exhibited mild symptoms, as indicated by mean scores around three on the PHQ-9 and GAD-7.This necessitates cautious statistical interpretation of the correlations due to potential bias.A separate study with a different population is required to confirm clinical utility.Third, all symptom assessments were self-reported, which risks social desirability response bias (45) and may not accurately reflect true symptoms.Furthermore, since the survey was web-based, we cannot confirm that the actual respondents were the intended participants.Meta-analyses have shown that clinician-rated depressive symptoms have a significantly larger effect size than self-reported symptoms (46).Incorporating clinician interviews or clinician-rated scales could yield more comprehensive results.Fourth, although we excluded participants diagnosed with any psychiatric comorbidity, including alcohol and other substance use disorders, we could not rule out the use of substances like caffeine or alcohol during the actual experiment period.Nevertheless, considering our study's aim to evaluate the utility of a smartwatch as a real-time monitor for HRV as a digital biomarker of psychiatric symptoms, the consumption of certain substances within daily intake ranges should be considered part of our research.Lastly, the quality of biomedical signals collected by smartwatches has been a subject of ongoing debate.Specifically, PPG signals are argued to not accurately represent HRV, especially under free-living conditions (47) or without controls for breathing (48).Nevertheless, we did filter out noise from the PPG signals throughout the processing pipeline (31).This study has demonstrated that estimated HRV from PPG signals is significantly correlated with ECG-measured HRV metrics, indicating the reliability of our estimated HRV.
Despite these limitations, our study is significant because it demonstrated that biomedical signals obtained through simple methods, such as common smartwatches, are associated with psychiatric symptoms such as depression and anxiety.Further research should aim to highlight the existing shortage of mental health services.Additionally, with the integration of digital technology, these methods can provide more advanced mental health services in conjunction with the digital platforms that have gained substantial attention.We look forward to future research on a broader range of mental health symptoms based on various biomedical signals, not just PPGs, and research on a wider patient population.

FIGURE 1
FIGURE 1 Diagram of the HRV extracting system architecture.Raw photoplethysmogram (PPG) signals extracted from a smartwatch are processed to compute heart rate variability (HRV) metrics, which are then stored in the server database; adapted from Aitolkyn et al., 2023 (30).

TABLE 2
Demographics and clinical characteristics of study participants.

TABLE 1
Overview of investigated HRV metrics.

TABLE 3
Correlations between HRV metrics and depressive symptoms measured by PHQ-9 scores.The number after the HRV metric refers to the first or second half of the observation period, respectively.*Statistically significant p < 0.05.

TABLE 4
Correlations between HRV metrics and anxiety symptoms as measured by GAD-7 scores.Pearson correlation coefficient; The number after the HRV metric refers to the first or second half of the observation period, respectively.*Statistically significant p-value < 0.05.