Evaluation of Wearable Technology in Dementia: A Systematic Review and Meta-Analysis

Background: The objective of this analysis was to systematically review studies employing wearable technology in patients with dementia by quantifying differences in digitally captured physiological endpoints. Methods: This systematic review and meta-analysis was based on web searches of Cochrane Database, PsycInfo, Pubmed, Embase, and IEEE between October 25–31st, 2017. Observational studies providing physiological data measured by wearable technology on participants with dementia with a mean age ≥50. Data were extracted according to PRISMA guidelines and methodological quality assessed independently using Downs and Black criteria. Standardized mean differences between cases and controls were estimated using random-effects models. Results: Forty-eight studies from 18,456 screened abstracts (Dementia: n = 2,516, Control: n = 1,224) met inclusion criteria for the systematic review. Nineteen of these studies were included in one or multiple meta-analyses (Dementia: n = 617, Control: n = 406). Participants with dementia demonstrated lower levels of daily activity (standardized mean difference (SMD), −1.60; 95% CI, −2.66 to −0.55), decreased sleep efficiency (SMD, −0.52; 95% CI, −0.89 to −0.16), and greater intradaily circadian variability (SMD, 0.46; 95% CI, 0.27 to 0.65) than controls, among other measures. Statistical between-study heterogeneity was observed, possibly due to variation in testing duration, device type or patient setting. Conclusions and Relevance: Digitally captured data using wearable devices revealed that adults with dementia were less active, demonstrated increased fragmentation of their sleep-wake cycle and a loss of typical diurnal variation in circadian rhythm as compared to controls.


INTRODUCTION AND BACKGROUND
Dementia has been identified by the World Health Organization as a global priority for public health and social care in the twenty-first century (1). Advances in the molecular and genetic understanding of neurodegenerative disease has contributed to improved diagnostic paradigms and helped to foster a new era of personalized medicine for patients with dementia. This has coincided with the advancement in biological drug development for targeted therapies. These therapies have reflected the maturation in the scientific understanding of dementia that goes beyond raw measurement of cognitive performance. As a case in point a recent review of active clinical trials had shown that 14 biological treatments have targeted neuropsychiatric and behavioral symptoms as primary end-points. Challenges remain in capturing the heterogeneity of the clinical course experienced by individuals with dementia and translating these into meaningful end-points.
Technological advances using accelerometers, gyroscopes, and other motion detectors housed in mobile platforms may eventually present a cost-effective way to measure disease burden and deploy personalized treatments (2). Wearable devices that can continuously monitor physiological measures over extended periods, for example in the patient's home, provide unique information not attainable with traditional in-clinic monitoring and hold particular appeal in dementia populations (3). Advances in technology have made these devices increasingly affordable and user friendly but have been limited by methodological challenges. Specifically, their high resolution and sensitivity leaves them susceptible to noisy interference, complicated and time-consuming analytical techniques are required to derive clinically meaningful endpoints from the large amounts of data they produce, and the lack of standards has led to isolated "islands of expertise" (4).
The flexibility of wearable platforms has resulted in a variety of different uses including monitoring of gait, motion tracking, and sleep and circadian rhythm assessment (5). The ability to identify objective measurements of specific endpoints with respect to individual and group-wise subject performance, captured in realtime at various settings including at home, provides ecological validity that would otherwise be lost in laboratory settings. The main question that we had aimed to address was the potential for wearable devices to provide information on the behavioral and neuropsychiatric fluctuations inherent in the clinical course of dementia. The ability to accurately and objectively measure these fluctuations can provide researchers with viable digital surrogate end-points for use in clinical trials. We undertook a systematic review and meta-analysis to evaluate the utility of wearable technology in patients with dementia for the measurement of these neurophysiological parameters. The objective of this analysis was to systematically review studies employing wearable technology in patients with dementia by quantifying differences in digitally captured neurophysiological endpoints.

Types of Studies
We included observational studies reporting primary data in a peer-reviewed scientific journal. Studies had to include participants with a mean age ≥50 years and did not include any direct intervention (i.e., drug, vitamin, supplement, exercise, cognitive, or behavioral intervention). Studies published before 1970 or translated to English were excluded. Studies that did not provide descriptive statistics for a physiological outcome were excluded. Conference abstracts, review papers, case reports, letters, opinion pieces, editorials, article comments, or corrections were excluded.

Type of Exposure
We included all-cause dementia (any dementia subtype) as our exposure (6). Exact search terms for dementia subtypes included can be found in the search strategy (Search Terms). As we included studies from 1970 onwards, diagnostic criteria   for the diagnosis of dementia differed between studies and is summarized for each study in Table 1.

Types of Outcome Measures
We included studies which provided physiological data as measured by wearable technology. Wearable technology was defined as a non-implantable, body-fixed sensor technology designed to monitor for >24 h and to not interfere with the wearer's normal activity (5,72). By this definition, studies using finger-based pulse oximeters, blood pressure monitors, galvanic skin response sensors, functional near-infrared spectroscopy (fNIRS), and electroencephalograms (EEG) were excluded.
Where studies included measurement devices other than a wearable device, only data from the wearable device was included in the final analysis.

Methods for Literature Secondary Screening
First Selection: Abstract Screening Two authors (RP and AC) independently screened each record by title and abstract according to eligibility criteria. Eighteen thousand four hundred fifty-six abstracts were included in the initial screening process. There were 525 disagreements in abstract selection between the two reviewers. Conflicts were resolved by two additional authors (NS and JB) using the inclusion and exclusion criteria and definitions outlined in Figure 1.

Final Selection
Two hundred thirty articles were eligible for full text review. Two authors (RP and AC) independently determined eligibility of each article for inclusion. In cases of disagreement or conflict, senior authors (NS, JB, and KT) determined whether the study met eligibility criteria. Forty-eight articles were included in the final systematic review.

Data Collection
Data was extracted by three authors (JB, RP, and AC). Information extracted from each publication is provided in Table 6. To assess the methodological quality of included studies, we used the checklist provided by Downs and Black (73). A total quality score is provided for each study in Table 6 (maximum score = 32).

Statistical Analyses
Data analysis was performed using Stata/SE (StataCorp LP, Texas, Version 15). The age and number of included participants per study, as well as general study results are provided in Table 6. Initial synthesis of qualitative data revealed a number of common endpoints reported consistently by authors. Subsequent meta-analyses included only observational case-control or crosssectional studies that presented data for these commonly reported endpoints ( Table 7). Meta-analyses were conducted using the standardized mean difference (Hedges' g). Hedges' g values ≤0.20, >0.20 but <0.80, and ≥0.80 were considered small, moderate, or large, respectively between controls and participants with dementia (74). For each single or combined effect size, a positive value indicated a higher mean value of that variable in participants with dementia than in healthy volunteers. Some publications contained two subgroups of dementia participants. A fixed effects meta-analysis was performed on the dementia subgroups within each of these studies to compute a composite effect size and variance (75). This composite effect was used in the across-study random effects analysis.
Across-study heterogeneity was investigated using the Cochran's Q-test and I 2 statistic. Cochran's Q test was performed using the weighted method of moments method (75). Cochran's Q statistic was considered significant at p < 0.10. I 2 -values of 25, 50, and 75% were considered indicative of low, moderate, and high heterogeneity, respectively (76).
For each analysis, a funnel plot of standardized mean differences was constructed, and the risk of publication bias evaluated through funnel plot asymmetry and Egger tests. We acknowledge that many other factors including heterogeneity, differences in methodological quality, and selective reporting may produce funnel plot asymmetry (77).
The influence of each study on a meta-analysis estimate was investigated through influence analysis, where each individual study is omitted in turn and the meta-analysis re-estimated using a random effects model. For publications that included more than one subgroup of participants with dementia, the largest subgroup was included in the influence analysis.
There was no funding source for this study and the corresponding author had full access to all of the data in the study and had final responsibility for the decision to submit for publication.

Systematic Review
Five database searches resulted in 18,456 retrieved abstracts after removal of duplicates (Figure 1). Two hundred thirty of these publications qualified for full-text screening after examination by title and abstract. Forty-eight studies qualified for inclusion in the final qualitative analysis and 19 of these publications qualified for inclusion in one or multiple meta-analyses (Dementia: n = 617, Control: n = 406) ( Table 7).
Nineteen studies (39%) enrolled participants only with adrelated diagnoses. Table 2 describes the technical specifications of devices used in individual studies. Thirty-four studies (70%) tested participants using a wrist-worn actigraph. The average assigned duration of wear was 8.26 days (range: 6 min−28 days). Forty (83%) studies used accelerometry as the main measurement of activity. One study used an accelerometer with a gyroscope, while one further study used an accelerometer, gyroscope and magnetometer. Six studies (12%) used activity monitors which did not state the type of measurement modality.

Daily Activity
Of the 48 included studies, 23 (47%) groups reported outcome data on daily activity counts as measured by actigraphy. Qualitative analysis showed that activity counts were presented in a number of different ways ( Table 3). Significant associations of activity counts with other measures, or differences in activity between individuals with dementia and control groups, were reported for the measures of daily activity (eight groups, 34%), peak daily activity (two groups, 8%), mean activity counts (five groups, 21%), daytime activity (five groups, 21%), night time activity (one group, 4%), number of immobile hours (one group, 4%), and activity patterns (three groups, 13%). Quantitative analysis demonstrated that participants with dementia had a significantly lower mean daytime activity counts compared to controls (mean difference, −1.60; 95% CI, −2.66 to −0.55) (Dementia: n = 210, Control: n = 136) ( Figure 2I).

Non-parametric Measurements of Circadian Rhythm Using Wearable Devices
Sixteen (33%) of 48 studies reported non-parametric measurements of circadian rhythm ( Table 4). Qualitative analysis revealed that intradaily variability (IV) was reported by 13 groups: eight (61%) reported an association or difference in dementia groups. Interdaily stability (IS) was reported by 14 groups: nine (64%) reported an association or difference in dementia subjects. Relative amplitude (RA) was reported by 12 groups: seven (58%) reported an association or difference in dementia subjects. Activity of most active 10 h (M10) was reported by nine groups: seven (77%) reported an association or difference in dementia subjects. Activity of least active 5 h (L5)

Cosinor Analysis of Circadian Rhythm Using Wearable Devices
Nine (19%) out of the 48 groups reported a cosinor analysis of circadian rhythm. Qualitative analysis ( Table 4) showed that midline estimating statistic of rhythm (mesor) was reported by five groups: two (40%) reported a significant association or difference in dementia subjects. Amplitude of the cosinor wave was reported by 10 groups: five (50%) reported an association or difference in dementia subjects. Acrophase was reported by eight groups: two (25%) reported an association or difference in dementia subjects. Quantitative analysis was only performed on the amplitude of the cosinor wave. It revealed that subjects with dementia had a significantly lower mean amplitude than controls (mean difference, −1.22; 95% CI, −1.94 to −0.50) (Dementia: n = 174, Control: n = 93) ( Figure 2H).

Wearable Actigraphy for Gait Derived Measures
Of the 48 included studies six (12%) groups reported outcome data on actigraphy to measure posture and gait characteristics ( Table 5). Qualitative analysis showed that all six (100%) reported an association or difference in dementia subjects. These studies each reported a different measure of gait or walking activity, and thus a meta-analysis was not possible.

Risk of Bias Within Studies
Average rating of methodological quality of included studies was 15·54 points (SD = 1·47). The median and mode were both 16 points, with a range of 12-18 ( Table 6).

Meta-Analysis and Heterogeneity
Low between study heterogeneity (I 2 < 50%) was observed for analyses of IV, RA, and L5 variables ( Table 7). Moderate to high between study heterogeneity (I 2 > 50%) was observed for analyses of IS, TST, amplitude, M10, SE, and daytime activity. Meta-regression or subgroup analyses were performed for all actigraphy measures with a moderate to high heterogeneity (I 2 > 50%) which included IS, TST, Amplitude, M10, SE, and daytime activity. Type of dementia, mean age, study design and quality score were all investigated as explanatory variables.
Subgroup analyses indicate that effect estimates vary markedly between dementia subtypes for variables M10 and SE, suggesting differences in dementia type between studies may account for some of the heterogeneity observed in meta-analyses of M10 and SE measurements.      (7) A difference in gait speed under dual task conditions was observed between dementia subjects and controls

Risk of Publication Bias Across Studies for Meta-Analysis
Funnel plots for each variable investigated using random effects meta-analysis are provided in Figure 3. These plots were constructed with a measure of study size on the x-axis and a measure of effect size on the y-axis. Dashed lines represent the pseudo 95 and 99.7% confidence limits about the effect estimate (solid line). Funnel plot asymmetry was observed for all but two variables (IV and RA), and significant Egger tests observed for M10 (p = 0.0057) and amplitude variables (p = 0.0078), suggesting evidence of publication bias for these measurements.

Investigation of Influential Studies
The impact of each study on a meta-analysis estimate was investigated through influence analysis. Influence analysis shows that meta-analysis estimates are generally robust (Figure 4), excluding meta-analysis of daytime activity, where the pooled estimate decreases in magnitude markedly and precision of the estimate improves with exclusion of Varma and Watts (65). Even with exclusion of this influential study, the pooled estimate remains significant and shows the same direction of effect as in the full meta-analysis.

DISCUSSION
From our systematic review of the literature we found 48 articles which met our inclusion criteria of wearable technology use in patients with dementia for the measurement of physiological parameters. Wearable devices were utilized most extensively to measure circadian rhythm, measurement of the sleep wake cycle and daily activity. In the studies which were analyzed using forest plots, groups of participants with dementia were less active then controls, had a difference in their sleep wake cycle and showed differences in their circadian rhythms when compared to control groups. To our knowledge, this study is the first

Wearable Devices to Measure Sleep and Circadian Rhythm
The use of actigraphy to measure sleep was the most commonly reported outcome. Participants with dementia demonstrated reduced sleep efficiency as compared to controls. There was also a significant difference between individuals with dementia and controls on non-parametric measures of circadian rhythm including IV, IS, and RA, however it should be noted that for some measures the combined effects were substantially weighted by the results of Hooghiemstra et al. (31). Meta-analysis of the amplitude measure of circadian rhythm cosinor analysis also demonstrated a moderate but statistically significant difference between groups. Again, a high level of heterogeneity between studies was observed for this outcome measure. Despite evidence of the utility of wearable actigraphy in sleep monitoring, consistent outcome measures and methods of analyzing sleep data and circadian rhythm have not been universally agreed upon (2). In order for actigraphy to become routinely used in clinical and drug treatment trials, consistent outcome measures are needed and, as shown in this meta-analysis, may provide a useful endpoint for patients with dementia.

Wearable Devices and Daily Activity
When using wearable devices to measure daily activity, those with dementia had significantly lower daily activity counts than controls. This effect was demonstrated despite acrossstudy variation in methods of calculating daytime activity including peak activity counts, mean activity, and daily activity.
A meta-analysis of studies measuring daily activity showed that subjects with dementia demonstrate significantly less daily activity as compared to controls. Four groups reported no differences in nocturnal activity between subjects with dementia and controls. It should be noted that two of these studies did not recruit a control group, but instead compared participants with dementia to their caregivers [McCurry et al. (49) and Merrilees et al. (50)]. Physical activity has been examined in longitudinal studies and found to be associated with both development of dementia as well as disease progression (78). There is increasing evidence that physical activity and exercise as part of multidomain interventions holds benefit for patients with dementia (79). However, as demonstrated in this review, definitions of physical activity differ significantly between studies and daily activity counts measured by wearable devices are not definite indicators of beneficial exercise, but merely of movement. Some researchers have attempted to quantify daily activity counts into variables such as energy expenditure, and this measure was also reduced in participants with dementia as compared to controls (55). With the growing availability of consumer wrist worn devices for movement and activity tracking, the use of daily activity measurements provides a potential novel end point for large scale clinical trials in dementia.

Wearable Devices and Gait
Analysis of gait behavior was studied by six groups. Significant differences between controls and those with dementia were reported by all groups for multiple aspects of the gait cycle and behavior. However, due to the variation in reported outcomes, a quantitative analysis could not be performed and conclusions regarding the use of wearable devices for the study of gait could not be reliably made. It is important to note that gait speed and walking speed were reported as significantly different in subjects with dementia when compared to controls, while cadence and step variance were not. Lower gait speed in particular has been shown in numerous longitudinal studies to correlate with increased fall risk in older adults (80). Further work to replicate these findings in subjects with dementia is warranted.

Limitations
The main limitation of the meta-analysis was the between-study heterogeneity ( Table 7). Given differences in characteristics of study design such as duration of testing, wearable device type, and diagnosis, statistical heterogeneity was expected between publications included in each meta-analysis. Despite this, effect size comparisons between healthy volunteers and participants with dementia were generally consistent in direction between studies. Methodological considerations specifically for actigraphy testing in dementia have been more thoroughly addressed in a clinical review (81). Also, all papers included in this review corresponded to definitions of both all cause dementia and wearable devices which were agreed upon by the author group. As a result, studies which did not conform to these definitions have been excluded and the effect these may have had on the analysis cannot be quantified. Lastly not all devices used have been compared to gold-standard clinical testing and their methods of measurement may differ and therefore their reported differences should be interpreted with caution.

CONCLUSIONS AND IMPLICATIONS
In conclusion this systematic review and meta-analysis has shown that the wearable devices studied demonstrate differences in those with dementia when compared to controls. Specifically, it provides evidence that wearable devices demonstrate a utility in measuring levels of activity, changes in circadian rhythm, and changes in the sleep wake cycle. Included studies were limited by their heterogeneity, the lack of classification of dementia sub-type and stage, as well as the lack of confirmatory clinical trials. Further work is warranted to correlate these findings with clinical changes which may represent surrogate digital end-points such as the neuro-psychiatric manifestations associated with circadian rhythm changes and the loss of mobility associated with decreased activity.

DATA AVAILABILITY STATEMENT
All datasets generated for this study are included in the article/Supplementary Material.

AUTHOR CONTRIBUTIONS
JB: concept and design. JB, AC, and RP: acquisition, analysis, or interpretation of data, and drafting of the manuscript. JB, AC, RP, NK, and KT: critical revision of the manuscript for important intellectual content, administrative, technical, or material support. AC: statistical analysis. KT: obtained funding. JB and KT: supervision. AC had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. All authors contributed to the article and approved the submitted version.