Smartphone Accelerometry: A Smart and Reliable Measurement of Real-Life Physical Activity in Multiple Sclerosis and Healthy Individuals

Background: Mobility impairment is common in persons with multiple sclerosis (pwMS) and can be assessed with clinical tests and surveys that have restricted ecological validity. Commercial research-based accelerometers are considered to be more valuable as they measure real-life mobility. Smartphone accelerometry might be an easily accessible alternative. Objective: To explore smartphone accelerometry in comparison to clinical tests, surveys, and a wrist-worn ActiGraph in pwMS and controls. Methods: Sixty-seven pwMS and 70 matched controls underwent mobility tests and surveys. Real-life data were collected with a smartphone and an ActiGraph over 7 days. We explored different smartphone metrics in a technical validation course and computed afterward correlation between ActiGraph (steps per minute), smartphone accelerometry (variance of vector magnitude), clinical tests, and surveys. We also determined the ability to separate between patients and controls as well as between different disability groups. Results: Based on the technical validation, we found the variance of the vector magnitude as a reliable estimate to discriminate wear time and no wear-time of the smartphone. Due to a further association with different activity levels, it was selected for real-life analyses. In the cross-sectional study, ActiGraph correlated moderately (r = 0.43, p < 0.05) with the smartphone but less with clinical tests (rho between |0.211| and |0.337|). Smartphone data showed stronger correlations with age (rho = −0.487) and clinical tests (rho between |0.565| and |0.605|). ActiGraph only differed between pwMS and controls (p < 0.001) but not between disability groups. At the same time, the smartphone showed differences between pwMS and controls, between RRMS and PP-/SPMS, and between participants with/without ambulatory impairment (all p < 0.001). Conclusions: Smartphone accelerometry provides better estimates of mobility and disability than a wrist-worn standard accelerometer in a free-living context for both controls and pwMS. Given the fact that no additional device is needed, smartphone accelerometry might be a convenient outcome of real-life ambulation in healthy individuals and chronic diseases such as MS.


INTRODUCTION
Multiple sclerosis (MS) is the most common autoimmune disease of the central nervous system (CNS) and leads to an accumulation of disability by chronic inflammation and neurodegeneration (1). The patterns of disability are heterogeneous, but impaired mobility occurs in up to 75% of persons with multiple sclerosis (pwMS) (2) and represents one of the most disrupting physical features of MS (3). Regarding the perceptions of bodily functions, ambulation is rated as one of the three most valuable abilities (4). Besides, walking is the most frequent type of self-selected physical activity (5) and represents with over 50% of dynamic activity over a 24h period, the primary mode of physical activity in pwMS (6). Walking impairment could cause physical inactivity, which results in physical deconditioning, and in this negative feedback mechanism, walking impairment could be driven further down (7). In the clinical setting, walking impairment can be used to monitor disability progression, and ambulatory improvement can be used as an indicator of efficacy in therapeutic trials (8). However, while the importance of the walking ability in MS is widely accepted, the ideal measurement approach is still under discussion (9).
The Extended Disability Status Scale (EDSS) is an accepted standard of disability measurement in MS and relies in its middle range mainly on walking abilities in the range between 20 and 500 m (10). However, the scale suffers for its increased variability for longer walking distances and other factors like fatigue, patient's mood, and the time the test was performed (11). EDSS also has limitations to measure small but clinically meaningful changes in ambulation, and it fails to capture the performance fluctuation over time in the natural environment (12). Standard clinical performance-based measures, such as the Timed 25-Foot Walk (T25FW), 2-Minute Walking Test (2MWT), and 6-Minute Walking Test (6MWT) (13), provide objective snapshots of the day-to-day variable ambulatory capacity (14). They may not reflect the continuous walking activity in the real-world environment due to the lack of ecological validity (15). Patientreported outcome measures (PROMS) (e.g., 12-Items Multiple Sclerosis Walking Scale (MSWS-12) (16) or Godin Leisure-Time Exercise Questionnaire (GLTEQ) (17) are limited by recall bias and variability in self-perception of physical activity.
To that end, the total ambulatory activity undertaken in the habitual environment in performing a usual range of daily activities is recognized as the gold standard for measuring ambulatory mobility in neurological disorders (18), and there is an emerging body of research supporting the application of accelerometry for measuring physical data in MS (19,20). ActiGraph (Pensacola, FL, USA) is one of the most common accelerometers used in research (20,21). Associations between the output of ActiGraph (i.e., activity counts, step counts, MVPA, and sedentary time) and clinical outcomes in the free-living setting have been intensively investigated (22)(23)(24). Nevertheless, there are neither standard protocols of application of commercially available accelerometers nor standard accelerometer output-for example, estimates of energy consumption, step number, or walking speed (19,25). The need and burden of wearing an additional device restricts its use to short-term usage and may, due to the perceived invasiveness, affect the ecological validity. Smartphone with built-in accelerometry might overcome this shortcoming and has been considered as a possible measurement for motion data. The cost and the burden of measurement are low due to a high usage rate in the general population and among pwMS and the lack of need for a further device (26). Studies in recent years supported the application of smartphones for assessing mobility and physical activity in clinical as well as in a free-living setting (27)(28)(29)(30). However, there is a lack of studies investigating smartphone accelerometry as a putative outcome for neurological diseases such as MS.
Here, we aimed to investigate the value of built-in smartphone accelerometers as a valid outcome for disability and mobility compared to a wrist-worn ActiGraph in a representative group of pwMS compared to healthy controls in a free-living setting.

METHODS
The validation and exploration of the smartphone accelerometry were done in two steps: first, we performed a technical validation course for wear time validation and selection of outcomes. Second, we performed a cross-sectional analysis in pwMS and healthy controls with clinical outcomes, PROMS, and ActiGraph measurements. The value of an outcome metric was estimated by its discriminant ability between different disability groups (e.g., mild vs. moderate impairment) and its correlation with self-reported physical activity and clinical performancebased measures.

Technical Validation Course
To define periods of wear time and non-wear time and to explore summary measures from the raw accelerometry data, we performed a technical validation course using 28 smartphones, Samsung Galaxy (model S4 mini) with a built-in tri-axis accelerometer. First, we collected no-wear data over 10 min while all smartphones were lying in different positions on a table. Then, the smartphones were carried by three members from the staff for investigating wear time assessments, which included sitting, standing, walking, running, and stair climbing 10 min each. Passive movements were recorded in an elevator and during a bus trip.

Participants
Participants were recruited at the MS outpatient clinic at the University Medical Centre Hamburg-Eppendorf. The inclusion criteria for pwMS were (1) age 18-65 years, (2) a confirmed diagnosis of MS according to McDonald criteria 2010 (31), (3) an Extended Disability Status Scale (EDSS) (10) score below 6.5, and (4) no relapse in the last 30 days. The inclusion criteria for the controls were (1) not reporting disease with potential impact on mobility and (2) matching the age distribution of the sample with MS. The exclusion criteria for both samples were severe medical conditions other than MS, severe cognitive impairment, or any other condition that might relevantly compromise the use of a smartphone (e.g., very low visual acuity or severe ataxia). All participants gave written informed consent prior to any testing under this protocol, and the local ethical review board (Ärztekammer Hamburg, PVN 5001) approved the investigation.
All participants were supplied with an ActiGraph (model GT3X+) and a smartphone (Samsung Galaxy S4 mini). We asked the participants to wear the ActiGraph on the non-dominant wrist (35) and the smartphone in the habitual position like their phones for the following 7 days. They were asked to wear both devices during the entire day, except for showering, swimming, or while sleeping.

Data Processing
All the written data, including demography, clinical performance-based measures, and PROMS were collected in an electronic case report file. The raw ActiGraph data were processed, and standard outcomes [mean vector magnitude (meanVM), daily MVPA, steps/minute] were downloaded with ActiLife 6 software version 6.13.3 (ActiGraph, Pensacola, FL USA) in 60-s epoch intervals. Non-wear time was filtered out with the Choi algorithm (36). The smartphone accelerometer data were collected via a small Android-based application, which had been developed by the Institute of Neuroimmunology and Multiple Sclerosis (INIMS). The raw accelerometer axis (X, Y, and Z) values were filed at a sampling rate of 2 Hz.

Selection of Smartphone Outcomes
For the selection of putative smartphone outcomes in the technical validation course, we computed and explored the following summary metrics for epochs of 60 s (same bout length as for the ActiGraph): Sum of absolute axis values (sumX, sumY, and sumZ), variance of axis values (varX, varY, and varZ), Pearson's correlations between each pair of axes (corXY, corXZ, and corYZ), sum of all absolute axis values (sumXYZ), mean absolute correlation (corXYZ), sum of absolute vector magnitude (sumVM), mean vector magnitude (VM), and mean variance of the vector magnitude (varVM). Most of the metrics reflect standard accelerometry metrics-such as the vector magnitude and sum of acceleration of selected axes (37). However, several commonly used accelerometry outcomes rely on the proper orientation in space, for example the vertical axis counts. For smartphones, such orientation-dependent metrics are not reasonable under the concept of using the patient's device in the future. Thus, we decided to explore orientation-independent metrics. We hypothesized that increasing physical activity might translate into the reduced correlation of the axes counts and increased variance of acceleration measurements.
To compare the potential metrics, the available dataset was split in a ratio of 1:1 randomly in an explorative and a validation subset. First, we used the explorative data to visually inspect boxplots of all measurements for the selection of candidates with the high discriminant ability of no-wear vs. wear time and over different activities. The potential metrics were then formally tested for discriminant abilities of wear and no-wear time by receiver operating characteristic (ROC) analyses. Finally, we validated the metrics from the explorative dataset in the validation sample and defined cutoff values for separation of wear and no-wear time. For further analysis, all accelerometry outcomes were wear-time corrected average values.

Statistical Analysis
For the statistical analysis, we divided the total sample into healthy controls and the pwMS. The pwMS were further divided into the following subgroups: (1) disease course (relapsing vs. progressive) representing conceptually early and late MS and (2) by EDSS <3.5 vs. ≥3.5 representing a cutoff for ambulatory impairment in MS (10) (minimal ambulatory impaired vs. ambulatory impaired). We performed descriptive statistics of the demography with mean/SD, median/range, or number/rates according to the nature of the data. Student's t-test was used to detect the differences of demography, clinical performance-based metrics, PROMS, wear time, and metrics of accelerometry within the above-mentioned groups. Associations between smartphone accelerometry and ActiGraph were first estimated by Spearman's rank-order coefficient within the groups. The most correlating metric of each accelerometer was then chosen to be tested with the clinical performance-based metrics and PROMS by Spearman's rank order. We used Mann-Whitney U-test to determine the ability of the accelerometers to separate between groups. In addition, we computed ROC analysis to examine the predictability of the accelerometers for disease course and severity of the disability. P < 0.05 was used for judging the significance level. Due to multiple comparisons, we corrected the p-values with the false discovery rate (FDR). All analyses were performed with statistics in R. Discriminant abilities could be confirmed in the validation subset, and the AUC from the validation set did not differ from the explorative estimation for varVM (p = 0.507). However, varVM showed significantly higher accuracy than sumXYZ (p < 0.001) and was chosen for wear time detection. Moreover, both metrics tended to increase with estimated physical activity level, and we used these two outcomes for further analyses (Figures 2A,B).

Participants and Clinical Characteristics of the Subgroups
We included 137 subjects: 70 HC and 67 pwMS (see Figure 3). Demographic data are presented in Table 1. Patients with primary or secondary progressive MS (PP-/SPMS) were elder (49.6 vs. 35.9 years) than patients with relapsing-remitting MS (RRMS). pwMS with impaired ambulation had longer  disease duration (12.9 vs. 6.4) than its comparison group. Otherwise, we observed no group differences in age, body mass index (BMI), and waist. Moreover, the median EDSS in patients with primary or secondary progressive MS (PP-/SPMS) was 1.8 higher (p < 0.001) than in relapsing-remitting MS (RRMS). Table 2 shows the descriptive statistics of clinical tests, PROMs, and accelerometry measures of ActiGraph and  smartphone. The average measurement times within 7 days were 55 h for the smartphone and 76 h for the ActiGraph, which represent an average active wear time of 7.5 and 10.9 h/day, respectively.

Correlations Between Smartphone Metrics and ActiGraph
First, we were interested in analyzing the correlation between standard ActiGraph outcomes, and smartphone-derived metrics (see Table 3 and Figure e3). Among all metrics, varVM correlated best with ActiGraph steps/minute within all participants (rho = 0.44, p < 0.001). However, this association was mainly driven by the correlation in healthy controls (rho = 0.478, p < 0.001), while it was clearly weaker but still significant in pwMS (rho = 0.29, p = 0.022).

Correlations Between Accelerometer Outcomes, Clinical Performance-Based Measures, and PROMS
Next, we investigated the association of both accelerometers with demography, clinical measures, and PROMS (Figure 4). Thus, we will describe the smartphone outcome varVM and the ActiGraph outcome steps/minute as the most correlating outcomes in the subgroups more in detail. The association of both outcomes with demography, clinical measures, and PROMS are summarized in Figure 4

The Ability of Accelerometry to Differentiate Between Subgroups
In addition, we wanted to compare the discriminant abilities of accelerometry data for MS subgroups. Again, we used steps/minute and varVM as outcomes of interest. ROC analysis (Table 4) revealed that varVM was the better classifier for differentiating pwMS from control (AUC = 0.75 vs. 0.68,  Figure 5). Moreover, only varVM was able to differentiate between relapsing-remitting and progressive MS (AUC = 0.946, p < 0.0001, Figure 6) and to differentiate between severe ambulatory impairment and mild ambulatory impairment patients (AUC = 0.728, p < 0.01, Figure 7).

DISCUSSION
This study examined smartphone accelerometry as an outcome of real-life ambulation and physical activity in healthy individuals and pwMS. To follow this aim, we analyzed the relationship of putative smartphone metrics with a research-grade accelerometer (ActiGraph) during free-living conditions, with objectively measured walking ability and with self-reported physical activity.
Overall, results showed that the smartphone accelerometer correlated only moderately with ActiGraph in HC and pwMS. However, the smartphone accelerometer seems to be more closely associated with walking ability, represented by the clinical performance-based measures, such as TTW, T25FW, 2-/6-MWT, FTSTS, and with ambulatory impairment, represented by MSWS and EDSS. Moreover, the smartphone accelerometer differentiated the levels of ambulation among all participants and ambulatory impairment among the pwMS better than the ActiGraph. In our study, smartphone metrics seem more reliable than a wrist-worn research-grade accelerometer. For this study, we used a new metric to capture ambulation and body motion based on accelerometry data-the variance of the vector magnitude. From a conceptual point of view, the metric represents the movement of the smartphone in all dimensions in a given time. The metric was chosen based on a technical validation course and provided two important features: high specificity and sensitivity to identify wear time periods and a positive association with increased ambulation. An advantage of this metric is its independence from the orientation of the smartphone. The value of this metric was evaluated in comparison to a battery of different outcomes and was chosen by applying a strict selection methodology based on an explorative and a validation data set. Moreover, the promising results of being a good discriminator between ambulation levels and its association with ambulatory impairment metrics indicate a successful proof-of-concept.
The rather weak to moderate association between smartphone varVM and ActiGraph outcomes in both HC and pwMS contrasts one study that android smartphones provided similar raw counts as ActiGraph in a free-living setting (29). Although ActiGraph is a validated tool, most of those validating studies chose the hip-worn position (20,21), and the literature provides controversial data for the wrist-worn position of accelerometers (12,23,(38)(39)(40). However, the acceptance for the wrist-worn position may be higher (39). In this study, we also used wristworn ActiGraph data, which might explain the unexpected low to moderate correlations with clinical measures and PROMS. Another reason for the lower correlation could be explained by the arm movements during the non-walking time included in the high active wear time, while the participants might move the smartphone mostly when they were walking.
Regarding the ecological validity, ActiGraph could influence the exercising behavior and, for example, increase the physical activity, since the notable visibility and discomfort on the   wrist could be perceived invasive as a "reminder." On the other hand, the possibly perceived invasiveness of a wrist-worn accelerometer could be intentionally used to motivate users for more exercising. Eventually, using smartphones as an ubiquitous, available measuring tool might overcome these shortcomings as usual smartphone positions such as handbags, rucksacks, or pants pocket, which are less perceivable and provide a high accuracy (40,41). These positions are closer to the body's center of mass that has been recommended as the best sensor position (42). Thus, smartphones could refer more to real life and offer higher ecological validity. However, the perceived invasiveness was not studied here and needs to be addressed in future studies. An important result supporting the value of our smartphonebased approach is the clear association with age. Walking abilities, and especially walking speed, are known to be an important indicator of health status and associated with age (43). Interestingly, this important coherence with age was wellreproducible with our smartphone metric but not with the ActiGraph. Proving a well-known and fundamental relationship emphasizes the reliability of smartphone accelerometry as an outcome of real-life ambulation. Furthermore, the discriminating ability of smartphone varVM confirmed the assumption that smartphones could differentiate levels of walking ability and ambulatory dysfunction. However, these findings are based on a cross-sectional study, and its sensitivity to disability progression or improvement must be analyzed in a longitudinal setting.
However, it remains uncertain which dimension of physical activity or ambulation is captured explicitly by smartphones in general: There are controversial results addressing this issue in the literature of former research (27,29,44). Here, our approach using a research-grade accelerometer as a reference failed. However, varVM correlated much stronger with the clinical measures, representing walking ability than with the PROMs and representing self-reported physical activity. Thus, we assume that smartphone measures rather the walking ability than the physical activity. It might be due to a measurement gap during exercising or other vigorous activities performed without the smartphone. Conceptually, this assumption is supported by the fact that smartphones are usually worn during habitual activities like traveling, shopping, and walking outside. At the same time, it is preferably placed aside during exercising and other vigorous activities. However, this assumption needs further investigation.
The association between smartphone metrics and clinical outcomes was, in general, higher than for the ActiGraph. However, both smartphone and ActiGraph correlated with clinical measures or PROMs more among the ambulatory mildly to moderately impaired pwMS than those with severely impaired ambulation. This links to the still open question of whether accelerometry can generally measure walking ability or rather physical activity in patients with very low activity levels and variable gait patterns, such as in progressive MS. However, the ability of the clinical test to mirror real-life ambulation and motion is also limited, and they have a rather low ecological validity (15,45). Thus, a poor association might be due to the low performance of the real-life device or shortcomings of the clinical tests. Further research is needed to provide better objective estimates of low activity levels in more severely disabled patients. Moreover, our technical validation indicated a meaningful increase in the chosen smartphone metric with increasing physical activity. However, these findings could not be validated in the real-life setting in this study.
One of the limitations in our study was the wear time of the devices that might have been too short for reliable estimates of real-life walking or activity. The original wristlet of ActiGraph was often reported as unfeasible during specific exercising like weightlifting; on the other hand, the smartphone has a relatively short battery life and needed to be charged at least once a day. Wear time alone cannot be considered as evidence for the smartphone as an outcome of real-life activity. However, smartphone covered ∼72% of the ActiGraph measurement time. Future studies need to validate against other outcomes or devices. Moreover, we asked the participants to wear the smartphone in their habitual wearing position, aiming to simulate the reallife condition and to avoid the possibly perceived invasiveness. Although the usual position like handbag, backpack, and pants pocket probably does not have differences in the accuracy of measurement (40), the smartphone secured on the upper arm showed a lower accuracy (41). Another limitation is that it is impossible to determine if the phone estimates would remain comparable with other phone models that have not been tested. However, one of the most prominent android brands was used in this study. Finally, we only investigated a rather simple summary metric of 60-s epochs, which reduces the complexity of the raw data. Advanced algorithms, for example, estimating walking speed, might improve the validity of smartphone accelerometry, as it has been shown for research-grade devices (25).
Even with these limitations, there seems to be a strong opportunity for smartphone accelerometry in the context of several diseases and healthy living (27,30). It might help clinicians to monitor ambulatory dysfunction, disease progress, or rehabilitation in diverse clinical conditions with high ecological validity. It could also help patients to monitor their individual changes of walking ability from a personal baseline over time and to achieve ability goals. Combined with motivational, educational tools, it may as well help to improve physical activity independent from diseases.

CONCLUSION
Smartphone accelerometry provides better estimates of mobility and disability than a wrist-worn standard accelerometer in a freeliving context for both controls and pwMS. Given the fact that no additional device is needed and despite further validation, smartphone accelerometry might be a convenient outcome of real-life ambulation in healthy individuals and chronic diseases such as MS. Moreover, activity estimates from smartphones might be more ecological valid as the perceived invasiveness of assessment is lower than for additional and clearly visible devices.

DATA AVAILABILITY STATEMENT
The datasets generated for this study are available on request to the corresponding author.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Ärztekammer Hamburg. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
YZ: design and conceptualized study, major role in the acquisition of data, analyzed the data, and drafted the manuscript for intellectual content. NN: major role in the acquisition of data and revised the manuscript for intellectual content. JP: interpreted the data and revised the manuscript for intellectual content. EG: major role in the acquisition of data, interpreted the data, and revised the manuscript for intellectual content. CH: interpreted the data and revised the manuscript for intellectual content. J-PS: design and conceptualized study, analyzed the data, and drafted the manuscript for intellectual content. All authors contributed to the article and approved the submitted version.

FUNDING
Parts of this study were funded by a grant from Biogen.