Ecological validity of a deep learning algorithm to detect gait events from real-life walking bouts in mobility-limiting diseases

Introduction The clinical assessment of mobility, and walking specifically, is still mainly based on functional tests that lack ecological validity. Thanks to inertial measurement units (IMUs), gait analysis is shifting to unsupervised monitoring in naturalistic and unconstrained settings. However, the extraction of clinically relevant gait parameters from IMU data often depends on heuristics-based algorithms that rely on empirically determined thresholds. These were mainly validated on small cohorts in supervised settings. Methods Here, a deep learning (DL) algorithm was developed and validated for gait event detection in a heterogeneous population of different mobility-limiting disease cohorts and a cohort of healthy adults. Participants wore pressure insoles and IMUs on both feet for 2.5 h in their habitual environment. The raw accelerometer and gyroscope data from both feet were used as input to a deep convolutional neural network, while reference timings for gait events were based on the combined IMU and pressure insoles data. Results and discussion The results showed a high-detection performance for initial contacts (ICs) (recall: 98%, precision: 96%) and final contacts (FCs) (recall: 99%, precision: 94%) and a maximum median time error of −0.02 s for ICs and 0.03 s for FCs. Subsequently derived temporal gait parameters were in good agreement with a pressure insoles-based reference with a maximum mean difference of 0.07, −0.07, and <0.01 s for stance, swing, and stride time, respectively. Thus, the DL algorithm is considered successful in detecting gait events in ecologically valid environments across different mobility-limiting diseases.


Introduction:
The clinical assessment of mobility, and walking specifically, is still mainly based on functional tests that lack ecological validity.Thanks to inertial

. Introduction
Mobility is the ability to move about in the home and community (1).Mobility can be affected by chronic health conditions, including but not limited to neurological, respiratory, cardiac, and musculoskeletal disorders (2).Deficits in mobility have been linked with a reduced quality of life, an increased fall risk, and mortality (2, 3), therefore, mobility is regarded as an essential aspect of health (4).The most common and functionally relevant aspect of mobility that is affected by aging and chronic health conditions is walking (1,5).
To date, the clinical assessment of mobility is based on functional tests that include short walking tasks (6)(7)(8)(9).A common shortcoming of these functional tests is the lack of ecological validity: Walking, as measured in clinical settings, does not reflect daily life walking (3,(10)(11)(12).The transition to unsupervised monitoring of human motion in naturalistic and unconstrained daily life activities is driven mainly using wearable inertial measurement units (IMUs) (4,13).It is noteworthy that meanwhile both European and American notified bodies for the certification of medical devices (Medical Device Regulation and Food and Drug Administration, respectively) have put focus on wearable sensors by updating their regulations for the design, pre-clinical validation, and clinical validation of devices that include wearable IMUs (13,14).Similarly, both the European Medicines Agency and the United States Food and Drug Administration encourage the inclusion of parameters from unsupervised patient monitoring as exploratory endpoints in clinical trials (11,15).
A critical step for the objective analysis of gait is the segmentation of gait sequences into gait cycles (16)(17)(18), i.e., the basic repetitive unit that gait is comprised of (19,20).The beginning and end of each gait cycle, also referred to as stride, are often determined from two successive initial contacts (ICs) of the same foot (19,20).Together with the instant at which the foot leaves the ground (i.e., final contact, FC), each stride can be divided into a stance and swing phase (18)(19)(20)(21).ICs and FCs are commonly referred to as gait events (19,20,22) and are a prerequisite for any further clinical gait analysis (18).The detection of ICs and FCs from IMUs is typically done using heuristics-based algorithms (23)(24)(25)(26)(27)(28)(29)(30).Many of these algorithms use local maxima or minima of the acceleration and/or angular velocity signals along one axis (31), which requires knowledge of the sensor-to-segment alignment (32,33).However, in unsupervised human gait monitoring, the sensorto-segment alignment cannot be controlled as study participants often attach the sensor themselves, for example, after showering (34).Therefore, the technical validity of these algorithms for the case of unsupervised human gait monitoring is still an ongoing challenge also due to the scarcity of labeled free-living gait data (35)(36)(37).Additionally, IMU-based gait signals are affected by disease characteristics, participant activity levels, and the exact context in which walking takes please, and therefore, any heuristics-based algorithm that was developed based on lab-based gait data might not translate directly to free-living gait (3,11,15,30,38).
In contrast to the aforementioned heuristics-based algorithms, machine learning-based algorithms do not depend on user-defined sets of rules but rather learn to recognize gait signals directly from annotated data (39-41).Hidden Markov models (HMMs), for example, were successfully applied for gait segmentation in healthy (42,43) and pathological gait (42,44), but only in-lab recorded gait data were used to check for validity.A recent study used HMMs to segment gait cycles from free-living gait data and reached 96% recall and 89% precision for free-living data, however, data were only from participants with Parkinson's disease (PD) (45).Although HMMs thus seem a good fit for modeling the sequential nature of the gait cycle, one still needs to define the number of discrete states beforehand, and it would be needed to have a separate model per activity if more than just gait was to be detected (46,47).Deep learning (DL)-based algorithms provide an alternative approach that does not require any heuristic rules but rather learns relevant data representations automatically from a set of input features and reference annotations (40,41,48,49).DL algorithms have been successfully applied for gait event detection from stereophotogrammetric data (50-54) and from inertial measurement unit data (34, 55), however, only for in-lab gait data.
Therefore, the specific aim of the current study was to determine whether a previously in-lab validated DL-based algorithm (34) for the detection of ICs and FCs can be used for the detection of gait events in pre-extracted real-life walking bouts in a heterogeneous cohort of different mobility-limiting diseases.For the current study, walking bouts were defined according to the recently published consensus framework for digital mobility monitoring (2).
. Materials and methods . .Data collection

. . . Study participants
As part of the Mobilise-D technical validation study (56), a convenience sample of 108 participants was recruited at five independent study sites (Newcastle upon Tyne Hospitals NHS Foundation Trust, UK, Sheffield Teaching Hospitals NHS Foundation Trust, UK, Tel Aviv Sourasky Medical Center, Israel, Robert Bosch Foundation for Medical Research, Germany, University of Kiel, Germany).The sample represented five mobility-limiting disease cohorts [congestive heart failure (CHF), chronic obstructive pulmonary disease (COPD), multiple sclerosis (MS), Parkinson's disease (PD), and proximal femoral fracture (PFF)] and a cohort of healthy older adults (HA) (56).These cohorts cover a range of walking speed, mobility challenges, and potential events that are of clinical interest, such as improving vs. worsening of function, falls, hospitalization, nursing home admission, and death.Furthermore, as the participants were recruited at five different sites across Europe, they ensured a geographical representation and covered a diverse representation of healthcare organization, such as in-vs.outpatient care, as well as public vs. private health services (1,56).Participants needed to be able to walk 4 m independently, to give informed consent, and have a Montreal Cognitive Assessment score > 15 (57).A detailed description of inclusion and exclusion criteria is provided elsewhere (56), and ranges of values for cohort-specific clinical scales are detailed in Table 1.

. . . Study protocol
Study participants were equipped with the INertial module with Distance sensors and Pressure insoles (INDIP) system that included both pressure insoles (PIs) and IMUs to record movement signals from both feet and the lower back (27,58,59).Participants wore the INDIP system for 2.5 h in their habitual environment, e.g., home, work, community, and/or outdoor environment, which was chosen by the participant, with no specific restrictions (56).To capture the largest possible range of activities, participants were provided with a list of activities that could be included if relevant to their chosen environment (e.g., rising from a chair, walking to another room, and walking outdoors).No supervision or structure as to how these tasks were completed was given to the participants.The duration of the observation has been established as a trade-off between experimental, clinical, and technical requirements (56).
. .Data processing . . .Data preparation Data from the INDIP system were synchronized by setting the clock to have the same timestamp for all the sensors between the left and right foot, and values were recorded at a sampling frequency, f s , of 100 Hz.As input to the DL algorithm, only the raw accelerometer and gyroscope data from both feet were used.Data were split into three different datasets: a training set, a validation set, and a testing set (40,41).For this purpose, for each of the six cohorts, data from approximately 20% of the participants were assigned to the testing set, data from another 20% of the participants were assigned to the validation set, and data from the remaining participants were used as the training set.
The validation set was used to find an optimal network architecture using grid search (60), and the training set was used to optimize the corresponding model parameters (40,41).The testing set was only used for the final evaluation, and notably, the numbers presented in the Section Results only corresponded to the performance of the testing set.

. . . Reference system
For all data, the gait events, that is both ICs and FCs, were detected separately from the PIs and IMUs from the INDIP system that is described in detail elsewhere (61) to meet the emerging demands associated with reproducibility and replicability in biomedical research and regulatory qualification (62).Then, the results were combined, and priority was given to the PIs in case both modalities detected an event (63).For the PIs, footground contact was defined when at least three sensing elements from the PI belonging to the same spatial neighborhood were consecutively activated and deactivated (64).For the IMUs, an existing algorithm, originally designed for shank-worn IMUs, was adapted for use with foot-worn IMUs.Previously, it was validated for the detection of supervised gait events in older, hemiparetic, parkinsonian, and choreic gait (27,65) and across multiple research centers for parkinsonian and mildly cognitive impaired gait (66).
From these gait events, walking bouts (WBs) were formed by merging information from left and right strides (27,28).Each WB represented a gait sequence with a minimum of two left and two right strides (2, 63).Here, strides were only considered valid if (i) the stride duration was between 0.2 and 3 s and (ii) the stride length was minimally 0.15 m.A resting period of 3 s determined consecutive WBs, thus, each WB could contain a resting period of ≤3 s.
For the current study, we analyzed only those WBs that lasted ≥10 s (67-70) and for which both the INDIP's PIs and IMUs were used for determining the gait events.These gait events were considered as reference annotations for training and evaluating the DL algorithm.

. . . Deep learning algorithm
The DL algorithm was based on the neural network (NN) that was previously validated on in-lab gait data from shank-worn IMUs worn by participants with different neurological diseases (34, 71).At the core of the NN was a temporal convolutional network (TCN) (72,73).The TCN was built from stacking residual blocks (74) with an exponentially increasing dilation factor for the convolutional layers (Figure 1).Specifically, each residual block comprised two sequences of a dilated convolution (Conv) layer (75), a batch normalization (BatchNorm) layer (76), a rectified linear unit (ReLU) activation layer, and a dropout layer (77).A residual connection was used to perform convolution with a kernel size of 1 in case the number of feature maps did not match the number of input channels (72,73).The outputs of the second dropout layer and the residual connection were summed elementwise and inputted to a ReLU activation layer.The convolution layers consisted of 64 filters with a kernel size of 3 and a dilation factor of 2 m−1 with m = 1, • • • , N dil for the m-th residual block (with N dil = 6, the number of residual blocks, and thus, the maximum dilation factor was 2 5 = 32).
The outputs of the last residual block were passed through a fully connected (also referred to as dense) layer followed by a softmax activation layer (78,79).The final outputs were then regarded as probability that a certain gait event took place at the given time step, t n .

. . Evaluation
As in our previous studies (34), the performance was evaluated with the testing set only.The trained model was used to predict the probability that any gait event occurred from the IMU data.Peak probabilities, with a minimum probability, Pr = 0.5, and a minimum interpeak distance, t = 0.5 s, were considered detected events.
Performance was evaluated for the overall detection performance, time agreement between predicted and annotated gait event timings, and time agreement between subsequently derived stride-specific gait parameters.

. . . Overall detection performance
The overall detection performance quantified how many of the annotated gait events were detected (true positives), how many of the annotated gait events were not detected (false negatives), and how many of the detected events were not annotated (false positives).From these numbers, the recall (also referred to as sensitivity) and precision (also referred to as positive predictive value) were calculated as follows: precision = # true positives # true positives + # false positives . (2) Thus, the recall represented the fraction of annotated events that were detected, and the precision represented the fraction of events that were truly gait events.
Here, in case the absolute time difference between an annotated and predicted event was ≤250 ms, it was considered a true positive event (30, 34, 80, 81) (in other words, a tolerance window of 500 ms centered around the reference timing was used).

. . . Time agreement
For all correctly detected gait events (true positives), the time agreement between the detected and annotated event timings was quantified by where t pred is the timing corresponding to the peak probability and t ref is the timing of the INDIP-derived annotations.
As a robust measure for the time agreement and its spread, the median time error and the inter-quartile range (IQR) were computed (82), and time agreements were visualized using box plots.

. . . Stride-specific gait parameters
For those strides where both ICs and the FC in between were detected, the stance, swing, and stride times were computed (19,20,83).Stance time was the time between an FC and the preceding IC of the same foot, swing time was the time between an IC and the preceding FC of the same foot, and stride time was the time between two consecutive ICs of the same foot (34, 83).
For each of these temporal gait parameters, the mean time difference and the limits of agreement (LoA) based on a 95% confidence interval (CI) were computed (82).Differences were visualized using Bland-Altman plots (84, 85).

. . Demographics
Data were collected from 108 different participants, and eventually data from 99 participants were used for the current study (Table 1).Data from the other participants were excluded due to incomplete or missing data from the INDIP system or because no WBs ≥ 10 s were recorded.Eventually, the DL-based algorithm was evaluated for its performance in detecting gait events of 13,100 strides divided over 295 bouts recorded from 17 participants in the testing set.

. . Overall detection performance
The overall detection performance was quantified by the number of true positives, number of false negatives, and number of false positives.From these numbers, the recall and precision were calculated (Table 2).In total, from 13,134 ICs, the algorithm detected 12,985 events (i.e., 99%) and missed 169 events (i.e., 1%), and similarly, from 12,838 FCs, the algorithm detected 12,747 events (i.e., 99%) and missed 91 events (i.e., 1%).When evaluated per cohort, the recall for the IC detection was ≥98%, and the precision was ≥96%.Similarly, the recall was ≥99%, and the precision was ≥94% for FC detection for all cohorts.  .

. Time agreement
For all the correctly detected events, i.e., true positives, the difference between the detected event timing and the annotated event timings was calculated according to Equation ( 6).The median time error was close to 0 s with the IQR enclosing a zero difference for both ICs and FCs for all cohorts, except for the PFF cohort (Figure 2).The PFF cohort showed a median time error of −0.02 s and an IQR of 0.03 s for IC detection, and a median time error of 0.03 s and IQR of 0.05 s for FC detection (Table 3).

. . Stride-specific gait parameters
For those strides that had two correctly detected ICs and a correctly detected FC in between, stride-specific temporal gait parameters (i.e., stance time, swing time, and stride time) were calculated.For all cohorts, the mean differences between the stance, swing, and stride times derived from the detected events and those derived from the annotations were close to zero with the LoA encapsulating a zero-mean difference (Figure 3).Notably, for the PFF cohort, the mean time difference for the stance time was +0.07 s, and the mean time difference for the swing time was −0.07 s, which resulted in a zero-mean difference for the stride time (Table 4).Similarly, for all gait phases, the absolute errors were 0.04 s or less for all cohorts, except the PFF cohort (Table 5).This resulted in a relative time error for the stride times of ≤2% across all cohorts, but for the swing times, the relative time error for the PFF cohort was 27%, and for the COPD cohort, it was 12%.

. Discussion
The specific aim of the current study was to determine whether a previously in-lab validated DL-based gait event detection algorithm (34) could be used for the detection of gait events from real-life walking bouts in a heterogeneous cohort of different   mobility-limiting diseases.For that purpose, participants from different disease cohorts (CHF, COPD, MS, PD, and PFF) and a cohort of healthy adults were equipped with the INDIP system that consisted of PIs and IMUs for both feet.Participants wore the INDIP system for 2.5 h in the habitual environment, as chosen by the participants, and a wide range of activities were recorded in these ecologically valid environments.Data from the PIs and IMUs were used to generate reference timings for ICs and FCs, whereas raw data from the accelerometer and gyroscope were used as the input to the DL algorithm to identify ICs and FCs.The recall and precision of gait events were used as a general measure for the detection performance and were considered high (i.e., recall ≥ 98% and precision ≥ 96%).For comparison, in Trojaniello et al. (27), no missed or extra gait events were observed in a heterogeneous sample of elderly, hemiparetic, parkinsonian, and choreic gait, but data were only collected from walking back Values represent the mean and 95% confidence interval of all stances, swings, and strides of the test subjects for the given cohort.CHF, congestive heart failure; COPD, chronic obstructive pulmonary disease; DL, deep learning; HA, healthy adults; MS, multiple sclerosis; PD, Parkinson's disease; PFF, proximal femoral fracture.
and forth for 1 min in a 12 m walkway.Similarly, high recall and precision (≥98%) were reported for a continuous wavelet transform (CWT)-based algorithm, but it was evaluated only for 13 healthy participants and 3 hemiplegic participants who walked continuously along a 10 m walkway (86).A recent study (45) found a recall of 96% and precision of 89% in a cohort of 28 PD participants, who wore two IMUs on the feet for 2 weeks, which are slightly lower than the recall and precision from the current study.Overall, the data of the studies presented here, including the present study, indicate that very high recall and precision values can be achieved with the deep learning approach for the detection of gait events.This, together with the higher flexibility of the DL-based algorithms compared to conventional algorithms, speaks for the future use of such algorithms for the detection of gait in mobility-limiting diseases also in the habitual environment.
For the correctly detected gait events, the time differences between the predicted event timing and the annotated event timings were quantified as a measure of temporal agreement between the reference system and the DL-based algorithm.The time differences were still in the same range as those previously reported for CWT-based (23,27,30,86,87) and DL-based algorithms (34) validated on in-lab gait data.To put this into perspective, studies that evaluated the time differences of detected gait events from PIs when compared to force plates or instrumented walkways also reported time differences in the range from 0.02 s to 0.04 s (17, 64,87).For the INDIP pressure insole method, a negligible delay (<10 ms) was observed for FCs, and a consistent IC anticipation (20 ms) was found when compared to force plates (64).It suggests that a certain margin of uncertainty should be considered when interpreting gait event timing differences in the DL-based algorithm.
Finally, stride-specific gait parameters were derived for the correctly detected events.These may be of greatest clinical relevance since changes in spatiotemporal gait parameters were associated with a shorter time to PD diagnosis (88) and from mild cognitive impairment to Alzheimer's disease (89), and values of temporal gait parameters were different in disease cohorts compared to healthy cohorts (90)(91)(92).Here, a zero-mean time difference was found for the stride times for all cohorts.Similarly, the time differences for stance and swing times were centered around a zero-mean difference for all cohorts, with only the mean differences for the stance and swing time of the PFF cohort being a bit larger (0.07 s and −0.07 s for the stance and swing time, respectively).The mean differences for stance and swing times in the PFF cohort may in part be explained by the altered gait pattern that is observed in this cohort (93, 94).Nonetheless, the time agreement for the stride-specific temporal gait parameters derived from the DL algorithm and the reference system was in a similar range as those communicated before for a comparable DL-based approach that evaluated results only from straight-line walking in a supervised laboratory setting (55).
The very good results that were obtained in the current study for two-feet-worn IMUs (56) combined with the results for a single shank-worn IMU from our previous study (34) provided evidence that the algorithm performance generalizes to other sensor wear locations and to free-living gait data.The current algorithm has the additional benefit that it does not require the knowledge of exact sensor location and orientation relative to the feet contrary to many previously validated algorithms (23,24,31,34,95).This has the practical consequence that there are less stringent requirements for study participants or future patients on how to attach the sensors to their feet.Since for the previous validation the input data consisted of the raw accelerometer and gyroscope signals from a single sensor that was located either laterally above the ankle joint or medially below the knee joint (34), the algorithm for the current validation was again trained, validated, and tested.Both studies show a high recall and sensitivity, highlighting that the algorithm is capable of detecting gait events from different sensor locations without the loss of accuracy provided that sufficient training data are available for any new sensor location (34).Furthermore, the algorithm performance was evaluated across a broad spectrum of five different mobility-limiting disease cohorts, and although the number of participants in the testing set for each cohort was low, it showed that the algorithm was able to accurately detect gait events in heterogeneous pathological gait patterns.This will ultimately allow future users of the algorithm to perform not only sensitivity analyses for individual cohorts but also specificity analyses across different cohorts.
The limitation of the current study included that only data from detected WBs were used.This means that gait event detection relied on the accurate detection of gait sequences as a preceding step (45).However, several algorithms have been reported for accurate IMU-based gait sequence detection in both healthy and disease cohorts (24,25,28,29,46,(96)(97)(98)(99)(100).Furthermore, data from some participants had to be excluded from analysis due to missing or incomplete data which was mainly due to issues with the PIs.As reference timings for gait events are still obtained mainly from force or pressure measuring device (23), it showed the difficulty of obtaining a dataset with annotated gait events on completely unsupervised free-living gait data (35)(36)(37)45).To get a better picture of the algorithm's generalizability to other datasets, it needs to be tested on newly unseen datasets, for example, with a slightly different sensor setup, such as in Martindale et al. (46).
In addition, the study did not evaluate clinical aspects in detail, such as medication and symptom fluctuations.This is, in part, due to the heterogeneous sample of participants with different mobilitylimiting diseases.Consequently, the current study did not focus on identifying, for example, digital biomarkers of disease progression, for which a greater sample size of a specific disease would be required.However, as this is a study comparing, in the same person, systems at one point in time on a motility aspect, we believe that this does not influence the results reported here.Furthermore, it should be stressed that the heterogeneous sample is an asset of the current study as the results show that the algorithm achieves excellent performance for different pathological gait patterns.Given the time span of 2.5 h, we did not specifically investigate whether disease-associated gait abnormalities, such as freezing of gait in PD (101), were captured by the recording.However, the duration of the assessment was chosen as a trade-off between experimental, clinical, and technical requirements (56) and is five times longer than the recommendations for validation procedures of assessing physical activity in older adults (102).Lastly, the current analysis also relied on a peak detection algorithm to identify the most probable timings of gait events (34,46,55).However, from a clinical perspective, this may be regarded as a benefit since it would allow a clinician to decide whether to consider certain strides based on how confidently it can be assumed that it was indeed a stride.

. Conclusion
This study aimed to validate a DL algorithm for the detection of gait events in an ecologically valid environment across different mobility-limiting disease cohorts.The performance evaluation showed an excellent detection rate and low time errors for both event timings and subsequently derived temporal gait parameters for all cohorts.The DL reached a performance that was in a similar range or slightly better than approaches that were to date only validated on in-lab recorded gait data or for a specific disease cohort.
As the DL algorithm does not rely on expert-defined decision rules or hand-crafted features nor on exact sensor-to-segment alignment, it poses fewer requirements on the data collection.
Our next steps include extending the current analysis for data from multiple days and evaluating to which extent the DL network can be trained using participant-specific data to improve gait event detection on an individual level.Future studies may also consider the development of novel gold-standard systems that allow validation approaches beyond lower limb movement, for example, to include upper limb movement.

FIGURE
FIGURESchematic depiction of the deep learning model architecture with a residual block (ResBlock) that is repeated (in this case, six times) before a dense and softmax layer are applied.Inputs to the network are the raw accelerometer and gyroscope data of both left and right inertial measurement units.The outputs are estimated probabilities for each of the gait events for each time step.BatchNorm, batch normalization; Conv, convolution; DropOut, dropout; ReLU, recitified linear unit.

FIGURE
FIGURETime di erence between the predicted and reference events timings for initial and final contacts evaluated per cohort.A positive time di erence corresponded to an advanced detection.CHF, congestive heart failure; COPD, chronic obstructive pulmonary disease; MS, multiple sclerosis; HA, healthy adults; PD, Parkinson's disease; PFF, proximal femoral fracture.

FIGURE
FIGUREBland-Altman plots for the stance, swing, and stride times evaluated per cohort.The gray solid line corresponds to the overall mean di erence, and the dashed lines correspond to the mean di erence ± standard deviation.CHF, congestive heart failure; COPD, chronic obstructive pulmonary disease; DL, deep learning; HA, healthy adults; MS, multiple sclerosis; PD, Parkinson's disease; PFF, proximal femoral fracture.
TABLE Dataset details for training, validation, and testing sets, including the total number of bouts and strides.
CAT, COPD assessment test; CHF, congestive heart failure; COPD, chronic obstructive pulmonary disease; EDSS, Expanded disability status scale; FEV1, Forced expiratory volume in 1 s; HA, healthy adults; HandY, Hoehn and Yahr scale; KCCQ, Kansas City cardiomyopathy questionnaire; MS, multiple sclerosis; PD, Parkinson's disease; PFF, proximal femoral fracture; SPPB, short physical performance battery; UPDRS, Movement Disorder Society-sponsored Unified Parkinson's Disease Rating Scale, part III.Age, height, and weight are presented as mean (standard deviation), and the clinical scales are presented as mean [minimum, maximum].
TABLE Time di erences between the predicted event timings and the annotated event timings evaluated per cohort.

TABLE Stance ,
swing, and stride times obtained from the reference and the DL algorithm, and the absolute and relative time errors for comparison.