The Maximum Lyapunov Exponent During Walking and Running: Reliability Assessment of Different Marker-Sets

The maximum Lyapunov exponent (MLE) has often been suggested as the prominent measure for evaluation of dynamic stability of locomotion in pathological and healthy population. Although the popularity of the MLE has increased in the last years, there is scarce information on the reliability of the method, especially during running. The purpose of the current study was, thus, to examine the reliability of the MLE during both walking and running. Sixteen participants walked and ran on a treadmill completing two measurement blocks (i.e., two trials per day for three consecutive days per block) separated by 2 months on average. Six different marker-sets on the trunk were analyzed. Intraday, interday and between blocks reliability was assessed using the intraclass correlation coefficient (ICC) and the root mean square difference (RMSD). The MLE was on average significantly higher (p < 0.001) in running (1.836 ± 0.080) compared to walking (1.386 ± 0.207). All marker-sets showed excellent ICCs (>0.90) during walking and mostly good ICCs (>0.75) during running. The RMSD ranged from 0.023 to 0.047 for walking and from 0.018 to 0.050 for running. The reliability was better when comparing MLE values between blocks (ICCs: 0.965–0.991 and 0.768–0.961; RMSD: 0.023–0.034 and 0.018–0.027 for walking and running respectively), and worse when considering trials of the same day (ICCs: 0.946–0.980 and 0.739–0.844; RMSD: 0.042–0.047 and 0.045–0.050 for walking and running respectively). Further, different marker-sets affect the reliability of the MLE in both walking and running. Our findings provide evidence that the assessment of dynamic stability using the MLE is reliable in both walking and running. More trials spread over more than 1 day should be considered in study designs with increased demands of accuracy independent of the locomotion condition.


INTRODUCTION
Stability is crucial for uninterrupted task execution in dynamic conditions such as locomotion and requires effective regulation by the CNS (Schöner and Kelso, 1988;Massion, 1992;Patla, 2003;Ting et al., 2009;Bohm et al., 2015). As such, dynamic stability during gait refers to the ability of the system to maintain functional locomotion (i.e., not leading to falls) despite the presence of kinematic disturbances or control errors (England and Granata, 2007;Bruijn et al., 2013). One parameter to evaluate numerically the dynamic stability during locomotion is the maximum Lyapunov exponent (MLE) calculated using nonlinear time series analysis and has been adopted as a criterion for the occurrence of control errors (Dingwell and Cusumano, 2000;Buzzi et al., 2003;Bruijn et al., 2013Bruijn et al., , 2014. The MLE is based on the Lyapunov's theory of dynamic stability, initially formulated to assess the sensitivity of a mechanical system to small perturbations and is often used to quantify how the patterns of gait kinematics change in response to small perturbations (Lyapunov, 1992;Ihlen et al., 2017). While arguments can be made for any of the deriving stability measures, recent reviews suggested the use of the MLE as a prominent measure of dynamic stability (Hamacher et al., 2011;Bruijn et al., 2013;Mehdizadeh, 2017), which has thus received extensive focus in the recent years (Wurdeman et al., 2014;Terrier, 2015, 2017;Hamacher et al., 2016;Wu et al., 2016;Chini et al., 2017;Mehdizadeh, 2017;Vieira et al., 2017;Wickstrom et al., 2017).
Although the popularity of the MLE has increased in the context of movement science, there is scarce information on the reliability of the method, especially when comparing measurements performed in a pre-post design after specific therapy or exercise interventions. Previous studies in walking conditions reported good intrasession reliability (Kang and Dingwell, 2006b;van Schooten et al., 2013;Rábago et al., 2015). Based on this, it was argued that differences between young and older adults (Buzzi et al., 2003;Kang and Dingwell, 2008;Terrier and Reynard, 2015;Mehdizadeh, 2017) as well as patients with moderate neurological gait disorders  where instability is expected, can be discovered with the MLE. However, the reliability of the MLE is decreased between days (van Schooten et al., 2013;. In clinical settings where the evaluation of therapies in a pre-post design is required, the reduced between days reliability provides limitations for the detection of therapyrelated alterations. Furthermore, the proof of acute changes after learning or short time adaptation (as for example while walking in different environments), needs a high degree of accuracy (Hak et al., 2012). Based on the reported reliability between days (van Schooten et al., 2013;, detection of differences after exercise or therapy-induced adaptations might not be feasible. Using a block of measurements within several consecutive days to define a representative value of the MLE might increase the reliability, thus improving the detection ability for small alterations in the MLE. To date there is no information regarding the reliability of the MLE if more measurement days are included in the calculation. Beyond walking, recently several studies investigated the dynamic stability of running using the MLE 37−40 . However, there is no available information regarding intraday or between days reliability of the MLE during running. Nonlinear time series analysis is a valuable tool for examining the invariants of a dynamical system, but is sensitive to different methodological approaches (Kantz and Schreiber, 2004;Bradley and Kantz, 2015). To date, no consensus exists regarding the data acquisition strategies for the calculation of the MLE. While the computational aspects of the MLE calculation have been frequently examined (Bruijn et al., 2009b;Bradley and Kantz, 2015;Mehdizadeh and Sanjari, 2017;Reynard and Terrier, 2017), there is no comprehensive study examining the placement and clustering regarding data acquisition strategies. Neuromuscular control of the superior segment (trunk) is believed to enable humans maintain stability (Winter et al., 1993) and trunk control to be prioritized over inferior segments (Cromwell et al., 2004). As such, while the trunk is suggested to be representative of the stability of the human system (Kang and Dingwell, 2009;Beurskens et al., 2014), previous studies examining the MLE employed diverse placements and quantities of markers or accelerometers. For instance, the sternum , the first (Dingwell and Marin, 2006), and sixth (Bruijn et al., 2009a;van Schooten et al., 2011) thoracic vertebrae, the second (Sejdić et al., 2013) and fifth (Terrier and Dériaz, 2011;van Schooten et al., 2013) lumbar vertebrae have been used, while clusters of two  or six markers Dingwell, 2006a,b, 2008) have also been employed for acquiring of data and subsequent analysis of dynamic stability. However, through time series analysis we compute a few characteristic numbers from a large sample of data (Kantz and Schreiber, 2004;Bradley and Kantz, 2015), and data collected from different parts of the system can contain different information regarding its states. Possible disparities in the resulting MLE deriving from different bony landmarks, could influence the results of cross-sectional, interventional or prospective study designs and comparisons across different studies that employ the MLE. Moreover, different information regarding the states of the system during locomotion across the different bony landmark positions could have an effect on the resulting reliability. It can be argued that some specific marker sets on the trunk may provide higher reliability than others, representing in a more useful way the dynamic states of the human body during walking and running.
The purpose of the current study was thus to examine the reliability of the MLE both during walking and running using six different marker-sets fixed on the trunk. Further, we aimed to investigate the effects of the different marker-sets on the MLE values. In doing so, we included comparisons of trials performed within the same day, across different days and between block measurements (i.e., three consecutive days of measurement) separated by a long period of time (in average 2 months). We hypothesized dissimilar MLE values within the marker-sets and a marker-set specific reliability during walking and running (i.e., different marker-sets would exhibit different reliability values) and that the reliability would improve in the block design.

Experimental Protocol
For the current study we recruited 16 young and healthy adults (five female), which were informed of the study's procedures. Anthropometric data of the participants were as follows: 27 ± 5 years of age; 179 ± 9 cm of height; 72 ± 12 kg of body mass. None of the participants had any musculoskeletal or neuromuscular impairments at the time of the measurements or 6 months prior to them. This study was reviewed and approved by the Ethics Committee of the Humboldt-Universität zu Berlin. All the participants gave written informed consent for the experimental procedure, in accordance with the Declaration of Helsinki. The participants came to the lab on six different days in total. Measurements were conducted in two blocks of three consecutive measurement days. The time between blocks was 2 months on average. Each day participants executed randomly (based on computer-generated random numbers) two walking and two running trials at their preferred velocity, separated by a short resting period (60 s). In total, the participants performed 12 walking and 12 running trials per measurement block. The schematic representation of the protocol is depicted in Figure 1. All participants walked and ran on a treadmill (mercury, H-pcosmos Sports & Medical GmbH, Nussdorf, Germany) with an integrated pressure plate (FDM-THM-S, Zebris Medical GmbH, Germany). A 40-60 s familiarization time preceded each trial. We recorded 270 s for each walking trial and 120 s for each running trial to ensure that a high amount of steps is included in our analysis.
The individuals' preferred velocity was determined while walking and running, through the "method of limits" (Treutwein, 1995). Following a self-selected warm-up, an experienced researcher manipulated the velocity (starting at 0.8 m/s) with varying increments of 0.05-0.08 m/s every 5-10 s. The participant would then affirm when his/her comfort walking velocity was reached and the whole procedure would repeat starting from a higher velocity than the selected. The researcher used similar decrements and the participant once again affirmed his/her preferred pace. The whole process was performed at least two times and until the selected values did not differ more than 10%. The same procedure was followed to determine the preferred running velocity (starting at 1.9 m/s).

Maximum Lyapunov Exponents
Kinematic data were recorded through the use of five highspeed video cameras (Flare 4M180-CCL, IO Industries Inc., Canada) operating at 80 Hz during the walking trials and at 190 Hz during the running trials. We recorded 11 reflective FIGURE 1 | Schematic representation of the measurement design. All participants completed two blocks of measurements. Every block included three consecutive days of measurements (two trials per day). The design was the same in walking and running.
10 mm-markers positioned on bony landmarks of the trunk. Markers were positioned on the spine at the first (T1), sixth (T6), tenth (T10), and twelfth (T12) thoracic and the second lumbar vertebrae (L2). Further, the scapulae were recorded bilaterally on the acromia, superior and inferior angles (Figure 2). The video tracking was performed using dedicated software (Simi Motion 9.0.4, Simi Reality Motion Systems GmbH, Germany). A fourth order Butterworth 20 Hz low-pass filter was applied to the registered coordinates, maintaining the maximum dynamics of the system (Sinclair et al., 2013). The coordinates of the markers on the T1, T6, T10, and L2 were analyzed separately. Except the time series originating from the individual markers, two clustered marker-sets were created by averaging the coordinates of several markers together on each time frame. The coordinates of all 11 captured markers formed the first clustered marker-set (ALL), while the second marker-set (SP) included only the spine markers (T1, T6, T10, T12, L2) which were clustered together as one.
We calculated the maximum Lyapunov exponents (MLE) on the vertical axis of the six time series, namely the "T1, " "T6, " "T10, " "L2, " "ALL, " "SP." We analyzed the coordinate data according to the procedure followed in a previous study (Ekizos et al., 2017). In short, we identified the maximum common steps of all participants in all 192 trials (16 participants, 12 trials each) and extracted the data segment corresponding to this amount of steps in each trial. For the walking trials 454 steps were identified in all participants, while in running 279 steps were identified. This segment was then normalized to a uniform data length (based on the recorded steps and the average data points per step). For walking, the data segment consisted of 18614 data points, and for running of 19809 data points.
with S(t) being the m-dimensional reconstructed state vector, z(t) the input 1D coordinate series, τ the time delay and m the embedding dimension. Time delays were selected based on the first minimum of the Average Mutual Information function (Fraser and Swinney, 1986) and number of embedding dimension through a Global False Nearest Neighbors analysis (Kennel et al., 1992). Individually selected time delays were chosen by averaging the outcome delays of all individual time series for each of the participants (Ekizos et al., 2017). For our data, m = 3 was sufficient for all participants in both walking and running, while τ ranged from 12 to 16 in walking (∼0.34 of average step) and from 21 to 27 frames (∼0.34 of average step) in running. We then calculated the average divergence of each point's trajectory to its closest neighbor, using the Rosenstein algorithm (Rosenstein et al., 1994). The MLE was calculated from the slopes of the resulting average divergence curves' linear fits. The number of data points chosen as the fitting region were equal to one step.

Statistics
First we performed a repeated measures two-way ANOVA to examine differences in the MLE between the different marker-sets, with trials and marker-sets as within subject factors. If appropriate, post-hoc comparisons were made with the Bonferroni correction (adjusted p-value for number of comparisons equal to 15) to determine where the effects would be present. Further, a repeated measures three-way ANOVA was employed on the MLE values to test the effect within each day, between days and between blocks separately for the different marker-sets (SPSS v.22, International Business Machines Corp., USA). The two-way mixed single measures absolute agreement intraclass correlation coefficient (ICC) was chosen as more appropriate for our study to determine the reliability of the measurement. To calculate the ICC between trials of same day, trial one against trial two of all days were first assessed and the ICC of all days was subsequently averaged. For the ICC between days the averaged values (trial one and trial two) of all days were used. The ICC was then assessed on the 3 days of block one and block two and the resulting values were averaged. The ICC values between blocks were calculated on the averaged values of all trials in each block. Moreover, to determine the magnitude of the variance in the calculated values of the MLE between the trials of each day, between trials of different days and between all the trials of block one and block two, we calculated the root mean square difference (RMSD). Differences on the absolute MLE values between walking and running were examined through a Student's paired t-test. All statistical tests and procedures were performed separately for the six marker-sets (i.e., four independent markers and two clustered sets) and separately for walking and running. The level of significance for all tests was set to α = 0.05.

RESULTS
Participants' preferred velocity was 1.5 ± 0.1 m/s in walking and 2.9 ± 0.5 m/s in running. Cadence was 116.3 ± 6.0 steps/min in walking and 160.4 ± 8.7 steps/min in running. The values of the MLE, averaged over all trials, were significantly higher (p < 0.001) in running (1.836 ± 0.080) compared to walking (1.386 ± 0.207) in all examined marker-sets, and thus, exhibited that running was locally more unstable than walking.

Walking Reliability
After the first test on the effect of the marker-sets on the resulting MLE values, we found a statistically significant (p < 0.001) effect of the marker-sets. The post-hoc comparisons showed significantly higher MLE values on the marker-set "T6" compared to both "L2" (p = 0.013) and "ALL" (p = 0.021). Moreover, "T10" exhibited significantly higher (p = 0.035) MLE values compared to "L2" (Figure 3). For the walking condition, detailed values for the results of the ANOVA, the ICCs and the RMSD of all marker-sets are presented in Although the reliability values were quite high in all marker-sets based on the ICC and RMSD values the markerset "ALL" exhibited the highest ICC and lowest RMSD within days, between days and between blocks followed by the markersets "L2" and "SP." Both ICCs and RMSDs showed superior values between blocks in all marker-sets compared to the within and between days conditions ( Table 1). A similar trend was The values refer to the walking condition.
Frontiers in Physiology | www.frontiersin.org observed when examining the divergence curves of individual participants for all trials, averaged over days and averaged per block (Figure 4).

Running Reliability
The effect of the marker-sets on the MLE values, was statistically significant (p < 0.041). The post-hoc comparisons revealed significantly higher MLE values of the marker-set "T6" when compared to the marker-set "L2" (p = 0.030) (Figure 5). Similar to walking condition we separately tested each marker-set, and the within days, between days and between blocks effect on the MLE. All values for the results of the repeated measures ANOVA, the ICCs and the RMSD of all marker-sets during the running trials are presented in Table 2. We found significant (p = 0.035) differences in the "L2" marker-set when comparing between trials of the same day. No further significant (p > 0.05) differences, between trials of the same day, between consequent days or between blocks were found in any other marker-set. ICCs between trials of the same day ranged from 0.739 to 0.844 for all 6 marker-sets, while between days the ICCs ranged from 0.688 to 0.870. Further, the ICC values between blocks ranged from 0.768 to 0.961. RMSD of trials within the same days ranged from 0.045 to 0.050. The RMSD values when considering the between days comparison, ranged from 0.038 to 0.045. Similar to the walking trials a decrease in the RMSD values was found when considering the values of blocks. The RMSD values between blocks ranged from 0.018 to 0.027. During running, the marker-set "SP" exhibited the highest ICC and lowest RMSD between days and between blocks following by the marker-sets "L2" and "T10." Similar to walking both ICCs and RMSDs showed superior values between blocks in all markersets compared to within and between days conditions ( Table 2). This was also observed when examining the divergence curves of individual participants for all trials, averaged over days and averaged per block (Figure 6).

DISCUSSION
In the present study we examined the effect of different marker-sets on the reliability of the MLE computed for each marker-set separately. The study examined these effects on different locomotion conditions, namely walking and running. All marker-sets showed excellent reliability during walking and high reliability in the running condition. The RMSD were lowest when comparing MLE values between blocks, and higher when considering trials of the same day in both walking and running. Further we found that different marker-sets have a significant effect on the MLE values in both walking and running. This effect was more pronounced while walking.
During walking, previous studies in MLE reliability have reported good (i.e., from 0.75 to 0.88) (Portney and Watkins, 2009) intrasession (Kang and Dingwell, 2006b;van Schooten et al., 2013;FIGURE 4 | Exemplary (i.e., one participant and one marker-set) divergence curves for all trials, averaged over days and averaged per block during walking.
Frontiers in Physiology | www.frontiersin.org Rábago et al., 2015) and moderate (i.e., from 0.53 to 0.68) (Portney andWatkins, 2009) intersession ICC values (van Schooten et al., 2013;. The ICCs found in our study during walking were clearly higher compared to these previous studies in both intrasession (i.e., from 0.946 to 0.980) and intersession (i.e., from 0.971 to 0.985) comparisons. We recorded kinematic data for 270 s which allowed us to include a high number of steps (i.e., 454 step cycles), and reliability increases substantially as the number of recorded steps increases (Kang and Dingwell, 2006b;Bruijn et al., 2009b). Another source of increased reliability in our study could be the use of the treadmill, in comparison to ambulatory monitoring of gait (van Schooten et al., 2013(van Schooten et al., , 2015. When omitting any averaging between the recorded trials the ICC values between days and between blocks decreased slightly (see Supplementary Material). In walking, all marker-sets were shown to have excellent reliability and provided no significant differences when comparing within days, between days or between blocks. Measuring only one landmark of the trunk during walking could, thus, be sufficient to describe the local dynamic stability of the system and be preferred for reasons of simplification in the study design.
To our knowledge no study has examined reliability on the resulting MLE while running. During running, one markerset (i.e., L2) showed significant differences in MLE within days and one marker-set (i.e., T1) exhibited consistently low reliability values compared to the others. The clustered markerset "SP" provided the best and more robust values in the running condition, exhibiting consistently high ICCs and low RMSD within days, across days and between blocks. The improved reliability of the clustered "SP" marker-set may be attributed to small inter-vertebrae movements that are present during locomotion (Syczewska et al., 1999). The inter-vertebrae movements add another layer of complexity to the system and can affect the reliability of the MLE values. By using the clustered marker-set "SP, " these movements would possibly have minimal effects on MLE by repeated measurements due to averaging, thus improving the reliability of the marker-set. It could be suggested that, in studies were a higher measurement reliability is needed for the assessment of the system's local dynamic stability during running, more than one landmark of the trunk should be considered. Conversely, the use of the "T1" marker-set might be less preferable. Moreover, the "SP" marker-set presented no significant differences on the absolute MLE values compared to any of the other marker-sets on the trunk (Figures 3, 5), and thus, the resulting MLE values could also be representative of the dynamic stability of the system. The ICC values during running were slightly lower compared to walking in all marker-sets and the effect of averaging the values of the individual trials more pronounced than in walking (see Supplementary Material). While the motor programming of walking and running remains similar (Cappellini et al., 2006), running exhibits an increased variability and decreased regularity (Estep et al., 2018), which may explain the small decrease of the ICC values. To ensure the differences in walking and running are not dependent on the number of steps, we analyzed our walking data also including 279 steps. When matching the analyzed steps of walking to those recorded in running (i.e., when we analyzed 279 steps in walking), the results in all examined parameters for the reliability and the absolute MLE values remained in similar levels compared to when we included all 454 steps. Our findings confirmed the increased instability during running compared to walking. The higher instability during running may be due to an increased demand in recruiting and coordinating the multiple degrees of freedom faster during the task execution (Jordan et al., 2009;Estep et al., 2018) affecting the assessed within days, between days and between blocks ICCs.
Although reliability was high within and between days, it increased when block measurements were introduced. This can be supported by the results of the reliability analysis with increased ICC and especially by the lower RMSD values. The divergence curves (Figures 4, 6) in both locomotion conditions further evidenced the higher reliability between the blocks. The minimum relative detectable differences (i.e., RMSD divided by the average MLE of the marker-set) were lower in the comparison between blocks. The minimum relative detectable differences results for the "SP" marker-set are 3.1, 2.5, and 2.1% for within days, between days and between blocks in the walking condition and 2.6, 2.1, and 1% respectively for running. It is thus surmised that more trials spread over more than 1 day can significantly improve the reliability of the measurement. To present, studies employing the Lyapunov analysis for examining the stability of gait have focused on differences between groups of young and older age (Buzzi et al., 2003;Kang and Dingwell, 2008;Hamacher et al., 2011Hamacher et al., , 2015Terrier and Reynard, 2015;Mehdizadeh, 2017) or health and pathology (Moraiti et al., 2007(Moraiti et al., , 2010Lamoth et al., 2010;Look et al., 2013;Kao et al., 2014;Hoogkamer et al., 2015). However, interventional or prospective The values refer to the running condition.
Frontiers in Physiology | www.frontiersin.org study designs examining the resulting MLE might require higher degrees of accuracy and thus more than one measurement trial and day. Based on our results, the chosen marker-set has a significant effect on the absolute value of the MLE on both walking and running conditions and that held true whether the values were obtained from a single or from clustered markers. These differences may be attributed to the nature of the theoretical concept of the used Lyapunov analysis. Time series analysis tries to identify the true dynamics regarding the states of the system from the observed time-ordered data. By measuring in a specific site or local region of the system we approximate the true dynamics, but as with any data collection we gather imperfect information. As such, different components of the system contain different parts of information regarding the states of the system and can yield altered MLE. The absolute MLE values between marker-sets differed up to 13.3% in walking and up to 1.3% in running and therefore highlight the importance of marker placement. Similar to our results, Rispens et al. reported MLE values that differed by 6.7% between two markers on the spine (i.e., when comparing the vertical component of the second and the fifth lumbar vertebrae; Rispens et al., 2014). During walking, MLE typically increases about 8-25% in older compared to young adults (Buzzi et al., 2003;Bruijn et al., 2014;Hamacher et al., 2015). Further, MLE has been reported to increase 9% in patients with focal cerebellar lesion (Hoogkamer et al., 2015) and 21% in patients with various neurological diseases compared to non-affected adults , while patients receiving orthopedic shoes exhibited decreased LLE by 9% (Terrier et al., 2013). These values indicate that expected differences can in some cases be affected by different markersets or placement errors. During running, changes of 25% have been reported in people with and without lower limb unilateral amputation (Look et al., 2013), which would not be affected by placement differences. However, in milder cases -such as after acute transition from shod to barefoot condition with reported changes of 2% (Ekizos et al., 2017)-the results could be affected from different marker-sets or erroneous marker placement. This indicates that standardization in marker placement and markerset chosen is important in study designs. Moreover, our findings exhibit the difficulty of comparing the absolute values of MLE between studies, the results of which were obtained with different marker-sets.

CONCLUSIONS
In the current study we endeavored to examine the reliability of the MLE values using different marker-sets within days, across days and between blocks. The chosen marker-set influences the resulting MLE values. The reliability was acceptable in both walking and running for the detection of expected differences in experimental studies. A clustered marker-set may be preferable in the running condition when higher measurement reliability is necessary. More trials spread over more than 1 day, considerably improved the reliability of the MLE measurement and should be considered in study designs with increased demands of accuracy, independent of the locomotion condition.

AUTHOR CONTRIBUTIONS
AE designed the study, carried out the experiments and data analysis and drafted the manuscript. AlS carried out the experiments and participated in editing the manuscript. ArS participated in data analysis and editing of the manuscript. AA conceived and designed the study and drafted the manuscript. All authors gave final approval for publication.