Estimation of Involuntary Components of Human Arm Impedance in Multi-Joint Movements via Feedback Jerk Isolation

Stable and efficient coordination in physical human-robot interaction requires consideration of human feedback behavior. In unpredictable tasks, where voluntary cognitive feedback is too slow to guarantee desired task execution, the human must rely on involuntary intrinsic and reflexive feedback. The combined effects of these two feedback mechanisms and the inertial characteristics can be summarized in the involuntary impedance components. In this work, we present a method for the estimation of the involuntary impedance components of the human arm in multi-joint movements. We apply force perturbations to evoke feedback jerks that can be isolated using a high pass filter and limit the duration of the estimation interval to guarantee exclusion of voluntary cognitive feedback. Dynamic regressor representation of the rigid body dynamics of the arm and first order Taylor series expansion of the feedback behavior yield a model that is linear in the involuntary impedance components. The constant values of the inertial parameters are estimated in a static posture maintenance task and subsequently inserted to estimate the remaining components in a dynamic movement task. The method is validated with simulated data of a neuromechanical model of the human arm and its performance is compared to established methods from the literature. The results of the validation demonstrate superior estimation performance for moderate movement velocities, and less influence of the variability of the movements. The applicability to real data and the plausibility of the limited estimation interval are successfully demonstrated in an experiment with human participants.

Stable and efficient coordination in physical human-robot interaction requires consideration of human feedback behavior. In unpredictable tasks, where voluntary cognitive feedback is too slow to guarantee desired task execution, the human must rely on involuntary intrinsic and reflexive feedback. The combined effects of these two feedback mechanisms and the inertial characteristics can be summarized in the involuntary impedance components. In this work, we present a method for the estimation of the involuntary impedance components of the human arm in multi-joint movements. We apply force perturbations to evoke feedback jerks that can be isolated using a high pass filter and limit the duration of the estimation interval to guarantee exclusion of voluntary cognitive feedback. Dynamic regressor representation of the rigid body dynamics of the arm and first order Taylor series expansion of the feedback behavior yield a model that is linear in the involuntary impedance components. The constant values of the inertial parameters are estimated in a static posture maintenance task and subsequently inserted to estimate the remaining components in a dynamic movement task. The method is validated with simulated data of a neuromechanical model of the human arm and its performance is compared to established methods from the literature. The results of the validation demonstrate superior estimation performance for moderate movement velocities, and less influence of the variability of the movements. The applicability to real data and the plausibility of the limited estimation interval are successfully demonstrated in an experiment with human participants.

INTRODUCTION
Technological advancements in robotics are enabling robotic assistance through physical interaction of human and robot for applications in medical, domestic, and industrial domains. In physical human-robot interaction (pHRI), control strategies are designed to provide efficient and intuitive interaction, during which instabilities must be avoided to guarantee safety and comfort of the human. Fulfillment of these requirements requires consideration of human feedback behavior in the control design process.
During execution of a desired motor task, joint torques produced by the neuromuscular system are composed of a feedforward and a feedback component (Tee et al., 2004). Due to a priori calculation, the feedforward component cannot account for unpredictable dynamics, which can result from incomplete or incorrect internal models (Tomi et al., 2008), inherent neural noise (Faisal et al., 2008), and unexpected external perturbations (Gomi and Kawato, 1997). The resulting deviations are compensated by restoring torques that contain effects of muscle intrinsic viscoelastic properties, reflexive feedback, and cognitive feedback (Gomi and Osu, 1998). Apart from the effects of the muscle intrinsic viscoelastic properties, these torques are produced at different task-dependent delays. As cognitive feedback possesses the longest delays [in the order of 100 ms (Franklin and Wolpert, 2011)], it may be too slow to guarantee desired task execution in unpredictable tasks (Mehta and Schaal, 2002). Thus, in such situations, the human must rely on intrinsic and reflexive feedback. In this work, all cognitive feedback at supraspinal level is referred to as voluntary feedback and the combined effects of all feedback mechanisms with shorter delays than voluntary feedback are grouped into involuntary feedback (see Figure 1).
The restoring torques generated by involuntary feedback can be modeled by a linear system composed of the involuntary impedance components, damping and stiffness (Bennett et al., 1992). Due to length-and velocity-tension relationships of muscle fibers and tendons, both components depend on joint angles and angular velocities. Additionally, the central nervous system (CNS) is able to modulate both components through co-contraction of antagonistic pairs of muscles at the respective joints (Mussa-Ivaldi et al., 1985), e.g., to achieve accuracy requirements (Lametti et al., 2007) or to compensate environmental instabilities (Burdet et al., 2001). Analysis of such modulation strategies in experiments that emulate realistic pHRI may provide valuable information for the control design process, e.g., for stability assessment or control behavior adaptation. However, acquisition of this information requires an estimation method that is compatible with the short estimation interval, in which only involuntary feedback is active. It must enable precise execution of perturbations and accurate isolation of resulting feedback behavior, as estimation errors in the unperturbed states directly influence the estimation accuracy of the involuntary impedance components (Burdet et al., 2013). To the best of the author's knowledge, existing methods are either not able to guarantee exclusion of voluntary feedback (e.g., Gomi and Kawato, 1997;Erden and Billard, 2015) or are limited to application to free, unfettered movements (Piovesan et al., 2013).
In this work, we present a method for the estimation of the involuntary impedance components of the human arm in multi-joint movements. We apply force perturbations in order to evoke deviations during two-dimensional point to point arm movements. These perturbations are designed such that the dominant frequencies of the jerks of the evoked feedback behavior lie above those of the unperturbed movements. Thus, the feedback behavior can be isolated by a high pass filter. The duration of the estimation interval is limited to guarantee the exclusion of voluntary feedback. Dynamic regressor representation of the rigid body dynamics and first order Taylor series expansion of the feedback behavior yield a model that is linear in the involuntary impedance components in joint space. The constant values of the inertial parameters are estimated in a static posture maintenance task and subsequently used for the estimation of the remaining involuntary impedance components in a dynamic movement task. Both the feedback jerk isolation and the involuntary impedance estimation are validated with simulated data of a neuromechanical model of the human arm. We compare the validation results to those obtained by application of the methods presented in Gomi and Kawato (1997) and Erden and Billard (2015). In the validation of the feedback jerk isolation, we additionally analyze the effects of different movement velocities as well as different frequencies and amplitudes of neural noise (see Figure 2). Finally, we perform an extensive evaluation of the applicability of the method to real data based on an experiment with human participants. It includes an analysis of the effects of different durations of the estimation interval.

Related Work
Estimation of multi-joint arm impedance is generally either performed by application of a perturbation paradigm, in which perturbations are used to evoke deviations from unperturbed states to measure variational dynamics, or by application of electromyographic (EMG) sensing modalities. In the latter category of approaches, multiple studies combine measurements of muscle activity with parametric muscle models, e.g., linear models of muscle stiffness (Osu and Gomi, 1999), quadratic models of muscle tension (Shin et al., 2009), or calibrated models of musculotendon unit forces (Buzzi et al., 2017). In Kim et al. (2009), an artificial neural network produces a mapping between EMG data and stiffness estimates, that are obtained from measured joint torques and an empirically determined linear model. In Lakatos et al. (2013) and Ajoudani et al. (2015), similar mappings are produced by pairing stiffness estimates obtained by application of the perturbation paradigm with EMG data. All of these EMG approaches have in common that they exclusively estimate stiffness.
Estimation of multi-joint arm impedance by application of the perturbation paradigm can generally be divided into two main categories: estimation in static posture maintenance tasks and in dynamic movement tasks. In both categories, perturbations are used to evoke deviations from unperturbed states and impedance is estimated based on the variational dynamics. Impedance estimation in static posture maintenance tasks is significantly less complex, as it does not require estimation of unperturbed states of an underlying movement. Furthermore, it allows for application of position perturbations, which are executed by moving the hand along a perturbation trajectory and thus enable a priori definition of the variational kinematics. Some studies exploit this advantage by using perturbation trajectories with plateau phases, during which the deviation remains constant (Mussa-Ivaldi et al., 1985;Dolan et al., 1993;Gomi and Osu, 1998;Darainy et al., 2004;Masia and Squeri, 2015). During the plateau phase, variational velocities and accelerations are zero. Thus, the respective variational forces can FIGURE 1 | Classification of intrinsic, reflexive, and cognitive feedback due to unpredictable dynamics into involuntary and voluntary feedback. All cognitive feedback at supraspinal level is referred to as voluntary feedback and the combined effects of all feedback mechanisms with shorter delays than voluntary feedback are grouped into involuntary feedback. be used to exclusively estimate stiffness. Some studies use position perturbations defined by third (Artemiadis et al., 2010;Patel et al., 2014) or fifth degree polynomials (Lakatos et al., 2011(Lakatos et al., , 2013, which are specifically designed to improve the conditioning of the estimation. Other studies use stochastic position (Perreault et al., 2004;Ajoudani et al., 2015) or force perturbations (Palazzolo et al., 2007) in combination with significantly longer estimation intervals to estimate frequency domain impedance transfer functions.
Impedance estimation during dynamic movement tasks is significantly more challenging than during static posture maintenance tasks, as it requires precise estimation of the unperturbed states of the underlying movement. In Wang et al. (2001), mechanical impedance is solely estimated in terms of feedback forces. In Hondori and Tech (2011), it is represented by the ratio of frequency domain forces and velocities. In Burdet et al. (2000), position perturbations with plateau phases are used to exclusively estimate stiffness in a 60 ms interval that starts 120 ms after perturbation onset. In Darainy et al. (2007) and Wong et al. (2009b), similar methods are used to exclusively estimate stiffness in a 50 ms interval that starts 250 ms after perturbation onset. In Piovesan et al. (2013), force perturbations and time-frequency analysis are used to estimate impedance 135 ms after perturbation onset. Due to reliance on vibrational energy, this method is only applicable to free, unfettered movements. In Gomi and Kawato (1997), force perturbations are used to estimate impedance by least squares estimation in a 280 ms interval. Erden and Billard (2015) use a similar method to estimate impedance in a 250 ms interval.
To the best of the author's knowledge, among existing methods in the literature, only those presented in Gomi and Kawato (1997) and Erden and Billard (2015) are able to estimate stiffness, damping, and inertial characteristics during multijoint arm movements that emulate realistic pHRI. In Gomi and Kawato (1997), participants perform transversal and longitudinal point to point movements. The movements are guided by a target position that moves along a reference trajectory and is displayed on a computer monitor. First order Taylor series expansion is used to obtain a linearized model of the rigid body dynamics and the feedback behavior in joint space. Inertial parameters are estimated a priori and used for the estimation of damping and stiffness. Variational dynamics are obtained by calculating the difference to the averaged unbiased dynamics of all perturbed movements, which are calculated by subtracting the offsets at perturbation onset. In Erden and Billard (2015), participants perform a manual tungsten inert gas welding task in cooperation with a robot. The movements are guided by a straight reference trajectory and the participants are instructed "to do their best to achieve a good welding." Variational behavior is described by a linear model of diagonal inertia, damping, and stiffness matrices in Cartesian space, which are estimated simultaneously. Variational positions and forces are obtained by subtraction of the offsets at perturbation onset and variational velocities and accelerations are obtained by differentiation. Both Gomi and Kawato (1997) and Erden and Billard (2015) are unable to guarantee exclusion of voluntary feedback, as both methods require estimation intervals that are significantly longer than the delay of voluntary feedback. In Gomi and Kawato (1997), this limitation is compensated by application of the do-notintervene-voluntarily paradigm, which means that participants are instructed to not intervene voluntarily in response to the perturbation. However, this approach poses two problems.
FIGURE 2 | Schematic diagram of the major elements of the method and the validation with simulated data. The colored boxes, which are defined in the data acquisition part of the diagram, indicate which sets of simulated data are applied and analyzed in which phases of the validation.
Frontiers in Neuroscience | www.frontiersin.org Firstly, despite these instructions, voluntary intervention is nonetheless possible and exclusion of voluntary feedback cannot be guaranteed. Secondly, instructing the participants to not intervene voluntarily substantially limits the number of possible, realistic pHRI application scenarios.
A preliminary work has been presented in Börner et al. (2018). In this preliminary work, the rigid body dynamics were linearized in joint space and the involuntary impedance components were estimated in Cartesian space. Due to the linearization in joint space, assumptions concerning negligibility of inertia and Coriolis terms were necessary to obtain a linear second order system in Cartesian space. The elements of the Cartesian endpoint inertia were calculated with anthropometric data and subsequently inserted to estimate the remaining involuntary impedance components in a dynamic movement task. Furthermore, the application of a preliminary perturbation design required assumptions regarding correlations of the beginning of the estimation interval and noticeable deviations due to the perturbation.
The remainder of this article is structured as follows: section 2 contains the materials and methods, which include the derivation of the involuntary impedance model as well as the presentation of the feedback jerk isolation, the perturbation design, and the involuntary impedance estimation. Furthermore, the details of the simulation and the experiment are presented. Section 3 contains the results, which include the validation and the comparison with existing methods based on simulated data as well as the evaluation with experimental data. Section 4 contains the discussion and conclusions are summarized in section 5.

MATERIALS AND METHODS
We begin this section by formulating the problem considered in this work. Subsequently, we present the feedback jerk isolation and the involuntary impedance estimation. Finally, we introduce the details of the simulation and the experiment.

Problem Formulation
In this section, we first derive a general human motor behavior model, which then serves as the basis for the derivation of a model of the involuntary impedance components.

Human Motor Behavior Model
The human arm is modeled as a multi-joint two-link system. In order to reduce complexity and neglect gravity, movements are constrained to the horizontal plane. The rigid body dynamics is given by where q is the 2 degree of freedom (DoF) arm configuration in joint space, M q is the symmetric and positive-definite inertia matrix (Hogan, 1985), C is the Coriolis/centrifugal matrix, τ int are the internally generated joint torques, and τ ext are the external torques (Gomi and Kawato, 1997). The arm configuration q = [q 1 , q 2 ] T is defined by the angles of the shoulder q 1 and the elbow q 2 .
The internal torques τ int are produced by the muscle tensions m that act on the musculoskeletal system: where λ are the muscle lengths,λ are the respective derivatives, a are the muscle activations, and J m is the muscle Jacobian that contains the muscle moment arms that are necessary for the transformation to joint space (Shin et al., 2009). During execution of a motor task, the muscle activations a consist of a feedforward term a FF , a feedback term a FB , and a neural noise term a N (Franklin et al., 2008): The feedforward term a FF is calculated by the CNS through a priori optimization of the statistics of the desired motor behavior with respect to costs defined by the task-specific input parameters θ , which depend on factors, such as environmental constraints and task requirements (Harris and Wolpert, 1998). The neural noise term a N represents signal-dependent noise whose variance increases with the magnitude of the muscle activations a (Slifkin and Newell, 1999). Deviations caused by unpredictable dynamics are compensated by the feedback term a FB , which consists of multiple components that depend on delayed afferent sensory information and are produced at different delays (Franklin and Wolpert, 2011): where a FB,r and a FB,v are reflexive and voluntary feedback muscle activations, respectively. The variables δ r and δ v are the corresponding delays and t p = t − t 0 is the time after the onset of the unpredictable dynamics at t 0 . In the interval ]δ r , δ v ], the feedback muscle activations a FB only consist of reflexive feedback muscle activations a FB,r , which are affected by neural conduction delays that depend on the length and type of the nerve fiber. The fastest reflexive feedback is produced by the short-latency monosynaptic stretch reflex, with a delay δ r,s in the order of 10 − 40 ms (Matthews, 1991). Slightly slower reflexive feedback is produced by the cortical component of the longlatency stretch reflex, with a delay δ r,l in the order of 30 − 70 ms (Thilmann et al., 1991). As the feedback behavior in this interval does not recruit higher-order cognitive processes, the reflexive feedback muscle activations a FB,r only depend on the muscle lengths λ and the respective derivativesλ. For all t p > δ v , the voluntary feedback muscle activations a FB,v , which are defined by the task-specific input parameters θ , additionally contribute to the feedback muscle activations a FB . For voluntary feedback in response to haptic motion perception, the transmission of the associated proprioceptive sensory information to the CNS is subject to delays in the order of 100 ms. Conduction delays of descending motor commands from the motor cortex to the arm muscles are ∼ 15 ms (Merton and Morton, 1980). Consequently, the minimum delay δ v of the voluntary feedback muscle activations a FB,v is in the order of 115 ms. The internal torques τ int are composed of a feedforward term τ FF , a feedback term τ FB , and a neural noise term τ N , that represent the equivalents to the respective muscle activations a in (3): Analogous to the feedback muscle activations a FB in (4), the feedback torques τ FB consist of multiple components that are produced at different delays (Lakatos et al., 2011): where τ FB,i , τ FB,r , and τ FB,v are intrinsic, reflexive, and voluntary feedback torques, respectively. In the interval [0, δ r ], the feedback torques τ FB only consist of intrinsic feedback torques τ FB,i , which result from muscle intrinsic viscoelastic properties and are thus not affected by any delays (Tee et al., 2004). In the subsequent interval ]δ r , δ v ] and for all t p > δ v , the feedback torques τ FB additionally contain contributions from reflexive feedback torques τ FB,r and voluntary feedback torques τ FB,v , respectively. Inserting (6) in (5) yields the internal torques τ int for all t > δ v : This general model represents the basis for the derivation of the involuntary impedance model.

Dynamic Regressor Representation
The rigid body dynamics (1) is non-linear with respect to the arm configuration q and the respective derivativesq andq. Nonetheless, the Lagrangian formalism enables the derivation of the dynamic regressor representation, which is linear with respect to the standard inertial parameters of the system: where Y is the regressor matrix and π = [π T 1 , π T 2 ] T is the standard inertial parameter vector, which is composed of the inertial parameters of the upper arm π 1 and the forearm π 2 (Gautier and Khalil, 1988). In general, the inertial parameters consist of the masses and the mass moments of first and second order (Khalil and Dombre, 2004). However, some parameters do not have any effect on the system dynamics which corresponds to a zero column in the regressor matrix Y. For the multi-joint arm, omission of these parameters yields the reduced inertial parameter vector π r = I x 2 x 2 ,1 , I x 2 x 2 ,2 , m 2 l cx 3 ,2 , m 2 T , in which, according to the parallel axis theorem, Thus, the inertial properties are defined by the moments of inertiã I x 2 x 2 ,i , the masses m i , and the lengths from the joints to the centers of gravity of the links l cx 3 ,i . Comprehensive analysis of the reduced regressor matrix Y r reveals that further parameter reduction is possible. Appropriate transformations yield the base inertial parameter (BIP) vector π, which contains the minimal set of parameters, and the corresponding base regressor matrix Y of the multi-joint arm: π = I x 2 x 2 ,1 + m 2 l 2 1 , I x 2 x 2 ,2 , m 2 l cx 3 ,2 l 1 T , Y(q,q,q) = q 1q1 +q 2 y 1,3 0q 1 +q 2 y 2,3 , with y 1,3 = (2q 1 +q 2 )cos(q 2 ) − (2q 1q2 +q 2 2 )sin(q 2 ) , y 2,3 =q 1 cos(q 2 ) +q 2 1 sin(q 2 ) , where l 1 is the length of the upper arm (Klodmann et al., 2015).

Involuntary Impedance Model
While the sum of the intrinsic feedback torques τ FB,i and the reflexive feedback torques τ FB,r can be modeled by a linear system composed of the involuntary impedance components, damping and stiffness (Bennett et al., 1992), the voluntary feedback torques τ FB,v are highly task-specific and can therefore not be modeled by such a general formulation (Franklin and Wolpert, 2011). Therefore, in this work, in order to guarantee exclusion of voluntary feedback, we limit the duration of the estimation interval to T est = δ v . Due to the short latency of the reflexive feedback and the variability of the respective delays, separation of the intrinsic and the reflexive feedback contributions is difficult (Burdet et al., 2013). Thus, we summate the contributions in the involuntary feedback torques For small deviations, the feedback behavior evoked by the force perturbations can be described by a linearized model obtained by first order Taylor series expansion of the general model (7) about the unperturbed states q * ,q * ,q * , a * , τ * int , and τ * ext . According to (2), the variational internal torques τ int = τ * int − τ int after the onset of the perturbation depend on the muscle lengths λ, the respective derivativesλ, and the muscle activations a. Due to the confinement to the estimation interval [0, T est ], the variational muscle activations a = a * − a, as defined by (3) and (4), can only consist of reflexive feedback muscle activations a FB,r .
Considering a = f (λ,λ) as well as λ = J m (λ)q andλ = J m (λ)q, linearization of the internal torques τ int for t p ∈ [0, T est ] yields in which all variational variables, indicated by a symbol, represent the deviations from the unperturbed states, e.g., q = q * − q, and the total derivatives are defined as Inserting (7) for t p ∈ [0, T est ] and (14) in (15) yields In this linearized model, the involuntary joint damping D q and the involuntary joint stiffness K q , which are both part of the involuntary impedance components 1 , are defined as Inserting the BIP vector π of (12) and the base regressor matrix Y of (13) in (8) and using the resulting linear model to calculate the variational internal torques τ int yields Subsequently inserting (17) in (19) yields the involuntary impedance model In some pHRI scenarios, measurements are performed in Cartesian space. In these scenarios, the arm configuration q can be obtained from the arm endpoint configuration x with the inverse kinematics where l 2 is the length of the forearm and the Cartesian origin is located in the shoulder joint. Furthermore, the Jacobian J can be used to transform the external endpoint forces u ext to the external torques The problem considered in this work consists of the estimation of the involuntary impedance components, i.e., BIP vector π , damping D q , and stiffness K q , in the estimation interval [0, T est ]. This is to be achieved given the perturbed observations {x, u ext }, which result from application of force perturbations during multi-joint arm movements, and requires estimation of the unperturbed dynamics {q * ,q * ,q * , τ * ext }.

Involuntary Impedance Estimation
In this section, we first present the estimation of the unperturbed dynamics via feedback jerk isolation and the perturbation design. Subsequently, we present the involuntary impedance estimation.

Feedback Jerk Isolation
According to the minimum jerk principle (Flash and Hogan, 1985), the CNS optimizes the arm endpoint trajectory in a point to point movement through minimization of the total endpoint jerk. We take advantage of this by designing the perturbation such that the dominant frequencies of the jerks of the evoked feedback behavior lie above those of the unperturbed movements. Due to this design, we are able to estimate the evoked feedback behavior in the form of the variational jerks ...
x through application of a high pass filter to the jerks ...
x of the perturbed movement. In order to achieve maximum pass band flatness and fast stop band roll-off, we perform this feedback jerk isolation with a Butterworth filter that is implemented as a digital biquad filter with filter order n HP = 10. It is applied bi-directionally for zero phase distortion and the cut-off frequencies f c,HP are defined based on the energy spectral densities (ESDs) ψ of the jerks ...
x of the unperturbed and perturbed movements (Stein, 2000): We use the set , which contains the ESDs ψ of the jerks ...
x of all unperturbed and perturbed movements, to calculate the averaged ESDs of the unperturbed movements ψ UP , the perturbed movements ψ P , and the resulting feedback behavior ψ FB = ψ P − ψ UP . The cut-off frequencies f c,HP are defined as the frequencies f at which ψ FB > ψ UP , i.e., the averaged ESDs of the feedback behavior ψ FB become larger than those of the unperturbed movements ψ UP . The high pass filtered jerks ...
x HP yield the estimated variational jerks ...
x and the estimated unperturbed jerks ... x * . Integration then yields the estimated variational kinematics { x, ẋ , ẍ } and the estimated unperturbed kinematics {x * ,ẋ * ,ẍ * }. For simplicity, from this point on, we refer to the averaged ESDs ψ UP , ψ P , and ψ FB simply as ESDs.
The apparatus in our experiments is controlled by means of an admittance control scheme. Therefore, the external forces u ext can be calculated with where u pert is the perturbation force, u adm is the admittance force, M handle is the handle inertia, and M adm and D adm are the admittance inertia and damping, respectively. Inserting the estimated unperturbed kinematicsẋ * andẍ * in (25) and (26) yields the estimated unperturbed external forcesû * ext and the estimated variational external forces û ext . The estimated unperturbed arm configurationq * and the respective derivativeṡ q * andq * are calculated using the inverse kinematics (21) and (22) and the estimated unperturbed external torquesτ * ext are calculated using the transformation (23). Finally, the estimated FIGURE 3 | Block diagram of the feedback jerk isolation. For every variational variable χ, the respective unperturbed variable χ * = χ + χ is calculated with the respective perturbed variable χ, and vice versa. Figure 3 presents a block diagram that illustrates the complete procedure of the feedback jerk isolation.

Perturbation Design
The perturbation is designed to meet two essential criteria. First, the dominant frequencies of the jerks of the evoked feedback behavior should lie above those of the unperturbed movements. Second, in order to minimize movement interference and corrective oscillations, it should not only produce deviations, but also move the hand back toward the unperturbed states. In order to fulfill these criteria, we design the perturbation acceleration profileẍ pert through concatenation of two sinusoidal functions with where A p and T p are amplitude and duration. The first of the two functions ξ p,1 with amplitude A p,1 and duration T p,1 is responsible for the deviation from the unperturbed states. The second function ξ p,2 with amplitude A p,2 and duration T p,2 supplies the retracting movement toward the unperturbed states. In order to minimize hardware oscillations due to the perturbation, we define T p,2 = (3/2)T p,1 and scale A p,2 such that all perturbation profiles are zero at the end of the perturbation. Lastly, we combine the perturbation acceleration profileẍ pert and velocity profileẋ pert with the admittance inertia M adm and damping D adm of our apparatus to obtain the perturbation force profile u pert . It is defined by the amplitude of its first peak A pert and its duration T pert = T p,1 + T p,2 . In the dynamic movement task in this work, A pert = 40 N and T p,1 = 70 ms result in T pert = 175 ms. The resulting normalized perturbation force profile u pert and the corresponding kinematics profiles {x pert ,ẋ pert ,ẍ pert , ...
x pert } are illustrated in Figure 4. Given the normalized perturbation force profile u pert , the perturbation force in which the perturbation angle φ pert is defined by the set of perturbation angles pert .

Involuntary Impedance Estimation
Due to the limited duration of the estimation interval T est , we assume that the involuntary impedance components D q and K q are constant for t p ∈ [0, T est ]. Concatenation of the elements of the BIP vector π and the resulting involuntary impedance parameters D q and K q in the unknown model parameter vector ζ = π 1 , π 2 , π 3 , D q,11 , D q,12 , D q,21 , D q,22 , K q,11 , K q,12 , K q,21 , K q,22 T allows expression of (20) by the identification model where A is the observation matrix and b is the output vector. For an interval with a total of N samples in which the independent variable matrix X is defined as The estimated model parameter vectorζ is given bŷ Due to the limited duration of the estimation interval T est and the likewise limited duration of the perturbation T pert , the perturbation in (29) does not possess sufficient richness of frequency components to guarantee persistent excitation (Söderström and Stoica, 1989). A common approach for the compensation of such effects is the a priori estimation of the inertial parameters (Gomi and Kawato, 1997;Darainy et al., 2007;Wong et al., 2009a,b) which only marginally influences the estimated impedance parameters (Gomi and Osu, 1998).
As the inertial parameters possess constant values that are independent of the dynamics, the elements of the BIP vector π can be estimated a priori in a static postural task. The intrinsic feedback behavior of the multi-joint arm is governed by spring-like characteristics that result from the elastic properties of the individual muscles. Thus, the intrinsic feedback forces possess zero curl and the intrinsic stiffness is symmetric (Shadmehr and Arbib, 1992). The reflexive feedback forces may possess non-zero curl which can only result from heteronymous inter-muscular reflex arcs with unequal activation gains (Hogan, 1985). However, as the resulting antisymmetric stiffness elements are significantly smaller than the corresponding symmetric ones, the reflexive feedback behavior is nonetheless governed by predominantly spring-like characteristics (Mussa-Ivaldi et al., 1985;Artemiadis et al., 2010). As the estimation interval in this work is significantly shorter than the ones in Mussa-Ivaldi et al. (1985) and Artemiadis et al. (2010), we assume that the stiffness K q is symmetric for t p ∈ [0, T est ]. As the potential energy of the vector field of the involuntary feedback forces must increase in response to deviations from the unperturbed states, the symmetric stiffness K q must also be positive definite. In order to incorporate both symmetry and positive definiteness of the stiffness K q in the estimation, we apply the Cholesky decomposition K q = L q L T q , where L q is a lower triangular matrix (Horn and Johnson, 2012). With this decomposition and the a priori determined values of the estimated BIP vectorπ , the unknown model parameter vector ζ reduces to ζ = D q,11 , D q,12 , D q,21 , D q,22 , L q,11 , L q,12 , L q,22 T (36) FIGURE 4 | Normalized perturbation force profile u pert and kinematics profiles {x pert ,ẋ pert ,ẍ pert , ...
x pert } of the dynamic movement task in the simulation and the experiment. The first part of the perturbation with duration T p,1 is responsible for the deviation from the unperturbed states and the second part of the perturbation with duration T p,2 supplies the retracting movement toward the unperturbed states. and the elements of the output vector b are replaced by b = τ ext − Y(q,q * ,q,q * ,q,q * )π .
Due to the non-linearity introduced by K q = L q L T q , we can no longer use the identification model in (31). Instead, we apply non-linear least squares analysis to minimize f (ζ ) 2 with to determine the estimated model parameter vectorζ , i.e., estimated dampingD q and stiffnessK q .

Simulation
We validate our method with simulated data of a neuromechanical model of the human arm, which generates transversal movements of a two-link, six-muscle arm through calculation of muscle activities (Franklin et al., 2008). By simulating each movement twice, once with and once without perturbation, we validate the estimation of the unperturbed dynamics. Furthermore, we use the inertial characteristics and the intrinsic impedance to validate the estimation of the involuntary impedance parameters.
We use the neuromechanical model to simulate a static postural task and a dynamic movement task. As the static postural task does not include an underlying movement, we are able to apply the do-not-intervene-voluntarily paradigm. While the do-not-intervene-voluntarily paradigm does not guarantee exclusion of voluntary feedback or constant unperturbed states (Sainburg, 2015), it is widely used and accepted as a plausible approximation of both of these assumptions (Mussa-Ivaldi et al., 1985;Gomi and Kawato, 1997;Osu and Gomi, 1999). Thus, given the absence of an underlying movement, and the fact that the elements of the BIP vector π are all constant, we are able to use an estimation interval duration T est that is longer than the delay of voluntary feedback δ v in the static postural task. As the dynamic movement task does include an underlying movement, it is performed without the do-not-intervene-voluntarily paradigm and we use the limited estimation interval duration T est = δ v .

Human Arm Model
The rigid body dynamics of the neuromechanical model are determined by (1) and the correlations between the internal torques τ int and the muscle tensions m are determined by (2). The muscle tensions where m a and m imp represent the muscle tensions due to muscle activation and mechanical impedance, respectively. The muscle tensions due to muscle activation m a are assumed to be identical to the muscle activations a in (3) and the muscle tensions due to mechanical impedance where D m , K m , and e m represent damping, stiffness, and tracking errors with respect to the desired trajectory at the muscle level and K 0 and K 1 contain intrinsic stiffness parameters. The feedforward muscle activations a FF are calculated a priori based on the inverse kinematics and dynamics of the desired movement. The feedback muscle activations a FB are modeled by proportional-derivative (PD) control that depends on the tracking errors e m , the respective derivativesė m , and a simulated feedback delay δ s = 60 ms. The signal-dependent noise a N is calculated with zero mean Brownian motion, which provides movements similar to those observed in previous experiments (Burdet et al., 2001). It is generated by a normally distributed random variable, which is amplified by a parameter α N = 12.5 and filtered by a fifth order low pass Butterworth filter with a cut-off frequency f c,N = 2 Hz. In the implementation of the neuromechanical model, there is no distinction between reflexive and voluntary feedback. Instead, both feedback mechanisms are combined and the simulated feedback delay δ s is defined to lie between the respective delays δ r and δ v . As the simulated feedback delay δ s is shorter than the duration of our estimation interval T est , the feedback behavior within the estimation interval is influenced by voluntary contributions. In order to avoid this and enable the validation of the estimated involuntary impedance parametersD q and K q through the simulated intrinsic impedance components, we use a variation, in which the simulated feedback delay δ s is defined equal to the delay δ v of voluntary feedback, i.e., δ s = δ v = 115 ms. Although the physiological correctness of the variation is slightly reduced compared to the original implementation, it nonetheless represents a plausible simulation of human behavior. More importantly, it represents a means for validation of our method and for comparison of its performance to those of existing methods. In order to represent the effects of the do-not-intervene-voluntarily paradigm, voluntary feedback is removed in the simulations of the static postural task.

Simulation Design
As the definition of the signal-dependent noise a N is based on a normally distributed random variable, the static task and the dynamic task are each simulated ten times and the respective results are averaged. In both tasks, the simulated manipulandum inertia and damping are defined according to the admittance inertia M adm = diag{5, 5} kg and damping D adm = diag{20, 20} Ns/m of our apparatus.
Static task. In the static task, the arm maintains a total of five different positions that are distributed evenly across the horizontal plane:  Figure 5). In order to obtain perturbations that are similar to those of the static estimations in Gomi and Kawato (1997), we define the perturbation profile duration T p,1 = 160 ms which results in a total perturbation duration T pert = 400 ms. We change the interaction mode from closed-loop to openloop which means that applied forces do not contribute to the movement of the manipulandum. As this means that we are essentially applying a position perturbation, we are able to define the amplitude of the perturbation position profile x pert , which we set to 8 mm. The perturbation angles φ pert are defined by the set pert = {(π/12)k | k = 1 − 24 \ {6, 12, 18, 24}}, which contains 20 angles. Each perturbation angle φ pert is executed once in every position and the order of the resulting perturbations is randomized. Single execution of each of the 20 perturbation angles φ pert in each of the five positions results in a total of 100 trials. The duration of the estimation interval T est = 400 ms and the estimated BIP vectorπ is obtained by calculating the least squares solution for the complete data set of all 100 trials.
Dynamic task. In the dynamic task, the arm performs twodimensional point to point movements along the sagittal axis from x start = [0, 0.30] T m to x goal = [0, 0.55] T m (see Figure 5). The duration of the movements T mov is set to 2 s. The feedforward muscle activations a FF are calculated with the inverse kinematics and dynamics of a positional data set, which is provided by the authors of Franklin et al. (2008). It contains positional data of 50 two-dimensional point to FIGURE 5 | Schematic representation of the human arm task space, the static task, and the dynamic task. In the static task, the arm maintains a total of five different positions that are distributed evenly across the horizontal plane. In the dynamic task, it performs point to point movements along the sagittal axis. point arm movements, which possess identical start and goal positions as the movements in our task. The perturbations are designed to generate sufficiently large deviations in as short a time frame as possible, but nonetheless be physically plausible and kinesthetically renderable under hardware limitations. The perturbation amplitude A pert = 40 N and the perturbation profile duration T p,1 = 70 ms which corresponds to a total perturbation duration T pert = 175 ms. The perturbation is initiated when the hand reaches x 2 = 0.4 m which equals a distance of 0.1 m along the axis of the principal movement. The perturbation angles φ pert are defined by pert , which is identical to the static task. Each perturbation angle φ pert is executed three times and the order of the resulting perturbations is randomized. Three repetitions of each of the 20 perturbation angles φ pert result in a total of 60 trials, of which each consists of an unperturbed and a perturbed movement. The duration of the estimation interval T est = 115 ms and the estimated dampingD q and stiffnessK q are obtained by calculating the least squares solutions for data sets of 20 trials each. Grouping the complete data set of all 60 trials into data sets of 20 trials each, starting from the first 20 trials and moving the respective data set along trial by trial until the last 20 trials, provides a total of 41 individual least squares solutions.

Experiment
In order to evaluate the performance of the proposed method for real data, we conduct an experiment with 12 participants. The experiment design is identical to the simulation design, apart from a randomized time interval before the onset of the perturbation in the static task and a randomized distribution of unperturbed and perturbed trials in the dynamic task.

Apparatus and Experiment Design
The apparatus is presented in Figure 6. It consists of two linear servo motor driven single rail stages (Copley Controls Thrusttube Module) that are mounted on top of each other in orthogonal orientation. Together, they span a 2-DoF workspace of ±0.20 m and are each equipped with an encoder that yields position data with a precision of 1 µm. Additionally, six motion capture cameras (Qualisys Miqus) are available for tracking of passive markers. A vertical handle and a 6-DoF force-torque sensor (JR3-67M25) are mounted on top of the upper servo motor driven cart to measure the forces in the horizontal plane. A custombuilt seat with shoulder belts, a sling attached to the ceiling, and a wrist orthosis are available for limitation of the movements of the participants. Visual feedback is provided on a screen behind the apparatus and implemented with the Psychophysics Toolbox (Brainard, 1997). The position of the cart is visualized by a dot and the workspace safety boundaries are visualized in the form of a boundary box.
Haptic interaction by means of participant force input is enabled by the admittance control scheme defined in (26), where M adm = diag{5, 5} kg and D adm = diag{20, 20} Ns/m. The parameters of the admittance control scheme are chosen to guarantee natural interaction with the apparatus and sufficient attenuation of force-torque sensor noise. Precise rendering of the position is ensured by a high gain PD controller, which is implemented in Matlab/Simulink and executed on a Linux system with RT-preempt real-time kernel (Ubuntu 14.04, 3.14.3-rt4-prt). The sample rate is set to f s = 4 kHz and inputs to the Thrusttube Modules are downsampled to 2 kHz due to hardware limitations. The signals are filtered using a seventh order Savitzky-Golay polynomial filter with a cut-off frequency of f c,SG = 50 Hz.
In order to avoid predictability of the perturbations and adaptation of the participants, a randomized time interval with T random ∈ [1, 3] s is included before the onset of the perturbations in the static task. For the same reasons and to obtain both unperturbed and perturbed movements, the ratio of perturbed trials to total amount of trials is set to 33 % in the dynamic task. In order to avoid effects of fatigue, the resulting 180 trials of the dynamic task are performed in three consecutive blocks. Each of these blocks contains a total of 60 trials that consist of 20 perturbed and 40 unperturbed randomly distributed trials. In both tasks, the visual feedback of the position of the cart is deactivated during the perturbations.

Participants and Experiment Procedure
A total of 12 participants (9 males, 3 females) with betweensubject mean (SD) age of 26.92 (3.40) years, height of 174.83 (7.38) cm, and weight of 69.43 (9.82) kg volunteered to participate in this experiment. All 12 participants had normal or corrected-to-normal vision. One of the participants was left handed and performed the experiment with the non-dominant hand. The remaining 11 participants were right handed and participated with the dominant hand. Informed written consent was obtained from all participants before the experiment, which was conducted according to the principles in the Declaration of Helsinki. The research ethics were obtained from the ethics committee at the Technical University of Munich.
The participants were seated in front of the apparatus and their upper body was restrained by two shoulder belts to fix the position of the shoulder. Their upper arm was supported by a sling attached to the ceiling to constrain all motions to the horizontal plane and reduce effects of fatigue. In order to avoid wrist motions, the participants were wearing a wrist orthosis. Two passive motion capture markers were placed on shoulder and elbow to measure the lengths of the upper arm and the forearm.
In the beginning of the static task, the cart automatically moved to the first position at the bottom of the workspace. After the application of all 20 perturbations, it automatically moved to the next position. This procedure was repeated until all 100 perturbations were completed. Analogous to the simulation, each perturbation angle φ pert was executed once in each of the positions and the order was randomized. The participants were informed about the procedure, including the perturbations and the randomized time interval before the onset of the perturbation. They were told that their objective was to naturally grasp the handle on top of the cart. Additionally, they were instructed to not voluntarily react to the perturbations in any way and not prepare for them in any kind of preemptive manner, e.g., by co-contraction.
In the beginning of the dynamic task, the cart automatically moved to the start position at the bottom of the workspace. As soon as it reached the start position, the admittance turned on. As soon as the participants reached the goal position, represented by a dot at the top of the workspace, the admittance turned off and the cart automatically moved back down to the start position. This procedure was repeated until all 60 trials with 20 randomly distributed perturbed trials were completed. In total, three of these blocks of trials were completed for a total of 180 trials. Analogous to the simulation, each perturbation angle φ pert was executed three times and the order was randomized. The participants were informed about the procedure, including the perturbations and their random distribution. They were told that their objective was to naturally grasp the handle on top of the cart and move it from the start position to the goal position. They were told that the duration of the movement should be ∼ 2 s. In order to help them adjust their movement velocity, a beep sound occurred after 2 s. The participants were informed that they were allowed to voluntarily react to the perturbations. Additionally, they were instructed to not prepare for the perturbations in any kind of preemptive manner, e.g., by co-contraction.

RESULTS
In this section, we first present the results of the simulation, which consist of the validation of the feedback jerk isolation and the involuntary impedance estimation. Subsequently, we present the results of the evaluation with experimental data. The results of the simulation are either presented in within-, between-, or cross-simulation mean (SD). The cross-simulation mean (SD) is obtained by calculating the mean of the withinsimulation means and the mean of the within-simulation SDs. Analogous correlations apply for the results of the experiment, which are presented in within-, between-, or cross-subject mean (SD). As the majority of the results are cross-simulation/crosssubject mean (SD) results, for simplicity, from this point on, the respective results are referred to simply as mean (SD) results.

Validation Feedback Jerk Isolation
In this section, we validate the feedback jerk isolation by analyzing its performance for different movement velocities and variations of neural noise. Furthermore, we compare the results to those obtained by application of the methods in Gomi and Kawato (1997) and Erden and Billard (2015). In order to ensure equal conditions and enable performance comparisons without effects of voluntary feedback, all methodologies are applied to the same sets of simulated data and use the same duration of the estimation interval T est = 115 ms. The estimation of the unperturbed dynamics {q * ,q * ,q * , τ * ext } is validated by analysis of the estimated variational dynamics { x, ẋ , ẍ , û ext }. The estimation accuracy is assessed using the normalized root mean square errors (NRMSEs) in the estimation interval [0, T est ]: where ν is the normalizing value, χ is the real value,χ is the estimated value, and n = 2 is the dimensionality. For the estimated variational dynamics { x, ẋ , ẍ , û ext }, the normalizing value ν is given by the maximum real value in the estimation interval [0, T est ]. In order to analyze the performance for different movement velocities, the duration of the movements T mov is changed (1, 3 s). For analysis of the performance for different variations of neural noise, the parameterization of the zero mean Brownian motion is changed in terms of the cut-off frequency f c,N (1, 3 Hz) and the amplitude α N (5, 20). Figure 7A shows the mean results of the ESDs ψ UP and ψ P .

Filter Configuration
In the unperturbed movements, the energy of the principal movement along the x 2 axis is distinguishable by a peak in the respective ESD ψ UP,2 . A secondary, much lower peak represents effects of neural noise. As the point to point movements do not require movements along the x 1 axis, the respective ESD ψ UP,1 is significantly lower and represents effects of neural noise. The energy of both axes of the unperturbed movements decreases to marginally low values for high frequencies. The opposite applies for the energy of the perturbed movements, which increases to significantly higher values for high frequencies.
The respective ESDs ψ P are much higher than those of the unperturbed movements and have multiple peaks in the high frequency range. Figure 7B shows the ESDs ψ UP , ψ P , and ψ FB of a single simulation in two axis-specific graphs to demonstrate the calculation of the cut-off frequencies f c,HP . The cut-off frequency f c,HP,2 of the x 2 axis is sufficiently high to lie above the energy of the principal movement. Concurrently, it is sufficiently low to lie below the energy of the feedback behavior. Due to the absence of movements along the x 1 axis, the cut-off frequency f c,HP,1 is slightly lower. The between-simulation mean (SD) results of the cut-off frequencies f c,HP of all ten simulations are f c,HP = [1.89 (0.08), 2.24 (0.04)] Hz.

Results
The mean results of the NRMSEs in Figure 8 show that the NRMSEs all increase with t p . The gradual increase of the NRMSEs of the estimated variational accelerations ẍ results from the integration of the estimated variational jerks ...
x , which include estimation errors that originate from the calculation of the high pass filtered jerks ... x of the evoked feedback behavior become larger than the ESDs ψ UP of the jerks ...
x of the unperturbed movements. For clarity, the calculation is exemplarily illustrated for a single simulation and both graphs are limited to the low frequency ranges. The respective cut-off frequencies f c,HP are indicated by vertical lines.
the maximum values of the estimated variational kinematics { x, ẋ , ẍ } are all below 6 % and the NRMSE of the estimated variational external forces û ext is below 8 %.

Comparison
In order to compare the results to those obtained with the methods in Gomi and Kawato (1997) and Erden and Billard (2015), we average the NRMSEs over the complete estimation interval [0, T est ]. For simplicity, from this point on, we refer to the resulting averaged NRMSEs simply as NRMSEs. The mean (SD) results of the NRMSEs for the original simulation are listed in Table 1A and those for different durations of the movements T mov , cut-off frequencies f c,N , and amplitudes α N are listed in Tables 1B-D, respectively. The abbreviations FJI, GOM, and ERD represent our feedback jerk isolation and the methods in Gomi and Kawato (1997)  (2015), respectively. In the discussion of the NRMSEs of the alternative simulations, the changes in configurations and results are evaluated relative to the original simulation. In order to facilitate the comparisons of the NRMSEs of the original and alternative simulations, Tables 1B-D also include the NRMSEs of the original simulation, and the contents of all three tables are additionally illustrated with intermediate values in Figure 9.

and Erden and Billard
The NRMSEs in Table 1A show that FJI achieves high estimation accuracy and outperforms both GOM and ERD. The magnitudes and differences of the individual NRMSEs of FJI are in accordance with those of the NRMSEs presented in Figure 8. The performance difference of FJI and GOM increases from a small difference in ẍ to a large difference in x and the performance difference in û ext is marginally larger than that of ẍ . These correlations are plausible, due to the dependency on the averaged unbiased dynamics of all perturbed movements. While forces and accelerations are of similar magnitude, velocities and especially positions vary between movements. As a consequence, the accuracy of the averaged unbiased dynamics decreases significantly from accelerations down to positions. Similar correlations apply to the performance difference of FJI and ERD, but with significantly larger differences, especially for ẋ and x. The magnitudes of the NRMSEs of ERD are plausible, due to the dependency on the constant values of the offsets at perturbation onset. While these constant values represent accurate approximations of the unperturbed states of the quasi-static movements of the manual welding task in Erden and Billard (2015), they are incapable of accurately representing the kinematics of the unperturbed states of dynamic point to point movements. For this reason, we exclude this method from the remainder of the validation of the feedback jerk isolation and the involuntary impedance estimation.
The NRMSEs in Table 1B show that the performance of FJI is decreased for a higher movement velocity (resulting from a shorter duration of the movement) and marginally increased for a lower movement velocity (resulting from a longer duration of the movement). A higher movement velocity results in an increased cut-off frequency of the high pass filter which causes an increased information loss in the isolation of the feedback jerk. As a result, the estimation accuracy is decreased. The marginal increase in performance for a lower movement velocity suggests that the corresponding NRMSEs are close to the smallest possible errors, which are caused by unavoidable frequency overlap of unperturbed movements and feedback behavior, neural noise, and imperfect properties of a non-ideal high pass filter. The performance of GOM is slightly increased for a higher movement velocity and slightly decreased for a lower movement velocity. As the accumulated influence of the neural noise at the onset of the perturbation is negatively correlated with the movement velocity, the deviations from the averaged unbiased dynamics are larger for slow movements than they are for fast movements. As the performance of GOM is directly influenced by these deviations, the estimation accuracy is positively correlated with the different movement velocities. For T mov = 1 s, GOM outperforms FJI in all NRMSEs except x, for which the performance difference is decreased compared to Table 1A. For T mov = 3 s, FJI outperforms GOM with slightly increased performance differences compared to Table 1A.
The NRMSEs in Table 1C show that the performance of FJI is slightly decreased for lower and higher cut-off frequencies of the noise, with the latter resulting in slightly larger performance differences than the former. It seems that the lower cut-off frequency of the noise results in such a decreased cut-off frequency of the high pass filter, that it is too close to the frequency content of the unperturbed behavior. For the higher cut-off frequency of the noise, the increased cut-off frequency of the high pass filter results in an increased information loss in the isolation of the feedback jerk. Despite these slight decreases in performance, FJI still achieves high estimation accuracy for both alternative simulations. The performance of GOM is increased/decreased for lower/higher cut-off frequencies of the noise due to smaller/larger deviations from the averaged unbiased dynamics. For f c,N = 1 Hz, GOM outperforms FJI in all NRMSEs except x, for which the performance difference is decreased compared to Table 1A. For f c,N = 3 Hz, FJI outperforms GOM with slightly increased performance differences compared to Table 1A.
The NRMSEs in Table 1D illustrate that a lower amplitude of the noise has an almost identical effect on the performance as a lower cut-off frequency of the noise in Table 1C. In contrast, a higher amplitude of the noise has very different effects. For FJI, a higher amplitude of the noise only leads to a marginal decrease of the performance, much smaller than that in Table 1C. For GOM, it leads to a similar decrease of the performance as in Table 1C, except for x, for which the decrease is much larger. For α N = 5, the performance difference of FJI and GOM is almost identical to that of f c,N = 1 Hz in Table 1C. For α N = 20, FJI outperforms GOM with increased performance differences compared to Table 1A, especially for x. The different effects emphasize three issues: (1) For GOM, the estimation accuracy strongly depends on the similarity to the averaged unbiased dynamics.
(2) For FJI, it depends more strongly on the frequency of the noise than it does on the amplitude, due to the influence on the cut-off frequency of the high pass filter. (3) The performance difference in GOM due to a higher amplitude of the noise is larger than the performance difference in FJI due to a higher frequency of the noise, especially for x.   Gomi and Kawato (1997) and Erden and Billard (2015), respectively.

Validation Involuntary Impedance Estimation
In this section, we validate and compare the performance of the involuntary impedance estimation. The estimation accuracy is assessed using the normalized absolute errors (NAEs) of the estimated values. The normalizing values are either given by the real values of the elements of the BIP vector π or the maximum real values of the elements of the damping D q and stiffness K q , which are obtained by averaging the respective elements for the estimation interval [0, T est ]. For comparability of the results, we transform the inertial parameters in Gomi and Kawato (1997) to the elements of the BIP vector π .
In order to obtain additional means of comparing the overall performance of the methods, we calculate two additional performance criteria that are based on the Akaike information criterion (AIC) and the Bayesian information criterion (BIC). When comparing least squares fitted models, the AIC and the BIC are defined by the residual sum of squares (RSS) of the real values and the estimated model outputs: where y is the real value,ŷ is the estimated model output, p is the number of parameters, k is the number of data points, and n = 2 is the dimensionality (Burnham and Anderson, 2002).
As we aim to compare the overall performance of the methods, we define the estimated model outputŷ to be the simulation output that is obtained by inserting the estimated BIP vectorπ , dampingD q , and stiffnessK q into the neuromechanical model of the human arm (within the estimation interval [0, T est ]), and the real value y to be the corresponding simulation output of the original simulation.

Table 2A
contains the mean (SD) results of the NAEs of the estimated BIP vectorπ , dampingD q , and stiffnessK q . The NAEs of the elements of the estimated BIP vectorπ do not possess SDs, because they are estimated with the complete data set of the static task. The NAEs of all three elements are below 2% which indicates high estimation accuracy of the combination of the dynamic regressor representation with the data of the static task. The NAEs of the elements of the estimated dampinĝ D q and stiffnessK q are all approximately equal to or below Frontiers in Neuroscience | www.frontiersin.org  10% and demonstrate high estimation accuracy of the nonlinear least squares estimation with the data of the dynamic task.
The NAEs of the elementsD q,11 andK q,11 are slightly increased compared to the remaining elements of the respective matrices. This slight increase is plausible, as these elements represent the contributions of the single-joint shoulder muscles, which are less involved due to less movement of the shoulder joint.

Comparison
The NAEs of the elements of the estimated BIP vectorπ of GOM are almost identical to those of FJI. The mean NAEs of the elements of the estimated dampingD q of GOM are marginally smaller than those of FJI and the opposite applies to the corresponding SDs which essentially makes these NAEs almost identical as well. In contrast, a considerable difference in estimation accuracy is found in the elements of the estimated stiffnessK q , for which both the mean NAEs and the SDs of GOM are larger than those of FJI, with the mean NAEs being approximately twice as large for GOM as they are for FJI. These results are plausible, as the difference in estimation accuracy of the variational dynamics { x, ẋ, ẍ, u ext } is largest for the variational positions x. According to the involuntary impedance model, the difference in estimation accuracy of the variational angles q directly influences the estimated stiffnessK q . The mean (SD) results of the RSSs, AICs, and BICs of the variational external torques τ ext are presented in Table 2B. FJI outperforms GOM in all three performance criteria. The difference in RSS is especially relevant, as it demonstrates, that the differences in AIC and BIC do not arise solely due to differences in the number of parameters p. Similar to the mean NAEs of the elements of the estimated stiffnessK q in Table 2A, the mean RSSs are approximately twice as large for GOM as they are for FJI. The differences in RSS, AIC, and BIC demonstrate that (1) the differences in the estimated BIP vectorπ, dampingD q , and stiffnessK q , represented by the respective NAEs in Table 2A, have a substantial effect on the replicability of the real simulation output and that (2) the involuntary impedance estimation results of FJI yield a more accurate replication of the real simulation output than those of GOM.

Evaluation
In this section, we evaluate the performance of our method for real data obtained in an experiment with 12 human participants. For simplicity, from this point on, we omit the term estimated when referring to the results of the involuntary impedance estimation, i.e., BIP vectorπ, dampingD q , and stiffnessK q . In the experiment, every trial is either an unperturbed or a perturbed trial. Consequently, we can not evaluate the estimation accuracy of the unperturbed states. Therefore, we only evaluate the calculation of the cut-off frequencies f c,HP of the high pass filter. Figure 10 shows the mean results of the ESDs ψ UP and ψ P , which look very similar to those of the simulation presented in Figure 7A. In the unperturbed movements of the experiment, the energy of the principal movement along the x 2 axis is more widespread than in the simulation. However, the respective ESD ψ UP,2 nonetheless possesses a peak at a similar frequency. The ESD ψ UP,1 is significantly lower, due to the absence of movements along the x 1 axis. Due to hardware noise, the energy of both axes of the unperturbed movements does not decrease to values as low as those of the simulation for high frequencies. However, the respective values are nonetheless significantly smaller than those of the perturbed movements, for which the energy increases to significantly higher values for high frequencies. The respective ESDs ψ P are much higher than those of the unperturbed movements and have multiple peaks in the high frequency range. While the overall energy of the unperturbed movements is slightly decreased, the overall energy of the perturbed movements is increased compared to the simulation. This difference is largely caused by high frequency oscillations in the jerks ...

Feedback Jerk Isolation
x of the perturbed movements which result from oscillations due to the perturbations paired with sensor noise and high gain PD control. Nonetheless, the between-subject mean (SD) results of the cut-off frequencies f c,HP = [1.45 (0.30), 2.09 (0.24)] Hz are very similar to those of the simulation, with both values being slightly lower due to the slight decrease in overall energy of the unperturbed movements. Nonetheless, the cut-off frequency f c,HP,2 is still sufficiently high to lie above the energy of the principal movement. Table 3 lists the within-and cross-subject mean (SD) results of the involuntary impedance estimation. Analogous to the simulation, the elements of the BIP vectorπ do not possess SDs, because they are estimated with the complete data set of the static task. The elements of the mean BIP vector π are similar to the simulated values in the simulation π sim = [0.1945, 0.0737, 0.0838]. The first elementπ 1 of the BIP vector is decreased compared to the respective value in the simulation. As this element largely depends on the inertial characteristics of the upper arm, the difference could be caused by the comparatively little movement of the shoulder joint which results in less involvement of the upper arm. As the inertial parameters in Gomi and Kawato (1997) are defined by the linearized rigid body dynamics, the respective estimation results can not be used for comparison. Thus, we use the elements of the mean BIP vectorπ to calculate the mean inertiaM q , which we transform to Cartesian space to obtain the mean Cartesian endpoint inertiaM x . The elements of this mean Cartesian endpoint inertiaM x are similar to those reported in existing studies (Tsuji et al., 1995;Chang et al., 2013;Dyck and Tavakoli, 2013). Consequently, the same applies to the Cartesian endpoint ellipse, which is illustrated in Figure 11.

Involuntary Impedance Estimation
The elements of the mean dampingD q are similar to the averaged simulated values D q,sim = [2.42, 1.20, 1.20, 1.42], with the exception of the elementD q,11 , which is comparatively large. As the elementD q,11 represents the contributions of the single-joint shoulder muscles, which are less involved due to less movement of the shoulder joint, it is more difficult to estimate. This correlation is also visible in the increased estimation error in the validation with simulated data (see Table 2A). As the remaining elements are decreased compared to the respective values in the simulation, it is possible that some of the contributions of these elements are allocated to the element D q,11 . As the damping results of Gomi and Kawato (1997) are not reported, we can not use them for comparison. However, the elements of the mean dampingD q are of similar order of magnitude as those of static tasks (Tsuji et al., 1995;Lakatos et al., 2011). The overall increase in magnitude compared to the results of these static tasks is to be expected, as similar correlations are observed for estimates of stiffness (Gomi and Kawato, 1997).
The elements of the mean stiffnessK q are similar to the averaged simulated values K q,sim = [29.05, 14.37, 14.37, 17.08], with the elementsK q,12/21 being slightly decreased in comparison. Consequently, the elements of the mean stiffnesŝ K q are also similar to those reported for comparable dynamic tasks (Burdet et al., 2000;Darainy et al., 2007;Wong et al., 2009b), including those reported for the dynamic task in Gomi and Kawato (1997). The difference in size of the SDs of the stiffnessK q and the dampingD q fits to the difference in estimation errors in the validation with simulated data (see Table 2A). It could however also indicate that variations in damping during the course of the experiment are generally lower than those in stiffness or that some of the variations in damping are incorrectly interpreted as variations in stiffness by the non-linear least squares estimation. Figure 11 shows the Cartesian endpoint ellipses of the mean inertiaM x , dampingD x , and stiffnessK x . The shapes and orientations of the ellipses are similar to those in existing studies (Tsuji et al., 1995;Gomi and Osu, 1998;Darainy et al., 2007). Similar to the results for the movements along the sagittal axis in Gomi and Kawato (1997), the major axis of the Cartesian endpoint ellipse of the mean stiffnessK x is oriented slightly more parallel to the x 2 axis, i.e., the axis of the principal movement. Figure 12 shows the between-subject mean (SD) results of the BIP vectorπ , dampingD q , and stiffnessK q for different durations of the estimation interval T est . In the static task, the elements of the BIP vectorπ converge to constant values and do not change for durations T est > 400 ms. Similar behavior is observable in the dynamic task for the elements of the dampingD q , with a slight decrease for durations T est > 115 ms. In comparison, the elements of the stiffnessK q converge at a slower rate and require a longer estimation interval to reach plausible values. Furthermore, the decrease for durations T est > 115 ms is larger than that of the elements of the dampingD q . This decrease could be caused by the return to the unperturbed states. The longer the estimation interval, the larger the percentage of the variational data with small deviations from the unperturbed states. The larger this percentage becomes, the more influence the respective values of the elements of the stiffnessK q have on the solution of the non-linear least squares estimation. The decrease could however also be caused by the activation of voluntary feedback contributions.  The respective Cartesian space matrices are calculated by transformation of the BIP vectorπ , dampingD q , and stiffnessK q , respectively.

DISCUSSION
The mean results of the involuntary impedance estimation for different durations T est in Figure 12 show that the elements of the dampingD q and the stiffnessK q reach plausible values for the estimation interval with duration T est = δ v = 115 ms. While the elements of the dampingD q already reach plausible values at T est ≈ 75 ms, the amount of information necessary for estimation of plausible values of the elements of the stiffnessK q is not reached until T est ≈ 110 ms. The mean results in Figure 12 show that the elements of the BIP vectorπ converge to plausible values for the estimation interval with duration T est = 400 ms. These results, in combination with the mean (SD) results in Table 3, which are similar to those reported for comparable dynamic tasks, successfully demonstrate the applicability of our method to real data. The mean (SD) results of the NRMSEs in Table 1A show that our feedback jerk isolation achieves higher estimation accuracy for the variational dynamics { x, ẋ , ẍ , û ext } than the methods reported in Gomi and Kawato (1997) and Erden and Billard (2015). The difference in estimation accuracy is especially large for the variational positions x. As a consequence, the estimation performance of the involuntary impedance estimation is increased for the elements of the stiffnessK q , which is shown by the mean (SD) results of the NAEs in Table 2A. The mean (SD) results of the NRMSEs for different simulation configurations in Tables 1B-D show that the estimation accuracy of the feedback jerk isolation is decreased for a higher movement velocity. This is plausible, as a higher movement velocity results in an increased cut-off frequency of the high pass filter which causes an increased information loss in the isolation of the feedback jerk. Due to similar reasons, a higher cut-off frequency of the neural noise also leads to a decrease of the estimation accuracy. For all of the remaining simulation configurations, changes in the estimation accuracy are marginal. While the estimation accuracy of the method in Gomi and Kawato (1997) is similarly decreased for a higher cut-off frequency of the neural noise, it is contrastly increased for a higher movement velocity. As this method depends on the similarity of the movement to the averaged unbiased dynamics, an increase of the amplitude of the neural noise leads to a significant decrease of estimation accuracy, current constraints for safe pHRI. For those movements with peak velocitiesẋ peak that are similar to or moderately increased compared to the current constraints, the feedback jerk isolation provides superior estimation performance. As the neural noise parameters in the simulation are calculated to provide movement variability similar to that observed in repetitive, straight reaching movements (Burdet et al., 2001) and both neural and kinematic variability are shown to be interrelated individual characteristics (Haar et al., 2017), it is unlikely that realistic pHRI scenarios possess a lower amount of movement variability. Thus, based on these circumstances, we conclude that the feedback jerk isolation is well-suited for our envisaged application in realistic pHRI scenarios.
We approximate the feedback behavior by a linearized model. This approximation is analogously applied in comparable studies (Dolan et al., 1993;Burdet et al., 2000;Darainy et al., 2007) that include deviations of similar size or larger than the ones in our work. In our static task, the position perturbations with amplitudes of 8 mm result in mean (SD) maximum external force deviations of 4.75 (1.80) N. In our dynamic task, the force perturbations result in mean (SD) maximum position deviations of 3.53 (0.34) mm and external force deviations of 9.48 (3.05) N. The mean (SD) results of the NAEs in Table 2A shows that the estimation errors of the elements of the BIP vectorπ are almost identical for the dynamic regressor representation and the linearized rigid body dynamics of Gomi and Kawato (1997). This indicates that the variational dynamics { q, q, q, τ ext } of the static task are sufficiently small to allow for linearization of the rigid body dynamics without loss of estimation accuracy and supports the assumption that the feedback behavior evoked by the force perturbations can be approximated by a linearized model.
Another advantage of perturbations with comparatively small amplitudes is that they are less likely to lead to instability of the movement than those with larger amplitudes. The region of stability of the movement is influenced by a number of factors, including feedforward and voluntary feedback behavior, which are both highly task-specific. As stiffness is positively correlated with internal torques, movements that require larger feedforward torques, e.g., due to increased movement velocity or interaction forces, coincide with larger stiffness (Tee et al., 2004). Thus, for these movements, perturbations with a certain amplitude result in smaller deviations. Assuming that the voluntary feedback behavior can be modeled by stochastic optimal control, the effects of the perturbations after the delay of voluntary feedback depend on the incorporated cost functions, e.g., defined by the tracking error or the smoothness of the movement (Todorov and Jordan, 2002). Furthermore, the region of stability is also heavily influenced by the level of muscle cocontraction, which is known to increase in response to a number of factors, e.g., incomplete or incorrect internal models (Tomi et al., 2008) and increased accuracy requirements (Lametti et al., 2007). In Burdet et al. (2001), participants increase their level of muscle cocontraction in order to successfully compensate effects of unpredictable environmental instabilities. Due to all of these factors, assessment of the region of stability must take into consideration task-specific influences and constraints. In future work, we aim to apply our method to more complex movements and complement corresponding assessments of the region of stability, e.g., in the form of numerical assessments based on Monte Carlo methods.
Multiple studies present impedance estimation methods based on Kalman filters (Deng et al., 2006;Asao et al., 2013) or extended Kalman filters (Roveda et al., 2013). The advantage of these filters is that, given continuous observations of the relevant variables, they allow for estimation of time-varying values of damping and stiffness. However, their application requires a model of the complete system dynamics, including feedforward and voluntary feedback behavior (if the respective method is used for a duration that is longer than the delay of voluntary feedback). Some studies avoid these limitations by assuming that the combination of feedforward and feedback behavior can be modeled by the sum of damping and stiffness (Asao et al., 2013) or just stiffness (Roveda et al., 2013). In Deng et al. (2006), the authors assume that effects of feedforward and voluntary feedback behavior are neglectable due to application of pseudo-random perturbations in combination with a band pass filter. However, the plausibility of these assumptions is not validated, as the method is only applied to a simulation that models the combination of feedforward and feedback behavior by the sum of damping and stiffness. Thus, existing impedance estimation methods based on Kalman filters are extremely limited in possible application scenarios and the application to realistic pHRI would require significantly more complex models. While our method is not able to provide time-varying values of damping and stiffness, it is able to provide accurate estimates within a limited interval without the need for modeling feedforward and voluntary feedback behavior.

CONCLUSION
In this work, we present a method for the estimation of the involuntary impedance components of the human arm during multi-joint movements by application of force perturbations. These perturbations are designed such that the evoked feedback jerk frequency content can be isolated with a high pass filter. We limit the duration of the estimation interval to 115 ms to guarantee exclusion of voluntary feedback. We estimate the inertial parameters in a static postural task and subsequently insert them to estimate the damping and stiffness in a dynamic movement task.The evaluation of the experimental data shows that our method is able to provide plausible involuntary impedance estimates within the limited estimation interval that guarantees exclusion of voluntary feedback. Furthermore, the validation with simulated data shows that it provides superior estimation performance compared to the results obtained by application of existing methodologies within identical conditions. As the difference in estimation accuracy is especially large for the variational positions, the estimation performance is increased for the elements of the stiffness. The analysis of different movement velocities and variations of neural noise shows that the feedback jerk isolation is able to provide superior estimation performance for movements with moderate to low velocity and is much less affected by an increase in movement variability. We conclude that our method allows for involuntary impedance estimation in experiments that emulate realistic pHRI, without the need to include the donot-intervene-voluntarily paradigm or comparable constraints within the respective dynamic movement tasks. This enables the acquisition of valuable information regarding involuntary impedance modulation strategies, which could be used for assessment of stability and adaptation of robot control behavior during pHRI. In future work, we aim to apply the method to more complex movements, inspired by realistic pHRI scenarios. Furthermore, we intend to complement our current apparatus by EMG sensing modalities, in order to obtain additional insights into the involved muscle activations.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation, to any qualified researcher.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Ethics committee at the Technical University of Munich. The patients/participants provided their written informed consent to participate in this study. Written informed consent was obtained from the individual(s) for the publication of any potentially identifiable images or data included in this article.

AUTHOR CONTRIBUTIONS
HB and SE adapted the simulation and validated the method with simulated data. HB implemented and conducted the experiment, verified the method with experimental data, and wrote the first draft of the manuscript. SE and SH provided the critical revisions. All authors contributed to the conceptualization and methodology, read and approved the manuscript.