Evaluating Morphological Computation in Muscle and DC-motor Driven Models of Human Hopping

In the context of embodied artificial intelligence, morphological computation refers to processes which are conducted by the body (and environment) that otherwise would have to be performed by the brain. Exploiting environmental and morphological properties is an important feature of embodied systems. The main reason is that it allows to significantly reduce the controller complexity. An important aspect of morphological computation is that it cannot be assigned to an embodied system per se, but that it is, as we show, behavior- and state-dependent. In this work, we evaluate two different measures of morphological computation that can be applied in robotic systems and in computer simulations of biological movement. As an example, these measures were evaluated on muscle and DC-motor driven hopping models. We show that a state-dependent analysis of the hopping behaviors provides additional insights that cannot be gained from the averaged measures alone. This work includes algorithms and computer code for the measures.


Introduction
Morphological computation (MC), in the context of embodied (artificial) intelligence, refers to processes which are conducted by the body (and environment) that otherwise would have to be performed by the brain [11].A nice example of MC is given by Wootton [18] (see p. 188), who describes how "active muscular forces cannot entirely control the wing shape in flight.They can only interact dynamically with the aerodynamic and inertial forces that the wings experience and with the wing's own elasticity; the instantaneous results of these interactions are essentially determined by the architecture of the wing itself [. . .]" MC is relevant in the study of biological and robotic systems.In robotics, a quantification of MC can be used e.g. as part of a reward function in a reinforcement learning setting to encourage the outsourcing of computation to the morphology, thereby enabling complex behaviors that result from comparably simple controllers.The relationship of embodiment and controller complexity has been recently studied in [10].MC measures can also be used to evaluate the robot's morphology during the design process.For biological systems, energy efficiency is important and an evolutionary advantage.Exploiting the embodiment can lead to more energy efficient behaviors, and hence, MC may be a driving force in evolution.
In biological systems, movements are typically generated by muscles.Several simulation studies have shown that the muscles' typical non-linear contraction dynamics can be exploited in movement generation with very simple control strategies [14].Muscles improve movement stability in comparison to torque driven models [16] or simplified linearized muscle models (for an overview see [5]).Muscles also reduce the influence of the controller on the actual kinematics (they can act as a low-pass filter).This means that the hopping kinematics of the system is more pre-determined with non-linear muscle characteristics than with simplified linear muscle characteristics [5].And finally, in hopping movements, muscles reduce the control effort (amount of information required to control the movement) by a factor of approximately 20 in comparison to a DC-motor driven movement [7].
In view of these results we expect that MC plays an important role in the control of muscle driven movement.To study this quantitatively, a suitable measure for MC is required.There are several approaches to formalize MC [8,12,13].In our previous work we have focused on an agent-centric perspective of measuring MC [19] and we have applied an information decomposition of the sensorimotor loop to measure and better understand MC [4].Both publications used a binary toy world model to evaluate the measures.With this toy model, it was possible to show that these measures capture the conceptual idea of MC and, in consequence, that they are candidates to measure MC in more complex and more realistic systems.
The goal of this publication is to evaluate two measures of MC on biologically realistic hopping models.With this, we want to demonstrate their applicability in non-trivial, realistic scenarios.Based on our previous findings (see above), we hypothesize that MC is higher in hopping movements driven by a non-linear muscle compared to those driven by a simplified linear muscle or a DC-motor, for the following reason.Our experiments show that a state-dependent analysis of MC for the different models leads to insights, which cannot be gained from the averaged measures alone.
Furthermore, we provide detailed instructions, including MATLAB R code, on how to apply these measures to robotic systems and to computer simulations.With this, we hope to provide a tool for the evaluation of MC in a large variety of applications.
The quantifications of MC require a formal representation of the sensorimotor loop (see Fig. 1), which is introduced in the next section as far as it is required to understand the remainder of this work.For further information, the reader is referred to [2,9,19,20].

The Sensorimotor Loop
The conceptual idea of the sensorimotor loop is similar to the basic control loop systematics, which is the basis of robotics and also of computer simulations of human movement.In our understanding, a cognitive system consists of a brain or controller, which sends signals to the system's actuators, thereby affecting the system's environment.We prefer to capture the body and environment in a single random variable named world.This is consistent with other concepts of agent-environment distinctions.An example for such a distinction can be found in the context of reinforcement learning, where the environment (world) is everything that cannot be changed arbitrarily by the agent [15].A more thorough discussion of the brainbody-environment distinction can be found in [1,17] and a more recent discussion can be found in [3].A brief example of a world, based on a robot simulation, is given below.The loop is closed as the system acquires information about its internal state (e.g.current pose) and its world through its For simplicity, we only discuss the sensorimotor loop for reactive systems.This is plausible, because behaviors which exploit the embodiment (e.g.walking, swimming, flying) are typically reactive.This leaves us with three (stochastic) processes S(t), A(t), and W (t), t ∈ N, that constitute the sensorimotor loop (see Fig. 1), which take values s, a, and w, in the sensor, actuator, and world state spaces (their respective domains will be clear from the context).The directed edges (see Fig. 1) reflect causal dependencies between these random variables.We consider time to be discrete, i.e., t ∈ N and are interested in what happens in a single time step.Therefore, we use the following notation.Random variables without any time index refer to some fixed time t and primed variables to time t + 1, i.e., the two variables S, S refer to S(t) and S(t + 1).
Starting with the initial distribution over world states, denoted by p(w), the sensorimotor loop for reactive systems is given by three conditional probability distributions, β, α, π, also referred to as kernels.The sensor kernel, which determines how the agent perceives the world, is denoted by β(s|w), the agent's controller or policy is denoted by π(a|s), and finally, the world dynamics kernel is denoted by α(w |w, a).
To understand the function of the world dynamics kernel α(w |w, a) it is useful to think of a robotic simulation.In this scenario, the world state W is the state of the simulator at a given time step, which includes the pose of all objects, their velocities, applied forces, etc.The actuator state A is the value that the controller passes on to the physics engine prior to the next physics update.Hence, the world dynamics kernel α(w |w, a) is closely related to the forward model that is known in the context of robotics and biomechanics.
Based on this notation, we can now formulate quantifications of MC in the next section.

Quantifying Morphological Computation
In the introduction, we stated that MC relates to the computation that the body (and environment) performs that otherwise would have to be conducted by the controller (or brain).This means that we want to measure the extent to which the system's behavior is the result of the world dynamics (i.e., the body's internal dynamics and it's interaction with its world) and how much of the behavior is determined by the policy π (see Fig. 1).
In our previous publication [19] we have defined two concepts to quantify MC, from which the two measures below are taken and derived.

Morphological computation as conditional mutual information (MC W )
The first quantification, that is used in this work, was introduced in [19].The idea behind it can be summarized in the following way.The world dynamics kernel α(w |w, a) captures the influence of the actuator signal A and the previous world state W on the next world state W .A complete lack of MC would mean that the behavior of the system is entirely determined by the system's controller, and hence, by the actuator state A. In this case, the world dynamics kernel reduces to p(w |a).Every difference from this assumption means that the previous world state W had an influence, and hence, information about W changes the distribution over the next world states W .The discrepancy of these two distributions can be measured with the average of the Kullback-Leibler divergence D KL (α(w |w, a)||p(w |a)), which is also known as the conditional mutual information I(W ; W |A).This distance is formally given by (see also Alg. 2 in App.8)

Morphological computation as comparison of behavior and controller complexity (MC MI )
The second quantification follows concept one of [19].The assumption that underlies this concept is that, for a given behavior, MC decreases with an increasing effect of the action A on the next world state W .The corresponding measure MC A ∝ −I(W ; A|W ) cannot be used in systems with deterministic policy, because for these systems I(W ; A|W ) = 0 (see App. 9).Therefore, for this publication, we require an adaptation that operates on world states and is applicable to deterministic systems.The new measure compares the complexity of the behavior with the complexity of the controller.The complexity of the behavior can be measured by the mutual information of consecutive world states, I(W ; W ), and the complexity of the controller can be measured by the mutual information of sensor and actuator states, I(A; S), for the following reason.The mutual information of two random variables can also be written as difference of entropies: which, applied to our setting, means that the mutual information I(W ; W ) is high, if we have a high entropy over world states W (first term) that are highly predicable (second term).Summarized, this means that the mutual information I(W ; W ) is high if the system shows a diverse but nonrandom behavior.Obviously, this is what we would like to see in an embodied system.On the other hand, a system with high MC should produce a complex behavior based on a controller with low complexity.Hence, we want to reduce the mutual information I(A; S), because this either means that the policy has a low diversity in its output (low entropy over actuator states H(A)) or that there is only a very low correlation between sensor states S and actuator states A (high conditional entropy H(A|S)).Therefore, we define the second measure as the difference of these two terms, which is (see also Alg. 4 in Sec.8) For deterministic systems, as those studied in this work, the two measures are closely related.In particular, it holds that MC W − MC MI = H(A|W ) (see App. 10).The inequality MC W ≥ MC MI may not be satisfied always, because discretization can introduce stochasticity.Note that in the case of a passive observer, i.e., a system that observes the world but in which there is no causal dependency between the action and the next world state (i.e., missing connection between A and W in Fig. 1), the controller complexity I(A; S) in Eq. ( 2) will reduce the amount of MC measured by MC MI , although the actuator state does not influence the world dynamics.This might be perceived as a potential shortcoming.In the context discussed in this paper, e.g.data recorded from biological or robotic systems, we think that this will not be an issue.
The next section introduces the hopping models on which the two measures are evaluated.

Hopping models
In a reduced model, hopping motions can be described by a one-dimensional differential equation [6]: where the point mass m = 80 kg represents the total mass of the hopper which is accelerated by the gravitational force (g = −9.81m/s 2 ) in negative y-direction.An opposing leg force F L in positive y-direction can act only during ground contact (y ≤ l 0 = 1 m).Hopping motions are then characterized by alternating flight and stance phases.For this manuscript, we investigated three different models for the leg force.All models have in common, that the leg force depends on a control signal u(t) and the system state y(t), ẏ(t): , meaning, that the force modulation partially depends on the controller output u(t) and partially on the dynamic characteristics, or material properties of the actuator.The control parameters of all three models were adjusted to generate the same periodic hopping height of max(y(t)) = 1.070 m.All models were implemented in MATLAB R Simulink TM (Ver2014b) and solved with ode45 Dormand-Prince solver with absolute and relative tolerances of 10 −12 .The models were solved and integrated in time for T = 8 s and model output was generated at 1 kHz sampling frequency.

Muscle-Fiber model (MusFib)
A biological muscle generates its active force in muscle fibers whose contraction dynamics are well studied.It was found that the contraction dynamics are qualitatively and quantitatively (with some normalizations) very similar across muscles of all sizes and across many species.In the MusFib model, the leg force is modeled to incorporate the active muscle fibers' contraction dynamics.The model has been motivated and described in detail elsewhere [5][6][7].In a nutshell, the material properties of the muscle fibers are characterized by two terms modulating the leg force The first term a(t) represents the muscle activity.The activity depends on the neural stimulation of the muscle 0.001 ≤ u(t) ≤ 1 and is governed by biochemical processes modeled as a first-order ODE called activation dynamics with the time constant τ = 10 ms.The second term in Eq. ( 4) (6) Here we use a maximum isometric muscle force F max = 2.5 kN, an optimal muscle length l opt = 0.9 m, force-length parameters w = 0.45 m and c = 30, and force-velocity parameters lmax = −3.5 ms −1 , K = 1.5, and N = 1.5 [6].
In this model, periodic hopping is generated with a controller representing a mono-synaptic force-feedback.The neural muscle stimulation is based on the time delayed (δ = 15 ms) muscle fiber force F L,MusFib .The feedback gain is G = 2.4/F max and the stimulation at touch down u 0 = 0.027.This model neither considers leg geometry nor tendon elasticity and is therefore the simplest hopping model with muscle-fiber-like contraction dynamics.The model output was the world state w(t) = (y(t), ẏ(t), ÿ(t)), the sensor state s(t) = F L,MusFib (t), and the actuator control command a(t) = u(t).For this model, these are the values that the random variables W , S, and A take at each time step.

Linearized Muscle-Fiber model (MusLin)
This model differs from the model MusFib only in the representation of the force-length-velocity relation, i.e., F L,MusLin = a(t)F lin ( lM ) (see Eq. ( 6)).More precisely, the force-length relation is neglected and the force-velocity relation is approximated linearly with µ = 0.25 m/s.Feedback gain G = 0.8/F max and stimulation at touch down u 0 = 0.19 were chosen to achieve the same hopping height as the MusFib model.

DC-Motor model (DCMot)
An approach to mimic biological movement in a technical system (robot) is to track recorded kinematic trajectories with electric motors and a PD-control approach.The DCMot model implements this approach (slightly modified from [7]).The leg force generated by the DC-motor was modeled as where k T = 0.126 Nm/A is the motor constant, I DC the current through the motor windings, γ = 100 : 1 the ratio of an ideal gear translating the rotational torque T DC and movement φ(t) = γ ẏ(t) of the motor to the translational leg force and movement required for hopping.The electrical characteristics of the motor can be modeled as where −48 V ≤ u DC ≤ 48 V is the armature voltage (control signal), R = 7.19 Ω the resistance, and L = 1.6 mH the inductance of the motor windings.The motor parameters were taken from a commercially available DC-motor commonly used in robotics applications (Maxon EC-max 40, nominal Torque T nominal = 0.212 Nm).As this relatively small motor would not be able to lift the same mass, the body mass was adapted to guarantee comparable accelerations The recorded kinematic trajectory y rec (t) and ẏrec (t) during ground contact was taken from the periodic hopping trajectory of the MusFib model.This trajectory was enforced with a PD-controller u DC (t) = K P (y rec (t) − y(t)) + K D ( ẏrec (t) − ẏ(t)) (12) with feedback gains K P = 5000 V/m and K D = 500 Vs/m.This model is the simplest implementation of negative feedback control that allows to enforce a desired hopping trajectory on a technical system.The model output was the world state w(t) = (y(t), ẏ(t), ÿ(t)), the sensor state s(t) = (y(t), ẏ(t)), and the actuator control command a(t) = u DC (t).

Experiments
This section discusses the experiments that were conducted with the hopping models and the preprocessing of the data.Algorithms for the calculations are provided in the appendix (Sec.8) and implemented MATLAB R code can be downloaded from http://github.com/kzahedi/MC/ (commit c332c18, 30.Nov. 2015).A C++ implementation is available at http://github.com/kzahedi/entropy/.
At this stage, the measures operate on discrete state spaces (see Eqs. ( 1)-( 3) and Algs.2-4).Hence, the data was discretised in the following way.To ensure the comparability of the results, the domain (range of values) for each variable (e.g. the position y) was calculated over all hopping models.Then, the data of each variable was discretised into 300 values (bins).The algorithm for the discretisation is described in Alg. 1. Different binning resolutions were evaluated and the most stable results were found for more than 100 bins.Finding the optimal binning resolution is a problem of itself and beyond the scope of this work.In practice, however, a reasonable binning can be found by increasing the binning until further increase has little influence on the outcome of the measures.
The possible range of actuator values are different for the motor and muscle models.For the muscle models, the values are in the unit interval, i.e., a(t) ∈ [0, 1], whereas the values for the motor can have higher values (see above).Hence, to ensure comparability, we normalized the actions of the motor to the unit interval before they were discretized.
The hopping models are deterministic, which means that only a few hopping cycles are necessary to estimate the required probability distributions.To ensure comparability of the results, we parameterized the hopping models to achieve the same hopping height.

RESULTS
Tab. 1 shows the value of the two MC measures for the three hopping models.Compared to the MusFib model, the two other models result in significantly lower measurement of MC (≈ 30% less).This result complements previous findings
showing that the minimum information required to generate hopping is reduced by the material properties of the non-linear muscle fibers compared to the DC-motor driven model [7].
This also confirms previous findings that the non-linear contraction dynamics reduces the influence of the controller on the actual hopping kinematics in comparison to a linearized muscle model [5,6].To better understand the differences of the models, we plotted the state-dependent MC (see Alg. 3).Fig. 3 shows the values of MC W for each state of the models during two hopping cycles.We chose to discuss MC W only, because the plots of MC W and MC MI are very similar, and hence, a discussion of the state-dependent MC MI will not provide any additional insights.The plots for all models and the entire data are shown in Fig. 4. The full data is shown in Fig. 4. The plots only show a small fraction of the recorded data.For better readability, all the plots for MC are smoothened with a moving average of block size 5.
The orange line shows the state-dependent MC for the linear muscle model (MusLin), and finally, the blue line shows the values for the non-linear muscle model (MusFib).The green line shows the state-dependent MC for the motor model (DCMot).In the figure, the lower lines show the position y of the center of mass over time.The DCMot model is parameterized to follow the trajectory of the MusFib model (see Eq. ( 12)), which is why the blue and green position plots coincide.The original data is shown in Fig. 4.There are basically three phases, which need to be distinguished (indicated by the vertical lines).First, the flight phase, during which the hopper does not touch the ground (position plots are above the red line), second, the deceleration phase, which occurs after landing (position is below the red line but still declining), and finally, the acceleration phase, in which the position is below the red line but increasing.
The first observation is that MC is equal for all models during most of the flight phase (position above the red line) and that it seems to be proportional to the velocity of the systems.During flight, the behavior of the system is governed only by the interaction of the body (mass, velocity) and the environment (gravity) and not by the actuator models.This explains why the values coincide for the three models.
For all models, MC drops as soon as the systems touch the ground.DCMot and MusLin reach their highest values only during the flight phase, which can be expected at least from a motor model that is not designed to exploit MC.The graphs also reveal that the MusLin model shows slightly higher MC around mid-stance phase, compared to the DCMot model.For the non-linear muscle model, the behavior is different.Shortly after touching the ground, the system shows a strong decline of MC, which is followed by a strong incline during the deceleration with the muscle.Contrary to the other two models, the non-linear muscle model MusFib shows the highest values when the muscle is contracted the most (until mid-stance).This is an interesting result, as it shows that the non-linear muscle is capable of showing more MC while the muscle is operating, compared to the flight phase, in which the behavior is only determined by the interaction of the body and environment.

CONCLUSIONS
This work presented two different quantifications of MC including algorithms and MATLAB R code to use them.We demonstrated their applicability in experiments with non-trivial, biologically realistic hopping models and discussed the importance of a state-based analysis of morphological computation.The first quantification, MC W , measures MC as the conditional mutual information of the world and actuator states.Morphological computation is the additional information that the previous world state W provides about the next world state W , given that the current actuator state A is known.The second quantification, MC MI , compares the behavior and controller complexity to determine the amount of MC.
The numerical results of the two quantifications confirm our hypothesis that the MusFib model should show significantly higher MC, compared to the two other models (MusLin, DCMot).We also showed that a state-dependent analysis of MC leads to additional insights.Here we see that the nonlinear muscle model is capable of showing significantly more morphological computation in the stance phase, compared to the flight phase, during which the behavior is only determined by the interaction of the body and environment.This shows that morphological computation is not only behavior-, but also state-dependent.Future work will include the analysis of additional behaviors, such as walking and running, for which we expect, based on the findings of this work, to see a more morphological computation of the non-linear muscle model MusFib.
Note that we use a compressed notion in Alg.2-5, in which x = x(t + 1) and x = x(t).
Algorithm 1 Discretisation of the data.This part is the same for all measures, depending on which time series are required.The min and max were determined of the data of all hopping models.Require: t = 1, 2, . . ., T Require: time series y = (y(t)), ẏ = ( ẏ(t)), ÿ = (ÿ(t)), a = (a(t)), s = (s(t)), t = 1, 2, . . ., T Require: Number of bins B x for time series x

Figure 1 :
Figure 1: Causal model of the sensorimotor loop.This figure depicts a single step of an embodied, reactive system's sensorimotor loop.A detailed explanation is given in Sec. 2

Figure 2 :
Figure 2: Hopping models.The MusFib model considers the non-linear contraction dynamics of active muscle fibers and is driven by a mono-synaptic force-feedback reflex.The MusLin model only differs in the contraction dynamics, where the force-length relation is neglected and the forcevelocity relation is approximated linearly.The DCMot model generates the leg force with a DC-motor.It is controlled by a proportional-differential controller (PD), enforcing the desired trajectory.The desired trajectory is the recorded trajectory from the MusFib Model.The sensor signals are shown as blue arrows, the actuator control signals are shown as green arrows.In case of the muscle models, the sensor signal is the muscle force F M and the actuator control signal is the neural muscle stimulation u.In case of the DC-motor model, the sensor signals are the position and velocity of the mass, and the actuator control signal is the motor armature voltage u DC .

Figure 3 :
Figure 3: Comparison of state-dependent MC for MC W on the three hopping models.The lower plots in each Figure visualize the hopping position (out of proportion for better visibility).The red line indicates stance and flight phases.The full data is shown in Fig.4.The plots only show a small fraction of the recorded data.For better readability, all the plots for MC are smoothened with a moving average of block size 5.

Figure 4 :
Figure 4: Plots of state-dependent MC (first two rows) and the world state (following four rows) for all muscle models.The red line in the position plot indicates the time steps at which the hopper touches ground (position is below the red line).