# HUMAN-LIKE ADVANCES IN ROBOTICS: MOTION, ACTUATION, SENSING, COGNITION AND CONTROL

EDITED BY : Tadej Petric, Kosta Jovanovic, Toshiaki Tsuji and Calogero Maria Oddo PUBLISHED IN : Frontiers in Neurorobotics

#### Frontiers eBook Copyright Statement

The copyright in the text of individual articles in this eBook is the property of their respective authors or their respective institutions or funders. The copyright in graphics and images within each article may be subject to copyright of other parties. In both cases this is subject to a license granted to Frontiers. The compilation of articles constituting this eBook is the property of Frontiers.

Each article within this eBook, and the eBook itself, are published under the most recent version of the Creative Commons CC-BY licence. The version current at the date of publication of this eBook is CC-BY 4.0. If the CC-BY licence is updated, the licence granted by Frontiers is automatically updated to the new version.

When exercising any right under the CC-BY licence, Frontiers must be attributed as the original publisher of the article or eBook, as applicable.

Authors have the responsibility of ensuring that any graphics or other materials which are the property of others may be included in the CC-BY licence, but this should be checked before relying on the CC-BY licence to reproduce those materials. Any copyright notices relating to those materials must be complied with.

Copyright and source acknowledgement notices may not be removed and must be displayed in any copy, derivative work or partial copy which includes the elements in question.

All copyright, and all rights therein, are protected by national and international copyright laws. The above represents a summary only. For further information please read Frontiers' Conditions for Website Use and Copyright Statement, and the applicable CC-BY licence.

ISSN 1664-8714 ISBN 978-2-88963-265-7 DOI 10.3389/978-2-88963-265-7

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

#### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# HUMAN-LIKE ADVANCES IN ROBOTICS: MOTION, ACTUATION, SENSING, COGNITION AND CONTROL

Topic Editors:

Tadej Petric, Jožef Stefan Institute (IJS), Slovenia Kosta Jovanovic, University of Belgrade, Serbia Toshiaki Tsuji, Saitama University, Japan Calogero Maria Oddo, Sant'Anna School of Advanced Studies, Italy

Citation: Petric, T., Jovanovic, K., Tsuji, T., Oddo, C. M., eds. (2019). Human-Like Advances in Robotics: Motion, Actuation, Sensing, Cognition and Control. Lausanne: Frontiers Media SA. doi: 10.3389/978-2-88963-265-7

# Table of Contents

*04 Editorial: Human-Like Advances in Robotics: Motion, Actuation, Sensing, Cognition and Control*

Kosta Jovanović, Tadej Petrič, Toshiaki Tsuji and Calogero Maria Oddo

*07 Hebbian Plasticity in CPG Controllers Facilitates Self-Synchronization for Human-Robot Handshaking*

Melanie Jouaiti, Lancelot Caron and Patrick Hénaff


Luca Massari, Calogero M. Oddo, Edoardo Sinibaldi, Renaud Detry, Joseph Bowkett and Kalind C. Carpenter

*47 Development, Analysis, and Control of Series Elastic Actuator-Driven Robot Leg*

Chan Lee and Sehoon Oh

*68 A Variable Stiffness Actuator Module With Favorable Mass Distribution for a Bio-inspired Biped Robot*

David Rodriguez-Cianca, Maarten Weckx, Rene Jimenez-Fabian, Diego Torricelli, Jose Gonzalez-Vargas, M.Carmen Sanchez-Villamañan, Massimo Sartori, Karsten Berns, Bram Vanderborght, J. Luis Pons and Dirk Lefeber

*80 Assistive Arm-Exoskeleton Control Based on Human Muscular Manipulability*

Tadej Petrič, Luka Peternel, Jun Morimoto and Jan Babič

*90 Human-Inspired Online Path Planning and Biped Walking Realization in Unknown Environment*

Mirko Raković, Srdjan Savić, José Santos-Victor, Milutin Nikolić and Branislav Borovac


Udaya Bhaskar Rongala, Alberto Mazzoni, Marcello Chiurazzi, Domenico Camboni, Mario Milazzo, Luca Massari, Gastone Ciuti, Stefano Roccella, Paolo Dario and Calogero Maria Oddo

# Editorial: Human-Like Advances in Robotics: Motion, Actuation, Sensing, Cognition and Control

Kosta Jovanovic´ 1 , Tadej Petricˇ 2 \*, Toshiaki Tsuji <sup>3</sup> and Calogero Maria Oddo<sup>4</sup>

<sup>1</sup> School of Electrical Engineering (ETF), University of Belgrade, Belgrade, Serbia, <sup>2</sup> Department for Automation, Biocybernetics, and Robotics, Jožef Stefan Institute, Ljubljana, Slovenia, <sup>3</sup> Tsuji Laboratory, Department of Electrical and Electronic Systems, Saitama University, Saitama, Japan, <sup>4</sup> The Biorobotics Insititute, Scuola Superiore Sant'Anna, Pisa, Italy

Keywords: motor control, human-like perception and cognition, human-like sensing and actuation, musculoskeletal robots, digitalization of human

**Editorial on the Research Topic**

**Human-Like Advances in Robotics: Motion, Actuation, Sensing, Cognition and Control**

# HUMAN-LIKE ADVANCES IN ROBOTICS

Robots have significantly contributed to the quality of human lives by alleviating exhausting, repetitive and monotonous industry jobs in untidy and risky environments. They have increased the efficiency of logistics in warehouses and factories. Inspection of infrastructures has been advanced by robotics. Just now, robots contribute to state-of-the-art surgery, rehabilitation, and medical diagnostics. However, one of the robotics markets with greatest growth expectations home and service robotics (Litzenberger, 2018), apart for specialized devices such as cleaning robots or cooking assistants, is still struggling to find and adopt a robot which could respond to most challenges and demands in our homes. Our homes are unstructured, with space- and time- variant environments adapting to human needs and commodity. In order to fit such designed humancentered environment, service robots of tomorrow might be human-like machines. Such service robots could not only look like humans, but move and behave like humans which means that they also have to sense and act in a human manner. To that end, understanding biomechanics and sensorimotor control of humans has been a subject of scientific research for centuries. Making an artificial human is a dream of humanity even longer. In modern history this is demonstrated, as an example, by the Mechanical Turk, a human-like machine developed in 1770 by Wolfgang von Kempelen at the House of Habsburg which apparently was automatedly playing chess but was actually operated, through elegant mechanisms, by a chess master hidden inside. The Mechanical Turk has been a source of inspiration for scientists, popular culture, and the business sector, as confirmed by the Amazon Mechanical Turk web-based crowd work platform. This dream to reproduce human capabilities, combined with market needs, in the last few decades, resulted in the rapid progress of humanoid robotics as a scientific discipline. Consequently, bionics of human sensing capabilities, human motion performances, and human behavior patterns, is perceived as an essential part of robotics research.

Working in the field of humanoid robotics we all face typical questions:

• Could we design engineering counterparts for corresponding body parts which have been mastered through human evolution?

Edited and reviewed by: Florian Röhrbein, Technical University of Munich, Germany

> \*Correspondence: Tadej Petricˇ tadej.petric@ijs.si

Received: 30 August 2019 Accepted: 30 September 2019 Published: 16 October 2019

#### Citation:

Jovanovic K, Petri ´ c T, Tsuji T and ˇ Oddo CM (2019) Editorial: Human-Like Advances in Robotics: Motion, Actuation, Sensing, Cognition and Control. Front. Neurorobot. 13:85. doi: 10.3389/fnbot.2019.00085

• Do we really need to strive for faithful engineering replicas of a human body, the way it moves and behaves, or we could just undertake engineering design of robot parts which resemble key functionalities of humans?

# THE RESEARCH TOPIC ON HUMAN-LIKE ROBOTICS

This Research Topic brings theoretical and experimental findings and outlines guidelines to research activities in the field of human-like robotics. It aims at giving insights in the latest related scientific investigations and at presenting some samples of the current level of developed technologies importing those concepts in robotics with a science for robotics and robotics for science methodology (Yang et al., 2016). The main research lines addressed are:


• Cognition and behavior patterns of humans: as the most complex system to be engineered which cannot distinguish and separate the morphological shape, experience and intelligence (Pfeifer and Bongard, 2006), but to do comprehensively research and investigation on how to define and develop the "brain" for future robots through the interaction of perception, control, learning, and cognition as summarized in the work of Li et al. (2019); the papers of the Research Topic mainly pertaining to this cluster present a neurocomputactional architecture for spike-based tactile encoding-decoding of surface features (Rongala et al.), a path planning and walking strategy for humanoids (Rakovic et al. ´ ), and a central patter generator with Hebbian plasticity for human-robot interaction (Jouaiti et al.).

According to these categories, a possible clustering of the papers published within the present Research Topic is proposed in **Figure 1**.

# CONCLUDING REMARKS

The elaborated topics will lead us to a current answer on how far away state of the art humanoids are from humans in terms of mechanics, intelligence, and communication (Fukuda et al., 2001). Papers presented in the Research Topic depict useful pieces of research toward an ultimate goal of all roboticists: to build a fully functional autonomous robot which matches shape and performance of a human in a humancentered stochastic and unstructured environment. Since the Research Topic spans through numerous scientific disciplines, multidisciplinary research activities and variety of engineering and scientific issues, it is not possible to gather all results which will give a comprehensive insight into it. However, we should remember the words of Isaac Asimov in his novel Robots of Down: "A knotty puzzle may hold a scientist up for a century, when it may be that a colleague has the solution already and is not even aware of the puzzle that it might solve."

# AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

## REFERENCES


### ACKNOWLEDGMENTS

We thank all authors and reviewers contributing with their work to this Research Topic. Guest editor CO thanks Dr. Alberto Mazzoni for meaningful conversations.


**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Jovanovi´c, Petriˇc, Tsuji and Oddo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Hebbian Plasticity in CPG Controllers Facilitates Self-Synchronization for Human-Robot Handshaking

Melanie Jouaiti <sup>1</sup> \*, Lancelot Caron<sup>2</sup> and Patrick Hénaff 1,2

<sup>1</sup> Université de Lorraine, CNRS, Inria, LORIA, Nancy, France, <sup>2</sup> Information and Systems Department, Ecole Nationale Supérieure des Mines de Nancy, Nancy, France

It is well-known that human social interactions generate synchrony phenomena which are often unconscious. If the interaction between individuals is based on rhythmic movements, synchronized and coordinated movements will emerge from the social synchrony. This paper proposes a plausible model of plastic neural controllers that allows the emergence of synchronized movements in physical and rhythmical interactions. The controller is designed with central pattern generators (CPG) based on rhythmic Rowat-Selverston neurons endowed with neuronal and synaptic Hebbian plasticity. To demonstrate the interest of the proposed model, the case of handshaking is considered because it is a very common, both physically and socially, but also, a very complex act in the point of view of robotics, neuroscience and psychology. Plastic CPGs controllers are implemented in the joints of a simulated robotic arm that has to learn the frequency and amplitude of an external force applied to its effector, thus reproducing the act of handshaking with a human. Results show that the neural and synaptic Hebbian plasticity are working together leading to a natural and autonomous synchronization between the arm and the external force even if the frequency is changing during the movement. Moreover, a power consumption analysis shows that, by offering emergence of synchronized and coordinated movements, the plasticity mechanisms lead to a significant decrease in the energy spend by the robot actuators thus generating a more adaptive and natural human/robot handshake.

#### Edited by:

Tadej Petric, Jožef Stefan Institute (IJS), Slovenia

#### Reviewed by:

Ralf Der, Leipzig University, Germany Nicolas P Rougier, Université de Bordeaux, France

> \*Correspondence: Melanie Jouaiti melanie.jouaiti@loria.fr

Received: 09 February 2018 Accepted: 17 May 2018 Published: 08 June 2018

#### Citation:

Jouaiti M, Caron L and Hénaff P (2018) Hebbian Plasticity in CPG Controllers Facilitates Self-Synchronization for Human-Robot Handshaking. Front. Neurorobot. 12:29. doi: 10.3389/fnbot.2018.00029 Keywords: physical human robot interaction, hebbian learning, central pattern generator (CPG), adaptive behavior, handshaking, plasticity, neural oscillators

# 1. INTRODUCTION

For humans, physical and social interpersonal interactions induce gestural and verbal/non-verbal communications based on rhythmic mechanisms and rhythmic movements. These mechanisms and the associated synchronization phenomena (limit cycles and clamping) could play a fundamental role in physical and social interpersonal interactions (Troje et al., 2006; Yonekura et al., 2012) and could be an emergent feature of the physical and social interactions between humans who adapt to each other and learn from each interaction, generating synchronization phenomena and creating conscious or unconscious links between people (Delaherche et al., 2012). Scientists assume that emotional and social interactions involve a coupling between individuals which is achieved thanks to neural structures with similar properties as those implicated in the neural control of movements. For example, coordination of oscillatory motions between two individuals (two distinct brains) obeys the same rules as for inter-limb coordination within a single individual (single brain) (Schmidt et al., 1990; Tognoli et al., 2007). Thus, distinct individuals can spontaneously interact and successfully perform coordinated actions through an exchange of information by means of their sensorimotor, cognitive and social underpinnings.

In humans and animals, rhythmic movements rely on universal sensory-motor mechanisms (Cruse et al., 1998; Cattaert and Le Ray, 2001; Zehr et al., 2004) and result from learning processes implying chaotic neural oscillators in central pattern generators (CPGs). CPGs endowed with plasticity rules allowing for synchronization with the control body (Shadmehr, 2010), are also implied both in the generation of discrete and rhythmic movements (Grillner, 2006).

In human interactions, handshaking is an important and universally social function allowing social introduction in various contexts, regulating and maintaining human interactions (Schiffrin, 1974; Hall and Spencer Hall, 1983; Bernieri and Petty, 2011; Giannopoulos et al., 2011) but it can also provide information on the health and emotional state of a person (Chaplin et al., 2000), which could be useful for assistive robotics. It is a multimodal physical interaction, socially common but complex to reproduce with a humanoid robot because it involves fine and complex movement coordination which engages the body and gaze throughout the act: from the preparation to the contact, the locking, the rhythmic and synchronized movement until the withdrawal of the hands (Walker et al., 2013). How synchronized motion of two humans arms is established and maintained is still an open question but some aspects have been studied in the movement science and neuroscience fields, such as reaching hands (Lee, 1976; Bastin et al., 2006) and interpersonal synchronization tasks (Oullier et al., 2008; Dumas et al., 2014).

From a neuroscience point of view, handshaking implies interpersonal motor coordination and recent research showed that it also induces the synchronization of the brain activity of both partners Tognoli et al. (2007). Therefore, it can be considered as a paradigm for social and physical interactions, in particular because its multimodality is based on physical and social clamping of rhythmic movements. Consequently, if we want humanoid robots to be able to interact properly with humans, i.e., in a socially acceptable way, shaking hand with humans like a human is an interesting challenge (Der and Martius, 2017). It is then necessary to design bio-inspired robot controllers able to produce rhythmic movements and trigger the emergence of a synchronization in an interaction such as the handshaking gesture. One possible way to achieve this consists in designing robot controllers which are intrinsically rhythmic, such as CPGs, but which also incorporate synchronization learning abilities similarly to the plasticity mechanisms involved in the human motor nervous system for rhythmic movement production.

Several models of CPGs have been proposed for many years in order to understand human and animal motor control mostly aiming at locomotion control in robotics (Ijspeert, 2008; Yu et al., 2014; Nachstedt et al., 2017). CPGs are neuronal structures located in the spinal cord and able to generate rhythmic and discrete activities that can be initiated, modulated and reset by different kinds of signals: descendant signals from high level structures located in the MLR (mesencephalic locomotor region) (Grillner, 2006; Rossignol et al., 2006; Harris-Warrick, 2011) or afferent sensory feedbacks coming from low levels of the body (proprioceptive) or from the environment (exteroceptive) (Marder and Calabrese, 1996; Pearson, 2004). Different levels of CPG modeling exist from the microscopic level (called also biophysical model) that takes into account many details in the biophysical operation of the neurons like the famous Hodgkin-Huxley model (Hodgkin and Huxley, 1952), to the macroscopic level that tries to reproduce the functionality of a population of neurons using non-linear oscillators like Van der Pol (Rowat and Selverston, 1993; Low et al., 2006), Rayleigh (Mottet and Bootsma, 1999), or Hopf (Righetti and Ijspeert, 2006; Nachstedt et al., 2017).

Between the microscopic and macroscopic levels of modeling, there exists an intermediary level, called mesoscopic level, which takes a more realistic biological inspiration but is sufficiently simplified to study the sensorimotor couplings, oscillation properties and learning mechanisms involved in the control of rhythmic tasks. These models are usually based on a pair of two mutually inhibitory oscillating neurons thus creating a CPG, called half-center (Grillner and Wallen, 1985), divided into two parts controlling the extensor and flexor muscles.

The model of half-center CPG for mammal locomotion by McCrea and Rybak (Rybak et al., 2006) takes inspiration from biological structures, such as the rhythmic layer, modulating layer, interneurons, sensory neurons, etc. Its architecture is divided into three layers: Rhythm Generator layer (composed of an inhibitory pair of oscillatory neurons), Pattern Formation layer (composed of inter-neurons) and Motor layer (composed of Motoneurons). It also takes afferent (proprioceptive) and efferent (exteroceptive) sensory feedbacks into account. While this model has been widely used for locomotion (Amrollah and Henaff, 2010; Spardy et al., 2011; Nassour et al., 2014; Danner et al., 2016; Nachstedt et al., 2017), very few works apply it to the control of upper limb movements: to our knowledge, only Teka et al. (2017) used it to study the reaching movement.

Non-linear oscillator models (also called relaxationoscillators) can be used for oscillating neurons in CPGs because they can synchronize effortlessly with an external signal provided the frequency of this signal is not too different from the intrinsic frequency of the oscillator (Pikovsky et al., 2003; Petricˇ et al., 2011). Thus, non-linear oscillators are suitable models to explain and reproduce the synchrony phenomena that emerge in interpersonal coordination, especially if they are implemented at the rhythmic level of a CPG. In this case, by acting like a dynamic attractor, they facilitate the self-synchronization of the CPG with the dynamic of the limb controlled by the CPG.

During the production of movement coordination, the Matsuoka oscillating neuron model (Matsuoka, 1987) exhibits the behavior of a non-linear oscillator and self-synchronization. This model has been used extensively in robotic locomotion or human motor control modeling (Taga et al., 1991; Taga, 1998; Kasuga and Hashimoto, 2005; Degallier and Ijspeert, 2010; Yu et al., 2014; Avrin et al., 2017a,b). However, the main problem of the Matsuoka model is that it cannot produce discrete as well as rhythmic activities as mentioned in Degallier and Ijspeert (2010). Indeed, it is now known that, in motor control, discrete and rhythmic movements are generated by networks of spinal neurons (Grillner, 2006; Degallier et al., 2011). Consequently, in order to be biologically plausible, a CPG model must be able to produce both discrete and rhythmic activities, just like what has been observed in biological neurons implied in locomotion production (Marder and Bucher, 2001). Therefore, CPGs must include oscillating neurons able to operate in discrete and rhythmic modes depending on one or several parameters. Unfortunately, although the Matsuoka model is a non linear oscillator, its nonlinearity is not controllable, meaning the model doesn't have a nonlinear parameterizable function allowing different nonlinear behaviors.

The Rowat-Selverston oscillating neuron model (Rowat and Selverston, 1993) is able to produce discrete and rhythmic activities depending of two parameters as it has been demonstrated in Amrollah and Henaff (2010) and Nassour et al. (2014). However, only a few studies make use of it (Arikan and Irfanoglu, 2011). The Rowat-Selverston oscillating neuron is a generalized Van der Pol oscillator and consequently all known properties of the Van der Pol can be applied to it, especially the dynamic Hebbian learning of frequency introduced by Righetti et al. (2006).

The first originality of this article is to implement Hebbian mechanisms proposed by Righetti et al. (2006), in a bio-inspired CPG, which we previously used for biped locomotion (Nassour et al., 2014), enabling it to learn to synchronize with an external signal. The second originality resides in using this plastic CPG to control a simulated robotic arm which has to learn to synchronize its oscillatory movements with the frequency of an external force applied to its effector, thus reproducing the act of handshaking with a human.

In the first part, we explain how dynamic plasticity is integrated in our CPGs and present the design of our robot controller. In the second part, we validate our model by applying it to the command of a robotic arm interacting physically rhythmically in simulation. We show that the controller learns to synchronize with the imposed rhythm in a given frequency range matching the usual frequencies of handshaking. We also demonstrate the importance of plasticity to achieve fast and stable coordination. In the fourth part, we discuss our results and future prospects.

#### 2. MATERIALS AND METHODS

This section presents the plasticity mechanisms implemented in the neurons of the CPG and finally, the design of the CPG-based controller.

#### 2.1. Dynamic Plasticity in CPGs Based on Rowat-Selverston Neurons

As mentioned above, a non-linear oscillator has the property of self-synchronization with an oscillating external signal applied as its input, provided the frequency of this signal is close enough to the intrinsic frequency of the oscillator. Implementing frequency learning mechanisms inside a CPG would allow to synchronize its rhythmic activity with the external signal even if the frequency of this signal is significantly different from the intrinsic one of the CPG (Ijspeert, 2008; Yazdani et al., 2017). Therefore, the CPG could synchronize with the movements, triggering the emergence of a global coordination between the limbs (Degallier and Ijspeert, 2010). Righetti et al. (2006) proposed such a frequency learning model for a Van der Pol oscillator called Dynamic Hebbian learning. This section demonstrates the application of this idea to the Rowat-Selverston oscillating neuron model.

#### 2.1.1. Recall of Righetti's Model for Dynamic Hebbian Learning Into Van der Pol Oscillators

The free form (i.e., without any input signal applied) of the Van der Pol oscillator can be written as :

$$\begin{aligned} \dot{\mathbf{x}} &= \mathbf{y} \\ \dot{\mathbf{y}} &= -\alpha \left(\mathbf{x}^2 - p\right) \mathbf{y} - \omega^2 \mathbf{x} \end{aligned} \tag{1}$$

where y is the output of the oscillator, p amplitude of y, α controls the degree of nonlinearity of the system and ω mainly influences the frequency of the oscillator.

When the Van der Pol oscillator is forced by an oscillating input signal F(t) the model can be written as:

$$\begin{aligned} \dot{x} &= \dot{\nu} + \epsilon F\\ \dot{\nu} &= -\alpha \left(\dot{x}^2 - p\right)\nu - \alpha^2 x \end{aligned} \tag{2}$$

where ǫ can be seen as a gain or a weight.

In order to synchronize the oscillator with the input F(t) (see Righetti et al., 2006 for details), proposed to learn the frequency of the oscillator following a Hebbian learning rule :

$$
\dot{\phi} = \epsilon F \frac{\mathcal{Y}}{\sqrt{\mathbf{x}^2 + \mathbf{y}^2}} \tag{3}
$$

They showed that this rule allows the oscillator to change its intrinsic frequency to synchronize with the oscillating signal F(t). The oscillator preserves the learned frequency, even after the input signal is cut. It has been applied to the Hopf oscillator and the Fitzhugh-Nagumo oscillator.

#### 2.1.2. Van der Pol Form of Rowat-Selverston Neuron

The free form of the Rowat-Selverston model of a cellular neuron is described by the equations (see Rowat and Selverston, 1993 for details):

$$\pi\_m \dot{V} + V - A\_f \tanh\left(\frac{\sigma\_f}{A\_f} V\right) + q = 0 \tag{4a}$$

$$
\pi\_{\mathfrak{s}} \dot{q} = -q + \sigma\_{\mathfrak{s}} V \tag{4b}
$$

with V being the cellular membrane potential, q the slow current, τ<sup>m</sup> the time constant of the cellular membrane, τ<sup>s</sup> is the time constant of slow current activation (τ<sup>m</sup> ≪ τs), σ<sup>s</sup> and σ<sup>f</sup>

represent respectively the conductance of slow and fast currents, Af influences the amplitude of V.

Because Rowat-Selverston is a generalized Van der Pol oscillator, its equations can be rewritten in a Van der Pol form such as in Equation (1). To do that, Equation (4a) can be differentiated, and q˙ replaced by the expression given in Equation (4b).

We can thus obtain a new expression of the unforced Rowat-Selverston oscillator. In order to identify a Righetti learning rule in the Rowat-Selverston neuron model, we must liken this model to a Van der Pol oscillator expressed by Equation (1). To do that, we approximate the tanh function to a linear one, tanh(x) ≈ x, thus yielding:

$$\tau\_m \dot{V} + \left(\frac{\tau\_m}{\tau\_s} + 1 - \sigma\_f + \frac{\sigma\_f^3}{A\_f^2} V^2\right) \dot{V} + \frac{1 + \sigma\_s}{\tau\_s} V - \frac{\sigma\_f}{\tau\_s} V = 0 \tag{5}$$

We're well aware that approximating tanh(x) to x may seem farfetch and exceedingly inaccurate. Here, we are only trying to identify a Hebbian rule and experiments validate our attempt. It may very well be that, other rules, based on other far-fetched assumptions, are valid too.

By setting, V˙ = y, we can transform the model into the following unforced Van der Pol form see the Appendix for the detailed calculations:

$$\begin{split} \dot{V} &= \chi\\ \dot{\nu} &= \frac{-\sigma\_f^3}{\mathfrak{r}\_m A\_f^2} \left( V^2 - \frac{A\_f^2 (\sigma\_f \mathfrak{r}\_s - \mathfrak{r}\_m - \mathfrak{r}\_s)}{\mathfrak{r}\_s \sigma\_f^3} \right) \nu - \frac{1 + \sigma\_s - \sigma\_f}{\mathfrak{r}\_s \mathfrak{r}\_m} V \end{split} \tag{6}$$

By comparing this equation to Equation (1), we can finally identify the Van der Pol parameters ω, α and p of the unforced Rowat-Selverston oscillating neuron:

$$\omega = \sqrt{\frac{1 + \sigma\_s - \sigma\_f}{\tau\_s \tau\_m}}; \qquad \alpha = \frac{\sigma\_f^3}{\tau\_m A\_f^2};$$

$$p = \frac{A\_f^2 (\tau\_s(\sigma\_f - 1) - \tau\_m)}{\tau\_s \sigma\_f^3}; \text{with } \sigma\_f < 1 + \sigma\_s \tag{7}$$

#### 2.1.3. Implementing Dynamic Hebbian Learning Into the Rowat-Selverston Neuron

When an external signal F(t) is applied to the Rowat-Selverston oscillating neuron, the neuron potential V becomes:

$$
\dot{V} = \dot{\nu} + \epsilon F \tag{8}
$$

where the gain ǫ can be considered like a synaptic weight. Thus, the principle of Hebbian dynamic rule proposed by Righetti et al. (2006) can be applied on the parameters of the Rowat-Selverston model to learn the frequency of F(t).

As shown in Rowat and Selverston (1993), the frequency of the neuron oscillations depends only on τm, τ<sup>s</sup> , σ<sup>f</sup> , and σ<sup>s</sup> : if σ<sup>f</sup> is fixed above a given threshold θ<sup>f</sup> = 1 + τm τs ≈ 1 (τ<sup>m</sup> ≪ τs), σ<sup>s</sup> controls two modes depending on another threshold θ<sup>s</sup> . If σ<sup>s</sup> < θ<sup>s</sup> , there are no oscillations [intrinsic mode called "plateau potentials" in Marder and Bucher (2001)]. On the other hand, for σ<sup>s</sup> > θ<sup>s</sup> , the neuron produces a rhythmic signal [intrinsic mode called "endogenous bursting" in Marder and Bucher (2001)] whose frequency depends on τm, τ<sup>s</sup> , and σ<sup>s</sup> .

Following the idea of Righetti et al. (2006), we propose to implement dynamic Hebbian learning of the oscillations frequency by learning σ<sup>s</sup> depending on the signal F(t) applied to the neuron and weighted by ǫ. Thus, neural plasticity for frequency learning can be obtained by deriving the expression of ω 2 from 7 :

$$\dot{\sigma}\_s = 2\dot{\alpha}\omega\sigma\_m\tau\_s = 2\dot{\alpha}\sqrt{\tau\_m\tau\_s}\sqrt{1+\sigma\_s-\sigma\_f} \tag{9}$$

By applying the dynamic Hebbian learning rule proposed by Righetti et al. (2006) to Equation (3), we obtain :

$$\dot{\sigma\_s} = 2\epsilon F \sqrt{\tau\_m \tau\_s} \sqrt{1 + \sigma\_s - \sigma\_f} \frac{\mathcal{Y}}{\sqrt{V^2 + \mathcal{Y}^2}}; \qquad \sigma\_f < 1 + \sigma\_s \tag{10}$$

We can see that this learning rule depends on the CPG time constants. The presence of σ<sup>s</sup> on the right side of the equation, makes it a closed loop ensuring that the end value of σ<sup>s</sup> does not depend on its initial value.

#### 2.1.4. Plasticity for Afferent and Efferent Signals

Additionally, to improve the control realized by the CPG, we propose to learn the amplitude of neuronal oscillations by learning A<sup>f</sup> depending on F(t), and to maintain the strength of sensitivity of F(t) efficient enough with a learning mechanism of ǫ.

#### **2.1.4.1. Neuronal Plasticity for Amplitude Learning**

A<sup>f</sup> determines the amplitude of the output of the CPG (efferent signal) and thus the amplitude of velocity orders applied to the motors. When A<sup>f</sup> is high, σ<sup>s</sup> will oscillate globally before reaching stability. In Equation 4, the expression A<sup>f</sup> tanh( <sup>σ</sup><sup>f</sup> Af V) influences the amplitude of V and consequently the CPG output. If the amplitude is too big, the CPG becomes unstable due to the rapid switchings of the sigmoid function of interneurons located in the pattern formation layer, and if it is too small, the output of the CPG doesn't have enough energy. Adapting the amplitude of the neuron oscillations in accordance with the applied signal F(t) could solve that. One solution consists in minimizing the error between the quadratic values of F(t) and the argument of tanh() in equation 4 to match the amplitude of V with F(t) :

$$\dot{A}\_f = -\mu \left( \left( \upsilon \frac{\sigma\_f V}{A\_f} \right)^2 - F^2 \right) \tag{11}$$

where ν is a scale factor and µ a learning step.

The presence of A<sup>f</sup> in the equation makes it a closed loop, guarantying the same end value for A<sup>f</sup> no matter the initial value. Empirically, we found that 20 was the best value for ν. Since A<sup>f</sup> is not a constant any more, its derivative should appear in equation 5. This case was studied and the same result with an additional term V 3σ 3 f A 3 f A˙ <sup>f</sup> was obtained. A˙ <sup>f</sup> being extremely small, this last term can be neglected and thus, yields the same result. So this case won't be detailed here any further.

Frontiers in Neurorobotics | www.frontiersin.org

#### **2.1.4.2. Synaptic Plasticity for Sensitivity Learning**

ǫ acts like a learning step for σ<sup>s</sup> (local oscillation) and determines how much σ<sup>s</sup> will oscillate when a new input signal is applied before reaching stability. As a consequence, a small ǫ will allow for a more robust and stable learning but will require more time to reach stability, if the interaction doesn't last long enough, it may never be reached. On the other hand, a large ǫ will lead to a more unstable learning but the final σ<sup>s</sup> may be reached. The parameter ǫ also acts as a synaptic weight to the afferent signal F(t) that feedbacks to the CPG. In Equation 10, we're really only interested in the frequency of F(t) and its magnitude is not relevant.

So, ǫ can be considered as a synaptic weight that could enable the CPG to better sense the external signal F(t) by normalizing it to magnitude 1. Besides, it was empirically determined that if ǫF is too small (< 1), σ<sup>s</sup> changes are too slow and may never reach a stable final value and when ǫF is too big (> 1), σ<sup>s</sup> becomes unstable. Optimal results are obtained when the amplitude of product ǫF equals 1. From there, a learning equation of ǫ can be based also on an error of quadratic values pondered by a variable gain that limits extreme values of F(t):

$$
\dot{\epsilon} = \lambda \tanh(\xi F) \left( 1 - (\epsilon F)^2 \right) \tag{12}
$$

with ξ an empirically determined gain ensuring that the term inside the tanh is big enough (in our case, ξ = 100 yields good results). This term guaranties that learning occurs only when F(t) is not zero.

Here, it could be argued that there is no need for learning ǫ, that manually determining the optimal value of ǫ beforehand would be sufficient. By all means, this could be done but the system would be less versatile and this would be ignoring the fact that the amplitude of the input varies over time. Even if the amplitude seldom varies so drastically, so that the ǫ wouldn't be valid any more, it isn't the optimal value for ǫ and the system could be performing better, especially if the input signal varies over time. In that case, ǫ would be suitable for a range of frequencies but if the frequency becomes too low or too high, the system won't behave as expected, thus requiring an adaptive ǫ.

#### 2.2. Designing the CPGs-Based Controller

An architecture based on CPGs is designed, according to the McCrea and Rybak model, to control a robot interacting physically with a human partner. The robot is a Mico robotic arm from Kinova company (**Figure 1**). One CPG controls the joint motor by applying velocity orders (efferent signals) and receives proprioceptive feedbacks (afferent signals) from the joint: torque and velocity (**Figure 1**). The equations for the generic CPG are the following, with <sup>i</sup> <sup>∈</sup> <sup>N</sup>, designating the joint id.

For the coupled Rhythm Generator cells:

$$\dot{V}\_{i[\![E,\!F]}} = \mathcal{Y}\_{i[\![E,\!F]}} - \dot{W}\_{inhib} \frac{\mathcal{Y}\_{i[\![E,\!F]}}}{1 + e^{-4\mu\_{\{\![E,\!F]}}}} + \epsilon\_{i[\![E,\!F]} F\_{i}} \tag{13}$$

$$\dot{\mathcal{Y}}\_{i[\![E,\!F]}} = \frac{1}{\tau\_{\mathfrak{m}}} \left(\sigma\_{f} - \frac{\tau\_{\mathfrak{m}}}{\tau\_{\mathfrak{s}}} - 1 - \sigma\_{f} \tanh^{2} \left(\frac{\sigma\_{f}}{A\_{f\_{i}}} V\_{i[\![E,\!F]} \right) \right) \mathcal{Y}\_{i[\![E,\!F]}} \tag{14}$$

$$-\frac{1 + \sigma\_{s\_{i[\![E,\!F]}}}{\tau\_{\mathfrak{s}} \tau\_{\mathfrak{m}}} V\_{i[\![E,\!F]} + \\ \tag{14}$$

$$\frac{A\_{f\_{i[\![E,\!F]}}}{\tau\_{\mathfrak{s}} \tau\_{\mathfrak{m}}} \tanh \left(\frac{\sigma\_{f}}{A\_{f\_{i[\![E,\!F]}}} V\_{i[\![E,\!F]} \right)}$$

The term in Winhib models the mutual inhibition between the rhythmic cells for the extensor and the flexor.

The terms σs<sup>i</sup> {E,F} , Af<sup>i</sup> {E,F} , and ǫi{E,F} are defined by Equations (10–12) respectively.

Inter-neurons of pattern formation layer (neuron PF), sensory neurons (neuron SN) for afferent feedbacks and motoneurons (neurons MN) for efferent signals, are defined as a sigmoid function (Debnath et al., 2014; Nassour et al., 2014):

values (in red, 10; in blue, 100; in green, 300). The σs have not been distinguished because we're only interested in the tendency and not in the individual behaviors.

$$PF(V\_{i\_{\{E,F\}}}) = PF\_{i\_{\{E,F\}}} = \frac{1}{1 + e^{\frac{-V\_{i\_{\{E,F\}}}{2}}}} \tag{15}$$

$$\text{SNN}\_{\text{s}}(\nu\_{m\text{es}\_{i}}) = \text{SN}\_{i,\text{s}} = \frac{1}{1 + e^{\alpha\_{i}\nu\_{m\text{es}}}} \tag{16}$$

$$\text{MN(PF}\_{i\_{\{\mathbb{E},\mathbb{F}\}}}, \text{SN}\_{i,\text{s}}) = \text{MN}\_{i\_{\{\mathbb{E},\mathbb{F}\}}} = \frac{1}{1 + e^{\alpha\_m \left(PF\_{i\_{\{\mathbb{E},\mathbb{F}\}}} - \text{SN}\_{i,\text{s}}\right)}} \\ \text{ (17) }$$

With α<sup>s</sup> = −0.061342 and α<sup>m</sup> = 3. These coefficients were chosen to match the parameters of the robot. For instance, the sigmoid slope of the sensory neuron is determined by the range of values of the speed.

## 3. SIMULATION OF HUMAN-ROBOT HANDSHAKING: RESULTS

In this section, we will first present our results with a handshake simulation, then we will study the parameters influence and finally, we will demonstrate the importance of neuronal and synaptic plasticity.

The simulations have been run in the V-REP Simulation software with the Kinova Mico robotic arm. The V-REP simulator cannot realistically compute grasping with a human hand, so we simulate the handshaking gesture with a ball placed inside the gripper. The ball is defined as a static object not subjected to gravity that, unless stated otherwise, moves up and down according to a 2 Hz sinusoidal signal of amplitude 0.16 m. This frequency is coherent for handshaking according to previous experiments dedicated to the study on handshaking between humans (Tagne et al., 2016). Since both objects are collidable, the ball exerts a force on the fingers of the gripper, forcing the arm to move along (see **Figure 1**). Reaching and grasping details are irrelevant to this work and won't be detailed here.

The Mico arm has seven degrees of freedom, but Tagne et al. (2016) showed that arms are moving in the sagittal plan. In the current setup, only the shoulder and elbow (joints 2 and 3 of the Mico robot) are controlled for handshaking simulation, the five other joints are hence locked and unable to move. At the beginning of the simulation, the robot isn't subjected to any external force (other than gravity). The robotic arm raises toward the ball and grasps it. Then, by applying a sinusoidal signal to the ball, it must move in the vertical plane, thus applying a perturbation to the robotic arm. Finally, the ball is released and the interaction stops.

In all simulations here, the robotic arm raises toward the ball between t = 0 and t = 0.68s, then the interaction starts. The length of the interaction varies depending on the test conducted. Finally, when the ball is released, the behavior of the robotic arm is observed during ten more seconds before the simulation stops. Sensory feedbacks are taken into account during the whole process and are fed as an input to the CPG.

## 3.1. Role of Feedbacks and Mutual Inhibition on Plasticity

The choice of the parameters is a crucial step, when inappropriately chosen, the system may not behave as expected or the results may be subpar. So, in order to select the best parameters for the CPG, the role and influence of each parameter were studied.

To have an oscillating system, Rowat and Selverston (1993) determined that σ<sup>f</sup> > 1 + τm τs and a ratio τm/τ<sup>s</sup> of at least 10 is required. Actually, because of our newly derived learning rule (Equation (10)), we also require σ<sup>f</sup> < 1 + σ<sup>s</sup> , and best results are now achieved with σ<sup>f</sup> = 1. For greater σ<sup>f</sup> , the system is too unstable, and for smaller values, the learning of σ<sup>s</sup> slows down because the neurons are in non-rhythmic behavior.

#### 3.1.1. Inhibition Influence on Plasticity

We previously stated that the natural frequency of the oscillator is determined by τm, τ<sup>s</sup> , σ<sup>f</sup> and σ<sup>s</sup> only. However, the natural frequency also depends on W. The higher W, the lower the frequency, hence the higher σ<sup>s</sup> needs to be to compensate. **Figure 2** shows that the value of W influences the final value of σ<sup>s</sup> and below W = 0.05, the result is roughly the same. We can observe a slight demarcation for W = 0.05 and above. For W ≥ 1, the system isn't able to oscillate (Rowat and Selverston, 1997).

The initial value of σ<sup>s</sup> doesn't change the final value reached (see **Figure 2**). For very high or very low values, the final σ<sup>s</sup> may never be reached if the interaction does not last long enough.

#### 3.1.2. Effect of Afferent Sensory Feedbacks on Plasticity

Tests were carried out to determine which articular sensory information is best suited to our purpose and yields the best result in term of synchronization. **Figure 3** shows the comparison between articular position, articular velocity and articular torque as feedback. Position and velocity feedback offer very bad results. Both are neither able to adapt nor synchronize in spite of our best attempts to find better parameter values. Finally force feedback shows the best results. Furthermore, handshaking is a social gesture and as such, provides information about the interaction partner: firmness of grip, strength, vigor. This data can be used to infer personality traits (Chaplin et al., 2000) and can only be sensibly obtained from force feedback. So, the torque measured in the joint will be the afferent input of our CPG for synchronization.

## 3.2. Analyze of the Simulated Handshake

The simulation lasts 50 s. The interaction starts at t = 0.68s and lasts until t = 40s when the gripper opens and releases the ball. In this case, only frequency adaptation (σ<sup>s</sup> learning) is enabled, ǫ and A<sup>f</sup> remain constant. The parameters used for the simulation are as follows: ǫ = 0.02 for the shoulder CPG (joint 2), ǫ = 0.03 for the elbow CPG (joint 3), τ<sup>m</sup> = 0.35, τ<sup>s</sup> = 3.5, W = 0.005, σ<sup>f</sup> = 1.0 and A<sup>f</sup> = 0.05.

#### 3.2.1. Emergence of Synchrony in Handshaking

The simulated act of handshaking can be divided into four phases among which two specific phases appear showing the emergence of synchronization of movement during contact :


clearly visible on the plot. On the right, evolution of φ3E and φ3F during the experiment for the extensor and flexor of the second joint.

s can be observed to be perfectly synchronized and in phase (**Figure 4**). From then onwards, the torque amplitude stays mainly stable at 50 N.m for joint 2 and 25 N.M for joint 3, this shows that the arm learned the movement, it oscillates at the right frequency on its own and the ball isn't forcing on it so much. Besides, we can observe that the σ<sup>s</sup> also reach stability, from t = 12s onwards (by considering a response time at 5% of the final value). The σ<sup>s</sup> for both joints are now completely merged and stable around 192.

• **Withdrawal phase.** Finally the interaction stops at t = 40s and the ball is released so there isn't any force exerted on the arm. We can see that the arm goes on oscillating at the frequency learned during the interaction, though with a smaller amplitude. The σ<sup>s</sup> also remain stable, showing that the new value has indeed been learned. This oscillation could be stopped by setting the value of σ<sup>f</sup> below 1.

#### 3.2.2. Inter-limb Coordination

Inter-limb coordination can be observed thanks to φ<sup>E</sup> and φ<sup>F</sup> which represent the phase difference of the flexor and extensor motoneuron output, respectively of both CPGs (see **Figure 5**):

$$\phi\_{\{E,F\}} = \theta\left(V\_{\mathcal{Z}\_{\{E,F\}}, \mathcal{Y}\_{\mathcal{Z}\_{\{E,F\}}}}\right) - \theta\left(V\_{\mathcal{Z}\_{\{E,F\}}, \mathcal{Y}\_{\mathcal{Z}\_{\{E,F\}}}}\right) \tag{18}$$

with θ(V, y) the phase of the CPG:

$$\theta(V,\chi) = \text{sign}(V)\arccos(\frac{-\chi}{\sqrt{V^2 + \chi^2}}) \tag{19}$$

Both φ start at t = 0. Similarly, to our previous observations, during the transitory phase, φ3<sup>E</sup> increases while φ3<sup>F</sup> decreases. After that, the φ reach stability around π and −π, from t = 20s onwards and retain the same value after the interaction stops at t = 40s.

#### 3.2.3. Dynamic Stability of Synchronization

Dynamic stability of synchronization can be observed through the phase portrait of the CPGs (V-y) and robot articulations (angular velocity-angular position). On the CPG output phase portraits (see **Figure 6**), three different cycles can be observed. First, the starting cycle (most inner circle), when the rhythmic cells oscillate at their own intrinsic frequency. Then, the interaction cycle (most outer circle) when the human and the robot are interacting. Finally, the middle circle is the end cycle. On the other hand, on the position-velocity phase portraits we can clearly distinguish two cycles. The outer cycle corresponds to the interaction part, while the inner cycle is the "arm released" part. The cycle does not change in shape, but changes in size (due to the amplitude decreasing) when the arm is released. This cycles apparition shows that the system is stable, and thus the frequency is learned.

#### 3.3. Plasticity Leads to Frequency Adaptation

In this study, the frequency of the ball movement varies following Heaviside functions simulating different types of

FIGURE 8 | σs learns. Top figure represents the evolution of σs, φ and frequency of F. Below are the torque measured on the joint (red) and send velocity (blue), and at the bottom, the mechanical work provided by each joint.

human handshakes: 2 Hz between 0.68 and 35 s, then 1 Hz between 35 and 70 s, then 2 Hz between 70 and 100 s, finally 2.5 Hz between 100 and 120 s. To demonstrate the importance of frequency adaptation for synchronization, a first simulation was run without learning σ<sup>s</sup> (σ<sup>s</sup> would thus remain constant at 10), while a second was run with σ<sup>s</sup> learning enabled.

Results show that the GPG controller which doesn't learn σ<sup>s</sup> synchronizes with the perturbation signal thanks to its property of natural synchronization, but since it doesn't learn the new frequency, it doesn't reach stability, i.e. the system oscillates at the right frequency, but only because the interaction forces it to. The signals F and s are neither in phase nor in anti-phase, which would be stable regimen. This leads the system to always provide maximum effort throughout the whole interaction, and the force to be constantly saturated. This can be observed on the bottom of **Figure 7** by the important mechanical work provided by the joints.

On the contrary, when the system learns σ<sup>s</sup> , we can see that σs indeed adapts to each new frequency and we can observe that the torque and CPG output are synchronized and in phase (**Figure 8**). The decrease in force, which we previously witnessed in our simple handshaking experiment, occurs here too. In this case, the mechanical work provided by the joints (see bottom of **Figure 8**) is consequently much less important since the force happens to be saturated only during the transition phases.

#### 3.4. Plasticity Decreases the Energy Spend by the Robot

It is interesting to study the role of neuronal plasticity (σ<sup>s</sup> and Af ) and synaptic plasticity (ǫ), on the energy spend by the robot for synchronization. Since last section has shown the positive effects of learning σ<sup>s</sup> on the mechanical work provided by the motors, this section won't talk about σ<sup>s</sup> learning any more, which will be always enabled. So when employing the terms without plasticity, the reader shall understand without any plasticity (A<sup>f</sup> nor ǫ learning) but σ<sup>s</sup> learning.

Again, the frequency of the ball movement varies throughout the interaction: 2 Hz between 0.68 and 35 s, then 1 Hz between 35 and 70 s, then 2 Hz between 70 and 100 s, finally 2.5 Hz between 100 and 120 s. The parameter values are the same as in section 3.2.

Moreover, for the first simulation (without plasticity), A<sup>f</sup> = 0.05, ǫ = 0.01 for joint 2 and ǫ = 0.02 for joint 3. For the second simulation (with neuronal plasticity), <sup>λ</sup> <sup>=</sup> 2.10−<sup>3</sup> , <sup>µ</sup><sup>2</sup> <sup>=</sup> 5.10−<sup>6</sup> and <sup>µ</sup><sup>3</sup> <sup>=</sup> 8.10−<sup>6</sup> .

Results from simulations are evaluated first energetically and second by synchronization time. To calculate the power consumption of the system, we compute the work provided by each joint with following equation:

$$W = \sum\_{t} |F\_{t} \Delta \theta\_{t}| \tag{20}$$

The synchronization time is defined by the 5% response time for both σ<sup>s</sup> to reach the stability value, for each different frequency value.

First, it should be noticed from **Figure 9** that the system without plasticity doesn't do too well in the lower frequency 1 Hz. Indeed, after decreasing, the force increases again and the σ<sup>s</sup> of the different joints never merge. These two phenomena, which by the way are also to be found in the simulation with only A<sup>f</sup> plasticity, are due to the value of ǫ which, while suitable for the other frequencies, is too small to get good results at 1 Hz.

We can also see that the force applied on the joints of the system is much lower when plasticity is applied (**Figure 10**). The force also decreases faster, this can be correlated with the evolution of the σ<sup>s</sup> which is steeper during the transitions but slows down a little before reaching the new stability value. Those observations suggest that, although the synchronization times may appear similar in the **Table 1**, the

TABLE 1 | Comparison table for various simulations with or without neuronal plasticity.


τi is the time required so that σ<sup>s</sup> stabilizes at the ith frequency change.

transitory phase might be shorter, and hence synchronization faster, with plasticity. Furthermore, let us remark that, although the synchronization time appears much smaller without ǫ plasticity for 1 Hz, the validity of the measure for the other two cases could be discussed, since the σ<sup>s</sup> never merge.

In **Table 1**, W<sup>i</sup> represents the sum of the mechanical work provided by joint i during the simulation (synchronization times have been already explained so we won't dwell on the subject any further). We can see that learning the amplitude A<sup>f</sup> decreases the work noticeably for joint 2, but only slightly for joint 3. Besides, learning ǫ alone decreases the work further and the artifacts mentioned before disappear. Finally the association of both ǫ and Af learning yields the best results by virtually halving the original work value.

#### 4. DISCUSSION AND CONCLUSION

In this paper, we implemented Hebbian mechanisms in a bioinspired CPG, thus enabling it to learn to synchronize with an external signal. Furthermore, we used this plastic CPG to control a simulated robotic arm which had to learn to synchronize its oscillatory movements to the frequency of an external force applied on its effector.

We also underlined the relevance of force feedback, which not only yields much better results than velocity and position feedback but is also able to provide useful information, such as firmness of grip, strength, vigor. Such data, as evidenced in Chaplin et al. (2000) can be used to assess personality traits of the handshaking partner. This knowledge would allow the robot to adapt to different personalities (introvert, extrovert...), and thus make the interaction more enjoyable.

The analysis of synchronization phenomena clearly shows two main phases: the transitory phase where the system adapts and learns and the permanent phase where the system has retained the learning and is stable. Our best synchronization time is 6.82 s which is quite long for a handshake. Let us underline that we did not put the system in the best conditions to achieve faster coordination, the initial σ<sup>s</sup> (10, 0.44 Hz) being quite different from the final value (192, 2 Hz). Our main concern here was to show the capacity of the CPG to adapt even to very different frequencies from its own. , the Mico robot is not compliant and thus offers too much resistance to any perturbation. As a matter of fact, most robots are not designed for such tasks: lacking force/torque sensors, and the robot controllers can also be inadequate, providing only position control. So, putting the CPG in better initial conditions and using a more compliant robot would undoubtedly lead to a much faster synchronization.

Moreover, we demonstrated the importance of neuronal and synaptic plasticity which leads to a natural, global synchronization and adapts the neuronal architecture to a wider range of arm dynamics in physical interaction. On the one hand, we showed that learning σ<sup>s</sup> is paramount to have an adaptive system robust to frequency changes. On the other hand, this system can be improved further by learning the amplitude A<sup>f</sup> and the synaptic weight ǫ and hence considerably decreases the power consumption. We showed that local plasticity mechanisms trigger the emergence of a global adaptive stable behavior. In conclusion, it is our belief that plasticity is essential in designing a versatile and reliable bio-inspired controller.

Concerning the methodology followed in this work, it could obviously be argued that a single neuron can simply be used for each joint instead of a whole CPG. Let us answer that we wish to be as biologically close as possible, so our approach uses a mesoscopic model based on Rybak and McCrea's work (Rybak et al., 2006). Apart from that, a CPG offers more possibilities than a simple neuron due to its structure that creates a more robust and stable attractor.

Furthermore, the CPG model used for the rhythmic arm movement during physical interaction is the same as for walking, proving its versatility. On top of that, it should be noted that no dynamic model of the robot was used to control it. The dynamic control of the rhythmic movements relies solely on the natural synchronization abilities of the CPG. This makes the CPG-based control particularly interesting since it can very easily be adapted to another set of joints. Indeed, our simulation was only concerned with handshaking but this plastic CPG model could be applied to any rhythmic movements: walking, waving, cleaning, drumming.

Here, we use the well-known slave-master paradigm where one actor of the interaction imposes its frequency upon the other but we're also interested in studying how two robots would adapt to each other. In the future, we plan on extending the CPG architecture to more than two joints. Using a simulator obviously entails its share of limitations and our oversimplified handshake oversees a lot of subtleties present in human-robot interactions. Our controller will be validated with a real compliant robotic arm interacting with a human. Additionally, in order to better understand handshaking and hence, better reproduce it with robots, we will continue our study of handshaking, its synchronization phenomena and societal impact by performing human psychological/physiological studies.

Our code can be found at http://doi.org/10.5281/zenodo.1222100

## AUTHOR CONTRIBUTIONS

LC, PH, and MJ wrote the paper. LC introduced frequency learning. MJ ran the simulations and introduced amplitude and synaptic weight learning. PH supervised the work.

#### REFERENCES


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Jouaiti, Caron and Hénaff. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# APPENDIX

#### Mathematical Details for Section 2.2.2

Let us recall the Rowat Selverston equations:

$$\pi\_m \dot{V} + V - A\_f \tanh\left(\frac{\sigma\_f}{A\_f} V\right) + q = 0\tag{21a}$$

$$
\sigma\_s \dot{q} = -q + \sigma\_s V \tag{21b}
$$

Equation (21a) can be differentiated, yielding:

$$
\sigma\_m \ddot{V} + \dot{V} - \sigma\_f \left(1 - \tanh^2\left(\frac{\sigma\_f}{A\_f}V\right)\right) \dot{V} + \dot{q} = 0\tag{22}
$$

and q˙ replaced by the expression given in equation (21b):

$$\left(\tau\_m \ddot{V} + \dot{V} - \sigma\_f \left(1 - \tanh^2\left(\frac{\sigma\_f}{A\_f} V\right)\right) \dot{V} + \frac{1}{\tau\_s} (\sigma\_s V - q) = 0 \tag{23}$$

Then we replace q by its expression from equation (21a):

$$
\tau\_m \dot{V} + \dot{V} - \sigma\_f \left(1 - \tanh^2\left(\frac{\sigma\_f}{A\_f} V\right)\right) \dot{V} + \frac{\sigma\_s}{\tau\_s} V + \frac{1}{\tau\_s} V + \frac{\tau\_m}{\tau\_s}
$$

$$
\dot{V} - \frac{A\_f}{\tau\_s} \tanh\left(\frac{\sigma\_f}{A\_f} V\right) = 0\tag{24}
$$

and group the terms:

$$
\tau\_m \ddot{V} + \left(\frac{\tau\_m}{\tau\_s} + 1 - \sigma\_f + \sigma\_f \tanh^2\left(\frac{\sigma\_f}{A\_f}V\right)\right)
$$

$$
\dot{V} + \frac{1 + \sigma\_s}{\tau\_s} V - \frac{A\_f}{\tau\_s} \tanh\left(\frac{\sigma\_f}{A\_f}V\right) = 0\tag{25}
$$

# Human-Robotic Variable-Stiffness Grasps of Small-Fruit Containers Are Successful Even Under Severely Impaired Sensory Feedback

Mark Haas 1,2, Werner Friedl <sup>1</sup> , Georg Stillfried<sup>3</sup> and Hannes Höppner <sup>1</sup> \*

<sup>1</sup> German Aerospace Center DLR e.V., Institute of Robotics and Mechatronics, Wessling, Germany, <sup>2</sup> Faculty of Electrical Engineering, University of Applied Sciences Kempten, Kempten, Germany, <sup>3</sup> Agile Robots AG, Wessling, Germany

Application areas of robotic grasping extend to delicate objects like groceries. The intrinsic elasticity offered by variable-stiffness actuators (VSA) appears to be promising in terms of being able to adapt to the object shape, to withstand collisions with the environment during the grasp acquisition, and to resist the weight applied to the fingers by a lifted object during the actual grasp. It is hypothesized that these properties are particularly useful in the absence of high-quality sensory feedback, which would otherwise be able to guide the shape adaptation and collision avoidance, and that in this case, VSA hands perform better than hands with fixed stiffness. This hypothesis is tested in an experiment where small-fruit containers are picked and placed using a newly developed variable-stiffness robotic hand. The grasp performance is measured under different sensory feedback conditions: full or impaired visual feedback, full or impaired force feedback. The hand is switched between a variable-stiffness mode and two fixed-stiffness modes. Strategies for modulating the stiffness and exploiting environmental constraints are observed from human operators that control the robotic hand. The results show consistently successful grasps under all stiffness and feedback conditions. However, the performance is affected by the amount of available visual feedback. Different stiffness modes turn out to be beneficial in different feedback conditions and with respect to different performance criteria, but a general advantage of VSA over fixed stiffness cannot be shown for the present task. Guidance of the fingers along cracks and gaps is observed, which may inspire the programming of autonomously grasping robots.

Keywords: soft manipulation, variable impedance, variable stiffness, grip stiffness, surface electromyography (sEMG), force feedback (FF), visual feedback (VF), environmental constraints

#### 1. INTRODUCTION

Online grocery sales are an intensively growing field with growth rates of more than 25% year-overyear in the USA (Springer, 2017). Some online supermarkets have their own warehouses, where the groceries are packed into delivery totes before they can be distributed to the customers. The consignment of food is partly done in cold storage rooms to preserve their freshness, which makes the working environment un-alluring for human workers. Letting robots do the task might be one solution. However, automated food grasping is still ambitious.

#### Edited by:

Kosta Jovanovic, School of Electrical Engineering, University of Belgrade, Serbia

#### Reviewed by:

Perla Maiolino, University of Cambridge, United Kingdom Lucia Beccai, Fondazione Istituto Italiano di Technologia, Italy

> \*Correspondence: Hannes Höppner hannes.hoeppner@dlr.de

Received: 04 July 2018 Accepted: 12 October 2018 Published: 31 October 2018

#### Citation:

Haas M, Friedl W, Stillfried G and Höppner H (2018) Human-Robotic Variable-Stiffness Grasps of Small-Fruit Containers Are Successful Even Under Severely Impaired Sensory Feedback. Front. Neurorobot. 12:70. doi: 10.3389/fnbot.2018.00070

Robotic grasping and manipulation of grocery items in an online supermarket warehouse poses the challenge of having to deal with thousands of items of different size, shape, or rigidity (Burgess, 2017). Furthermore, some of the items are very delicate but should not be damaged to avoid complaints by customers. To save some of the cost and complexity of perception systems, it is desirable that the robotic systems are able to fulfill their tasks even when high-fidelity sensory feedback is lacking or even missing.

Solutions to these problems are investigated within the European project Soft-bodied intelligence for Manipulation (SOMA). It focuses on the development of manipulation systems with simple, robust, and efficient robotic hands with embodied compliance, which both endure and require the exploitation of environmental constraints like surfaces and edges for guiding their motion.

Concepts for embodied compliance include series-elastic actuators (SEA, Pratt and Williamson, 1995), variable-stiffness actuators (VSA, Vanderborght et al., 2013, Wolf et al., 2015) and other concepts such as pneumatics in combination with continuously deformable material (Deimel and Brock, 2015). While SEA systems and the pneumatic system by Deimel and Brock exhibit fixed relationships between their applied force and grip stiffness, VSA systems are equipped with additional motors to change their stiffness characteristics.

Robotic systems with embodied compliance yield various advantages over mechanically rigid systems with sensor-based actively controlled compliance: shock absorption and energy storage for cyclic or explosive movements. Active compliance control also introduces a controller-dependent delay which can lead to potential damage because of peak loads during impacts (Haddadin et al., 2007) whereas the inherent elasticity of SEA and VSA can be used without delay. Furthermore, embodied mechanical compliance might even be realized at a lower cost than actively controlled compliance because of the eliminated need for stiff torque sensors, which are expensive.

Whenever lack of sensory feedback disables a closed-loop control of forces and positions, compliance, i.e., low stiffness, remains as a possibility to guide the robotic hand along environmental constraints and to shape the grip around objects, increasing possible contact points which in return boost the chances of a successful and efficient grasp (Eppner et al., 2015). On the other hand, high stiffness is still required for withstanding the weight of an object during lifting and yields lower positional errors. Since VSA are able to provide both low and high stiffness, we hypothesize that in the absence of high-fidelity sensory feedback, VSA systems are particularly useful and perform better than SEA systems with a fixed stiffness. To test this, we design a manipulation task where visual and force feedback can be switched on and off.

In this paper, a VSA robotic hand prototype—the DLR Wearable Hand to Investigate Stiffness in Grasping (WHISG) is presented and its performance in a particular grocery manipulation task—picking small-fruit containers from storage boxes—is examined.

The WHISG bases its VSA technology on the one of the DLR Hand Arm System (Friedl et al., 2011) and aims to provide decent force and speed at low cost and light weight.

The present study aims at testing the hypothesis that variable stiffness helps better to compensate lack of sensory feedback than fixed stiffness, i.e., that VSA systems suffer less performance deficits than SEA system when sensory feedback is reduced. Furthermore, it aims at answering the questions how in VSA systems stiffness should be modulated and how environmental constraints can guide the robotic hand to a successful grasp position.

For the comparison of VSA and SEA, the manipulation task is attempted with three different stiffness modes of the WHISG hand: variable stiffness, fixed low stiffness, and fixed high stiffness. Please note: even if the underlying mechanism is always a VSA, the hand is exactly used as SEA hand for the two fixed stiffness modes.

To initially save programming effort and learn from human manipulation strategies, instead of being positioned by a robotic arm and controlled by a computer, the robotic hand is positioned and controlled by human operators. For the positioning, the robotic hand is physically attached to the operator's forearm. For the control, the desired robotic hand opening angle is electronically mapped from the operator's hand opening angle. Additionally, in the variable-stiffness mode, the robotic hand stiffness is electronically mapped from the operator's grip stiffness.

The operator's grip stiffness is acquired using surface electromyography (sEMG) of intrinsic hand muscles according to Höppner et al. (2017), a method which is known in literature as "teleimpedance." One of the most prominent studies focusing on teleimpedance was conducted by Ajoudani et al. (2012) who teleoperated the Cartesian stiffness of a lightweight robot during a peg-in-hole and a ball-catching task using sEMG of eight arm muscles and visual feedback. Both tasks were performed better with the teleimpedance control (i.e., reduced positional errors and less force exertion) compared to a constantly low- or high-stiffness control, which demonstrates the effectiveness of this method for impedance-controlled robotic arms.

Teleimpedance using sEMG has also been studied for VSA hands (Hocaoglu and Patoglu, 2012; Godfrey et al., 2013; Ajoudani et al., 2014). In these studies, the sEMG signal was acquired from the extrinsic hand muscles in the forearm. Godfrey et al. (2013) found that adding impedance control and vibrotactile feedback to a teleoperation setting improved the user experience and reduced the physical and mental effort when grasping objects. Laghi et al. (2017) investigated the role of

**Abbreviations:** DIP, distal interphalangeal (joint); DoF, degrees of freedom; DLR, Deutsches Zentrum für Luft- und Raumfahrt, German Aerospace Center; EC, environmental constraints; FAS, Flexible Antagonistic Spring (VSA mechanism); FF, force feedback; GFM, Grip Force Master (grip position control and force feedback device); GP, grasping phase; HS, high stiffness; LS, low stiffness; MANOVA, multivariate analysis of variance; MCP, metacarpophalangeal (joint); MVC, maximum voluntary contraction; PIP, interphalangeal (joint); SEA, serieselastic actuator; sEMG, surface electromyography; SOMA, Soft-bodied Intelligence for Manipulation (EU project); SP, searching phase; VF, visual feedback; VF+FF, full feedback (visual and force feedback); VS, variable stiffness; VSA, variablestiffness actuator; WHISG, Wearable Hand to Investigate Stiffness in Grasping (robotic hand).

force feedback, visual feedback and communication delay in a teleimpedance approach, as well. The aim of their study was to verify the usefulness of a combined teleimpedance mode with force feedback in comparison to standard control modes. They found their newly introduced method to be working even when in the presence of a communication delay.

The present study builds upon the existing teleimpedance studies, but differs in some aspects. Firstly, the sEMG signals of intrinsic instead of extrinsic hand muscles are measured, as they have been shown to provide a more accurate estimate of grip stiffness (Höppner et al., 2017). The better signal may be due to lower electrical resistance and less cross-talk from other muscles, which both lead to a better signal-to-noise ratio.

Secondly, grip forces are fed back to the operator as grip forces instead of vibrotactile stimuli. We assume that this more intuitive feedback enables the operators to better adjust the stiffness more effectively. Thirdly, the hand is positioned in space by the human arm via an attached beam instead of via a robotic arm. This is mainly done to simplify the experiment setup.

Like the robotic VSA hand, the human hand can be considered a variable-stiffness mechanism (Höppner et al., 2011, 2017). Generally speaking, a higher contractile muscle force will result in a higher stiffness. Through co-contracting antagonistic muscle parts the human is able to decouple force from stiffness and to increase stiffness without changing joint torque (McIntyre et al., 1996; Perreault et al., 2002, 2004) by about 20% for the human hand (Höppner et al., 2017).

Because of this familiarity with variable stiffness and since humans are known to adapt very well to new situations, we assume that in the present task—after trials of learning—the human operators will outperform any existing robotic controller. Hence, the research questions are focused not primarily on the controller, but on the robotic hand hardware, the stiffness mode and the level of sensory feedback.

Section 2 describes the experiment materials and methods. It introduces the design of the WHISG hand—a new soft robotic hand based on VSA technology—, briefly describes the acquisition of sEMG signals for teleoperating the robotic hand stiffness and explains the experimental and statistical design in detail. In section 3, the results are listed. The influence of the factors force and visual feedback as well as stiffness mode on the three metrics—duration to complete one trial, number of single grasping actions and required thumb torque is presented. Finally, section 4 discusses the results in the context of the research question, i.e., the effect of stiffness modes and sensory feedback modes on grasp performance and on the stiffness modulation strategy and draw possible conclusions and implications for soft robotics.

## 2. MATERIALS AND METHODS

The comparison of SEA and VSA grasping under different levels of sensory feedback entailed the design of a suitable robotic hand, interfacing it to the human operators, setting up the experimental environment and measuring the grasping performance during different experimental conditions.

## 2.1. Design of the WHISG Hand

The WHISG hand, at only 500 g, is light enough to be handheld or mounted on a lightweight robot or, as is the case in the present study, on the forearm of an operator (see **Figure 1** and **Video 1**). The hand is designed to study grasping a broad variety of delicate groceries, such as cucumbers, mangoes, small-fruit containers, salad etc. However, for this first version the kinematic design is roughly estimated rather than based on a detailed kinematic analysis. It is a three-fingered hand with inherent compliance due to elastic elements between motor and joint (see **Figure 2**), enabling shape adaptation and safe interaction with the environment. We choose three instead of two fingers in order to increase the grasp stability as well as to ensure a more evenly distribution of applied object forces. The housing parts are 3D-printed from polylactic acid filament. Motor control is done with two Arduino Leonardo micro-controllers that communicate using their serial peripheral interface.

The hand contains two types of fingers: one thumb-like main finger with four degrees of freedom (DoF) and two underactuated fingers with two DoF each, which oppose the thumb. The two thumb DoF closest to the base constitute the metacarpophalangeal (MCP) joint. It allows sideways (adduction/abduction) and bending (flexion/extension) movement. The MCP is followed by the proximal interphalangeal (PIP) and distal interphalangeal (DIP) joints with one flexion/extension DoF each. The opposing fingers contain only one PIP and one DIP joint each.

For adjusting the stiffness of the joints, the Flexible Antagonistic Spring (FAS; Friedl et al., 2011) mechanism of the DLR Hand Arm System (Grebenstein et al., 2011) is used (**Figure 2**). The joint is connected to two tendons, each of which can move it in one of two opposing directions (e.g., flexion and extension). The tendons each run to electric servo motors via elastic elements with non-linear forcedisplacement characteristics, where the force is a convex, increasing function of the displacement. The convexly increasing force-displacement characteristics cause the stiffness of the elastic element to rise when the tendon force increases. Due to the antagonistic arrangement of the tendons, the stiffnesses of both tendons add up while the forces cancel each other. Thereby, the stiffness of the joint can be varied independently from the joint torque. The difference in the actuation of the motors defines whether position or stiffness of the joint is changed. Motion at the joint can be achieved by moving both servos in the same direction, changing the stiffness by moving in opposite directions. The maximum joint torque is limited by the maximum motor torque because each motor can only apply forces in one direction (Grebenstein, 2012).

**Figure 3** shows the mechanical design of the elastic elements and how the change of the angles between the tendons and the lever lead to the convexly increasing force-displacement characteristics. Initial stiffness depends on the position of the spring and its constant as well as on the distance between winder and spring pulley (Friedl et al., 2011). Tensioning the tendon by actuating the winder increases the spring deflection angle φ. Said angle is measured by hall sensors (iC-MP sensors by iC-Haus)

and custom-made magnets, and is used to calculate F<sup>s</sup> , the force exerted on the spring pulley.

Having two motors per DoF (2N, where N is the number of DoF) allows setting the stiffness of each joint separately from the other joints. This concept is followed, e.g., in the DLR Hand Arm System. To reduce weight and cost, the WHISG hand uses less than 2N motors.

The four-DoF thumb is actuated by four Bluebird BMS 385 servo motors (**Figure 4**). The PIP joint and the DIP joint share one set of tendons and move differentially, i.e., only the sum of the movement can be controlled. The DIP joint is equipped with an extension spring so that in the absence of external forces only the PIP joint moves. The DIP joint only comes into play when the PIP joint comes into contact with an object and cannot move any further, therefore ensuring a hook-like grip on the object. A change in stiffness affects all four thumb joints at the same time, i.e., the whole thumb can be made stiffer or softer, but not single joints. The number of motors is thus N = N − 1 + 1, where −1 stands for the underactuation of the PIP and DIP joints and +1 stands for the ability to vary one common stiffness value. The four motors each provide a maximum torque of 0.45 Nm resulting in a maximum force at the fingertip of around 10 N.

The resulting stiffness behavior at the thumb tip is exemplarily shown in **Figure 5** for a linear excursion of the thumb tip from a posture of 30◦ flexion of the MCP joint and 60◦ flexion of the PIP and DIP joints. Increasing the pretension of the antagonistic tendons of the joints leads to a shift in the force-stiffness relationship.

The two fingers with two DoF each are even more underactuated by only two (N/2 − 1 + 1) Bluebird BMS 390 DMH servo motors together (**Figure 6**). Each servo motor actuates one of the differential winders, each of which in turn differentially actuates one movement direction of both fingers, thereby dividing the required number of motors by two. Furthermore, one motor is saved by the combined actuation of PIP and DIP. For tuning the common stiffness of the two fingers, an additional motor is used.

# 2.2. Interfacing the Robotic Hand to the Human Operator

For positioning and moving around the WHISG hand, it was physically attached to the right forearm of the human operator via a beam and a splint (**Figure 1**).

The gripper opening angle was controlled by the operator's thumb and index finger via the Grip Force Master (GFM) by Force Dimensions (**Figure 1** top left). It basically consists of a small lever arm tendon-coupled to an electrical motor, which sets the distance between human index finger and thumb. Forcecontrol is achieved by measuring the motor current. In the force feedback mode, the GFM could feed back the grip force of the WHISG hand to the human hand.

In the variable-stiffness mode, the pretension of the WHISG thumb and fingers, and thereby their stiffness behavior, was controlled via the operator's grip stiffness, which was acquired via sEMG signals from intrinsic hand muscles according to Höppner et al. (2017). To measure the sEMG signals, wireless Trigno Standard Sensor electrodes from Delsys Inc. were used

FIGURE 3 | (A) Non-linear elastic element of the WHISG FAS consisting of a guide pulley, a leg spring, a lever, and a spring pulley. (B) Soft configuration. (C) Stiff configuration. An increase of 1l of the tendon excursion leads to an increase of the spring angle φ, which leads to a higher spring force Fs acting on the spring pulley via the lever. The relationship between spring force Fs, bearing reaction force Fr and tendon force F results from an equilibrium around the spring pulley. As the spring angle φ increases, the portion of the tendon forces F that points radially to the center of the spring pulley decreases and therefore the forces increase progressively to keep counteracting the spring and bearing reaction forces.

in connection with a Trigno Lab base station. These electrodes comply with the requirements put forth by the Medical Device Directive 181 93 / 42 / EEC, and the experiment complied with their intended use. The sEMG signals were processed on a custom-designed low-cost analysis box based on an Arduino Duo microcontroller board. The signals were band-pass filtered by a Butterworth 2nd order filter with a cutoff frequency of 20–500 Hz, rectified by root mean square and smoothed using a moving-average filter with a window size of 75 ms. Furthermore, the signals were calibrated to the individual operator by subtracting the baseline noise (recorded while resting the muscles) and divided by a signal recorded during maximum voluntary muscle contraction. The sum of the calibrated sEMG signals was mapped to the range of available pretension levels of the WHISG joints, which modify its stiffness behavior.

A block diagram depicting the connections between the subsystems is shown in **Figure 7** (bottom). The sEMG electrodes send continuously analog data (yellow arrow) to the low-cost analysis box. The calculated stiffness is send to the Linux PC using the serial port of the Arduino microcontroller. The Linux, the GFM, and the WHISG hand are connected to a Linux real-time PC running Matlab Simulink which ensures a proper synchronization between sEMG electrodes, force feedback device and robotic hand (green arrows demonstrate real-time signals). The grip force is measured using the deflection of the springs of the WHISG hand and fed back to the GFM. Subjects noticed higher grip forces through an increased resistance for pressing

the GFM lever arm. The lever arm position was sent back to the WHISG hand for controlling its grasp width. An increase in stiffness results in an overall pretension of all 6 FAS elements of thumb and fingers (see section 2.1).

# 2.3. Experimental Setup

Six healthy subjects, all male, five right-handed and one lefthanded, age 24–30, all naive to the experiment, took part as operators and performed the experimental protocol described below. The whole procedure lasted between 60 and 75 min per participant. Oral and written descriptions of the experiment were provided to the subjects. After all questions were answered a written consent form was signed by all participants. These experiments are compliant with the World Medical Association's Declaration of Helsinki, regarding the ethical principles for medical research involving human subjects, last version, as approved at the 59th WMA General Assembly, Seoul, October 2008. Necessary approvals for the subject studies were received from the organization-wide works council of the German Aerospace Center as well as its institutional board for data privacy. A physician is part of the works council. The collection and processing of experimental data were approved by both committees.

#### 2.3.1. Task: Object Grasping With Different Stiffness Settings

Operators were asked to grip one out of eight tightly packed small-fruit containers (LxWxH: 140 mm x 90 mm x 75 mm) out of a cardboard box (LxWxH: 400 mm x 300 mm x 90 mm) using the WHISG hand (see **Figure 7** and **Video 1**). The fruits were replaced with water-filled plastic bags to simulate their weight, 125 g. The cardboard box was placed on a waist-high table and fixed with adhesive tape such that no movement of the box was possible. One single grasping action was conducted as follows: The operator started approximately two steps away from the table to hinder learning of the optimal, initial grasping position. The WHISG hand was held on the right side of the operator's body, pointing downwards. With the start signal of the experimenter, the operator walked up to the table upon which data acquisition started and the operator began acquiring a grasp of the container. After successfully grasping the container, operators were asked to lift it up and place it right next to the box. After placing it, operators had to take two steps back and return to the starting position, which was the trigger to end data acquisition. Afterwards the plastic box was placed back into the cardboard box by the experimenter before the next grasping action will be performed.

#### 2.3.2. Trials With Different Combinations of Force Feedback, Visual Feedback and Stiffness Mode

Each participant conducted 60 grasping trial in total: 12 were training trials to minimize learning effects for the remaining 48, which were used for statistical analysis. Every 12 trials, operators were asked if they wanted to take a small break to minimize fatigue. In each of the 12-trials sub blocks, all combinations of stiffness modes and feedback modes were tested.

The stiffness modes consisted of variable stiffness (VS), low fixed stiffness (LS), and high fixed stiffness (HS). VS corresponded to the range of 0 to 55% pretension of the FAS mechanisms, LS to 5% pretension and HS to 55% pretension. Each subject was assigned a different permutation of stiffness modes to reduce effects of learning or fatigue on the results.

Within each stiffness mode, the following feedback modes were tested:


switched off; the welding visor glasses were replaced with plastic disks which were processed with sandpaper and an added adhering plastic sheet; objects and their outlines seen through these glasses were blurred;


A questionnaire (5-point Likert scale) was filled about the usefulness of the different stiffness modes, about exploiting the environment to get a secure grasp on the object and about the helpfulness of obtaining force feedback.

# 2.4. Statistical Design

To answer the research questions and evaluate the hypotheses of the study, several types of data were collected and analyzed during the trials.

For investigating the effects of stiffness and feedback modes on the grasp performance and for evaluating the hypothesis that variable stiffness helps to compensate lack of sensory feedback better than fixed stiffness, four performance measures were recorded:

• the grasp success rate; a grasp was counted as successful whenever the small-fruit container was grasped, lifted, and placed on the table next to the cardboard box within the time limit of 2 min; the grasp success rate was calculated for each experimental condition (i.e., for each combination of stiffness and feedback modes) as the number of trials with grasp success divided by the total number of trials within this experimental condition over all operators and repetitions;


For the grasp success rate, 12 values for twelve the experimental conditions were recorded, while the other three recorded performance measures contained 288 values for 288 trials.

The three latter performance criteria were analyzed with a linear mixed regression model,

yijkmn = β<sup>0</sup> + βstiffness\_mode,<sup>i</sup> + βvisual\_feedback,<sup>j</sup>

+βforce\_feedback,<sup>k</sup> + βstiffness\_mode×visual\_feedback,ij


where yijkmn is the response variable, i.e., any of the three above-mentioned performance measures—task completion time, number of grasping attempts or mean thumb gripping torque— , i is the stiffness mode, j denotes the presence of unimpaired visual feedback, k denotes the availability of gripping force feedback, m is the within-operator trial number, n is the operator number, β<sup>0</sup> is the intercept, which is a constant term, βstiffness\_mode,<sup>i</sup> is the fixed effect of the stiffness mode, βvisual\_feedback,<sup>j</sup> is the fixed effect of visual feedback, βforce\_feedback,<sup>k</sup> is the fixed effect of force feedback, βstiffness\_mode×visual\_feedback,ij, βstiffness\_mode×force\_feedback,ik, βvisual\_feedback×force\_feedback,jk, and βstiffness\_mode×visual\_feedback×force\_feedback,ijk are the fixed effects of their interactions, m βtrial\_number is the trial-number-dependent fixed effect of learning or fatigue, ǫoperator,<sup>n</sup> is the operatorspecific random effect and ǫmn is the residual random error. The random effects are assumed to follow normal distributions as follows:

$$
\epsilon\_{mn} \stackrel{\text{i.i.d.}}{\sim} \mathcal{N}(0, \sigma^2) \text{ and} \tag{2}
$$

$$
\epsilon\_{\text{operator},m} \stackrel{\text{i.i.d.}}{\sim} \mathcal{N}(0, \mathfrak{r}^2). \tag{3}
$$

How operators experienced the helpfulness of the different stiffness modes and the availability of force feedback was measured using a questionnaire (5-point Likert scale) with three questions for the stiffness modes and one question for force feedback. The answers were grouped into positive, neutral, and negative groups and reported summarily over all operators.

For the analysis of the stiffness modulation strategies in the variable-stiffness mode, the mean normalized stiffness values in the searching phase and the grasping phase were recorded.

They were analyzed using the following mixed regression model:

$$\begin{split} K\_{\text{mean},jklmn} &= \beta\_0 + \beta\_{\text{visual\\_feedBack},j} + \beta\_{\text{fore\\_feedback},k} + \beta\_{\text{phase},l} \\ &+ \beta\_{\text{visual\\_feedback}\times\text{phase},jl} + \beta\_{\text{fore\\_feedback}\times\text{phase},kl} \\ &+ \epsilon\_{\text{operator},tt} + \epsilon\_{\text{HH}}, \end{split} \tag{4}$$

where Kmean,jklmn is the response variable, i.e., the mean normalized stiffness, l denotes the phase (searching or gripping phase), βphase,<sup>l</sup> its fixed effect on the response variable, βvisual\_feedback×phase,jl and βforce\_feedback×phase,kl its interactions with the feedback modalities and the other variables as in Equation (1).

The parameters of the mixed models were fitted to the measured outcome measures using the lmer function of the lme4 library (Bates et al., 2015) of the R statistics software (R Core Team, 2015).

The exploitation of environmental constraints was observed by the experimenter, classified and reported summarily. The response of operators to the questionnaire whether they used the environment to obtain a secure grasp on the object was reported summarily.

#### 3. RESULTS

The main results of the experiments consist of (a) the effects of stiffness and feedback modes on the grasp performance and user experience, (b) in the case of the variable-stiffness mode, observations of stiffness modulation strategies, and (c) observations of strategies for exploiting environmental constraints.

The effect of the trial number, which can account for learning or fatigue, is two to three orders of magnitude smaller than the effects of stiffness and feedback.

#### 3.1. Effects of Stiffness and Feedback Modes on the Grasp Performance

The grasp performance is measured by four criteria: grasp success rate, task completion time, number of grasp attempts, and mean gripping torque.

The grasp success rate is 100% for all stiffness and feedback modes.

For the other three grasp performance measures—task completion time T, number of grasp attempts Nga, and mean gripping torque τmean—, the absolute measurement values at the different combinations of stiffness and feedback modes are shown as samples and box plots in **Figure 8**. The task completion time ranges from 4 to 63 s, the number of grasping attempts from 1 to 11 and the mean thumb gripping torque from 0.07 to 0.86 Nm, plus one outlier at 3.67 Nm, which could not be explained and remained in the data set for the analysis. For all three measures, low values are desirable: lower task completion times enable higher productivity, a lower number of task attempts indicates higher reliability and lower gripping torques indicate gentler handling of the manipulated goods.

How these three measures are affected by the presence or lack of sensory feedback and by the variable or fixed grip stiffness modes is shown in **Figures 9**, **10** as 95% confidence intervals of the results of the fitted mixed model of Equation (1). Whenever the confidence interval does not include zero, the effect is statistically significant at a significance level of α = 0.05. In some cases of a slight overlap between the confidence interval and the zero level we carefully speak of tendencies. Since the significance level is not corrected for multiple comparisons, the confidence intervals are interpreted in conjunction with each other as aggregate results, rather than independently as separate results.

The distributions of the residuals of the mixed models of task completion time and number of grasping attempts were somewhat positively skewed compared to a normal distribution, but still unimodal and smooth. While the method is robust to deviations from the normal distribution, this adds some uncertainty to the results.

The effect of the sensory feedback is shown in **Figure 9**. Different sensory impairments are compared against the baseline of full sensory feedback (VF+FF). Taking away the grip force feedback leaves visual feedback (VF) remaining. Conversely, impairing the visual feedback leaves grip force feedback (FF) remaining. Taking away both the force feedback and impairing the visual feedback leads to the "no-feedback" (NF) mode, where the remaining feedback is actually limited to blurred visual feedback as well as auditory feedback and some direct force feedback via the splint on the forearm, which are present in all feedback conditions. Lack of visual feedback deteriorates all performance measures in the low-stiffness mode (1T ≈ +11 s. . . +15 s, 1Nga ≈ +2 . . . +3, 1τmean ≈ range.

+0.1 Nm . . . +0.2 Nm; **Figures 9A–C** left column, FF and NF) and shows a tendency of slightly deteriorating them in the other two stiffness modes (1T ≈ +4 s. . . +5 s, 1Nga ≈ +0.2 . . . +0.8, 1τmean ≈ +0.04 Nm . . . +0.1 Nm; **Figures 9A–C** middle and right columns, FF and NF). Lack of force feedback shows a tendency to deteriorate the gripping torque (1τmean ≈ +0.08 Nm . . . +0.15 Nm; **Figure 9C** VF and NF). In the absence of visual feedback and when using the low-stiffness mode, lack of force feedback shows a slight tendency to improve the performance criteria (1T ≈ −3 s, 1Nga ≈ −1, 1τmean ≈ −0.05 Nm; **Figures 9A–C** left column, NF vs. FF).

**Figure 10** shows the effect of the stiffness modes on the grasping performance. The fixed-stiffness modes low stiffness (LS) and high stiffness (HS) are compared against the baseline of variable stiffness. In the absence of visual feedback, low fixed stiffness increases the task completion time on average by about 6 to 10 s (**Figure 10A**) and the number of grasp attempts by about 1.1 to 1.8 (**Figure 10B**). High fixed stiffness shows a slight tendency to increase the gripping torque by about 0.02 to 0.08 Nm (**Figure 10C** HS), while low fixed stiffness shows a tendency to reduce the gripping torque by about 0.06 to 0.08 Nm, except in the presence of force feedback only (**Figure 10C** LS).

In the questionnaire, operators gave one neutral and five negative responses for the low stiffness, five positive and one neutral for the high stiffness and four positive and two neutral for the variable stiffness setting. Four out of six operators found the presence force feedback helpful.

## 3.2. Observations of Stiffness Modulation Strategies

To answer the question how the operators adjust the stiffness in the variable-stiffness mode during the course of the trial, it is interesting how the stiffness changes between the searching phase and the grasping phase and how it is affected by the sensory feedback mode. **Figure 11** compares the grasping phase to the baseline of the searching phase. In the grasping phase, the mean normalized stiffness is about 0.2 units higher than in the searching phase. This difference between searching phase and grasping phase occurs during all sensory feedback conditions.

The effect of sensory feedback itself on the mean normalized stiffness is shown in **Figure 12**. In the grasping phase (diagram on the right), lack of force feedback leads to a decrease in mean normalized stiffness by about 0.08 units (VF), lack of visual feedback to an increase by about 0.04 units (FF), and lack of both feedback modalities to a decrease by about 0.04 units (NF). While not evident from the mean stiffness data, subjects reported to increase grip stiffness in the searching phase under conditions of sensory impairment in order to generate higher contact forces which could be felt at the lower arm via the splint. For exemplary time series of stiffness tuning and feedback force under different sensory conditions please see the **Appendix**.

# 3.3. Observations of Strategies for Exploiting Environmental Constraints

In the questionnaire, five out of six operators reported using the environment to get a secure grasp on the object.

**Figure 13** shows a comparison of different strategies that were used to grasp the object. Red dots depict the placement of the WHISG fingers whereas the green dot is the placement of the WHISG thumb. All strategies show a distinct exploitation of the environment: in the upper left, one finger was moved along the crack between two boxes to find the free space. By rotating the operator's arm and therefore the WHISG hand, the second finger was placed in the space above, before the thumb was placed in the space opposite of the fingers, ensuring a secure three-fingered grasp. However, in 94% of all grasp

trials, operators placed one finger on top of the box and fixed the object between thumb and one finger (see **Figure 13** upper right). In most trials, operators reported to have trouble seeing the lower of the two fingers. The bottom two strategies show three finger grasps with higher contact forces because either the upper or lower WHISG finger had to be placed in the crack between two boxes which required more force than placing the fingers in the free spaces. This sometimes led to squashing the lid of the box and therefore potential damage to the object content.

# 4. DISCUSSION AND CONCLUSIONS

In this study, picking and placing of small-fruit containers from a tightly packed set by the variable-stiffness WHISG robotic hand was investigated under different stiffness modes and different levels of sensory feedback. The hand was positioned and controlled by human operators as proxies for a robotic arm and a control computer. It was shown that the WHISG hand is suitable for consistently performing the task well within the time constraint of 2 min even when the controller received only a

limited amount of sensory feedback. However, it was also shown that the task could even be completed by a mechanically simpler hand with a fixed stiffness behavior.

Regarding the influence of sensory feedback on the grasping performance, it was shown that high-quality visual feedback tends to decrease task completion time, number of grasping attempts, and gripping torques. The influence of force feedback was less decisive, with a tendency of increasing task completion time, number of grasping attempts, and gripping torques when using a low-stiffness hand in the absence of visual feedback, but slightly decreasing gripping torques otherwise. This matches the result of Laghi et al. (2017) that visual feedback plays a more significant role in contact recognition than force feedback. Regarding the further result of Laghi et al. (2017) that the presence of force feedback decreases the influence of visual feedback, our results show a confirming tendency under some experimental conditions and a contradicting tendency under other experimental conditions.

Despite the lack of clear benefit on the measured grasping performance, the majority of operators reported in the questionnaire that they found force feedback helpful. This is similar to the findings of Godfrey et al. (2013), where subjects reported lower physical and mental effort when vibro-tactile feedback was present.

The hypothesis that variable stiffness can better compensate lack of sensory feedback than fixed stiffness could not be

confirmed comprehensively. Under conditions of impaired visual feedback, high fixed stiffness tended to provide the lowest task completion times and least number of grasping attempts. When high-quality visual feedback was given, low fixed stiffness tended to perform at similar speed and number of attempts as the high-stiffness and variable-stiffness hands, but tended to apply somewhat lower gripping forces, which may be beneficial for the handling of delicate goods. Variable stiffness tended to provide a compromise between high and low stiffness when both visual and force feedback were lacking. While under these conditions it performed similarly fast and with a similar number of grasping attempts as high fixed stiffness and faster and with less attempts than low fixed stiffness, it tended to apply somewhat lower gripping torques than high fixed stiffness but higher gripping torques than low fixed stiffness.

The grasp success rate was 100% under all stiffness and feedback modes, which we found surprising. Apart from proving the suitability of the WHISG hand for the given task, this also shows that a more difficult task might have been needed for better discovering possible benefits of variable stiffness in situations of sensory deprivation. In future studies, the task could be made more difficult by tightening the time constraint, by grasping more delicate objects without hurting them and by positioning the robotic hand by a robotic arm to eliminate the inadvertent force feedback on the operator's forearm via the splint.

The investigation of stiffness variation strategies showed that operators apply higher stiffness in the grasping phase than in the searching phase and somewhat higher stiffness in the presence of force feedback than in its absence. The higher stiffness in the grasping phase corresponds to the presumption that the fingers need to withstand the weight of the object during lifting. The higher stiffness in the presence of force feedback could be explained by the fact that the stiffness of the robotic hand is not only a function of the robotic grasping force, but also of the sEMG-controlled FAS

pretension level, which is in turn influenced by the contraction and cocontraction of the operator's muscles. In the absence of force feedback, no forces are applied to the operator's fingers, hence the operator can only modulate the stiffness by cocontraction. When force feedback is turned on, the operator additionally contracts to counteract the forces on the finger, which further increases his grip stiffness and the sEMG signals. This is actually more an artifact of the teleimpedance setup than an insight into an optimal stiffness modulation strategy.

The observation of environmental constraint strategies yielded four distinct grasping patterns, one of which enjoyed a striking preference by being used in 94% of the trials. Interestingly, this pattern, which exploits gaps between the containers near their corners, places only two fingers in grasping positions, as opposed to the other patterns, which place three fingers and exploit the cracks between the containers in addition to the gaps. One possible explanation lies in the fact that subjects had difficulties to see the second finger that often ended up unused. A further possible explanation is that forcing the fingers in the crack between the containers required more force and was therefore harder to realize than the placement in the corner gaps.

In conclusion, the study showed that the VSA WHISG hand is suitable for the task of picking and placing small fruit containers from a tightly packed set but that the task may also be done with a simpler SEA hand with fixed stiffness. Furthermore, it showed operators' strategies of exploiting the cracks and gaps between containers to guide compliant fingers to relevant grasp contact points, which may also be useful for autonomously operated robots.

# AUTHOR CONTRIBUTIONS

MH contributed to literature review, implementation of electronics, calibration methods, experiments, data analysis and paper writing. WF developed the robotic hand, supervised the project overall, and contributed in proof reading. GS contributed to the interpretation of data, writing, and proof reading the manuscript. HH developed the idea, supervised the project overall and contributed in data analysis, writing, and proof reading.

# ACKNOWLEDGMENTS

This work has been funded by the European Commission's Eighth Framework Program as part of the project SOMA (grant number H2020-ICT-645599). We would like to thank Holger Urbanek for the photography (same affiliation as all authors).

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnbot. 2018.00070/full#supplementary-material

Supplementary Figure 1 | Grasping action with VF but without FF; black line: raw sEMG signals for the first and second dorsal interossei muscle (MID1, MID2), blue line: commanded stiffness, red dashed line: GFM position, green dashed line: normalized feedback force, black dashed line: maximum WHISG stiffness.

Supplementary Figure 2 | Grasping action with VF and FF; black line: raw sEMG signals for the first and second dorsal interossei muscle (MID1, MID2), blue line:

### REFERENCES


commanded stiffness, red dashed line: GFM position, green dashed line: normalized feedback force, black dashed line: maximum WHISG stiffness.

Supplementary Figure 3 | Grasping action without VF and FF; black line: raw sEMG signals for the first and second dorsal interossei muscle (MID1, MID2), blue line: commanded stiffness, red dashed line: GFM position, green dashed line: normalized feedback force, black dashed line: maximum WHISG stiffness.

Supplementary Figure 4 | Grasping action without VF but with FF; black line: raw sEMG signals for the first and second dorsal interossei muscle (MID1, MID2), blue line: commanded stiffness, red dashed line: GFM position, green dashed line: normalized feedback force, black dashed line: maximum WHISG stiffness.

force task," in IEEE International Conference on Robotics and Automation, vol. 1 (Shanghai), 3312 – 3316.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Haas, Friedl, Stillfried and Höppner. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# APPENDIX

## A1 Stiffness Tuning Process

The necessary time to tune the stiffness is dependent on two factors: the sampling rate of the finger stiffness of the WHISG hand (100 Hz) and the small latency of the motors to get in the commanded position. Overall it takes less than 150 ms to go from the lowest to the highest stiffness setting, assuming that the muscular activity is at or above the determined MVC level (see e.g., **Supplementary Figure 2** at around 5.8 s). The stiffness tuning process is independent of the handled object and is only dependent on the muscular activity of the human operator. Assuming the hand of the operator is not perturbed, external disturbances don't affect the finger stiffness of the WHISG hand.

# A2 Stiffness Tuning and Feedback Force Under Different Sensory Conditions

The following figures show four trials with sEMG-based variable stiffness and different sensory conditions, all performed by operator 5. In all figures, raw sEMG signals for both electrodes are shown in black. MID1 denotes the activity of the first dorsal interossei muscle, MID2 of the second dorsal interossei muscle. Superimposed in red dashed lines is the commanded position of the GFM, in green dashed lines the feedback force normalized to the maximum feedback force over all trials of the operator (both lines percentaged to each particular scale). The black dashed line shows the maximum finger stiffness of the DLR WHISG hand.

**Supplementary Figure 1** depicts a trial with VF but without FF available to the operator. The first 2.75 s are spent searching for the open space between the boxes, before the grasp is finally initiated. Due to the VF being intact, the delayed feedback force doesn't matter for the grasping action. With FF disabled, the operator almost used the full range of the GFM which in return, with the high stiffness, resulted in a big feedback force.

In **Supplementary Figure 2**, both VF and FF was available. In total, two grasping actions were needed for a successful grasp of the object. After the first failed attempt, the operator voluntarily increased finger stiffness and was able to grasp the object—despite the lowered GFM position in comparison to the first attempt. The feedback force is considerably lower than in **Supplementary Figure 1** despite the commanded stiffness being equally high which is a result from the lowered GFM position.

With both VF and FF unavailable to the operator (see **Supplementary Figure 3**), the operator required the full range of the GFM to grasp the object. The low commanded stiffness could be a result of no sensory information available, which would protect the robotic hand in case of unforeseen impacts with the environment. Feedback force is low due to the lower stiffness despite the increased GFM position.

In **Supplementary Figure 4**, the operator increased the stiffness before initiating the grasp. With the increased feedback force, it was possible to detect the open space even without the visual information. Due to the reduced GFM position, feedback force overall is lowered even though the commanded stiffness is high.

# A3 VIDEO

The video attached to this submission exemplary shows the difference between the low and high stiffness setting of the WHISG hand as well as the inherent compliance when handling soft objects, e.g. fruits. Additionally the experimental procedure with and without visual feedback is presented.

# Tactile Sensing and Control of Robotic Manipulator Integrating Fiber Bragg Grating Strain-Sensor

Luca Massari 1,2 \*, Calogero M. Oddo<sup>1</sup> \*, Edoardo Sinibaldi <sup>3</sup> , Renaud Detry <sup>4</sup> , Joseph Bowkett <sup>5</sup> and Kalind C. Carpenter <sup>4</sup> \*

<sup>1</sup> Polo Sant'Anna Valdera, The BioRobotics Institute, Scuola Superiore Sant'Anna, Pontedera, Italy, <sup>2</sup> Department of Linguistics and Comparative Cultural Studies, Ca' Foscari University of Venice, Venice, Italy, <sup>3</sup> Center for Micro-BioRobotics, Istituto Italiano di Tecnologia (IIT), Pontedera, Italy, <sup>4</sup> Jet Propulsion Laboratory, California Institute of Technology, NASA, Pasadena, CA, United States, <sup>5</sup> Department of Mechanical and Civil Engineering, California Institute of Technology, Pasadena, CA, United States

Tactile sensing is an instrumental modality of robotic manipulation, as it provides information that is not accessible via remote sensors such as cameras or lidars. Touch is particularly crucial in unstructured environments, where the robot's internal representation of manipulated objects is uncertain. In this study we present the sensorization of an existing artificial hand, with the aim to achieve fine control of robotic limbs and perception of object's physical properties. Tactile feedback is conveyed by means of a soft sensor integrated at the fingertip of a robotic hand. The sensor consists of an optical fiber, housing Fiber Bragg Gratings (FBGs) transducers, embedded into a soft polymeric material integrated on a rigid hand. Through several tasks involving grasps of different objects in various conditions, the ability of the system to acquire information is assessed. Results show that a classifier based on the sensor outputs of the robotic hand is capable of accurately detecting both size and rigidity of the operated objects (99.36 and 100% accuracy, respectively). Furthermore, the outputs provide evidence of the ability to grab fragile objects without breakage or slippage e and to perform dynamic manipulative tasks, that involve the adaptation of fingers position based on the grasped objects' condition.

#### Edited by:

Ganesh R. Naik, Western Sydney University, Australia

#### Reviewed by:

Ashley Kleinhans, Ford Motor Company, United States Tomas Kulvicius, University of Göttingen, Germany

#### \*Correspondence:

Luca Massari luca.massari@santannapisa.it Calogero M. Oddo calogero.oddo@santannapisa.it Kalind C. Carpenter kalind.c.carpenter@jpl.nasa.gov

Received: 01 November 2018 Accepted: 04 March 2019 Published: 05 April 2019

#### Citation:

Massari L, Oddo CM, Sinibaldi E, Detry R, Bowkett J and Carpenter KC (2019) Tactile Sensing and Control of Robotic Manipulator Integrating Fiber Bragg Grating Strain-Sensor. Front. Neurorobot. 13:8. doi: 10.3389/fnbot.2019.00008 Keywords: robotics, manipulation tasks, fiber bragg gratings, tactile sensors, sensorized hand

# INTRODUCTION

The sense of touch is a key sensory modality of prehensile manipulation. Through tactile perception, humans can perceive object properties such as size, hardness, temperature, contour, etc. Information arises from the multiple receptors available within the human skin, especially across hand and fingers (Johansson and Vallbo, 1979; Johansson et al., 1982). During manipulation, the hand partially occludes the object from sight. Tactile sensing enables measurements to be obtained in areas that are inaccessible through vision. Prior behavioral studies have demonstrated the tactile reliance of human manipulation, for both simple grasping and dexterous manipulation (Johansson and Flanagan, 2009). In the last few years, the field of robotics has expanded toward more complex environments (Dahiya et al., 2010), including dangerous and unaccessible scenarios such as nuclear meltdown disasters and space missions to other planets, where robots are demanded to take over human jobs. The successful automation of complex human-like manipulative tasks depends on robot's perception capabilities, including through a tactile sensor, to characterize the relation between the operated objects and the robotic manipulator (Tegin and Wikander, 2005; Hoshi and Shinoda, 2006; Yousef et al., 2011). Although the human hand represents a point of inspiration for many prehensile robotic hardware (Bicchi, 2000; Murray, 2017), the field of artificial tactile sensing covers a large spectrum of underlying principles (Chi et al., 2018). The literature, for instance, shows relative success with capacitive, piezoelectric, piezoresistive, and resistive sensors. Such sensing systems rely on changes in the measured variable (i.e., capacitance, electrical charge, resistance, etc.) that involve different advantages and disadvantages. Capacitive sensors consist of two conductive plates interfaced by means of a compressible dielectric material (Golpaygani et al., 2009; Wong et al., 2012; Vogt et al., 2013). The transduction principle relies on the capacitance variations that occur when, during the loading phase, the gap between the plates changes. Such transducers entail high sensitivity and frequency response but are susceptible to electro-magnetic noise, tend to be nonlinear and to have hysteresis. Capacitive sensors are extensively used in robotic applications for tactile feedback, (Romano et al., 2011; Schmitz et al., 2011; Heyneman and Cutkosky, 2012; Jara et al., 2014). Piezoelectric sensors depend on the electrical charge generation in the quartz crystal, as it deforms by applying a load. Such sensors are frequently employed for dynamic sensing, due to a very high frequency response and can be used to build flexible tactile sensors (Sirohi and Chopra, 2000; Cutkosky et al., 2008; Qasaimeh et al., 2009; Chuang et al., 2013; Seminara et al., 2013; Canavese et al., 2014; Kim et al., 2014; Acer et al., 2015). On the other hand, piezoelectric sensors suffer temperature sensitivity and are generally fragile (Dahiya and Valle, 2008). Piezoresistive sensors rely upon the electrical changes in resistance occurring to the material during load/pressure application (Girão et al., 2013; Ma et al., 2015; Oddo et al., 2016). Such sensors are widely used as they are relatively easy to produce and can be flexible (Someya and Sekitani, 2014). The main drawbacks of these transducers refer to the low repeatability, fragility to shear forces, non-linear response and hysteresis. Among all the technologies, the use of optical fibers as transducers for tactile sensors is spreading due to the multiple advantages such as: electromagnetic immunity, flexibility, high sensitivity, multiplexing capability, and lightness (Polygerinos et al., 2010; Udd and Spillman, 2011; Wang and Wolfbeis, 2012). Several studies promote such sensors for different fields of application such as: automotive (da Silva et al., 2010), medicine (Silvestri and Schena, 2011) and smart textile (Massaroni et al., 2015) among the others. Depending on the working principle, fiber optic based sensors entail different ways of operation: micro and macro bending (Heo et al., 2007; Pirozzi, 2012), interferometry (Liu et al., 2012), hybrid optoelectronics (Ascari et al., 2007) and Fiber Bragg Grating (FBG) (Liang et al., 2018). In parallel to the development of tactile sensors, the robotics community has produced a vast amount of research on hand design. Hand design is typically applicationdriven, leading to different arrangements ranging from simple two-finger grippers to complex contraptions that mimic the mechanics of the human hand (Eusebi et al., 1994; Ramos et al., 1999; Townsend, 2000; Butterfaß et al., 2001). This paper presents the case of a four-finger under-actuated hand (Cam-Hand) that endows Jet Propulsion Laboratory's (JPL) quadruped RoboSimian robot with both manipulation and versatile mobility capabilities (Hebert et al., 2015; Karumanchi et al., 2018). This robot uses its limbs for mobility and manipulation such as grasping. Each seven degree of freedom limb consists of a set of three elbow assemblies and an actuator mechanically linked to the main body. The limbs end with a six-axis force sensor which interfaces the Cam-Hand (**Figure 1A**). The hand consists of an aluminum body and four aluminum fingers configured for many uses including being used as a gripper tool. The chosen design, conceived for use in scenarios that require robust manipulation, resulted in a system that enhances grip strength and robustness over dexterity and flexibility. Not being designed to prioritize complex manipulation tasks or handling fragile objects limits the variety of tasks the robot is able to perform. The present work is aimed at overcoming these limitations and enhancing safety and control during interaction with the surrounding environment. The RoboSimian Cam-Hand has been redesigned by sensorizing the artificial fingers to enable tactile feedback. New sensorized robotic fingers have been devised, embedding optical fiber sensing technologies, to gain information on grasped object properties as well as the contact conditions. The choice of the robotic hand sensorization was based on some crucial requirements such as (i) the ability to provide information about the contact (i.e., intensity), (ii) the ability to provide information about the grasped objects (i.e., size, rigidity, etc.), and (iii) the ability to perform manipulation tasks (i.e., estimation of grasp stability) (Kappassov et al., 2015). Moreover, it is worth mentioning that the robotic hand presents additional constraints related to the physical integration of the sensors. The adopted technology has to meet the requirements imposed by the robotic hand layout and design. Hence, the sensorization needs to be achieved without affecting the dexterity of the hand, i.e., without drawbacks in terms of bulkiness and encumbrance. Considering the aforementioned physical and tactile requirements, FBG technology was chosen to realize the sensor due to its adaptability to the design of the artificial hand, for its reliability in strain measurements and for the multiplexing capabilities that entail a high spatial resolution without an overwhelming amount of wires (**Supplementary Material Video 4**). State-of-art regarding applications adopting FBGs as transducers provide evidence of tactile sensors used in different scenarios. Compared to related works (i.e., soft tactile sensors embedding FBGs) (Heo et al., 2006; Saccomandi et al., 2015; Song et al., 2015; Jiang and Xiang, 2017; Negri et al., 2017; Li et al., 2018; Pedroso et al., 2018), the present sensor shares the concept of encapsulating the optical fiber in a soft matrix. Such cover not only protects the fiber from mechanical ruptures but also affects the transduction of the signal by mediating the transmission of pressure to the buried FBGs. In comparison to previous studies embedding FBGs in prototypical matrices (e.g., parallelogram bricks), the location of our FBGs were based on the design of the robotic hand, expressly functional to gripping tasks. One common elastomer used as encapsulating material is PDMS (Heo et al., 2006; Saccomandi et al., 2015; Song et al., 2015; Jiang and Xiang, 2017), while in this study a soft Dragon Skin silicone (20 medium, Smooth-on, USA) was

used, due to its higher flexibility and lower delamination between layers. Further details on the adopted elastomer, are given in the Materials and Methods section.

The scope of the present study goes beyond the development of a soft and flexible tactile sensor. The novelty of the work also relies in the demonstration of a closed-loop control strategy for fine manipulation (Fragile Task), and in extracting features of manipulated objects, whereas in state of the art studies FBG wavelength variations were used to estimate several quantities (e.g., pressure, force, hardness) but within an openloop scheme, without affecting the control variable. The work is organized as follows: Section Materials and Methods describes the transduction principle of the FBG transducers as well as the Cam-Hand design, the fabrication process and the control system. Furthermore, the same section reports about the experimental protocols and the data analysis. Results are presented in section Results, followed by the discussion and conclusion in section Discussions and Conclusions.

# MATERIALS AND METHODS

# Fiber Bragg Grating Transduction Principle

An FBG is a reflector, formed by systematic variation of refractive index, inscribed in the core of an optical fiber. This resonant microstructure acts as a narrow band filter. When light propagates along the optical fiber, and reaches the etched FBGs, part of the source is reflected. This reflected signal is called Bragg Wavelength (λB) and it depends on the grating spatial period (1B) and the effective refraction index (ηeff) of the optical fiber as in Equation (1):

$$
\lambda\_{\rm B} = 2 \cdot \eta\_{\rm eff} \cdot \Lambda\_{\rm B} \tag{1}
$$

Strain conditions and temperature variations imparted on the FBGs lead to variation of λ<sup>B</sup> resulting in changes of the grating spatial period (3B), or effective refraction index (ηeff). In the present work the contribution of temperature is negligible, since the whole experimental session was performed at room temperature.

# Sensorized Robotic Hand Design

The Cam-Hand body houses the driving electronics and three brushed DC motors (Maxon precision motor, Sachseln, Switzerland). The finger geometry follows a cam profile on the outside and a hook style shape on the inner profile. The system is comprised of two outer fingers slaved together and two inner fingers that are independent. Through continuous rotation of the fingers the Cam-Hand is able to achieve a huge number of configurations and grasping angles.

The Cam-Hand includes four fingers and the inner fingers were sensorized due to their independent actuation. Optical fibers (Technica Optical Components, Atlanta, GA, USA) that exhibit a diameter of 80µm (100µm with polyimide coating) were chosen as small diameters allow for low bending radius configurations. The fibers house 6 FBGs, each grating is 8 mm long and located at a distance of 10 mm, center-to-center, from the adjacent FBG. **Table 1** provides further details about the chosen technology. The optical fibers were encapsulated in a soft polymeric material integrated into the rigid artificial finger. According to previous works (Massari et al., 2018), Dragon Skin (20 medium, Smooth-on, USA) was chosen as soft material for encapsulating the optical fibers. This polymer shows remarkable physical properties such as high elongation at break and high flexibility (Cai et al., 2013). Moreover, during grasping, silicone mediated the transmission of pressure to the buried FBGs, applied by the grasped object to the robotic finger. Maintaining the same design of the previous Cam-Hand, new customized fingers were developed, in aluminum, with a notch to allow the insertion of the soft material and the relative optical fibers (**Figure 1B**). Such a groove held an irregular shape that followed the curvature of the robotic fingers. Both sides of

#### TABLE 1 | Datasheet of the optical fibers integrating FBG transducers.


\*SLSR, Side Lobe Suppression Ratio. \*\*FWHM, Full Width Half Max.

the finger presented the groove and were connected by means of a series of holes (3.5 mm diameter) whose purpose was to hold the polymer in a fixed position. Approximately, the notch resulted 62 mm in length, 4.4 mm in width and 2.5 mm in height (**Supplementary Figure 1**). The liquid polymer was casted to fill the notch and thus filling out the shape of the artificial finger when not sensorized. The final design includes two optical fibers located at each side of the finger. The sensitive area of the finger is approximately 60 mm, which corresponds to the front part of the finger, namely the area responsible for the grip. Several iterations of molds were created to realize the polymeric filling and fabrication process. This involves three consecutive steps:


In step (i) and (iii) silicone was degassed to minimize air bubbles and cured at room temperature until solidification was reached.

#### Cam-Hand Controller

The movements of the Cam-Hand were piloted by means of a DC voltage supply (HMC804x Power Supply, Rohde & Schwarz, Munich, Germany), a relays circuit (4-channel 5V USB Relay Module, SainSmart, USA) and an optical interrogator (Hyperion si155, Micron Optics, GA, USA), that was reading the FBGs output. A Graphical User Interface was realized in LabVIEW (National Instruments, TX, USA) to control the previous units and for data acquisition (**Figure 2**). A positive voltage was applied to the DC motor to close the hand and vice versa a negative voltage applied to open it. At constant load, higher voltage values entailed higher motor speed (rpm) and consequentially faster movements of the fingers. Through the power supply, the voltage flow was regulated to set the velocity and the relays were switched on/off to close/stop/open (+V/0/-V) the fingers. The initial configuration, also called free configuration corresponded to a condition where the fingers were open and ready to perform the grasp, while the grasp configuration matched with the condition of the fingers closed around the objects. Two controllers were developed: (i) static controller and (ii) dynamic controller. In the first case a fixed voltage equal to 13.5 V was given to the motor, thus allowing the fingers to close or open at constant speed. Depending on the FBGs output and through switching the relays, the static controller achieved the action of closing, stopping and opening the fingers. Two thresholds, lower and upper, were set on the mean of all FBGs wavelength variation. Enabling the static controller to activate the transition from the free configuration to

the grasp configuration. When the mean wavelength variation was lower than the first threshold, the controller allowed the flow of a positive voltage and the corresponding closing movement. When the mean wavelength variation trespassed such threshold, due to the higher pressure applied from the object to the sensorized finger and the consequentially higher strain suffered by the optical fiber, and entered in the grasp configuration, the controller disabled the voltage flow and stopped the hand. Opening the hand, thus giving negative voltage, took place when the mean wavelength variation raised over the upper threshold. In the second controller, instead, the given voltage was not constant but function of the mean wavelength variation. Through a PID controller (Proportional—Integrative—Derivative), in a closed loop, different values of voltage (based on the mean wavelength variation) were given to maintain a steady grab condition. A desirable value was established for the mean wavelength variation, corresponding to a certain grab condition, hereafter called set point. The controller was aimed at regulating the voltage values and switching the relays on/off in order to reach this value and sequentially to maintain it. Scope of such controller was to respond, dynamically, with different voltages to changes in size of the grasped object without slipping or breaking (**Supplementary Material Video 3**).

#### Experimental Materials and Protocols

The performance of the proposed version of the Cam-Hand was evaluated through different tasks that involved the action of grabbing several objects in various conditions (**Supplementary Figure 2**). Within this work, four tasks were performed: (i) Size Task, (ii) Material Task, (iii) Fragile Task, and (iv) Dynamic Task. The first and second tasks assessed the capability of the sensorized fingers to estimate mechanical properties of grasped object, namely size and rigidity. The third task, representing a qualitative test, evaluated the ability of the system to grab fragile objects without slipping or breaking them, thus obtaining a measure of the sensitivity of the Cam-Hand (**Supplementary Material Videos 1, 2**). The last task measured the Cam-Hand capacity to dynamically adapt its position based on objects that could change size (**Supplementary Figure 3**). For the Size Task 5 plastic cylinders, 3D printed in ABS, with varying diameter from 10 mm to 50 mm with step of 10 mm were realized. The height of such cylinders was constant and equal to 150 mm (**Figure 3A**). For the Material Task 4 cylinders, 30 mm in diameter and of 150 mm height, realized in different materials were used. Such cylinders had increasing Young

GPa. (C) Fragile task: Nachos used to test the sensitivity of the Cam-Hand. (D) Dynamic task: mechanical jack with variable length.

Modulus: Dragon Skin E ≈ 0.34 MPa, Vytaflex E ≈ 2 MPa (60 A, Smooth-on, USA), ABS E ≈ 2.2 GPa and Aluminum E ≈ 70GPa (**Figure 3B**). For the Fragile Task commercial nachos were used, that can be considered to be very fragile objects (**Figure 3C**). The Dynamic task involved the realization of a mechanical jack composed by two concentric cylinders, 29 mm and 23 mm in diameter and 40 mm and 28 mm in height, respectively. These objects were realized in ABS with a 3D printer and the top and bottom part were covered by a thin layer (5 mm) of Dragon Skin 20. The mechanical jack was attached to a motorized linear stage (A-LST0500B-C, Zaber Technologies Inc., Vancouver, Canada) that could control its length by moving backwards and forwards (**Figure 3D**).

For the first three tasks the procedure consisted of performing a grasp with the Cam-Hand piloted by the static controller. The sample was manually held between the robotic fingers in free configuration. Then, by enabling voltage flow, the Cam-Hand started closing its fingers until reaching the grasp configuration. After a few seconds of grasping, the robotic hand was manually released and brought back to the free configuration. Each trial of the first two tasks was executed 10 times for repeatability, thus having 50 tests for Size Task and 40 tests for Material Task. For the Fragile Task, 20 repetitions were achieved. For the first and the second task, the lower threshold was set to 0.01 nm, while for the Fragile Task it was set to 0.16 nm. In the Dynamic Task a grasp of the mechanical jack was performed with the Cam-Hand piloted by the dynamic controller. The set-point was set to 0.16 nm, this value was reached after enabling voltage flow and passing from free configuration to grasp configuration. After the grasping action, random values of travel range (from −25 to 25 mm) and velocity (from 0.5 to 3 mm/s) were given to the linear motorized stage. Consequentially, the mechanical jack linked to the stage started to move with different random velocity into a different random position. The Cam-Hand adapted its position, with a velocity proportional to the velocity of the stage, based on the mean wavelength variation (**Figure 4**). Moreover, another test was performed in which random values of travel range were given, but the values of velocity were increasing, starting from 0.5 mm/s and increasing by 0.1 mm/s each 0.75 s. After reaching the destination the velocity was reinitialised to 0.5 mm/s. To measure the force required to break the commercial nachos, an experimental task was performed. Fifteen nachos were brought to

fracture by applying a compressing force using a robotic platform composed by a load cell (Nano 43, ATI Industrial Automation, Apex, NC, USA) and a motorized vertical stage (8MVT120-25- 4247, STANDA, Vilnius, Lithuania). During the experiments, the motorized stage was commanded with a speed of 2.5 mm/s until breakage of the nachos samples was achieved, while the load cell was tracking the applied load force. We thus estimated the sensitivity of the FBG sensor as the ratio between the peak wavelength variation measured during grasping and the load breaking the nachos.

#### Data Analysis

The Neural Network Pattern Recognition App, developed within the Neural Network Toolbox in Matlab (MathWorks, Inc., MA,

USA) was employed to predict the diameter and the rigidity of the grasped objects from the FBGs wavelength variation, in the Size Task and in the Material Task, respectively. In the Size Task, the proposed classifier comprised 12 input neurons, namely all the FBGs reflected the wavelength exerted during the grab, while in the Material Task there was 1 input neuron, namely the slope of the wavelength variation function of time (1λ/1t) tracked during the grab. Both the neural network comprised 10 hidden neurons and 1 output neuron, which was the cylinder diameter value in mm in the first task and the material rigidity in the second task. The neural networks were trained using the conjugate gradient backpropagation method. The experimental data were divided into three complementary subsets: (i) training set, (ii) validation set, and (iii) testing set. For each class (i.e., diameter), 10 repetitions were performed and 8 trials were used for training, 1 trial for validation and 1 trial for test. To reduce variability, multiple rounds of cross-validation, using different partitions, were performed. The "Leave-one-out crossvalidation" method was adopted, which used one observation as a test set (and one as a validation set) and the remaining as training set. This partition was repeated, each time changing the test set and consequentially the other two subsets, until all the 10 trials were considered one time as test set. The results of the different cross-validation were combined (i.e., averaged) to assess the neural network's predictive performance by means of a confusion

matrix. Within the Fragile Task, the ability of the Cam-Hand to deal with fragile objects was assessed by calculating the number of broken samples during grasps. To assess the performance of the Cam-Hand to follow the changes in the grasped objects (Dynamic Task) the Root Mean Square Error (RMSE), the Normalized RMSE (NRMSE) and the NMRSE calculated for the data included in the interquartile range (NRMSE(IQR)) were calculated as expressed in Equations (2–4).

show the complete range across samples.

$$\text{RMSE} = \sqrt{\frac{\sum\_{n=1}^{N} \left(\varkappa\_{1,n} - \varkappa\_{2,n}\right)^2}{N}} \tag{2}$$

$$\text{NRMSE} = \text{RMSE} / \overline{\text{x}} \tag{3}$$

$$\text{NRMSE}\_{\text{(IQR)}} = \text{RMSE}\_{\text{(IQR)}} / \overline{\mathfrak{X}}\_{\text{(IQR)}} \tag{4}$$

#### RESULTS

Through different tasks, the capability of the proposed FBGsbased robotic hand was assessed to provide tactile feedback. By evaluating the performance of the proposed classifier for size recognition of different grasped objects, an overall accuracy of 99.36% was achieved. Individual accuracy values were calculated for each diameter: 99.3% for 10 mm, 99.4% for 20 mm, 99.6% for 30 mm, 98.5% for 40 mm and 99.9% for 50 mm. Moreover, it is relevant to observe that misclassification, within the different classes, happened mainly with their relative neighbors (**Figure 5**). Physical properties of grasped objects strongly affected the FBGs wavelength variation. The slope of the wavelength variation function of time (1λ/1t) increased monotonically with increasing Young Modulus across the different materials (**Figure 6**). High repeatability was achieved across all trials: the median value of the slope was 0.87 ± 0.02 nm/s for Dragon Skin

20, 1.61 ± 0.04 nm/s for Vytaflex, 2.48 ± 0.05 nm/s for ABS and 3.54 ± 0.12 nm/s for Aluminum (**Figure 6**). Furthermore, the classifier used to predict the rigidity showed an accuracy of 100% for all the classes. Within the Fragile Task, when performing a grasp on the nachos, only one sample among the 20 executed trials was broken (**Figure 7**). The results of the Dynamic Task presented similar performances in the two experimental conditions, both in the experiments with constant velocity (**Figure 8**) and in those where velocity increased from one position to the next (**Figure 9**). Error values were: RMSE = 0.019 nm, NRMSE = 12% and NMRSE(IQR) = 1.2% for the first condition and RMSE = 0.014 nm, NMRSE = 9% and NMRSE(IQR) = 2.2% for the second condition. The mean force value needed to break the sample was experimentally estimated to be 9.49 N ± 3.13 N. Combining this result with the FBG wavelength variations recorded in the fragile task turns out in a sensitivity estimation of at least 139 pm/N.

## DISCUSSIONS AND CONCLUSIONS

The results obtained through the different tasks had very high precision in identifying relevant properties of grasped objects (Size Task, Material Task and Fragile Task) as well as the contact conditions (Dynamic Task). Within the Size Task, the sensorized hand allowed recognition of the diameters of the

FIGURE 8 | Graph showing the dynamic task results. In the upper plot, the red line represents the set point value equal to 0.16 nm, the blue line represents the process variable, namely the wavelength variation of the mean of the FBGs. In the bottom plot the blue line represents the position of the motorized translational stage.

cylinders from Bragg wavelength variations, using the proposed neural network for pattern recognition (99.36% accuracy). Since the trials of the Material Task were performed using the static controller, thus selecting a constant velocity for fingers movements, such condition also allowed for estimation of the hardness of the different materials via the temporal variation of the Bragg wavelength (100% accuracy using the proposed classifier). Our hypothesis relied on the evidence that, at constant speed, harder material involved faster Bragg wavelength signal variation. Within the Fragile task, observations made on several grabbed samples (nachos) allowed for the reliability of the sensorized finger to be evaluated in handling such kind of fragile objects. The achieved experimental results are quite generalized, since each nacho had a different shape and size. The scope of the task was to understand qualitatively the sensitivity of the sensor. The results of the Dynamic Task provided evidence for the capability of the sensorized finger to adapt its position based on the variation of the length of a mechanical jack. Furthermore, it is clear that in both the performed conditions and during the entire travel range, the difference between the set point and the process variable was very low as demonstrated from the NRMSE(IQR) values. However, the results presented some peaks related to the phase of inversion of the motion of the stage, as highlighted from the NRMSE values, which were based on the entire raw dataset and not only the interquartile range. When the stage reached a position, it moved immediately into another position and therefore direction. Consequentially, the robotic finger passed from the action of closing to the action of opening (or vice versa) that caused an error of the controller in maintaining constant grasp conditions. Although the PID controller was not always accurate, the maximum and minimum values of wavelength variation were acceptable for keeping a good grasp without breaking or slipping the object. We believe that the peaks encountered in the task are not related to the sensor performance but instead related to the used motor drivers (relays circuit). The scope of the work was mainly centered around the evaluation of the proposed tactile sensorization and not around the realization of a perfect controller. Furthermore, the mechanical jack linked to the motorized stage through a long steel bar could have influenced the presented results. Future work will address the integration of the sensorized Cam-Hand in a robotic arm, thus bypassing the issues of the controller, since the actuation part will be managed by the motor controllers of the arm. Moreover, further investigations will carry out experimental tasks with other shapes (not only cylinders) but will also evaluate wider ranges of diameters and rigidity to increase the variety of the grasped samples. Further studies will also deal with the calibration of the system in order to estimate the relationship between the wavelength variation and the applied pressure to the robotic finger. A limitation of the present study is related to temperature compensation ability. FBGs directly respond to strain and temperature changes. Such intrinsic sensitivity to both physical variables requires a compensation method to split the contribution of mechanical actions from the contributions of possible temperature changes. Considering that the environmental conditions of the laboratory were stable within the performed experiments, the temperature contribution was not considered. Future studies will aim at introducing temperature compensation solutions, for example by means of dummy FBGs not being affected by strain but by temperature changes only.

The present paper introduced a robotic hand sensorized with optical fibers, embedding FBGs transducers, to convey tactile feedback in robotic manipulative tasks. To the best of our knowledge this is the first study that demonstrates the application of FBG technology in a robotic hand, in order to achieve fine object manipulation and features extraction based on closed-loop control. The choice to sensorize such a gripper with optical fibers is based on their flexibility in the integration process, but also on their high sensitivity in strain measurements. Thanks to this integration of tactile sensors, the new Cam-Hand design targeted the following abilities: (i) estimation of the grasped object size, (ii) detection of a value related to the Young modulus of the grasped objects (iii) grasping of objects with different mechanical properties (i.e., fragile, deformable, stiff) without their slippage or breakage, and (iv) dynamic adaptation of the fingers in order to maintain constant wavelength variation, independent of the shape of the objects. We believe that multiple advantages of FBG technology, demonstrated throughout the paper, can move forward the current state of the art. Beyond the aforementioned advantages, optical fibers ensure light weight solutions and distributed sensor capabilities. Another interesting advantage of using such a technology is the multiplexing capability. Multiple FBGs can be housed along one single optical fiber by means of just minor arrangements, hence improving the sensing capabilities without drawbacks in terms of complexity and bulkiness. Finally, FBGs pave the way for RoboSimian to operate in those application scenarios that require electromagnetic immunity, where most of the conventional sensors are unsuitable.

# AUTHOR CONTRIBUTIONS

LM, CO, and KC conceived the sensorized hand. LM and KC realized the sensorized hand. ES supported the design by means of engineering tools and contributed to data interpretation. LM, CO, ES, and KC designed the experimental protocols. LM, RD, and KC performed the experiments. LM, RD, and JB analyzed the data. LM, CO, ES, and KC discussed the results. LM, CO, and KC wrote the paper. CO and KC supervised the study.

# ACKNOWLEDGMENTS

Part of the research described in this paper was carried out at the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration (80NM0018D0004). Furthermore, this project received seed funding from the Dubai Future Foundation through Guaana.com open research platform and from the Italian Ministry of Education, Universities and Research within the Smart Cities and Social Innovation Under 30 program through the PARLOMA Project (SIN\_00132).

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnbot. 2019.00008/full#supplementary-material

#### REFERENCES


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Massari, Oddo, Sinibaldi, Detry, Bowkett and Carpenter. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Development, Analysis, and Control of Series Elastic Actuator-Driven Robot Leg

#### Chan Lee and Sehoon Oh\*

*Motion Control Lab, Department of Robotics Engineering, Daegu Gyeongbuk Institute of Science and Technology, Daegu, South Korea*

The mass-spring system-like behavior is a powerful analysis tool to simplify human running/locomotion and is also known as the Spring Loaded Inverted Pendulum (SLIP) model. Beyond being just an analysis tool, the SLIP model is utilized as a template for implementing human-like locomotion by using the articulated robot. Since the dynamics of the articulated robot exhibits complicated behavior when projected into the operational space of the SLIP template, various considerations are required, from the robot's mechanical design to its control and analysis. Hence, the required technologies are the realization of pure mass-spring behavior during the interaction with the ground and the robust position control capability in the operational space of the robot. This paper develops a robot leg driven by the Series Elastic Actuator (SEA), which is a suitable actuator system for interacting with the environment, such as the ground. A robust hybrid control method is developed for the SEA-driven robot leg to achieve the required technologies. Furthermore, the developed robot leg has biarticular coordination, which is a human-inspired design that can effectively transmit the actuator torque to the operational space. This paper also employs Rotating Workspace (RW), which specializes in the control of the biarticulated robots. Various experiments are conducted to verify the performance of the developed robot leg with the control methodology.

Keywords: biarticular actuator coordinate, series elastic actuator, rotating workspace, leg force control, impedance control

# 1. INTRODUCTION

Humans and animals' dynamic walking has been attracting many engineers' and scientists' attention, and the dynamic walking of a robot itself has been a very prominent research topic for robotic engineers.

Researches on dynamic walking can be categorized into two groups: an engineering approach and a scientific approach. The engineering approach aims to build legged robots and to find a control algorithm that can realize the dynamic walking of the robot walk. Various research aiming to build biped robots and quadruped robots (Boaventura et al., 2013; Englsberger et al., 2014; Hutter et al., 2016; Kuindersma et al., 2016; Jung et al., 2018) is considered to be within this approach. Walking algorithm theories such as Zero Moment Point stabilization are also categorized into this research approach, whereas the scientific approach attempts to analyze actual human walking and then derive a walking dynamic model (Seyfarth et al., 2002; Geyer et al., 2006). Medical researchers, physical therapists and even zoologists are becoming more involved in this research

#### Edited by:

*Kosta Jovanovic, University of Belgrade, Serbia*

#### Reviewed by:

*Wesley Roozing, University of Twente, Netherlands Tom Verstraten, Vrije University Brussel, Belgium Jonathan Hurst, Oregon State University, United States Helei Duan, Oregon State University, United States, in collaboration with reviewer JH*

#### \*Correspondence:

*Sehoon Oh sehoon@dgist.ac.kr*

Received: *24 January 2019* Accepted: *11 April 2019* Published: *07 May 2019*

#### Citation:

*Lee C and Oh S (2019) Development, Analysis, and Control of Series Elastic Actuator-Driven Robot Leg. Front. Neurorobot. 13:17. doi: 10.3389/fnbot.2019.00017*

(Farley et al., 1993; Ferris et al., 1998; Zajac et al., 2002, 2003). Moreover, the advances in motion measurement technologies boost the demand for better algorithms of these approaches. There also are attempts to integrate the two approaches—the realization of dynamic walking using an actual legged robot based on dynamic characteristics.

The Spring Loaded Inverted Pendulum (SLIP) model is a promising dynamic model based on a scientific approach that can explain humans' dynamic walking and running in a simple formulation (Poulakakis and Grizzle, 2009), which is different from the complicated engineering approach. In the SLIP model, it is assumed that a (robotic) leg behaves as a pure spring. Still, building an actual robot leg that can exhibit the pure dynamic characteristics of a spring is challenging (Oh and Kong, 2014). Most recent studies have focused on 'utilizing SLIP dynamics for locomotion/hopping' after the work of Raibert (1986). Raibert has developed a mechanical leg design that is a combination of a linear actuator for the interaction and a rotary actuator for the leg swing motion. This mechanism is intuitively matched to the motion of SLIP dynamics, thus enabling a successful SLIPbased locomotion. The other straightforward mechanical design that has been designed is called 'clock-actuated SLIP' (Altendorfer et al., 2001; Seipel and Holmes, 2007). The mechanism also matches motions the mechanical characteristics of the SLIP dynamics. Even though the mechanisms have successfully realized SLIP-based running/locomotion, they are still different from the real human- or animal-like leg. Therefore, these approaches can be utilized only for their respective application.

Furthermore, some researchers have intensively analyzed this problem and came up with control algorithms that can address the realization of dynamic motion based on robot dynamics (Yamaguchi et al., 1999; Khatib et al., 2008; Tsagarakis et al., 2013; Koolen et al., 2016; Oehlke et al., 2018). For instance, Hutter et al. (2010) claimed that the problem is caused by the inherent dynamics of the articulated robot in the operational space and proposed an operational space dynamics control. However, this research is still in a bottleneck because the method does not utilize any feature of the articulated leg or the specific workspace. Therefore, the method needs improvement of the dynamic modeling, with additional complexity coming from the modeling based on the Cartesian workspace. However, it can be said that there is still a large discrepancy between the two approaches. Therefore, the previously proposed walking dynamics still cannot be applied to the actual robot, because the dynamic behavior of the actual robot is quite different from the theoretical model, especially during fast and dynamic motions. Moreover, there is no recent significant finding after (Hutter et al., 2010) to provide the solution for enabling pure SLIP dynamics of the articulated leg.

The other significant issues due to the discrepancy are the difference of the dynamic model disturbed by external forces and the nominal dynamic model, and as a result, it is more difficult to control. Specifically, in the dynamic reaction of a robot walking, the ground reaction force affects the robot continuously. Hence, force control is applied to the leg to render the dynamic response of a robot leg against external force. Nowadays, many humanoid robots have employed a force/torque control platform (Englsberger et al., 2014; Hyon et al., 2017). However, force control of a robot leg has various challenges such as friction and noise in the force sensors. To solve this problem, a robot leg that is fully driven by two Series Elastic Actuators (SEAs) is developed in this research.

Since SEAs enable high-performance force control, the proposed robot leg exploits this high-performance force control to realize ideal SLIP dynamics. There are remarkable studies which equip SEAs for the biped robot called "ATRIAS" (Rezazadeh et al., 2015; Martin et al., 2017). However, the main difference with this paper is that ATRIAS utilizes the SEAs as a velocity source to regulate the ground reaction force under the SLIP-based high-level controls and switches the control method depending on the gait phase (flight/stance). In this paper, the SEAs are focused on a realization of pure SLIP dynamics based on their force controllability.

In addition to the employment of SEAs, the introduction of the biarticular actuation mechanism is applied to facilitate the realization of SLIP dynamics of the articulated robot leg. The biarticular actuation is known for its specific functionalities and roles in human motions (van Ingen Schenau et al., 1987; Zajac et al., 2002, 2003). Also, many biomimetic robots have employed this mechanism for jumping and running (Hurst et al., 2010; Niiyama et al., 2010). Taking into consideration this advantage of the biarticular actuation mechanism, the proposed robot leg is designed based on this mechanism. The dynamic analysis of joint-space motion is conducted based on the biarticular actuator coordination.

The approaches above are considered as mechanical contributions to realizing SLIP dynamics using an articulated robot. Contribution from software can be seen in the development of Hybrid Control (HC). HC is a combination of impedance control for the realization of spring dynamics and position control for attack angle control of the SLIP dynamics. To implement this controller, suitable operational space—which is called Rotating Workspace (RW)—is employed (Oh and Kong, 2014). The dynamic behavior of the developed robot leg in the RW is analyzed to improve the performance of HC. This paper proposes a serial and systematic procedure to achieve SLIP dynamics by fully utilizing the advantage of the biarticular mechanism as well as the employment of SEAs and HC in RW.

## 2. MATERIALS AND METHODS

#### 2.1. Design of Series Elastic Actuator-Driven Robot Leg

**Figure 1** illustrates the force-controllable robot leg which is developed in this research. The development of the robot leg is 2-fold: the introduction of the biarticular actuator mechanism and the application of the Series Elastic Actuator. The following subsections provide the advantages of these design components and methodologies of force control.

**Abbreviations:** DOB, Disturbance Observer; SEA, Series Elastic Actuator; SLIP, Spring Loaded Inverted Pendulum.

#### 2.1.1. Design Motivation of Bio-inspired Robot Leg Using Biarticular Actuator Mechanism

In the musculoskeletal system, biarticular muscles are muscles that cross two joints to generate torque at both joints. Most of the human muscles that generate a large amount of force or power are usually biarticular muscles, such as the rectus femoris at the front thighs, and the hamstring at the back thighs as shown in **Figure 2**.

In the human locomotion mechanism, biarticular muscles play a vital role to link the movement of muscles in the limb (Zajac et al., 2002, 2003). When a joint is locked by the co-contraction of a pair of monoarticular muscles, the associated biarticular muscles act as powerful monoarticular muscles that actuate the whole extremity. Also, the length change of the biarticular muscles is much smaller than that of the monoarticular muscles for generating normal human motion (van Ingen Schenau et al., 1987).

Consequently, the biarticular muscle mechanism enables the high efficiency and performance of the human lower extremity movement. This high efficiency has been theoretically (Oh et al., 2011) analyzed and evaluated through experiments (Choi et al., 2016; Roozing et al., 2018).

#### 2.1.2. Implementation of Biarticular Actuator Mechanism

Muscle coordination is known to play an essential role in various multi-joint human motions (Zajac, 1993). Similar to this concept, the actuator coordination of a robot is also to be well-designed to enhance the performance of actuators in a system. **Figure 3** illustrates possible actuator coordinations to configure a twolink robot. **Figure 3A** is the conventional serial type two link robot, each joint of which is equipped with an actuator. Oh et al. (2011) has shown the inefficiency of this actuator coordinate in terms of torque/force generation. Moreover, the actuator is located on each link in this configuration which imposes the weight of the actuator on every link. It is also noticeable that this configuration needs to utilize the relative joint coordinates to describe its motion.

**Figures 3B,C** illustrates the biarticular actuator coordination in a two-link robot, where the belt/wire in **Figure 3B** and the linkage in **Figure 3C** are utilized to transmit the torque of one actuator in the proximal joint to the distal link. This transmission enables the biarticular actuator coordination in a robotic system (Choi et al., 2016; Roozing et al., 2016). As this mechanism can locate the actuators on the proximal joint, the weight of the distal link is reduced, which is considered as another advantage.

This research adopts the link mechanism in **Figure 3C** to employ the biarticular actuator coordination as well as to guarantee large force transmission along with robustness against an impact from the environment.

**Figure 4** shows the details of the developed SEA-driven biarticular robot leg. Two SEAs are connected on the body to provide mono- and biarticular torques. The torques are transmitted to the links through the timing belts, which have a gear ratio of 2:1. For simplicity of the dynamics, both monoand biarticular-side links are designed to be equal to 0.3 m. The specifications of the parameters are summarized in the table on the right-hand side of **Figure 4**.

#### 2.1.3. Actuator Design: Transmission Force Sensing Type Series Elastic Actuator

SEA of Pratt and Williamson (1995) utilizes a spring as a part of its transmission to measure and control the interacting force between the motor and the load. As explained in section 1, SEA is employed as the actuator system in this research since it can provide high-fidelity torque/force while it is robust against the impact due to its inherent compliance.

FIGURE 3 | Various configurations of a two link robot. (A) series configuration, (B) biarticular configuration (wire/belt-driven), (C) biarticular configuration (linkage-driven).

There are various types of SEA configurations (Lee et al., 2017) such as the direct force sensing type (FSEA) (Pratt and Williamson, 1995), the reaction force sensing type (RFSEA) (Paine et al., 2014; Park et al., 2018) and the transmission force sensing SEA (TFSEA), Lee and Oh (2016). TFSEA is utilized as the actuators for the proposed robot leg.

**Figure 5** shows the TFSEA utilized in this research, where the spring is located between the ring gear of the planetary gear and the ground. This configuration allows the measurement of the absolute transmission force by using a single encoder and the robustness against the dead zone caused by gear backlash. The details of the configuration of TFSEA are described in Lee and Oh (2016).

The SEA in **Figure 5** developed for the robot leg consists of a current/torque-controlled BLDC motor with a servo driver (Maxon motor, EC 4-pole 48V and ESCON70/10), a harmonic gear (Harmonic Drive <sup>R</sup> , CSF-11), a torsional spring (custommade), and two encoders (incremental type motor-side encoder is 5,000 count/turn resolution and absolute type spring deflection encoder is 19-bit resolution). The maximum continuous torque of the SEA is 55.2 Nm (peak: 686 Nm), and the maximum permissible velocity is 80 rpm. The stiffness of the spring is

FIGURE 5 | Developed transmission force sensing type series elastic actuator (TFSEA).

128 Nm/rad, which leads to a high force sensing resolution of 0.3835 mN-m/tick with a 19-bit absolute encoder. Note that these specifications including output stiffness are obtained in the biarticular joint coordinates after timing belts and gears as shown in **Figure 4**.

#### 2.1.4. Robust Torque Control for the Developed SEA to Provide Ideal Torque Source

To achieve high-performance force control of the cPEA in the proposed robotic leg, model-based force control developed by Oh and Kong (2017) is applied in this research.

To design the model-based force control, the general dynamics of the SEA is derived from the free-body diagram in **Figure 6** as follows:

$$\begin{aligned} J\_m \ddot{\theta}\_j &= \ \mathfrak{r}\_{\dot{f}} - \mathfrak{r}\_s - B\_{\dot{f}} \dot{\theta}\_{\dot{f}}, \\ J\_l \ddot{\theta}\_l &= \ \mathfrak{r}\_s - B\_l \dot{\theta}\_l, \\ \mathfrak{r}\_s &= K\_s \theta\_s, \end{aligned} \tag{1}$$

where J<sup>j</sup> and J<sup>l</sup> are the moment of inertias of the motor and the load, B<sup>j</sup> and B<sup>l</sup> are the joint damping coefficients of the motor and the load, and θ<sup>j</sup> and θ<sup>l</sup> are the joint position of the motor and the load. θ<sup>s</sup> is the spring deflection. τ<sup>j</sup> is the torque generated by the motor and τ<sup>s</sup> is the torque transmitted through the spring. K<sup>s</sup> is the spring stiffness.

From the dynamics of the SEA in Equation (1), the transfer function from the motor torque τ<sup>j</sup> to the SEA transmission torque τ<sup>s</sup> can be obtained as follows.

$$P\_s(\mathbf{s}) = \frac{\mathbf{r}\_s}{\mathbf{r}\_{\dot{j}}} = \frac{K\_s(f\_{l\dot{s}}\mathbf{s} + B\_{l\dot{l}})}{(f\_{l\dot{s}}\mathbf{s} + B\_{l\dot{l}})(f\_{\dot{j}}\mathbf{s}^2 + B\_{\dot{j}}\mathbf{s}) + K\_s((f\_{l\dot{l}} + f\_{\dot{j}})\mathbf{s} + (B\_{l\dot{l}} + B\_{\dot{j}}))} \tag{2}$$

The actuator dynamics of SEA is basically linear. However, the value of the load-side moment of inertia J<sup>l</sup> is subjected to varying joint angle conditions, particularly in a multi-link robot like the proposed robotic leg. Therefore, a robust torque controller is required to provide an ideal torque source to the proposed robotic leg regardless of the joint angle.

To that end, a Disturbance Observer (DOB)-based robust torque controller is applied for the torque control of two

SEAs. The overall control scheme of the SEA torque control is illustrated in **Figure 7**. In **Figure 7**, P −1 sn (s) indicates the inverse model of the nominal transmission dynamics in Equation (2), and Q(s) is the second-order Q-filter of DOB. Cfb(s) and Cff (s) are the feedback and feedforward controller, respectively.

DOB can nominalize the dynamics of SEA in a specific frequency bandwidth determined by the Q-filter, leading to a robust force control against varying load conditions. Cff (s) is designed as the inverse model of the nominalized SEA dynamics, with a low-pass filter that can enhance fast-tracking performance.

The performance of the torque control algorithm is verified using the closed-loop frequency response measurements. To investigate the robustness of the torque control, these experiments were conducted with two different load conditions: with free-load, which means no additional load is attached to the SEA, and with the fixed-load environment. To analyze the performance of the frequency responses in these load conditions, a chirp-type excitation signal is applied with frequencies of 0.1–100 Hz. The output torques of SEA are measured by a torque sensor, and the experiments were conducted ten times to calculate the average frequency responses.

Note that the performance of the closed-loop response of the SEA torque control depends on the magnitude of the torque reference. This concept is called Large Force Bandwidth (LFB) (Hutter et al., 2016). **Figure 8** shows the characteristics of the torque-controlled SEA in this paper as well as the bandwidth variance regarding the magnitude values of the reference. Moreover, 5 Nm (red-colored) is the critical magnitude value before the safety protection of the motor driver is enabled (safety stop has occurred at the high-frequency region with 6 Nm magnitude reference). Therefore, 5 Nm is chosen as the magnitude of the reference torque for the evaluation of the robustness of the SEA torque control.

FIGURE 9 | Frequency responses of the proposed force control of the TFSEA. (A) force control with free load condition, and (B) with locked-load condition.

The experimental results for the robustness evaluation are shown in **Figure 9**, where the plots in **Figure 9A** are the responses with the free-load, and the plots in **Figure 9B** are the responses with the fixed-load. The gray-colored lines are the results of ten repetitions, and the red-colored lines are the averaged response. The responses are matched with second-order low-pass filters: 1 0.005<sup>2</sup> s <sup>2</sup>+0.004s+<sup>1</sup> and <sup>1</sup> 0.0045<sup>2</sup> s <sup>2</sup>+0.003s+<sup>1</sup> . The bandwidths are 32 and 35 Hz, respectively, which can be calculated from the filter models.

The results verify that the SEA with the model-based force control can provide high-fidelity force generation with small performance variation (8.57%). It can also be validated that the closed-loop frequency bandwidth of the force generation is up to 32 Hz regardless of the load condition.

## 2.2. Theoretical Analysis of Biarticular Actuator Coordination

For the realization of SLIP dynamics using SEAs, the kinetic characteristics of the robot are analyzed in the joint space. Since the developed robot incorporates the biarticular actuator coordination, the kinematic and dynamic characteristics should be formulated in this coordination.

#### 2.2.1. Kinematics and Statics Analysis of Biarticular Coordination

The basic actuator configurations of a two-link robot are illustrated in **Figure 10** where **Figure 10A** is the conventional serial link with two independent (monoarticular) actuators (which generate torques τ<sup>1</sup> and τ2), and **Figure 10B** is the proposed configuration with one monoarticular actuator and one biarticular actuator (which generate torques τ<sup>m</sup> and τ<sup>b</sup> ).

The kinematics of the conventional serial robot configuration shown in **Figure 10A** is given as

$$P\_{\chi} = l \cos \theta\_1 + l \cos \left(\theta\_1 + \theta\_2\right)$$

$$P\_{\mathcal{Y}} = l \sin \theta\_1 + l \sin \left(\theta\_1 + \theta\_2\right), \tag{3}$$

where P<sup>x</sup> and P<sup>y</sup> are the end effector position in the Cartesian coordinate system, and θ<sup>1</sup> and θ<sup>2</sup> represent the angles of two joints. l is the length of links. Note that the length of the first and second links are the same as l in this paper.

On the other hand, the kinematics of the biarticular actuator-coordinated link (in **Figure 10B**) is described using the monoarticular joint angle θ<sup>m</sup> and the absolute biarticular joint angle θ<sup>b</sup> , which are related to the conventional joint angles θ<sup>1</sup> and θ<sup>2</sup> as follows.

$$\begin{aligned} \theta\_m &= \theta\_1, \\ \theta\_{b.r} &= \theta\_2, \\ \theta\_b &= \theta\_1 + \theta\_2 = \theta\_m + \theta\_{b.r} \end{aligned} \tag{4}$$

where θb.<sup>r</sup> is relative angle between the links of the monoarticular and biarticular joints.

Based on this relationship, the end effector position P<sup>x</sup> and P<sup>y</sup> can be re-described using the biarticular coordination θ<sup>m</sup> and θ<sup>b</sup> as follows.

$$P\_{\chi} = l \cos \theta\_m + l \cos \theta\_b$$

$$P\_{\mathcal{V}} = l \sin \theta\_m + l \sin \theta\_b \tag{5}$$

The Jacobian matrix for the biarticular coordination can be derived by utilizing the partial differentiation of Equation (5) as follows:

$$
\begin{bmatrix} \dot{P}\_{\chi} \\ \dot{P}\_{\mathcal{V}} \end{bmatrix} = l \begin{bmatrix} -\sin\theta\_m & -\sin\theta\_b \\ \cos\theta\_m & \cos\theta\_b \end{bmatrix} \begin{bmatrix} \dot{\theta}\_m \\ \dot{\theta}\_b \end{bmatrix} = \mathbf{J}\_b \begin{bmatrix} \dot{\theta}\_m \\ \dot{\theta}\_b \end{bmatrix}, \tag{6}
$$

where **J**<sup>b</sup> is Jacobian for the biarticular coordination. The statics of the biarticular coordination can also be derived utilizing this Jacobian **J**<sup>b</sup> as follows.

$$
\begin{bmatrix} \tau\_m \\ \tau\_b \end{bmatrix} = J\_b^T \begin{bmatrix} F\_\chi \\ F\_\mathcal{Y} \end{bmatrix} = l \begin{bmatrix} -\sin\theta\_m \cos\theta\_m \\ -\sin\theta\_b \cos\theta\_b \end{bmatrix} \begin{bmatrix} F\_\chi \\ F\_\mathcal{Y} \end{bmatrix} \tag{7}
$$

This statics can be reorganized as follows, which verifies that the biarticular torque τ<sup>b</sup> works on the first joint and the second joint at the same time.

$$
\begin{bmatrix} \tau\_1\\ \tau\_2 \end{bmatrix} = \begin{bmatrix} 1 & 1\\ 0 & 1 \end{bmatrix} \begin{bmatrix} \tau\_m\\ \tau\_b \end{bmatrix} \tag{8}
$$

#### 2.2.2. Dynamics of Two-Link Robot With Biarticular Actuator Coordination

Equation (9) describes the dynamics of a two-link robot with the conventional actuator coordination, which has two independent torque inputs τ<sup>1</sup> and τ<sup>2</sup> shown in **Figure 10A**, where θ¨ <sup>1</sup> and θ¨ <sup>2</sup> are the accelerations of two joints. g is the acceleration of gravity, lic is the distance from the center of a joint i to the center of the mass point of the link i, m<sup>i</sup> is the weight of the link i and I<sup>i</sup> is the moment of inertia.

$$\begin{aligned} \begin{bmatrix} I\_1 + I\_2 + m\_2 l^2 + 2m\_2 l\_{2c} l \cos \theta\_2 & I\_2 + m\_2 l\_{2c} l \cos \theta\_2 \\\ I\_2 + m\_2 l\_{2c} l \cos \theta\_2 & I\_2 \end{bmatrix} \begin{bmatrix} \ddot{\theta}\_1 \\\ddot{\theta}\_2 \end{bmatrix} \\ + \begin{bmatrix} -m\_2 l\_{2c} l \sin \theta\_2 (\dot{\theta}\_2^2 + 2 \dot{\theta}\_1 \dot{\theta}\_2) \\\ m\_2 l\_{2c} l \sin \theta\_2 \dot{\theta}\_1^2 \end{bmatrix} \\ + \begin{bmatrix} g(m\_1 l\_{1c} + m\_2 l) \cos \theta\_1 + g m\_2 l\_{2c} \cos(\theta\_1 + \theta\_2) \\\ gm 2l\_{2c} \cos(\theta\_1 + \theta\_2) \end{bmatrix} = \begin{bmatrix} \tau\_1 \\\ \tau\_2 \end{bmatrix} \end{aligned} \tag{9}$$

This dynamics description can be rewritten utilizing the biarticular coordinates with θm, θ<sup>b</sup> , τ<sup>m</sup> and τ<sup>b</sup> . Equation (10) represents the relationship between the biarticular coordinates and the conventional coordinates θ1, θ2, τ<sup>1</sup> and τ2.

$$
\begin{bmatrix} \theta\_1 \\ \theta\_2 \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ -1 & 1 \end{bmatrix} \begin{bmatrix} \theta\_m \\ \theta\_b \end{bmatrix}, \begin{bmatrix} \tau\_m \\ \tau\_b \end{bmatrix} = \begin{bmatrix} 1 & -1 \\ 0 & 1 \end{bmatrix} \begin{bmatrix} \tau\_1 \\ \tau\_2 \end{bmatrix} \tag{10}
$$

Utilizing this relationship, the dynamics description in Equation (9) can be rewritten as in Equation (11), which is the dynamics description in the biarticular coordinates.

$$
\begin{split}
\begin{bmatrix}
\boldsymbol{\tau}\_{m} \\
\boldsymbol{\tau}\_{b}
\end{bmatrix} &= \begin{bmatrix}
I\_{m} & I\_{c}(\boldsymbol{\theta}\_{b,r}) \\
I\_{c}(\boldsymbol{\theta}\_{b,r}) & I\_{b}
\end{bmatrix} \begin{bmatrix}
\boldsymbol{\dot{\theta}}\_{m} \\
\boldsymbol{\dot{\theta}}\_{b}
\end{bmatrix} \\ &+ \begin{bmatrix}
\boldsymbol{\tau}\_{m,c}(\boldsymbol{\theta}\_{b,r},\dot{\boldsymbol{\theta}}\_{b}) \\
\boldsymbol{\tau}\_{b,c}(\boldsymbol{\theta}\_{b,r},\dot{\boldsymbol{\theta}}\_{m})
\end{bmatrix} + \begin{bmatrix}
\boldsymbol{\tau}\_{m,\boldsymbol{\xi}}(\boldsymbol{\theta}\_{m}) \\
\boldsymbol{\tau}\_{b,\boldsymbol{\xi}}(\boldsymbol{\theta}\_{b})
\end{bmatrix},
\end{split} \tag{11}
$$

where the inertia is reformulated as

$$\begin{aligned} I\_m &= I\_1 + m\_2 l^2\\ I\_b &= I\_2\\ I\_c(\theta\_{b,r}) &= m\_2 l\_{2c} l \cos \theta\_{b,r} = C\_I \cos \theta\_{b,r} \end{aligned} \tag{12}$$

Coriolis force and gravity effect in the biarticular coordinate are reorganized as follows:

$$\begin{aligned} \tau\_{m.c}(\theta\_{b.r}, \dot{\theta}\_{b}) &= -m\_{2}l\_{2c}l\sin\theta\_{b.r}\dot{\theta}\_{b}^{2} = -C\_{I}\sin\theta\_{b.r}\dot{\theta}\_{b}^{2}, \\ \tau\_{b.c}(\theta\_{b.r}, \dot{\theta}\_{m}) &= m\_{2}l\_{2c}l\sin\theta\_{b.r}\dot{\theta}\_{m}^{2} = C\_{I}\sin\theta\_{b.r}\dot{\theta}\_{m}^{2}. \\ \text{And} \\ \tau\_{m.g}(\theta\_{m}) &= g(m\_{1}l\_{1c} + m\_{2}l\_{1})\cos\theta\_{m} = G\_{m}\cos\theta\_{m}, \\ \tau\_{b.g}(\theta\_{b}) &= g m\_{2}l\_{2c}\cos\theta\_{b} = \frac{g}{l}C\_{I}\cos\theta\_{b} = G\_{b}\cos\theta\_{b}. \end{aligned} \quad \text{(13)}$$

Notice that Coriolis torques τm.c(θb.<sup>r</sup> , θ˙m), τb.<sup>c</sup> (θb.<sup>r</sup> , θ˙ b ) and gravity effects τm.<sup>g</sup> (θm), τb.<sup>g</sup> (θ<sup>b</sup> ) consist of sole physical values. Namely, Coriolis torque τm.<sup>c</sup> is proportional only to θ˙<sup>2</sup> b , and τb.<sup>c</sup> is also proportional only to θ˙<sup>2</sup> <sup>m</sup>. Moreover, the gravity terms τm.<sup>g</sup> and τb.<sup>g</sup> are also proportional to cos θ<sup>m</sup> and cos θ<sup>b</sup> , respectively.

Furthermore, the common coupling coefficient C<sup>I</sup> appears in many equations including Coriolis torque in (13) and the gravity effect G<sup>b</sup> in (14), which are basically constant coefficients. In other words, the coupling coefficient and the gravity effects can be identified in a simple way with these biarticular coordinates.

#### 2.3. Bio-inspired Leg Interaction Control to Realize SLIP Dynamics Using SEA-Driven Robot Leg

This section introduces the control algorithm to realize human locomotive characteristics by using SLIP dynamics. As shown in **Figure 11**, the section includes four subsections as follows:


• **Section 2.3.4:** The joint-space controllers are designed to improve the performance of hybrid control for pure SLIP behavior of the developed biarticular robot leg.

#### 2.3.1. Introduction of SLIP Model Based on Characteristic of Human Locomotion

Due to the recent improvements in the anthropometric techniques, human motions can be measured and analyzed in a theoretical way more than ever. One of the most simplified and well-formulated approaches to deriving the dynamic model of human walking and running focuses on the spring-like interaction of the leg with the ground during the stance phase and the free motion of the leg during the swing phase. The model that is developed to describe this feature is the Spring Loaded Inverted Pendulum model (Blickhan, 1989). Since the SLIP model simplifies the description and analysis of human locomotion, it is becoming a powerful template to realize humanlike locomotion using various types of robots.

As described in **Figure 11A**, the main requirements to realize stable and periodic running of a robot are the realization of spring dynamics and attack angle control. These correspond to the SLIP model-based locomotion as 2-fold: a pure massspring behavior during the ground interaction or stance phase is rendered, and the correct attack angle with regards to the momentum of the mass at the landing moment or incidence angle of the leg is required.

One successful engineering approach to achieve the SLIP model-based running of a robot is a combination of a linear actuator for the interaction and a rotary actuator for the leg swing motion (Raibert, 1986). In this approach, the robot can realize SLIP model-based running even if the configuration of the robot is far from a human-like design. The robot is not developed to mimic the human muscular-skeletal configuration but to realize SLIP dynamics.

Articulated robot legs, such as the proposed robot leg, are realized by using the scientific approach to meet humanlike locomotion specification. However, the conventional approach suffers from a technical bottleneck since the operational coordinate system of articulated robots differs from the operational space of SLIP dynamics, even though the configuration of the robot leg is derived from a bio-inspired design. Therefore, it is challenging to implement SLIP dynamics of the articulated robot directly.

The novel point of this paper is the transformation of the operational coordinate to develop a complementary coordinate for SLIP motion. In other words, the operational coordinate system of the articulated leg has to be transformed to include the interactive operational and swing operational axes for the SLIP dynamics. In this perspective, the new Hybrid Control (HC) described in **Figure 11B** is designed in the Rotating Workspace (RW) described in **Figure 11C** to match the articulated robot with the operational space of SLIP dynamics. The overall control structure to realize SLIP dynamics is described in **Figure 12**. The proposed robust Rotating Workspace Hybrid Control (RWHC) algorithm consists of three layers, which are HC, Rotating Workspace transformation and joint-level controllers, and is shown as a dashed square in **Figure 12**. In the following sections, the details of these control parts are designed and analyzed.

#### 2.3.2. Proposed Approach to Realize SLIP Dynamics: Hybrid Control in Rotating Workspace

This subsection introduces the HC method, which is a control method to realize SLIP dynamics. HC refers to the simultaneous control of θ- and r-directions as shown in **Figure 11B**. These

directions are the operation directions in RW and are required for realizing attack angle control and impedance control of the spring dynamics of SLIP in **Figure 11A**. As shown in **Figure 11B**, both the position controller with respect to θ-direction for the attack angle control and impedance control with respect to r-direction for the spring dynamics are implemented to accomplish HC in RW.

A robust position controller is chosen to overcome the influences of inherent coupling dynamics of the robot leg and external disturbances and to achieve precise attack angle control. For this requirement, the DOB-based two-DoF controller is adopted once more, but it uses a different dynamic model with the case of SEA torque control in section 2.1.4. The basic concept of this control approach has been proposed in Oh and Kong (2014).

The nominal inverse model of DOB is obtained based on a dynamic relationship between the θ-directional motion and the θ-directional force of the robot leg in the RW. In the viewpoint of the implementation, the statics and kinematics are also obtained to transform from the measured joint angles θ<sup>m</sup> and θ<sup>b</sup> to θ-directional motion and to generate the appropriate torques through the joint actuators. The coordinate transformation (kinematics) is required to project the reference position from the attack angle to the joint position.

On the other hand, the realization of the spring dynamics requires impedance control with respect to r-direction in the RW. The impedance control method enables rendering of desired dynamic behavior between the plant and the environment, e.g., the interaction of a second-order spring-mass system with the ground. There are two typical ways to render the mechanical impedance of the robot: position-based admittance control and force-based impedance control. The former needs a positioncontrolled robot system and end-effector force sensor, and the latter requires a force-controlled actuator system and joint position sensors. Since the proposed robot leg does not have an end-effector force sensor but includes force-controlled SEAs with measurable joint positions, the force-based impedance control is utilized to realize spring dynamics of the SLIP model with respect to r-direction in the RW. Moreover, in this the force-based impedance control, the inherent coupled dynamic behavior also appears on the closed-loop system in the RW. Therefore, the dynamics of the robot leg in the RW is analyzed for the proper rendering of the desired spring dynamics.

The r-directional impedance control law consists of the r-directional position displacement as a feedback signal and the reference impedance model. The r-direction position displacement can be calculated by using the kinematic relationship between the biarticular joint space angles and the r-directional motion, and the reference impedance model is defined based on the desired spring model of SLIP dynamics. Furthermore, the r-direction position displacement is multiplied by the reference impedance model to calculate the required r-directional force. The statics of the robot leg is required to generate the appropriate torques in the joint-space for rendering the reference impedance.

In summary, for the implementation of the proposed RWHC in the low-level joint control, workspace coordination including the derivations of kinematic and statics—is required. Furthermore, the dynamics of the proposed robot leg in the RW is necessary for realization of the DOB-based position control and analysis of the behavior of the impedance-controlled system.

#### 2.3.3. Analysis of the Developed Robot Leg in Rotating Workspace for Hybrid Control

A novel operational space that is suitable for the coordination of the robot leg and the realization of spring dynamics and control of attack angle is needed to implement SLIP dynamics. The coordination that can perfectly match the biarticular joint space and the demands of SLIP dynamics is RW transformation, unlike the conventional workspace dynamics and its control which are designed in the Cartesian coordinate system.

In this section, the Jacobian, the statics, kinematics and the dynamics are defined in the RW, which conforms with the implementation of SLIP dynamics of the biarticular robot leg as follows:


The RW is first introduced by Oh and Kong (2014). The paper provides the whole procedure for obtaining the transformation mathematics in detail and briefly shows the procedure for better understanding the controller design based on a RW formulation.

The kinematics between RW and biarticular joint-space is represented by a coordinate system that is rotated by the angle θ as shown in **Figure 11B**. The angle θ is the relative angle between the reference coordinate system and the robot endpoint position r, which are given by

$$r = \sqrt{P\_{\chi}^2 + P\_{\mathcal{Y}}^2} = 2l \cos \frac{1}{2} (\theta\_b - \theta\_m),$$

$$\theta = \tan^{-1} \frac{P\_{\mathcal{Y}}}{P\_{\chi}} = \frac{1}{2} (\theta\_b + \theta\_m) \tag{15}$$

The physical representations of the coordinates in the RW can be exactly matched with the SLIP motion. The r-directional movement of the robot end effector corresponds to ground interactive motion and θ-directional movement corresponds to leg swing motion, i.e., determination of the attack angle of the SLIP dynamics.

The Jacobian is obtained by considering the end effector velocities in the RW, which is given by

$$
\dot{\boldsymbol{x}}\_{..}^{R} = \dot{\boldsymbol{r}} \tag{16}
$$

$$
\dot{\jmath}^R = r \dot{\theta}\_r \tag{17}
$$

where x˙ <sup>R</sup> denotes <sup>r</sup>-directional velocity and <sup>y</sup>˙ R represents θdirectional velocity as shown in **Figure 11C**. The velocities can be described also by multiplication of the rotation matrix and biarticular Jacobian matrix **J**<sup>b</sup> , and this yields the RW Jacobian **J**<sup>R</sup> as follows:

$$\begin{aligned} \begin{bmatrix} \dot{\boldsymbol{x}}^R \\ \dot{\boldsymbol{y}}^R \end{bmatrix} &= \begin{bmatrix} \cos\theta & \sin\theta \\ -\sin\theta & \cos\theta \end{bmatrix} \boldsymbol{J}\_b \begin{bmatrix} \dot{\theta}\_m \\ \dot{\theta}\_b \end{bmatrix} \\ &= \boldsymbol{l} \begin{bmatrix} \sin\frac{\theta\_{br}}{2} & -\sin\frac{\theta\_{br}}{2} \\ \cos\frac{\theta\_{br}}{2} & \cos\frac{\theta\_{br}}{2} \end{bmatrix} \begin{bmatrix} \dot{\theta}\_m \\ \dot{\theta}\_b \end{bmatrix} = \boldsymbol{J}\_R \begin{bmatrix} \dot{\theta}\_m \\ \dot{\theta}\_b \end{bmatrix} \end{aligned} (18)$$

By using the obtained RW Jacobian **J**R, the static force relationship between the biarticular joint space torques and the RW forces can be described as follows:

$$\begin{bmatrix} \tau\_m\\ \tau\_b \end{bmatrix} = J\_R^T \begin{bmatrix} F\_\chi^R\\ F\_\mathcal{V}^R \end{bmatrix} = l \begin{bmatrix} \sin\frac{\theta\_{hr}}{2} & \cos\frac{\theta\_{hr}}{2}\\ -\sin\frac{\theta\_{hr}}{2} & \cos\frac{\theta\_{hr}}{2} \end{bmatrix} \begin{bmatrix} F\_\chi^R\\ F\_\mathcal{V}^R \end{bmatrix} \tag{19}$$

where F R x and F R y indicate r-directional and θ-directional forces. The derived kinematics in Equation (15) and statics in Equation (19) enable the RW coordinate transformation in the middle layer of **Figure 12** to realize RWHC. Using these, the leg dynamics in the RW is also derived for the analysis of the coupling behavior with inherent dynamic characteristics.

In order to obtain the dynamics of the robot leg in the RW, the coordinate description of accelerations x¨ R and y¨ R is obtained. The acceleration relationship of two operational spaces can be derived by time differentiation and rearrangement (from **J**<sup>R</sup> to **J** −1 R ) of Equation (18) as follows.

$$
\begin{bmatrix} \stackrel{\vec{\partial}\_m}{\ddot{\partial}\_b} \end{bmatrix} = \mathbf{J}\_R^{-1} \begin{bmatrix} \ddot{\mathbf{x}}^R \\ \ddot{\mathbf{y}}^R \end{bmatrix} - \mathbf{J}\_R^{-1} \mathbf{j}\_R \begin{bmatrix} \dot{\partial}\_m \\ \dot{\partial}\_b \end{bmatrix} \tag{20}
$$

where ˙**J**<sup>R</sup> denotes a time differentiation of the RW Jacobian. RW dynamics of biarticular actuator mechanism can be derived by using the inverse of Equation (19) and the Equation (20) as follows:

$$\begin{split} \boldsymbol{J}\_{R}^{T} \begin{bmatrix} F\_{\mathbf{x}}^{R} \\ F\_{\mathbf{y}}^{R} \end{bmatrix} &= \begin{bmatrix} \boldsymbol{\tau}\_{m} \\ \boldsymbol{\tau}\_{b} \end{bmatrix} = \begin{bmatrix} \boldsymbol{I}\_{m} & \boldsymbol{I}\_{\mathcal{L}}(\boldsymbol{\theta}\_{b,r}) \\ \boldsymbol{I}\_{\mathcal{L}}(\boldsymbol{\theta}\_{b,r}) & \boldsymbol{I}\_{\mathcal{b}} \end{bmatrix} \begin{bmatrix} \boldsymbol{\vartheta}\_{m} \\ \ddot{\boldsymbol{\vartheta}}\_{b} \end{bmatrix} \\ &+ \begin{bmatrix} \boldsymbol{\tau}\_{m,\mathcal{L}}(\boldsymbol{\theta}\_{b,r},\boldsymbol{\vartheta}\_{b}) + \boldsymbol{\tau}\_{m,\mathcal{S}}(\boldsymbol{\theta}\_{m}) \\ \boldsymbol{\tau}\_{b,\mathcal{L}}(\boldsymbol{\theta}\_{b,r},\dot{\boldsymbol{\phi}}\_{m}) + \boldsymbol{\tau}\_{b,\mathcal{S}}(\boldsymbol{\theta}\_{b}) \end{bmatrix}, \\ \boldsymbol{I}\_{\mathcal{T}}^{R} \begin{bmatrix} F\_{\mathbf{x}}^{R} \\ F\_{\mathbf{y}}^{R} \end{bmatrix} &= (\boldsymbol{\mathcal{U}}\_{R}^{T})^{-1} \begin{bmatrix} \boldsymbol{I}\_{m} & \boldsymbol{I}\_{\mathcal{L}}(\boldsymbol{\theta}\_{b,r}) \\ \boldsymbol{I}\_{\mathcal{L}}(\boldsymbol{\theta}\_{b,r}) & \boldsymbol{I}\_{\mathcal{b}} \end{bmatrix} \boldsymbol{I}\_{\mathcal{R}}^{-1} \left( \begin{bmatrix} \ddot{\boldsymbol{\mathcal{X}}}^{R} \\ \ddot{\boldsymbol{\mathcal{Y}}}^{R} \end{bmatrix} - \boldsymbol{\mathcal{Y}}\_{R} \begin{bmatrix} \boldsymbol{\vartheta}\_{m} \\ \dot{\boldsymbol{\vartheta}}\_{b} \end{bmatrix} \right). \end{split}$$

$$+\left(\boldsymbol{J}\_{R}^{T}\right)^{-1}\left[\begin{array}{c}\boldsymbol{\tau}\_{m.c}(\boldsymbol{\theta}\_{b.r},\dot{\boldsymbol{\theta}}\_{b})+\boldsymbol{\tau}\_{m.g}(\boldsymbol{\theta}\_{m})\\\boldsymbol{\tau}\_{b.c}(\boldsymbol{\theta}\_{b.r},\dot{\boldsymbol{\theta}}\_{m})+\boldsymbol{\tau}\_{b.g}(\boldsymbol{\theta}\_{b})\end{array}\right]\tag{21}$$

Based on Equation (21), it can be concluded that the RW coordinate corresponds kinematically to SLIP motion. However, the dynamics is complicatedly coupled in the RW coordinate. Because of these dynamic characteristics (dynamics coupling and gravity effect), the motions in the RW, x R and y R , which are to be controlled, adversely affect each other. Therefore, the performance of HC is deteriorated as shown in **Figure 13**. The next section shows how to remove the dynamic coupling effect and improve its performance effectively.

#### 2.3.4. Joint Space Dynamic Decoupling to Improve Performance of Robust Rotating Workspace HC

As described above, the attack angle control for the SLIP dynamics can be implemented using the high gain feedback without consideration of dynamics behavior in the RW. However, for the realization of the spring behavior of SLIP dynamics, it is not straightforward to render the desired dynamic behavior without considering the influence of mechanical characteristics such as inertia coupling and gravity effect. In this section, systematic dynamic decoupling methods in the joint space are designed to enhance the performance of the proposed RWHC based on the structural advantage of the biarticular robot leg.

A study on dynamics modification such as dynamic decoupling for a general multi-link robot has been first proposed in Khatib (1987). Khatib has developed the 'Operational Space Nonlinear Dynamic Decoupling' method to directly modify the robot's dynamic characteristics based on an identification of the coupling and gravity terms appearing in the workspace without using the structural features. The method requires correct identifications of complicated inertia coupling, Coriolis and gravity terms; hence, it is not effective to directly apply Khatib's approach to the biarticular robot leg.

In order to decouple the dynamics of the leg effectively, a systematic approach is required based on a structural advantage of the biarticular robot leg. In section 2.2.2, the structural advantage was found to be that the robot legs on the biarticular coordinate system are configured to be identified easily. Therefore, the proposed method utilizes joint level identification and decoupling in the biarticular coordinate to simplify the complex dynamics in Equation (21) (detailed in section 2.3.4.1). Furthermore, inertia control is applied to the simplified joint space dynamics to achieve better HC performance in the RW (detailed in section 2.3.4.2).

#### **2.3.4.1. Compensation of Inherent Dynamics - Inertia Decoupling and Gravity Compensation**

As mentioned before, because of the torque control, measurement capabilities of the SEA and biarticular actuator mechanism, the identification and compensation of the gravity, Coriolis and inertia torque in the joint space are simpler than in RW. Hence, the compensation of inherent dynamic coupling is performed in the joint space.

The controller for inertial and gravity compensation is designed based on the joint-space dynamics derived in Equations (12)–(14). Based on these equations, required compensation torques τ c <sup>m</sup> and τ c b are obtained as follows:

$$\begin{aligned} \tau\_m^\varepsilon &= \mathcal{C}\_I \cos \theta\_{b.r} \ddot{\theta}\_b - \mathcal{C}\_I \sin \theta\_{b.r} \dot{\theta}\_b^2 + \mathcal{G}\_m \cos \theta\_m \\ \tau\_b^\varepsilon &= \mathcal{C}\_I \cos \theta\_{b.r} \ddot{\theta}\_m + \mathcal{C}\_I \sin \theta\_{b.r} \dot{\theta}\_m^2 + \mathcal{G}\_b \cos \theta\_b \end{aligned} \tag{22}$$

θ<sup>m</sup> and θ<sup>b</sup> are measured by position sensors of each joint SEA and θb.<sup>r</sup> is calculated by Equation (4). θ˙m, θ˙ b , θ¨<sup>m</sup> and θ¨ <sup>b</sup> can be calculated by numerical differentiation of θ<sup>m</sup> and θ<sup>b</sup> . Three coefficients, Gm, G<sup>b</sup> and C<sup>I</sup> , are required to implement this compensation; however, the gravity term of biarticular-side joint Gb includes the common coupling coefficient C<sup>I</sup> as derived in Equation (14). Therefore, simple identification of G<sup>m</sup> and G<sup>b</sup> enable this compensation. Note that the identification method for G<sup>m</sup> and G<sup>b</sup> is described in section 3.3.

This compensation results in a fully decoupled joint-space dynamics in the biarticular coordinate system as follows:

$$
\begin{bmatrix} \boldsymbol{\tau}\_{m} - \boldsymbol{\tau}\_{m}^{c} \\ \boldsymbol{\tau}\_{b} - \boldsymbol{\tau}\_{b}^{c} \end{bmatrix} = \begin{bmatrix} I\_{m} & 0 \\ 0 & I\_{b} \end{bmatrix} \begin{bmatrix} \boldsymbol{\partial}\_{m} \\ \boldsymbol{\ddot{\partial}}\_{b} \end{bmatrix} = \mathbf{I}\_{bi} \begin{bmatrix} \boldsymbol{\partial}\_{m} \\ \boldsymbol{\ddot{\partial}}\_{b} \end{bmatrix},\tag{23}
$$

where **I**bi is the diagonal inertia matrix. τ<sup>m</sup> − τ c <sup>m</sup> and τ<sup>b</sup> − τ c b are the new torque references for the closed loop system with the proposed compensation in Equation (22). Because of the compensation, the dynamics of the joint-space are decoupled and simplified.

In order to analyze the effect of the compensation in RW, the modified dynamics is obtained in a similar way as in Equation (21).

$$\begin{aligned} \begin{bmatrix} F\_{\vec{\chi}}^{R\*} \\ F\_{\vec{\chi}}^{R\*} \end{bmatrix} &= (\mathbf{J}\_R^T)^{-1} \mathbf{I}\_{bi} \mathbf{J}\_R^{-1} \left( \begin{bmatrix} \ddot{\vec{\chi}}^R \\ \ddot{\boldsymbol{\chi}}^R \end{bmatrix} - \dot{\mathbf{J}}\_R \begin{bmatrix} \dot{\theta}\_m \\ \dot{\theta}\_b \end{bmatrix} \right) \\ &= \begin{bmatrix} \frac{I\_m}{l - l\cos\theta\_{b,r}} & 0 \\ 0 & \frac{I\_b}{l + l\cos\theta\_{b,r}} \end{bmatrix} \left( \begin{bmatrix} \ddot{\vec{\chi}}^R \\ \ddot{\boldsymbol{\chi}}^R \end{bmatrix} - \dot{\mathbf{J}}\_R \begin{bmatrix} \dot{\theta}\_m \\ \dot{\theta}\_b \end{bmatrix} \right) \end{aligned} (24)$$

where F R∗ x and F R∗ y are the new force reference regarding the RW coordinates with the proposed compensation in Equation (22). (**J** T R ) −1 **I**bi**J** −1 R indicates operational space inertia (Khatib, 1987).

Even though the inertial compensation is achieved in Equation (24), the diagonal components of the inertia matrix vary with θb.<sup>r</sup> . These variations deteriorate the performance of RWHC. In particular, the variation of the first inertia term Im l−l cos θb.<sup>r</sup> , which is the projected inertia with respect to rdirection, affects the realization of spring behavior of SLIP dynamics. In order to solve this phenomenon, the first inertia term should be controlled as a fixed inertia by adding a separate inertia control.

#### **2.3.4.2. Inertia Control for Robust Rotating Workspace Hybrid Control**

In this subsection, the additional inertia control is designed to fix r-directional inertia for better performance of RWHC.

In order to fix the first term <sup>I</sup><sup>m</sup> l−l cos θb.<sup>r</sup> in an operational space inertia matrix, I<sup>m</sup> should be modulated to include the term l − l cos θb.<sup>r</sup> . Therefore, the goal of inertial control can be defined as modulating the joint-space inertia I<sup>m</sup> as M<sup>R</sup> d (l − l cos θb.<sup>r</sup> ), where M<sup>R</sup> is the desired inertia (mass) of the SLIP model.

d The result of the compensated dynamic system in the biarticular actuator coordinates is linear and is simple enough as shown in Equation (23). Hence, the inertia control can be easily implemented in the biarticular joint-space. The required joint torques τ m <sup>m</sup> and τ m b to modulate the inertia I<sup>m</sup> as M<sup>R</sup> d (l−l cos θb.<sup>r</sup> ) are given by

$$
\begin{bmatrix} \tau\_m^m \\ \tau\_b^m \end{bmatrix} = \begin{bmatrix} M\_d^R(l - l\cos\theta\_{b,r}) - I\_m & 0 \\ 0 & M\_d^R(l - l\cos\theta\_{b,r}) - I\_b \end{bmatrix} \begin{bmatrix} \ddot{\theta}\_m \\ \ddot{\theta}\_b \end{bmatrix} \tag{25}
$$

Note that the second term of inertia control can be designed in another way; however, this paper chooses M<sup>R</sup> d (l − l cos θb.<sup>r</sup> ) − I<sup>b</sup> for simplicity.

By applying the inertia control to fix the r-directional inertia, the dynamics in Equation (24) can be reformulated in the RW by utilizing the RW Jacobian and statics in Equations (18) and (19) as follows:

$$\begin{split} \begin{bmatrix} \boldsymbol{F}\_{\ddot{\boldsymbol{x}}}^{\boldsymbol{R}\*} \\ \boldsymbol{F}\_{\ddot{\boldsymbol{y}}}^{\ddot{\boldsymbol{\alpha}}\*} \end{bmatrix} &= \begin{bmatrix} \boldsymbol{M}\_{d}^{\boldsymbol{R}} & \boldsymbol{0} \\ \boldsymbol{0} & \boldsymbol{M}\_{d}^{\boldsymbol{R}} \tan^{2} \frac{\boldsymbol{\theta}\_{b,r}}{2} \end{bmatrix} \begin{bmatrix} \ddot{\boldsymbol{x}}^{\boldsymbol{R}} \\ \ddot{\boldsymbol{\nu}}^{\boldsymbol{R}} \end{bmatrix} \\ &+ \frac{1}{2} \begin{bmatrix} \boldsymbol{M}\_{d}^{\boldsymbol{R}} (\cos \boldsymbol{\theta}\_{b,r} - 1) \dot{\boldsymbol{\phi}}\_{b,r} & \boldsymbol{0} \\ \boldsymbol{0} & \boldsymbol{M}\_{d}^{\boldsymbol{R}} (1 - \cos \boldsymbol{\theta}\_{b,r}) \dot{\boldsymbol{\phi}}\_{b,r} \end{bmatrix} \begin{bmatrix} \dot{\boldsymbol{x}}^{\boldsymbol{R}} \\ \dot{\boldsymbol{\nu}}^{\boldsymbol{R}} \end{bmatrix} \end{split} \tag{26}$$

Even though the closed-loop dynamic behavior in Equation (26) with respect to r-direction contains the damping-like term 1 2M<sup>R</sup> d (cos θb.r−1)θ˙ b.r , the inertia is fixed and decoupled as desired inertia M<sup>R</sup> d . For the perfection of spring behavior of SLIP model, the r-directional impedance model of HC can be redesigned to compensate the remaining damping-like effect as

$$F\_{\mathbf{x}}^{R\*} = -K\_d^R \mathbf{x}^R - \frac{1}{2} M\_d^R (\cos \theta\_{b,r} - 1) \phi\_{b,r} \dot{\mathbf{x}}^R \tag{27}$$

where desired spring constant K R d and mass M<sup>R</sup> d are given by the desired SLIP model. This redesigned RWHC with joint level controls including decoupling control, gravity compensation and inertia modulation renders pure mass-spring behavior with respect to r-direction.

Interestingly, the θ-directional inertia in Equation (26) is also simplified as M<sup>R</sup> d tan<sup>2</sup> <sup>θ</sup>b.<sup>r</sup> 2 . Considering that the leg posture is mainly placed in a common position, i.e., θb.<sup>r</sup> ≈ π 2 , the term tan<sup>2</sup> <sup>θ</sup>b.<sup>r</sup> 2 in the inertia can be regarded as 1. Therefore, the nominal inverse model in the DOB in **Figure 13** can be defined as M<sup>R</sup> d s 2 . Even though the r-directional motion driven by impedance control of HC causes variation of θ-directional dynamic behavior (because of θb.<sup>r</sup> ), dynamics variation and unmodeled terms can be actively rejected as a disturbance by the ability of DOB. Therefore, the developed SEA-driven robot leg with the proposed RWHC, including two-axes controllers, realizes SLIP dynamics robustly.

The overall structure of the Robust HC in the RW for biarticular robot leg is described in **Figure 14**.

#### 3. RESULTS

#### 3.1. Experimental Realization of Robust Rotating Workspace Hybrid Control

Experiments are conducted using the developed SEA-driven robot leg to verify the following points:


The experiments are conducted by way of the following four steps to systematically verify the contribution points above.

**Section 3.3** System identification using motor position control of SEA

**Section 3.4** Verification of Statics using gravity compensation and Rotating Workspace force control of the robot leg

**Section 3.5** Verification of inertia decoupling control using decoupling control in the biarticular coordinate

**Section 3.6** Performance verification of Robust Rotating Workspace Hybrid Control investigating the effectiveness of inertia modulation in the biarticular coordinate, impedance control and position control in the Rotating Workspace

#### 3.2. Experimental Setup

The experiments for the verifications are set in two ways: the dynamic experiment and static experiment. The experimental setups are shown in **Figure 15**.

The setup for the dynamic experiment in the left-hand side of **Figure 15** is utilized for the system identification and performance verification of proposed HC with joint level control. Then, the robot leg moves according to the control algorithms. Note that the system identification in section 3.3 uses this experimental setup for a variety of leg postures. However, the actual experimental behavior can be more properly described as 'quasi-static' than 'dynamic' (detailed experimental protocol is given in section 3.3).

The setup for the static experiment in the right-hand side of **Figure 15** is utilized for verification of the force transmissibility from SEAs to the end effector through the biarticular mechanism and Rotating Workspace transformation. For this purpose, the endpoint of the leg is connected to a load cell through a universal joint, which distresses any unexpected kinematic constraint in the setup. A single axis (vertical direction) load cell is chosen to measure the vertical force exerted from the endpoint. The load cell (CAS, SBA-50L) used in this setup has a resolution that can measure up to 0.02 N of force with low-pass filtering (100 Hz). In this setup, the body position and the load cell position can be adjusted with respect to the vertical guide and the horizontal guide to change the leg posture.

## 3.3. System Identification of SEA-Driven Robot Leg With Biarticular Actuator Coordination

As explained in section 2.2.2, the coefficients G<sup>m</sup> and G<sup>b</sup> of the gravity terms in (Equation 14) are characteristic parameters, which also include the inertia coupling coefficient C<sup>I</sup> . The identification of the gravity coefficient is a vital identification process for the whole system. Moreover, the SEA in the proposed robotic leg enables measurement of the interacting torques and gravitational torques by the link weights for the system identification. Taking advantage of these two points, the system identification of the robotic leg is proceeded as follows (in **Figure 16**); the motor-side angle (θ<sup>j</sup> in **Figure 7**) is controlled to be constant, and the spring torque (τ<sup>s</sup> in **Figure 7**) is measured after the system becomes steady state. Then, the load angle θ<sup>l</sup> in **Figure 7** is utilized as the mono and biarticular joint angle values of θ<sup>m</sup> and θ<sup>b</sup> , and the measured spring torque τ<sup>s</sup> is utilized as the mono and biarticular gravitational torques τm.<sup>g</sup> and τb.<sup>g</sup> . This measurement process has been conducted with several different angles of θjs, and **Figure 17** shows the results of the measured torques τ<sup>s</sup> and angles θ<sup>l</sup> . Two results are plotted as the measurement results using two SEAs: one for monoarticular joint and the other for the biarticular joint.

The plots show the relationship between θ<sup>l</sup> (= θm, and = θ<sup>b</sup> ) and τs(= τm.<sup>g</sup> , and = τb.<sup>g</sup> ) and exhibit a cosine function pattern,

FIGURE 17 | Gravity coefficient identification result. (A): relationship between θ*m* and τ*m*.*g*, and (B): Relationship between θ*b* and τ*b*.*g*.

which corresponds to the derived relationship in Equation (14). By fitting the curves with appropriate scale selection, the gravity coefficients Gm and Gb can be identified. In this case, the coefficient of monoarticular joint gravity term G<sup>m</sup> is identified as 3.15 Nm, and the biarticular joint coefficient G<sup>b</sup> = g l CI is identified as 0.33 Nm. Therefore, the coupling inertia term C<sup>I</sup> is calculated as 0.01 <sup>N</sup> m/s<sup>2</sup> based on the parameter values g = 9.81m/s<sup>2</sup> and l = 0.3 m.

# 3.4. Statics Verification of the Proposed Robot Leg With Biarticular Actuator Coordination

The statics for the biarticular actuator torque coordinate and the force in the Rotating Workspace given in Equation (19) are verified through experiments. The joint torques τ<sup>m</sup> and τ<sup>b</sup> are provided by SEA torque control, and the end effector force is measured by the load cell. The relationship between the provided joint torques and the measured end effector force is compared to verify Equation (19). Various poses of the robot leg are set as shown in (a–g) of **Figure 18**. Notice that the gravitational force is compensated in the following experiments utilizing the identified Gˆ <sup>m</sup> and Gˆ b .

Statics is verified through the force tracking at the end point where step-wise force references are given in the Rotating Workspace, and these references are converted to the required torque references for SEAs. The required torques are generated by SEA torque control, and the force generated at the end-point by these SEA torques are measured by the load cell and compared with the initial references.

**Figure 18** shows the results of the force tracking and thus the statics verification. Various force references are given changing directions as shown in the right graphs of **Figure 18**. The blacksolid lines in the right plots are the reference signals, and the dash-colored lines indicate the sensor measured forces.

The rise time, which measures the rate of the rise from 10 to 90% of the steady-state response, is calculated and averaged based on the measured data sets, and the result is 6.1 ms. The averaged root mean square error (RMSE) is also calculated, and the result is about 0.15 N. Note that the responses appear negligible steadystate errors in most cases; however, case (f) indicates about 1 N of steady state error, and this may be caused by misalignment

between load cell and leg in the experimental setup with respect to critical posture of the leg.

The results imply that the force of the endpoint in the Rotating Workspace can be controlled using statics in Equation (19) and the high-performance torque control of SEAs. In particular, the results show that the torque control of the SEA can guarantee the stability and fast response time for the step-wise reference with a stiff environment.

#### 3.5. Performance of Decoupling Control in the Biarticular Coordinate

Dynamics of the robotic leg with the biarticular actuator coordination can be decoupled as Equation (23) by applying decoupling control (Equation 22). The performance of this decoupling control is examined in this experiment.

The experimental setup and protocol are given as follows: a sinusoidal torque reference is provided to only one SEA to generate only one torque input τ<sup>m</sup> or τ<sup>b</sup> , and the dynamic behavior of all joint angles θ<sup>m</sup> and θ<sup>b</sup> are examined with and without decoupling control (Equation 22).

A sinusoidal torque with an amplitude of 5 Nm and a frequency of 3 Hz is generated by a SEA of the specific joint (τ<sup>m</sup> or τ<sup>b</sup> ) as an excitation. In addition to this excitation torque, additional decoupling control torques designed as in Equation (22) are added to both SEAs, and all the joint angular velocities θ˙<sup>m</sup> and θ˙ <sup>b</sup> are measured under this condition. After several seconds (1.5 s in the following experiments) of excitation, the decoupling control torques are turned off, and the change of the joint angular velocities are measured and compared with the behavior of the decoupling control.

**Figure 19** shows the results of the decoupling control. The upper graphs are the angular velocities θ˙<sup>m</sup> of the monoarticular joint, and the bottom graph is the angular velocity θ˙ <sup>b</sup> of the biarticular joint. The left result in **Figure 19A** is when the excitation signal is given to the monoarticular SEA τm, and the right result in **Figure 19B** is when the excitation signal is given to τ<sup>b</sup> .

The comparison of the unexcited joint velocities before and after 1.5 s shows that unexcited joints in the both cases are not affected by the motion of the other joint. This result verifies that the proposed decoupling control can successfully decouple the effect from the other joint torque.

### 3.6. Robustness and Tracking Performance Verification of Rotating Workspace Hybrid Control for Realization of SLIP

The performance of the proposed Rotating Workspace Hybrid Control to realize SLIP template-based movement is verified through experiments. To assess the performance in detail, the following points are investigated.


For this purpose, two types of experiments are designed.

Firstly, the leg is dragged and released in the radial direction to examine the controlled impedance against external perturbation. The results of dynamic behavior of the leg are compared with the theoretic solution that is analytically calculated based on the reference impedance model. The several types of reference impedance models implemented on the leg vary with the stiffness and damping values, which are 100, 150, and 250 N/m (low, mid and high stiffness with mid damping), and 5, 15, 25 Ns/m (low, mid, and high damping with mid stiffness). This varying stiffness and damping setting enables verification of the antiinertia variation effect of the inner-loop inertia modulation control in the biarticular coordinate. The desired inertia M<sup>R</sup> d in Rotating Workspace is set to 1.5 Kg, the length of the leg in the radial directional is set to 0.4 m, and the end point is dragged by a hand as far as 0.25 m.

FIGURE 19 | Performance verification of the proposed decoupling control. (A) excitation signal on monoarticular actuator, and (B) excitation signal on biarticular actuator.

**Figure 20** is the result of the drag-and-release test, which shows the behavior of the end point in the radial direction. The graph in **Figure 20A** shows the result with varying stiffness conditions, and the graph in **Figure 20B** shows the result with varying damping conditions. The red-, green- and bluecolored lines indicate low, mid and high stiffness/damping cases, respectively. The solid lines are the theoretical responses, and the dashed lines are the measured responses. Most of the experimental results match well with theoretical responses except the low damping case, which suffers from the nonlinear behavior after first oscillation. Even though the magnitude of the response is different from the theoretical model in this case, the natural frequencies are still well matched with the response of theoretical models.

Second, the dynamic response in the radial direction is investigated while the motion in the tangential direction is controlled. To examine the dynamic behavior of the end point, it is pushed in the radial direction, and the response in the radial direction against this pushing perturbation is measured and investigated. During this push and recovery process in the radial direction, the end point is position-controlled in the tangential direction. The reference for this motion is a sinusoidal pattern with 2 Hz frequency and 0.2 rad magnitude. The reference stiffness and damping for the radial direction impedance are set to 75 N/m and 10 Ns/m. In this experiment, the end point is pushed and recovered two times. In the first pushrecovery process, the inertia modulation and the decoupling control are turned off, and it is turned on in the second push-recovery process.

**Figure 21** is the result of this experiment. The upper graph shows the position of the end point in the radial direction, while the bottom graph shows the position in the tangential direction. The end point is pushed at 4 s at first, and it is pushed again at 13 s. Even though the endpoint recovers after the first push, tt can be verified in that the radial motion suffers from the coupling by the tangential motion up to 10.4 s.

At 10.4 s, the decoupling control and inertia modulation control are switched on, and the coupling in the radial

motion decreases dramatically. The radial motion exhibits the dynamic response designed by the radial impedance control against the second push. While the two pushing perturbations work in the radial direction, the tangential motion is not affected by the perturbation in the radial direction, which validates the robustness of the tangential position control.

# 4. DISCUSSION AND CONCLUSION

## 4.1. Discussion for Contribution Points of the Paper

The paper proposed a mechanical design analysis of SEA-driven robot leg and controller design to realize the SLIP dynamics. The discussions for the contribution of this paper are categorized as shown in the following subsections.

#### 4.1.1. Advantage of SEA-Driven Robot

The SEA, utilized in this paper, is one emerging actuator system in the robotics field. The benefits of the SEA are the torque control capability based on a spring deflection measurement. This paper has analyzed the dynamics of SEA and designed the DOB-based torque controller based on the dynamics. The robustness has been examined with regards to the variation of the load inertia. This is a significant feature when the SEA is used in multi-joint robots such as the robot leg in this paper, since the inertia varies continuously under the dynamic condition. The other advantageous feature of SEA is that the SEA can be used as an identification tool with a specific mechanical design such as the biarticular mechanism in this paper. SEA has various output states; namely, it is a multi-output system. In this paper, the motor side position control has been utilized to identify the physical parameters of the robot leg. This enables the dynamic parameter identification without any extra-measurement of link mass or center of mass, and it allows an accurate dynamics-based control including gravity compensation, inertia decoupling and inertia modulation.

#### 4.1.2. Benefits of Biarticular Mechanism and Rotating Workspace Coordination in Terms of SLIP Dynamics Realization

This paper has developed the SEA-driven articulated robotic leg with the biarticular actuation mechanism taking advantage of biarticular muscles in their design. The design has complemented the robot actuator coordination and human musculoskeletal system. In addition, the RW, which is suitable to the coordinate system of the SLIP model, has been utilized and analyzed in the operation of the biarticular actuator-coordinated robot to satisfy the requirement for human-like locomotion.

Taking advantage of biarticular actuator coordination, a systematic strategy was shown to compensate for the complex dynamic behavior of the robot leg in the Rotating Workspace within bio-inspired coordination. Based on this, the RWHC has been developed to enable the biarticular mechanism to realize SLIP behavior robustly.

#### 4.1.3. Control Performance of the Proposed Hybrid Control in Rotating Workspace

The experiments have verified the performance of the proposed Rotating Workspace Hybrid Control: impedance control performance of the robotic leg in the interactive direction and robust position tracking in the tangential direction, and robustness of radial directional impedance control during high accuracy tangential control. It has been shown through the experimental results that the developed SEA-driven robot leg with the proposed Rotating Workspace Hybrid Control including two-axes controllers could realize the SLIP dynamics robustly.

## REFERENCES

Altendorfer, R., Moore, N., Komsuoglu, H., Buehler, M., Brown, H., McMordie, D., et al. (2001). Rhex: a biologically inspired hexapod runner. Auton. Rob. 11, 207–213. doi: 10.1023/A:1012426720699

The dynamic performance depends on the body state; hence, this approach cannot be generalized for every legged robot. For example, the dynamic reaction of the body of quadrupeds and bipeds are totally different. Therefore, the theory and experiments have been analyzed and performed in a given fixed body condition. In order to extend the proposed approach, the dynamics with the unfixed body condition should be considered and compensated, even though the controller performs well in a given condition.

#### 4.2. Conclusions and Future Work

Human locomotion leads to a simple result of moving the body position; however, the dynamical operation, which is realized in the human leg, is complicated. In order to analyze human locomotion, an analytical tool that can make descriptions from the viewpoint of being outside of human motions as simple as possible is required, rather than a complex neuromuscular analysis approach that considers all the muscles' coordination from a vantage point inside the human body. Robotics, which has been developed mainly from the perspectives of engineering, now starts to turn to humans and nature. The bioinspired actuation and coordination systems and their insights obtained through the observation on the human and nature will provide a new perspective to establish fundamentals for bio-inspired robotics.

In this article, biarticular actuator coordination for designing, analyzing and controlling robotic systems has been investigated. This paper showed a serial procedure to realize the SLIP dynamics for the articulated robot. Based on the realized SLIP dynamics, the foundation for extending the robot leg to various locomotion such as running, gaiting and so on has been discussed. Such mechanical design and control technology is an underlying technology that can be applied to bipedal and quadrupedal walking robots. Therefore, the future work of this research is to utilize the proposed technique in hopping and walking motions of the robot leg and to extend the technique to a quadruped robot.

## AUTHOR CONTRIBUTIONS

CL contributed to literature writing and review, fabrication of the robot, controller design and experiments. SO developed workspace coordinates and supervised the project overall, and contributed to data analysis and writing the paper.

## ACKNOWLEDGMENTS

This work is partly supported by Korea government through the National Research Foundation (NRF-2016R1A2B4016163) and the Ministry of Trade, Industry & Energy (10080547).

Boaventura, T., Medrano-Cerda, G. A., Semini, C., Buchli, J., and Caldwell, D. G. (2013). "Stability and performance of the compliance controller of the quadruped robot hyq," in 2013 IEEE/RSJ International Conference on

Blickhan, R. (1989). The spring-mass model for running and hopping. J. Biomech. 22, 1217–1227. doi: 10.1016/0021-9290(89)90224-8

Intelligent Robots and Systems (Tokyo), 1458–1464. doi: 10.1109/IROS.2013. 6696541


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Lee and Oh. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# A Variable Stiffness Actuator Module With Favorable Mass Distribution for a Bio-inspired Biped Robot

David Rodriguez-Cianca1,2 \*, Maarten Weckx <sup>1</sup> , Rene Jimenez-Fabian<sup>1</sup> , Diego Torricelli <sup>2</sup> , Jose Gonzalez-Vargas 2,3, M.Carmen Sanchez-Villamañan<sup>2</sup> , Massimo Sartori <sup>4</sup> , Karsten Berns <sup>5</sup> , Bram Vanderborght <sup>1</sup> , J. Luis Pons <sup>2</sup> and Dirk Lefeber <sup>1</sup>

 Robotics and Multibody Mechanics Research Group, Vrije Universiteit Brussel (VUB) and Flanders Make, Brussels, Belgium, Cajal Institute, Spanish National Research Council (CSIC), Madrid, Spain, <sup>3</sup> Ottobock GmbH, Duderstadt, Germany, Department of Biomechanical Engineering, University of Twente, Enschede, Netherlands, <sup>5</sup> Robotics Research Lab, University Kaiserslautern, Kaiserlslautern, Germany

Achieving human-like locomotion with humanoid platforms often requires the use of variable stiffness actuators (VSAs) in multi-degree-of-freedom robotic joints. VSAs possess 2 motors for the control of both stiffness and equilibrium position. Hence, they add mass and mechanical complexity to the design of humanoids. Mass distribution of the legs is an important design parameter, because it can have detrimental effects on the cost of transport. This work presents a novel VSA module, designed to be implemented in a bio-inspired humanoid robot, Binocchio, that houses all components on the same side of the actuated joint. This feature allowed to place the actuator's mass to more proximal locations with respect to the actuated joint instead of concentrating it at the joint level, creating a more favorable mass distribution in the humanoid. Besides, it also facilitated it's usage in joints with centralized multi-degree of freedom (DoF) joints instead of cascading single DoF modules. The design of the VSA module is presented, including it's integration in the multi-DoFs joints of Binocchio. Experiments validated the static characteristics of the VSA module to accurately estimate the output torque and stiffness. The dynamic responses of the driving and stiffening mechanisms are shown. Finally, experiments show the ability of the actuation system to replicate the envisioned human-like kinematic, torque and stiffness profiles for Binocchio.

Keywords: variable stiffness actuator, bio-inspired biped robot, mass distribution, muti-DoFs joints, human-like locomotion

# 1. INTRODUCTION

Creating bipedal robots that can walk stably and efficiently as humans has been an open challenge since long time in robotics research (Vukobratovic and Borov ´ ac, 2004). Traditional approaches focus on multiple degree-of-freedom (DoF) platforms controlled by classic control paradigms that ensure quasi-static stability, e.g., the Zero-Moment Point (ZMP) (Vukobratovic, ´ 1975; Vukobratovic´ and Borovac, 2004). Despite their good performance on flat terrain, most of these robots show important limitations, such as high energetic costs, slow walking motion, poor robustness on uneven terrains, and unnatural kinematic patterns (Torricelli et al., 2016). Different from this approach, the "dynamic walking" principle emerged to improve the human-like properties of bipeds, realizing natural, and efficient motion with little or no actuation

#### Edited by:

Kosta Jovanovic, School of Electrical Engineering, University of Belgrade, Serbia

#### Reviewed by:

Vincent Bonnet, Université Paris-Est Créteil Val de Marne, France Stefan S. Groothuis, University of Twente, Netherlands Veljko Potkonjak, Belgrade Metropolitan University, Serbia

#### \*Correspondence:

David Rodriguez-Cianca david.rodriguez.cianca@vub.be

Received: 23 January 2019 Accepted: 17 April 2019 Published: 17 May 2019

#### Citation:

Rodriguez-Cianca D, Weckx M, Jimenez-Fabian R, Torricelli D, Gonzalez-Vargas J, Sanchez-Villamañan MC, Sartori M, Berns K, Vanderborght B, Pons JL and Lefeber D (2019) A Variable Stiffness Actuator Module With Favorable Mass Distribution for a Bio-inspired Biped Robot. Front. Neurorobot. 13:20. doi: 10.3389/fnbot.2019.00020

(McGeer, 1990; Collins et al., 2005; Hobbelen and Wisse, 2005). These solutions, while minimizing kinematic and control complexity, show poor stability, versatility and controllability in realistic environments. Compliant actuation has been proposed to narrow the gap between these two approaches. In humans, the intrinsic compliant properties of joints and muscles are at the basis of robust, energy efficient, and versatile locomotion. Humans modulate the stiffness of the joints through the co-contraction of agonist and antagonist muscles, producing large ranges from rigid to highly compliant behaviors (Sartori et al., 2015). These mechanisms are central for adapting to a large variety of terrains (Ferris et al., 1998) and for naturally adjusting to biomechanical and energetic demands (Farley and Gonzalez, 1996).

Series Elastic Actuators (SEAs) introduce compliance to humanoid robotics, resulting in safer human-robot interaction, shock absorption, and greater energy efficiency compared to stiff actuators (Vanderborght, B. et al., 2013). However, modulation of joint stiffness can only be achieved through impedance control of the SEAs (Kim et al., 2012; Tsagarakis et al., 2013; Paine et al., 2014; Pierce and Cheng, 2014; Negrello et al., 2016). Variable Stiffness Acuators (VSAs), on the other hand, can inherently change the mechanical compliance of a humanoid's joints, having positive outcomes in terms of energy efficiency, robustness against disturbances and similarity with human motions due to their inherent compliant behavior. Previous research regarding the application of VSAs in humanoids showed modulation of walking velocity and step length in passive dynamic walkers (Huang et al., 2013), and the postural control of a biped (Hettich et al., 2011). Currently, only few examples of bipeds with VSAs can be found: Veronica (Huang et al., 2013), Lucy (Vanderborght et al., 2008), and BLUE/miniBLUE (Enoch and Vijayakumar, 2015). Several working principles for actuators with variable-stiffness capabilities have been proposed for various robotic applications. Examples of these are the Compliant Asymmetric Antagonistic Actuator (Roozing et al., 2015), vsaUT-II (Groothuis et al., 2014), AwAS-II (Jafari et al., 2012), Variable Torsion Stiffness Actuator Schuy et al. (2013), Mechanically Adjustable Compliance and Controllable Equilibrium Position Actuator (MACCEPA) (Van Ham et al., 2007), and ARES (Cestari et al., 2015). Because of the importance of multi-DoF joints in the human body, bio-inspired robotic applications often require multi-DoF actuated joints (Mizuuchi et al., 2007; Potkonjak et al., 2011; Nakanishi et al., 2012). In these applications, nonetheless, no variable stiffness capabilities were implemented. Instead, the compliance of such actuators was fixed. The research on multi-DoF actuators with variable stiffness is, therefore, still limited. Generally, cascades of single-degree of freedom actuators are used (Catalano et al., 2011). One of the few examples that follows a more integrated approach is the multi-DoF actuator with variable stiffness based on two antagonistic setups of ANLES actuators (Koganezawa et al., 2012). Another example, based on the MACCEPA concept, is proposed in Weckx et al. (2014). VSAs bring about increased mechanical complexity and weight since generally two motors are required for the independent control of both equilibrium position and stiffness. Mass distribution of the legs is an important design parameter, because it can have detrimental effects on the cost of transport. It is reported that the net metabolic cost of walking increases with more distal location of increased load mass. Hence increased load mass at the foot has greater effect on the metabolic cost than at the thigh (Browning et al., 2007; Schertzer and Riemer, 2014). Added mass to lower extremities also increases the swing leg's moment of inertia about the hip joint, resulting in higher moments at the knee and ankle compared to normal walking (Royer and Martin, 2005). This in turn leads to heavier motors in a humanoid's legs. Mass and the distribution of mass are therefore key aspects in the design of humanoids.

This work presents a novel VSA module designed to be implemented in the sagittal DoFs of the legs of Binocchio, a bio-inspired humanoid robot designed as platform for the validation of biomimetic controllers and the understanding of the neuromechanical processes of human movement, including the role of compliance during walking. The presented VSA module places both motors, the driving and the stiffening motor, on the same side of the actuated joint, previously not possible with existing concepts, and in-line with the housing link. This feature allows to place the actuator's mass to more proximal locations with respect to the actuated joint instead of concentrating it at the joint level, creating a more favorable mass distribution in the design of the humanoid's leg . Besides, it also facilitates it's usage in joints with centralized multi-DoFs joints instead of cascading single DoF modules . These innovations have been illustrated with the implementation of the VSA module in the compliant multi-DoF joints of the humanoid biped Binocchio. The requirements for the VSA module, it's overall working principle, mechanical design and first prototype are presented in section 2. Section 2.5 shows the integration of the proposed actuator in the multi-DoFs joints of Binocchio. Subsequently, the static characteristics of the VSA module, the dynamic responses of the driving and stiffness modulation mechanisms and the ability of the actuator to follow human-like kinematics, torque and stiffness profiles are experimentally validated in section 3. Discussion of the presented results and conclusions from this work are given in sections 4 and 5.

## 2. A NOVEL VSA MODULE FOR A BIO-INSPIRED HUMANOID ROBOT

#### 2.1. Requirements

**Figure 1A** shows the defined joint actuation scheme for the Binocchio biped, based on a study of the key principles of human locomotion with a special focus on the relevance of lower limb joints' compliance during walking (Torricelli et al., 2016). The actuator presented in this paper is aimed at being implemented in all the biped's sagittal joints, i.e., waist, hip, knee and ankle as, based on the previous work, variable stiffness seems to play a bigger role in these DoFs during ground-level human locomotion. We therefore used modeling and simulation approaches in combination with evidence from human studies in order to define the actuation requirements. We employed a

FIGURE 1 | Actuation requirements for Binocchio. (A) Kinematic and actuation concept, (B) Estimates of human stiffness modulation for the human ankle and knee sagittal joints. (C) Kinematic and actuation profiles derived from the human-like B4LC simulator.

simulated biomimetic biped, called B4LC (Luksch and Berns, 2010), i.e., Bio-inspired Behavior-Based Bipedal Locomotion Control, to generate realistic estimates of human-like kinematic and torque patterns based on the biped's weight (35 kg) and height (170 cm) for level ground walking at a cadence of 1.4 s/stride (**Figure 1C**). As all the VSA actuators were meant to have the same specifications, we based the requirements on the maximum values observed in the simulations. We defined a maximum required range of motion (ROM) of 90◦ and a maximum torque of 40 Nm. As for the requirements on the joint stiffness modulation, we used a model, developed by Sartori et al. (2015), to predict stiffness changes in the joints throughout the gait cycle for the knee and ankle joints (**Figure 1B**). Based on this model we defined a required stiffness range between 0 and 5 Nm/deg. As for the hip joint, we defined the same stiffness range as in agreement with the results presented in Shamaei et al. (2013). Since no evidence was available on the stiffness variation in the waist, the same stiffness range was defined.

#### 2.2. Conceptual Development

The VSA actuator presented in this work is based on the MACCEPA concept (Van Ham et al., 2007). In the original concept (**Figure 2A**), the spring is housed on the output link, which means both the mechanisms used to drive the lever arm and the one to change the pre-compression on the spring, P, are housed on different links of the joint (namely input and output link, respectively). In contrast, the novel MACCEPA concept presented in this paper (**Figure 2B**) allows both mechanisms to be housed in the input link. This represents an advantage since the actuator's mass can be placed in more proximal locations to reduce the inertia of the output link. Besides, since the actuator's components are housed on the same link of the joint (the input link), the remaining link (the output link) can house a second actuator to drive a second DoF by means of a universal joint, where each of the actuators drive one of the axis of the 2 DoFs joint. A strap is attached to the output link (OL) at a distance D from the axis of rotation, denoted by O in the diagram. The strap is subsequently guided through the lever arm (LA). The strap is guided through the joint's axis of rotation O to a linear spring housed on the input link, to which it is rigidly attached. The effective length of LA, B, is defined as the distance between the axis of rotation and the point where the strap makes initial contact with LA. When LA and OL are aligned, the force exerted by the strap, due to the initial deformation of the spring, or initial pre-compression, is aligned and no torque is exterted on OL. When α is different from zero, an additional deformation is added to the strap, which exerts a force on the spring FS. The exerted force is no longer aligned with OL, resulting in a torque determined by the perpendicular component of F<sup>S</sup> with respect to the OL at a distance D from the joint's center of rotation. By changing the initial pre-compression force P in the spring, the actuator's output stiffness can be adjusted. One of the practical features of the proposed design is that the actuator's output torque and stiffness characteristics (or quasi-stiffness Rouse, E.J. et al., 2013) can be easily and accurately predicted using only the deviation angle α, and the force in the strap, FS(t), or the spring linear deformation, p(t), induced by the compression, or stiffening mechanism. The torque-angle relationship is given by

$$\begin{aligned} T &= T(\alpha, p) = D \cdot f\_{\mathbb{S}} \\ &= k \left( A(\alpha) + B + p(t) - D \right) \frac{BD}{A(\alpha)} \sin(\alpha) \end{aligned} \tag{1}$$

or

$$T = T(\alpha, F\_{\rm S}) = F\_{\rm S}(t) \frac{BD}{A(\alpha)} \sin(\alpha) \tag{2}$$

where FS(t) is the force in the strap and

$$A(\alpha) = \sqrt{B^2 + D^2 - 2BD\cos(\alpha)}\tag{3}$$

is the length between the attachment points of the strap at the output link and the lever arm. T = T(α, FS) can be used to improve the torque prediction if F<sup>S</sup> is directly measured, which is convenient in situations when the spring stiffness k is unknown or non-linear. The force in the strap and the deformation

of the spring induced by the compression mechanism are related through

$$p(t) = \frac{1}{k} F\mathbf{s}(t) - A(\alpha) - B + D.\tag{4}$$

When p(t) = p is held constant, the force in the spring due to it's initial deformation is P = k · p = FS(0) when the actuator is at equilibrium position (α = 0).

This relatively simple representation, given by (1)–(4), can effectively be used to estimate the actual output torque for design purposes as well as in closed-loop feedback schemes. The partial derivative of (1) with respect to α yields the output stiffness S of the actuator (or quasi-stiffness) as a function of α for a given initial spring deformation p:

$$\begin{split} S &= S(\alpha, p) = \frac{dT(\alpha, p)}{d\alpha} \\ &= k \left( A(\alpha) + B + p(t) - D \right) \\ &\times \frac{BD}{A(\alpha)} \left[ \cos(\alpha) - \left( \frac{BD \sin^2(\alpha)}{A(\alpha)^2} \right) \right] \\ &\quad + k \left( \frac{BD \sin(\alpha)}{A(\alpha)} \right)^2 \end{split} \tag{5}$$

or

$$\mathcal{S} = \mathcal{S}(\alpha, F\_{\mathcal{S}}) = \frac{dT(\alpha, F\_{\mathcal{S}})}{d\alpha}$$

Frontiers in Neurorobotics | www.frontiersin.org

$$=F\_{\mathbb{S}}(t)\times\frac{BD}{A(\alpha)}\left[\cos(\alpha)-\left(\frac{BD\sin^{2}(\alpha)}{A(\alpha)^{2}}\right)\right]$$

$$+k\left(\frac{BD\sin(\alpha)}{A(\alpha)}\right)^{2}\tag{6}$$

if the force in strap is available.

When the system is at it's equilibrium position, S accounts for the apparent stiff reaction of it's output link to external perturbations producing a deviation angle from the equilibrium position, as in the case of a passive torsional spring.

#### 2.3. Mechanical Design

**Figure 3A** shows the mechanical design of the proposed actuator based on the previously explained concept. The system consists of two mechanisms: the driving mechanism (DM) and the stiffening mechanism (SM), connected by means of a Kevlar <sup>R</sup> strap.

The driving mechanism defines the position of the lever arm, and as such the position where the actuator does not generate any torque, i.e., the equilibrium position. The purpose of the stiffening, or pre-compression, mechanism is 2-fold: On the one hand, it transforms any deviation angle between the lever arm and the output link into a deformation of the spring during the operation of the actuator. On the other hand, the stiffening mechanism is able to set the initial deformation on the spring to adjust the level of compliance, or stiffness, of the output link when interacting with the environment. **Figure 4** illustrates the working principle of the actuator.

#### 2.3.1. Driving Mechanism

The driving mechanism consists of a custom-made inverted slider-crank configuration that drives the lever arm (**Figure 3A**). A motor (EC-4pole 22, 200 W, 24 V, Maxon motor) in combination with a ball spindle drive (GP 32S Maxon motor, <sup>1</sup> : 1 transmission <sup>∅</sup><sup>10</sup> <sup>×</sup> 2 mm ball) acts as the slider. The spindle's nut is connected to the crank, allowing it to rotate with respect to the nut. The crank itself is rigidly connected to the lever arm. The motor-spindle-drive combination is hinged with respect the input link so that the slider's length equals the distance between the crank-nut connection and the hinge point. The angle between the crank and the lever arm, the length of the crank, and the relative position of the hinge to the axis of rotation can be used to tune the transmission between spindle drive and the lever arm. When the spindle drive rotates, it transforms the rotational movement of the motor into a translation of the nut. When the position of the nut changes along the spindle screw, the length of the slider changes. This, in turn, changes the configuration of the inverted slider-crank and alters the angular position of LA. The linear force exerted by the spindle drive, therefore, is transformed into torque on the lever arm. A strap is fixed to the free end of the OL and enters the LA through a pair of pulleys. It exits the LA through a second pair of pulleys that guides it through the joint's axis of rotation to avoid parasitic torques. To accommodate this, the axis of rotation is not a continuous axle, but rather two axles, one on each side of the lever arm. The strap is subsequently guided toward the stiffening mechanism where it is attached to the shuttle.

#### 2.3.2. Stiffening Mechanism

The stiffening mechanism uses a carriage system to transform a deviation angle α into a deformation of the spring. A motor (EC-4pole 22, 200 W, 24 V, Maxon motor) directly connected in-line with a trapezoidal spindle drive (Spindle Drive GP 32 S, Maxon motor, 4.1 : 1 transmission TR <sup>∅</sup><sup>10</sup> <sup>×</sup> 2 mm) is attached to the input link. The spindle's screw is centrally positioned through a carriage system. The spindle's nut is located on the far end of the motor side. As indicated in **Figure 3A**, the shuttle consists of a set of clamps, Platform A, and Platform B. The shuttle is centered around the spindle and is able to translate linearly relative to it. A clamp rigidly attaches the strap to the shuttle through platform A. A die spring (68.68 N/mm, Lesjöfors) is placed around the screw and is held in place by Platform B on one end and the spindle's nut on the other end. By changing the position of the nut with the stiffening motor while the shuttle is kept in-place, the initial compression of the spring is changed. When a force, parallel to the screw and directed away from the motor side, is applied to the shuttle, the spring compresses against the nut.

### 2.4. Prototype

**Figure 3B** shows the mechanical construction of the first prototype. The driving and the stiffening mechanisms are attached to different sides of an aluminum H profile. As for the materials, we employed aluminum 7,075 for most of the actuator's parts except for the joint's center of rotation, where we chose stainless steel. The DM includes two 14-bit magnetic encoders (AS5048A, Austriamicrosystems), to measure the deviation angle between LA and OL (α) and the angular position of the output link with respect to the input link. The shuttle accommodates a donut compression load cell (SS1108 4448N, Toledo Transducers) between the spring and Platform B to measure the force acting on the strap (FS) and derive the output torque. The actuator has a total weight of 1.2 kg excluding the weight of the H-profile and can generate a maximum torque of 40.0 Nm. The maximum spring compression is 21.7 mm, resulting in a maximum force in the strap of around 1, 500 N. The main configuration parameters of the actuator are summarized in **Table 1**. These values were the result of an optimization process


to minimize the lever arm energy requirements for the specified torque trajectories.

# 2.5. Integration in the Two-Degree-Freedom Joints of Binocchio

A common approach to construct multi-DoF joints in robotics is cascading single-DoF joints. This approach can become bulky and complex when VSAs are implemented due to the required extra motor and stiffening mechanism. A Cardan centralizes 2 DoFs in one joint and therefore is an excellent alternative to reduce the complexity and size of multi-DoFs joints. It consists of two axes rigidly and orthogonally connected to each other in the center of the joint. The 2-DoFs joints of the biped are driven by Cardan, or universal, joints. Each of the axes drives either the saggital or the frontal DoF of the joint. As shown in **Figure 5**, an H-shaped and a U-shaped structural profile are connected by means of bearings to the sagittal and frontal axis, respectively, of the Cardan. The VSA module is housed on the custommade H-profile, driving the sagittal axis while a MACCEPAbased SEA module, presented in Rodríguez-Cianca et al. (2015), is housed on the U-profile, driving the frontal axis. **Figure 5** shows a magnified view of the biped's 2-DoF ankle joint and an illustration showing the construction of the Cardan joint. This modular construction is used in the entire biped for the ankle, hip and trunk joints. This results in a mass-wise human-like, tapered leg with one SEA housed on the ankle, one VSA in the shank, 2 VSAs in the thigh, one SEA on the hip, and one VSA in the trunk (**Figure 6**).

# 3. EXPERIMENTS

# 3.1. VSA Characterization

The actuator's output torque and compliant behavior is a function of the initial spring pre-compression P and deflection angle α. We carried out two static tests in order to characterize this behavior

and compare it with the theoretical models. We attached both the actuator and the output link to an external rigid frame, while varying the α angle by acting on the driving mechanism. We recorded the resulting torque Text by means of a bidirectional load cell (LSB200 445 N, FUTEK) placed between the output link and the external frame. We repeated the experiment for different values of initial compressions: 50 , 250 , 500, and 750 N, corresponding to percentages of the maximum pre-compression values of 3.3, 16.6, 33, and 50%, respectively. The readings of the internal and external load cells, as well as the lever arm encoder, where captured at a sample frequency pf 1 kHz by a real-time data-acquisition (DAQ) system (PCI-6229 National Instruments DAQ board on an Intel Core2 Duo 2.16 GHz C5102 Beckhoff Automation industrial computer running under MathWorks Real-Time Windows Target <sup>R</sup> 4.2 and Simulink <sup>R</sup> 8.1 on Microsoft Windows <sup>R</sup> XP). Each of the actuator's motor was driven by a commercial motor drive (ESCON 70/10 700W, Maxon motor) set in velocity control mode commanded by a proportional-integral- derivative (PID) force controller, in the case of the SM motor, and a proportional (P) position controller, in the case of the DM motor, with external reference inputs commanded by the DAQ system. **Figure 7A** shows the output torque as a function of the deflection angle α for different constant values of the initial compression force P in the spring using a sinusoidal angular displacement command for the lever arm with a 10◦ amplitude at 0.5-Hz. A comparison is made

between the externally measured torque Text with respect to the theoretical approximations provided by (1) and (2). Using T(α, p), the root mean square error (RMSE) is 7.0%, with a maximum normalized error of 21.5%. Using T = T(α, FS), the RMSE is reduced to 6.1% with a maximum normalized error of 20.7%, possibly thanks to some dynamic effects captured by the internal load cell, errors when measuring the spring deformation or due to the fact that the spring constant k might not be perfectly lineal.

**Figure 7B** compares the output stiffness estimation using (5) and (6) for different values of the initial spring deformation. The comparison is also made against the instantaneous approximation of the derivative of Text(t) with respect to α(t) using finite differences, denoted by Sext. This numerical approximation of S depicts a tendency similar to that obtained by the analytical estimation, showing a stiffening behavior as both the initial spring deformation (P) and deviation angle (α) values increase. The lack of accuracy of Sext, specially at higher values of P, could be explained by numerical errors due to the presence of noise in the input signals and possibly due to very small errors on the angle measure, which induces larger output torque estimation errors at higher pre-compression values.

# 3.2. Closed-Loop Frequency Response

#### 3.2.1. Driving Mechanism

The torque control of the actuator is carried out by means of the driving mechanism. Similar to former MACCEPA designs, the frequency response of this mechanism depends on the level of initial deformation applied to the actuator's spring. In order to determine this influence, we set the torque controller to follow a chirp trajectory with a peak amplitude of 10 Nm centered around 0 Nm in a frequency range between 0.05 to 10 Hz during 40 s for three different levels of initial spring compression P of 250 N (16% of Pmax), 500 N (33%) and 750 N (50%), while the output link was kept locked. The torque controller used the approximation T = T(α, FS) to estimate the output torque based on the position of the lever arm, α, measured by a magnetic encoder, and the force in the strap, FS, measured by a load cell placed in series with the spring. The lever arm was controlled by the DM motor, which was set in velocity control mode with a nested limiting current controller using a proportional-integral closed loop controller using the torque estimation as feedback.

The system is able to exhibit different frequency-response characteristics as a function of the initial pre-compression force, as can be seen in **Figure 8**. The controller presented a bandwidth, calculated as the cutoff frequency at an attenuation of −3 dB, of 5.27 , 6.36, and 6.53 Hz for initial pre-compression values of 250 , 500, and 750 N, respectively.

#### 3.2.2. Stiffening Mechanism

The stiffening mechanism of the actuator is responsible of the output stiffness variation. Actuator's output stiffness is estimated using the approximation S = S(α, FS) by measuring the linear force in the strap F<sup>S</sup> and the lever arm deviation angle α. The bandwidth of this mechanism was calculated in the absence of any external perturbance, i.e., α = 0. By modulating the force in the strap, therefore, the passive stiffness of the system can be modified. We used a simple proportional control loop to regulate the force in the Kevlar strap F<sup>S</sup> by means of the driving mechanism motor while set in velocity control mode. The force controller was set to follow a chirp signal with a frequency swept from 0.05 Hz to 10.00 Hz in 40 s and an amplitude of 350.0 N centered around 400.0 N, corresponding to an output stiffness in the range [0.25-3.68] Nm/deg. The frequency response of the stiffening mechanism is shown in **Figure 9**. For the specified force command, the mechanism presents an average bandwidth of 2.23 Hz.

### 3.3. Human-Like Profiles Tracking

The third set of experiments aimed at testing the ability of the VSA actuator to replicate the human-like torque, kinematics, and stiffness profiles defined in section 2.1. Results on the human-like performance of the actuator for the biped's hip, knee and ankle joints are reported in **Figure 10**.

To reproduce the human-like joint angle characteristics (**Figure 10**, left), the output link of the actuator was left free to move. The output angle was measured by means of a magnetic encoder (AS5048, AMS) with 14-bit of resolution, and used as feedback in a PID controller. In order to cancel the effect of gravity, the actuator was placed horizontally. Experiments were performed for two conditions: in the presence and in the absence of an external load, and repeated for hip, knee and ankle reference trajectories. The output load consisted of a mass of 2.5 kg attached at a distance of 15 cm from the actuator's axis of rotation, generating an inertia of 0.05625 kg m<sup>2</sup> . All tracking experiments were performed at a fixed pre-compression value of 200 N. PID values were manually tuned for each condition so that overshoot was limited to 5%, and data were collected for 15 consecutive cycles, which were segmented offline to get the mean and standard deviation values.

At the hip joint the actuator can follow the desired kinematic trajectory with a RMSE of 2.71 ± 0.41◦ , representing a

normalized error of 5.89 %. In the presence of an external load, the RMSE increased to a value of 6.71 ± 0.32◦ (14.59 %). The knee shows an RMSE of 1.43 ± 0.45◦ (2.86 %) without the presence of any load, and of 7.18 ± 0.34◦ (14.33 %) in the presence of an external load. At the ankle joint, RMSE values were 1.41 ± 0.40◦ (4.33 %) and 5.51 ± 0.22◦ (16.96 %), without and with external load, respectively.

To test the output torque tracking performance for hip, knee and ankle joints, the generated output torque was estimated by T = T(α, FS) and used as feedback to a PI controller. The actuator could accurately follow the desired torque trajectories for the three different joints (**Figure 10**, center). It generated a maximum torque of 40 Nm, with peak to peak values of 65 Nm at the specified gait frequency. The maximum RMSE was 5.89 ± 0.47 Nm (9.14 %), at the hip joint. The knee and ankle joint showed an RMSE of 2.98 ± 0.24 Nm (10.37 %), and 1.62 ± 0.14 (6.90 %) respectively. These results show the actuator is able to accurately follow the desired trajectories with small tracking errors.

To generate the human-like stiffness modulation profiles (**Figure 10**, right), we imposed a stiffness trajectory provided by simulation (Sartori et al., 2015) on human stiffness during walking for a speed of 1.4 s/stride. The stiffness trajectory was then converted into a desired force in the strap (FS) using the approximation provided by (6) and used as setpoint for the stiffening controller. For this experiment, the angle α was kept constant at 0◦ . Experiments show a mean error of 1.036 ± 0.001 Nm/deg (26.61 %) for the knee and 0.424 ± 0.017 Nm/deg (10.14 %) for the ankle joint. In these experiments, the maximum required force in the spring (FS) was 790 N and 870 N for the knee and ankle, respectively. Hip measurements could not be performed due to the lack of human reference data on this joint.

#### 4. DISCUSSION

Experiments showed that the theoretical model of the actuator provides a good approximation of it's output torque and stiffness (**Figures 7A,B**), which suggests that the actuator can be controlled using a real-time model-based approach instead of relying on load cells, as done by most torquecontrolled humanoids, with also direct advantages on mechanical complexity. The frequency response of the driving and variable stiffness mechanisms showed an average bandwidth of 6 and 2 Hz respectively (**Figures 8**, **9**). In practical terms, the system can

simulator.

go from an initial output stiffness of 0 to 5 Nm/deg in about 0.5 s in the absence of external perturbations. It is important to note that these results strongly depend on the used controller and that the main purpose of the tests was to show the ability of the system to change it's frequency response as a function of the stiffness settings. Other control strategies are currently being explored in order to achieve a better performance. However, with the current control strategy, the actuator could effectively modulate it's output position, torque and stiffness in the same range, amplitude and frequency as human joints during walking based on our simulations, as it can be concluded from the small tracking errors observed in **Figure 10**. The actuator could generate a maximum torque of 40 Nm, with peak to peak values of 65 Nm at the specified gait frequency.

This, in combination with the high position, torque and stiffness control accuracy, suggests the actuator could perform the desired function when implemented in the robot. However, in the current state of the research, these results not yet allow for a conclusive assessment of the biped's performance given the tracking accuracy of the actuator, and other additional higher level control strategies might be required to compensate for the existence of tracking errors.

As presented in section 2.5, the current design only allows to implement 2 compliant actuators (2 VSAs, 2 SEAs or 1 VSA and 1 SEA) into one centralized joint using a cardan. If a third DoF is required, for instance, it becomes necessary to cascade a third actuator connected to the 2-DoF module (see **Figure 6**), as it's done in the case of the hip and the waist of Binocchio (see **Figure 1A**). However, based on an analysis done in a previous study from the same authors providing an overview of the key principles of human bipedal walking (Torricelli et al., 2016), we believe it is fair to say that the maximum required amount of DoFs per joint is limited to 3 in order to achieve humanlike biped locomotion, including walking under unperturbed and perturbed conditions, and mostly at the hip and waist level. Finally, due to the lack of data for the human waist, we assumed that the same stiffness range as for the rest of the joints applied, which is not necessary correct. However, from the authors point of view, this is a question related to the design choices of the biped robot itself, and not on the design and performance of the VSA, which is the main focus of this study. Therefore, in the authors opinion, the answers to these questions are out of the scope of this paper, and will have to be further investigated in a future study.

Currently, compliance in humanoid robots is mostly used with the goal of improving torque control and/or ensuring safety to shocks, rather than truly replicating human-like joint functions. Our results show that the presented VSA can span from a rigid configuration, up to 5 Nm/deg, to values very close to zero, resulting in a passive behavior of the joint useful when the limb should move freely under inertial or gravitational effects, e.g., during the swing phase. Besides, the actuator presents a stiffening effect with respect to the deviation angle, which has been found beneficial for locomotion (Seyfarth et al., 2006; Vanderborght et al., 2011). Finally, in a previous study the authors demonstrated the ability of the proposed VSA to reduce it's electrical energy consumption online during the execution of repetitive tasks (Jimenez-Fabian et al., 2018). These results suggest that the presented actuator could also reduce the energy requirements of the Biped for walking due to the repetitive nature of this task, which would highly reduce it's cost of transport.

## 5. CONCLUSIONS

The understanding of human locomotion has led to usage of VSAs in humanoids due to their inherent advantages. Striving toward a more accurate biologically inspired robotic counterpart of humans, these VSAs have to be implemented in mechanical multi-DoF joints to replicate the rich variety of movements human present. Furthermore, distribution of the extra mass that these VSAs bring forth, due to the required additional motors, is paramount for minimizing the cost of transport of humanoids. However, research toward multi-DoF VSAs is still scarce. This work presented and tested a novel VSA concept designed to be implemented in the sagittal DoFs of the legs of a bio-inspired humanoid robot designed as platform for the validation of biomimetic controllers and the understanding of the neuromechanical processes of human movement, including the role of compliance during walking. The presented actuator allows to place the motors of the VSA in-line with the actuated link and house both motors on the same side of the actuated joint. This not only places the actuator's mass to more proximal locations to create a more favorable mass distribution in the design of the humanoid's leg, but also facilitates it's usage in multi-DoFs joints. The construction of the VSA module, it's main mechanisms, and overall working principle have been explained. The equations necessary to determine it's characteristics have been derived and experimentally validated. The experimental results confirm the functionalities expected from the proposed concept and the accuracy of it's mathematical description. These innovations have been finaly illustrated with the implementation of the VSA module in the multi-DoF joints of the biped Binocchio.

Under a neuroscientific perspective, Binocchio represents a biorobotic test bench that may serve in the future to understand the biomechanical mechanisms of walking performance. The replication of biological joint stiffness dynamics is an emerging and largely unexplored issue that may produce significant step changes in robots operating in real-life environments. This has important implications in robotics, enabling to experimentally validate the hypothesis that stiffness modulation is a determinant for robust and efficient walking, providing invaluable understanding of the neuro-mechanical processes of human movement. Binocchio represents an advanced mechatronic platform that can allow new biologically-motivated walking and standing control algorithms to be directly validated in real-life environment. Besides, beyond humanoid applications, our work has also potential impact in the field of rehabilitation and assistive robotics, where the role of variable compliance has been recently identified as a key factor for the achievement of truly human-like behavior.

## AUTHOR CONTRIBUTIONS

DR-C, MW, MS-V, and DL conceptualization. DR-C, MW, RJ-F, JG-V, and DT methodology. DR-C and RJ-F software. DR-C and RJ-F validation. DR-C formal analysis. DT, BV, JP, and DL resources. DR-C and MW writing original draft preparation. DR-C, MW, RJ-F, DT, JG-V, MS-V, MS, KB, BV, JP, and DL writing review and editing. JP and DL funding acquisition.

#### FUNDING

This work was funded by the European Commission under the FP7 project H2R Integrative Approach for the Emergence of Human-Like Robotic Locomotion under grant no. 600698.

#### REFERENCES


DR-C is a FWO-SB fellow of the Research Foundation - Flanders (FWO).

#### ACKNOWLEDGMENTS

The authors would like to thank all researchers participating in the H2R project, who contributed directly or indirectly to the discussions leading to the design and development here presented.


**Conflict of Interest Statement:** JG-V is employed by company Ottobock GmbH, but was not employed at the time when the research was conducted. All other authors declare no competing interests.

The reviewer SG declared a shared affiliation, though no other collaboration, with one of the authors MS to the handling Editor.

Copyright © 2019 Rodriguez-Cianca, Weckx, Jimenez-Fabian, Torricelli, Gonzalez-Vargas, Sanchez-Villamañan, Sartori, Berns, Vanderborght, Pons and Lefeber. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Assistive Arm-Exoskeleton Control Based on Human Muscular Manipulability

#### Tadej Petricˇ 1 \*, Luka Peternel <sup>2</sup> , Jun Morimoto<sup>3</sup> and Jan Babicˇ 1

<sup>1</sup> Laboratory for Neuromechanics and Biorobotics, Department for Automatics, Biocybernetics and Robotics, Jožef Stean Institute, Ljubljana, Slovenia, <sup>2</sup> Department of Cognitive Robotics, Delft University of Technology, Delft, Netherlands, <sup>3</sup> Department of Brain-Robot Interface, ATR Computational Neuroscience Labs, Kyoto, Japan

This paper introduces a novel control framework for an arm exoskeleton that takes into account force of the human arm. In contrast to the conventional exoskeleton controllers where the assistance is provided without considering the human arm biomechanical force manipulability properties, we propose a control approach based on the arm muscular manipulability. The proposed control framework essentially reshapes the anisotropic force manipulability into the endpoint force manipulability that is invariant with respect to the direction in the entire workspace of the arm. This allows users of the exoskeleton to perform tasks effectively in the whole range of the workspace, even in areas that are normally unsuitable due to the low force manipulability of the human arm. We evaluated the proposed control framework with real robot experiments where subjects wearing an arm exoskeleton were asked to move a weight between several locations. The results show that the proposed control framework does not affect the normal movement behavior of the users while effectively reduces user effort in the area of low manipulability. Particularly, the proposed approach augments the human arm force manipulability to execute tasks equally well in the entire workspace of the arm.

#### Edited by:

Yongping Pan, National University of Singapore, Singapore

#### Reviewed by:

Duygun Erol Barkana, Yeditepe University, Turkey Tairen Sun, Jiangsu University, China Venketesh N. Dubey, Bournemouth University, United Kingdom

# \*Correspondence:

Tadej Petricˇ tadej.petric@ijs.si

Received: 30 January 2019 Accepted: 08 May 2019 Published: 29 May 2019

#### Citation:

Petric T, Peternel L, Morimoto J and ˇ Babic J (2019) Assistive ˇ Arm-Exoskeleton Control Based on Human Muscular Manipulability. Front. Neurorobot. 13:30. doi: 10.3389/fnbot.2019.00030 Keywords: exoskeleton control, manipulability analysis, robot assistance, arm exoskeleton, human-robot interaction

# 1. INTRODUCTION

To date, many exoskeleton systems have been designed and controlled to either assist or resist the human motion depending on the application type. These systems enclose either a larger part of the human body or just individual joints. Most of the control methods here are focused on augmenting the effectiveness of the users in terms of joint motion or joint torque. To enable safe interaction between the exoskeleton and the user, a common approach is to use an impedance or admittance controller (Tsagarakis and Caldwell, 2003; Marchal-Crespo and Reinkensmeyer, 2009), where the interaction forces are controlled through a mass-spring-damper system (Hogan, 1985). Contrarily, in power augmentation tasks the exoskeleton needs to provide additional joint torques to augment the existing body capabilities. Here movement intentions of the user and corresponding joint torques are obtained by either direct force/torques measurements (Pratt et al., 2004; Kong and Jeon, 2006) or muscle activity measurements (Petric et al., 2011 ˇ ). The most common approach for measuring the muscle activity in exoskeleton control methods is the use of electromyography (EMG) (Fleischer et al., 2005). Different methods can be used to map muscle activities into the joint torques, such as biomechanical models (Rosen et al., 2001; Fleischer and Hommel, 2008), proportional mapping (Ferris et al., 2006; Lenzi et al., 2012; Koller et al., 2017; Toxiri et al., 2018) or machine learning algorithms (Peternel et al., 2016).

Many of these power augmentation methods are design to amplify the force evenly regardless of the limb configuration and the desired direction of movement. However, the force capability of the user's limb endpoint is heavily dependent on its current configuration and the direction of movement. A common approach to evaluate the biomechanical performance of a limb endpoint is to use manipulability measure that was initially derived for analysis of anthropomorphic robots (Yoshikawa, 1985). Manipulability is a measure that describes the relationship between joints and limb endpoint with respect to velocity (Yoshikawa, 1985; Vahrenkamp et al., 2012), acceleration (Chiacchio and Concilio, 1998; Yokokohji et al., 2009) or force (Bicchi et al., 1997; Gravagne and Walker, 2002; Tanaka et al., 2015). These measures are used to evaluate the effects of instantaneous variation in joints on the variation at the endpoint, and is usually represented by a spheroid around the endpoint. The distance from the endpoint to the spheroid surface in a given direction represents the maximal feasible velocity, acceleration or force capacity in that direction.

A few studies explored how to exploit manipulability measures for human motion augmentation. In a study (Petric et al., 2016 ˇ ), we proposed a control approach that compensates the anisotropic property of the kinematic manipulability related to the human arm. Similarly, in Shen et al. (2017), the control approach was improved by incorporating endpoint loading conditions into the modified manipulability models. In Kim et al. (2010), the dynamic manipulability was used to generate an energy efficient gait pattern. However, these control methods (Kim et al., 2010; Petric et ˇ al., 2016; Shen et al., 2017) were only based on human limb kinematics without considering also biomechanical specifics of human arms, i.e., muscles. In contrast to typical robotic actuators with gears and motors, the human joints are actuated by sets of antagonistically coupled muscles. The force generation capacity is configuration dependent and the relationship between the muscle forces and joint torques is nonlinear.

Many studies in human biomechanics already thoroughly analyzed the relationship between the joint torque and joint angle in lower limbs (Anderson et al., 2007), upper limbs (Leedham and Dowling, 1995; Kentel et al., 2011) and whole body (Millard et al., 2013). Analysis of the human force manipulability was introduced in Jacquier-Bret et al. (2012); Yu and Liang (2012), but without consideration of the specifics of the human actuators. Nevertheless, a detailed study which would address the relationship between joint torques and forces at the limb endpoint in terms of movement or force capabilities was still missing. The quantitative evaluation of force generation at the endpoint was reported in Sasaki et al. (2010), where manipulability models were developed using the human joint torque characteristic. To properly account for the effect of specific characteristics of human joint on endpoint manipulability, a study recently derived a manipulability model of the endpoint via human muscle forces (Ohta et al., 2014).

To address the limitations of the control method in Kim et al. (2010); Petric et al. (2016); Shen et al. (2017) ˇ , we propose a novel control method for an upper body assistive device which takes into account the human arm muscular force manipulability (Ohta et al., 2014). The proposed method derives from biomechanical studies to account for configuration dependent force capabilities of the human arm and selectively augments the user endpoint force capabilities based on the current arm configuration and motion direction. As a result, the exoskeleton provides more support to the arm in configurations and directions of motion where the force manipulability is smaller, and vice versa, less support to arm in configurations and directions of motion where the force manipulability is high. As a consequence, the proposed exoskeleton controller effectively maintains a spherical endpoint force manipulability of the human arm in the entire workspace.

To analyze the effects of the controller on the human motion, we hypothesize that the proposed control approach will reduce the human effort without diverting from the normal unassisted motion trajectory. To validate the proposed approach and hypothesis, we performed an experimental study on nine subjects, who were wearing a two degrees-of-freedom (DoF) arm exoskeleton. Their task was to move a 4 kg weight between two different target locations. We used a surface EMG to measure the effort of each subject during the task execution.

A preliminary study was presented at 2017 IEEE International Conference on Robotics and Automation (Goljat et al., 2017), where the method was introduced and evaluated only on a single subject. The specific contributions of this paper are: an extended evaluation based on data from nine naive subjects supported with a statistical analysis, an extended method formulation with a more in-depth explanation, a more thorough overview of the related work, and an additional discussion of novel results.

FIGURE 1 | Illustrated representation of proposed method using a muscular manipulability model (Goljat et al., 2017). The human arm is modeled as a two-segment rigid-body mechanism that is actuated by ten muscles: three shoulder muscle (sternal and clavicular part of Pectoralis major and Deltoid muscle), two bi-articular muscles (Triceps long head and Biceps short head), and five elbow muscles (Triceps lateral and medial head, Biceps long head, Brachialis and Brachioradialis). The red ellipse represents the human arm muscular force manipulability, and the blue circle shows the resultant force manipulability of the combined system of human arm with assistive device.

#### 2. FORCE MANIPULABILITY

Manipulability is defined by the kinematics of the mechanism, where the joint angle variations are propagated into the endpoint variations (Yoshikawa, 1985). The arm manipulability can be expressed with an ellipsoid around the endpoint whose radius represents the capacity of movement in different directions of Cartesian space. Orthogonally to the manipulability ellipsoid is the force ellipsoid, whose radius represents the capacity of exerting a force in different directions. The direction of the largest force capacity is also the direction where the robot is the least sensitive to the actuator errors (Gravagne and Walker, 2002).

The classic manipulability measures assume that the joints are driven by the actuators (e.g., motors) that can produce equal joint torque in both directions, independently of the configuration. However, human arm is driven by muscles, whose torque production characteristics change with the configuration of the arm. Therefore, the classic manipulability measures need to be updated to account for these properties. In literature, the models that can account for such human specifics are called muscular manipulability models (Tanaka et al., 2005; Ohta et al., 2014). In this paper, we extended the muscular manipulability model of the arm (Ohta et al., 2014) to used it for controlling an arm exoskeleton. The main goal of the control concept is to augment the human motion using the muscular manipulability model. As a result, the human arm force manipulability becomes spherical throughout the entire workspace. The conceptual idea is illustrated in **Figure 1**.

The following sub-sections first provide the mathematical formulation of the classical manipulability measure and then its extension toward the muscular manipulability model. For the sake of clarity, the method is explained on a planar case, where ellipsoids are reduced to ellipses. Nevertheless, the method is general and operates in the 3D space.

#### 2.1. Force Manipulability

The Jacobian matrix **J** describes the relationship between the joint velocities and endpoint velocities, while **J** <sup>−</sup><sup>T</sup> describes the relationship between the joint torques and endpoint forces. In case of a non-redundant mechanism, eigenvalues and eigenvectors of either matrix define manipulability ellipse and force manipulability ellipse, respectively. In a general case when mechanism has redundant DoFs, the ellipse can be derived by mapping all possible variables in joint space, contained within a unit circle, into the endpoint variables in the Cartesian space. A set of all joint toque variables contained within the unit circle is described by:

$$||\mathfrak{r}||^2 = \mathfrak{r}^T \mathfrak{r} \le 1,\tag{1}$$

where τ is joint torque vector. In general, the transformation from the joint torques to the endpoint forces is given by:

$$\mathbf{r} = \mathbf{J}^T(\mathbf{q})\mathbf{F},\tag{2}$$

where **F** is the endpoint Cartesian force/torque vector and **q** is the joint angle vector. By inserting (2) into (1) we get:

$$\left|\left|\mathbf{J}^{T}F\right|\right|^{2} = F^{T}(\mathbf{J}\mathbf{J}^{T})F \le 1,\tag{3}$$

where the inner product **JJ**<sup>T</sup> <sup>=</sup> **<sup>M</sup>**<sup>v</sup> is used to compute the manipulability and (**JJ**<sup>T</sup> ) <sup>−</sup><sup>1</sup> <sup>=</sup> **<sup>M</sup>**<sup>F</sup> is used to compute the force manipulability. Using singular value decomposition (SVD) of **M**,

$$\mathbf{M} = \mathbf{U}\Sigma\mathbf{V}^\*,\tag{4}$$

where **U** is a unitary matrix, 6 is a diagonal matrix with nonnegative real numbers on the diagonal, **V** is unitary matrix, and **V** ∗ is the conjugate transpose of **V**, we can obtain the singular vectors, which correspond to the minor and major axes of the manipulability ellipse (Yoshikawa, 1985). Minor and major axes represent the directions in which the lower and the higher forces can be generated respectively. Even though this is a kinematic-based metric, it has still been used in several studies of human motion (Sabes and Jordan, 1997; Hara et al., 1998; Tanaka et al., 2005).

#### 2.2. Muscular Force Manipulability

To account for the forces that are generated by the muscles acting on the joints, we derive the muscular manipulability measure, which describes the transformation between the muscle forces and the endpoint forces. First, the transformation of the muscle forces to the joint torques is governed by:

$$\boldsymbol{\pi} = \mathbf{J}\_m^T(\boldsymbol{\varrho}) \boldsymbol{F}\_m,\tag{5}$$

where **J**<sup>m</sup> is the muscle Jacobian matrix that maps muscle forces **F**<sup>m</sup> into joint torques. **J**<sup>m</sup> matrix also represents the muscle moment arms for each joint. The moment arms for the extensor muscles were defined as the shortest distances between the centers of the joints and the lines connecting the origins and the insertions of the muscles. The parameters for the origin and insertion points of the muscles were selected from the literature (Wood et al., 1989). By merging the (Equations 2, 5) we get the relationship between the muscles forces and the endpoint forces as:

$$F = \mathbf{J}^{-T} \mathbf{J}\_m^T F\_m. \tag{6}$$

To account for the muscular activation levels we use the Hill's muscle model. The relationship between muscle forces and endpoint forces now derives into:

$$F = \mathbf{J}^{-T} \mathbf{J}\_m^T F\_h \mathbf{a},\tag{7}$$

where **F**<sup>h</sup> is the diagonal matrix of the Hill's muscle force equation and the muscular activation levels are bounded to ||α|| < 1. Note that muscle activation is greater than 0 at rest and less than 1 at max. In the same manner as in (3), by using (7) we get the expression that determines the muscular manipulability

$$\mathbf{M}\_m = (\mathbf{J}^{-T}\mathbf{J}\_m^T\mathbf{F}\_h)(\mathbf{J}^{-T}\mathbf{J}\_m^T\mathbf{F}\_h)^T. \tag{8}$$

Similarly as before, we use a singular value decomposition of **M**<sup>m</sup> to obtain singular vectors that correspond to the minor and major axes of the muscular force manipulability ellipse.

#### 3. EXOSKELETON CONTROL

In this section, we describe the proposed exoskeleton control method based on the muscular force manipulability model. The anthropometric data for the arm, muscles, muscle-tendon lengths and moment arms were obtained from Langenderfer et al. (2004); Holzbaur et al. (2005). The muscle force was modeled with Hill-type representation (Hill, 1938; Zajac, 1989) given by:

$$F\_{m,i} = (f\_{0,i} f\_{l,j} f\_{\nu,i} \alpha + F\_{\mathcal{P},i}) \cos(\phi),\tag{9}$$

where i is the i-th muscle, f<sup>l</sup> is the active force-length relationship, α is the activation level, φ is the muscle-tendon pennation angle, f<sup>0</sup> is the optimal muscle force and f<sup>v</sup> is the forcevelocity relationship. We neglected the passive part since its force contribution is low due to the constant muscle activation during the motion (Jo, 2011). The normalized tendon slack lengths are also small, therefore we assume that tendons are stiff and have a negligible effect on the generated force (Zajac, 1989). Furthermore, all human arm muscles have a pennation angle smaller than 20:

$$F\_{m,i} = f\_{0, \dot{\jmath}} f\_{l, \dot{\jmath}} f\_{\nu, \dot{\imath}} \alpha. \tag{10}$$

Here the product of the parts f0,<sup>i</sup> , fl,<sup>i</sup> and fv,<sup>i</sup> is equal to the diagonal matrix of the Hill's muscle force **F**<sup>h</sup> used in (8). The detailed parameters of the optimal muscle length, maximal muscle force, and tendon slack can be found in Ning Lan (2002); Buchanan et al. (2004); Colacino et al. (2012). In our model we have included a total of ten muscles: three shoulder muscle (sternal and clavicular part of Pectoralis major and Deltoid muscle), two bi-articular muscles (Triceps long head and Biceps short head), and five elbow muscles (Triceps lateral and medial head, Biceps long head, Brachialis and Brachioradialis), as shown in **Figure 1**. To compute the muscle Jacobian **J**<sup>m</sup> we used the parameters of muscle origins and insertions from Wood et al. (1989). By inserting the Jacobian **J**<sup>m</sup> in (8), we computed the muscular manipulability matrix **M**m. Furthermore,

FIGURE 4 | Illustration of experimental setup. The image shows the subject coupled with the exoskeleton where the arm is strapped to the subject arm with straps. The subject is holding a weight in the arm and was instructed to do a periodic lifting motion between designated targets. The lifting period was controlled by a metronome.

by using the singular value decomposition we got the minor and major axes of the muscular manipulability. The minor axis represents the direction in which the ability to produce the endpoint force is low, while the major axis represents the direction in which the ability to produce the force is high. Examples of the muscular force manipulability ellipses for the two DoF arm model are shown in **Figure 2**. Note that for computing the muscular force manipulability ellipsoids we do not need to record or capture any EMG data, i.e., it si computed based on a model whose input is the configuration of the arm.

The proposed control method augments the human motion in a way that the force manipulability shape of the human arm endpoint results in a circle, i.e., human force production capacity is equal in all directions throughout the entire workspace. Note that the level of support is not discrete but varies continuously based on the calculated muscular manipulability at any sample time in online manner. To do so, the arm exoskeleton device scales the user endpoint force based on the force manipulability model. The illustration of the scaling is illustrated in **Figure 1**. The supporting force F<sup>e</sup> that the exoskeleton produces is governed by:

$$F\_e = K(\mathbf{M}\_m, F\_u) F\_u,\tag{11}$$

where K(**M**m, **F**u) is a function that is computed based on the muscular manipulability model in the direction of user's force and is defined as:

$$K(\mathbf{M}\_m, F\_u) = \frac{\lambda\_m}{\hat{F}\_u} - 1,\tag{12}$$

FIGURE 5 | Different configurations of arm and their corresponding muscular manipulability ellipses for the two motions. (A) Motion in the high manipulability region (B) Motion in the low manipulability region. Green arrows point into the current direction of motion and their size correspond to the current manipulability.

where λ<sup>m</sup> = max(diag(6)) is the maximal singular value of **M**m. Here **F**ˆ <sup>u</sup> is the force manipulability capacity in the direction of user's force **F**<sup>u</sup> and is defined as:

$$
\hat{F}\_{\mu} = \mathbf{M}\_m \frac{F\_{\mu}}{||F\_{\mu}||}, \ \hat{F}\_{\mu} \in \{0, \ \lambda\_m\}. \tag{13}
$$

As a result, the exoskeleton provides a supportive force **F**e, which is based on the muscular manipulability model in the direction of the user's force. Note that with this approach the exoskeleton provides no supportive force when the direction of the user's force is aligned with the major axis of muscular manipulability ellipse. In this case, the supporting force of the exoskeleton as define in (11) results in **F**<sup>e</sup> = 0. In all other cases, the exoskeleton will provide a supportive force to compensate for the difference between major manipulability and the manipulability in the direction of movement as illustrated in **Figure 1**. The block diagram of the control concept is shown in **Figure 3**.

## 4. EVALUATION

#### 4.1. Subjects

Nine healthy male subjects participated in this study with an average age of 29.4 years (SD = 2.02 years), weight of 70.8 kg (SD = 2.01 kg) and height of 175.9 cm (SD = 1.29 cm). Prior to their participation, the subjects were informed about experimental procedures, potential risks and the aim of the study. The study and the informed consent signed by subjects was approved by Advanced Telecommunication Research Ethics Committee (Nos. 730, 731).

#### 4.2. Experimental Setup

The proposed method was evaluated on a pneumatically actuated arm exoskeleton as illustrated in **Figure 4**. Nevertheless, the proposed method is general and can be used with any exoskeleton that has force/torque sensing capabilities. The exoskeleton was developed at the Department of Brain Robot Interface, ATR, Japan (Noda et al., 2014). For evaluating the manipulabilitybased assistance, the motion was limited to a sagittal plane and we used only shoulder and elbow joints. The human arm was modeled as a planar two-segment serial mechanism. In this model the first joint represent the shoulder and the second joint represent the elbow. Note that we considered the wrist as a part of the forearm. The arm configuration and endpoint force were measured in real-time by encoders in exoskeleton joints and a force sensor, respectively.

#### 4.3. Experimental Protocol

Each subject was wearing the arm exoskeleton and was holding a 4 kg weight in their hands as shown in **Figure 4**. They were asked to move toward two different targets from the same starting position, at which the posture of the subject's arm was aligned with the body. Both targets, the starting position, the initial arm pose and the final arm pose are illustrated in **Figure 5**. To accentuate the differences, the motion paths were selected so that the most part of the motion toward the target A is in the area of high muscular manipulability and the motion path toward the target B is mostly in the area of low muscular manipulability. Note that along the path the muscular manipulability characteristics are not constant and the proposed method adaptively assisted the human motion accordingly. The length of the path for both motions was the same and about 70 cm.

The experiment was divided into four sessions:


FIGURE 6 | Left plot shows trajectory area for all four sessions and all subjects. Results show that was no significant difference (⋄) between sessions with or without exoskeleton support for a low manipulability sessions and for a high manipulability sessions. Right plot shows iEMG for all subjects and sessions. Results show that only low-unsupported motion was significantly different (∗) with others. Note that there was no significant difference (⋄) between low-supported and high-supported motion.

The statistical analysis was performed using Statistics and Machine Learning Toolbox in MATLAB. We calculated average movement times required for the motion, trajectory area, and iEMG during each of the four sessions for each subject. We then used these average values of each subject for statistical analysis. We investigated the effects of the exoskeleton device with the proposed controller on the movement times, movement variations and human effort using two-way repeated-measures ANOVA with independent variables [controller(2) × targets(2)]. The differences between the trajectory areas and the differences between the iEMGs were tested with post-hoc t-tests with Bonferroni correction. The level of statistical significance used was .05 for all statistical tests.

#### 4.5. Results

Analysis of variance showed no significant effects of the exoskeleton device on the movement times between both high and low manipulability motion [F(1, 8) = 2.47, p = 0.15] and supported and unsupported motion [F(1, 8) = 1.36, p < 0.28]. There was no significant interaction [F(1, 8) = 0.06, p = 0.93] between the effects of low and high manipulability motion, and the supported and unsupported motion on the time to reach the target.

Analysis of variance showed significant effects of the exoskeleton device on the both high and low manipulability motion [F(1, 8) = 3.23, p = 0.01] and supported and unsupported motion [F(1, 8) = 108.12, p < 0.01] on the trajectory area. There was no significant interaction [F(1, 8) = 1.26, p = 0.29] between the effects of the exoskeleton device on the low and high manipulability motion and supported and unsupported motion on the trajectory area. Post-hoc t-tests showed that trajectory area of the Low- Unsupported and Supported is statistically different from the trajectory area of the High- Unsupported and Supported [t(9) = 9.33 − 12.58, p < 0.01]. There is no difference between trajectory areas of Low-Unsupported and Low-Supported, and trajectory areas of High-Unsupported and High-Supported. The left diagram in **Figure 6** shows the means and standard errors (SEM) of the trajectory areas for all supported and unsupported motions.

Average motion paths and their standard deviations are shown in **Figure 7**, where we can see a negligible difference between the unsupported motion and the supported motion paths. Right plot on **Figure 7** shows the gain K(**M**m, **F**u) and their standard deviations with respect to the path for High and Low manipulability targets.

Analysis of variance showed significant effects of the exoskeleton device on both high and low manipulability motion [F(1, 8) = 7.22, p = 0.03] and supported and unsupported motion [F(1, 8) = 48.12, p < 0.01] on the iEMG activities. There was no significant interaction [F(1, 8) = 2.09, p = 0.19] between the effects of exoskeleton device on low and high manipulability motion and supported and unsupported motion on the iEMG activities. Posthoc t-tests showed that iEMG activities of the Low-Unsupported motion is statistically different from any of the others [t(9) = 3.52 − 5.64, p < 0.01]. The right diagram on **Figure 6** shows mean values for all supported and unsupported motions.

• Low-Supported:low manipulability motion with exoskeleton support.

Note that in the cases without the exoskeleton support we consider that the exoskeleton was only compensating its own mass and did not provide any additional support to the user, and in the cases with the exoskeleton support we consider that the arm exoskeleton was compensating its own mass and at the same time providing an additional support for the user based on the manipulability controller. Each session lasted for about 60 s, which resulted in 20 cycles of motion. The movement period of motion was maintained by asking the subjects to follow the rhythm of a metronome.

## 4.4. Data Processing

In each session, we collected motion data with the sampling frequency of 100 Hz and EMG data with the sampling frequency of 1kHz. The motion variation was assessed by the deviation of movement with respect to the straight line between the starting position and the target position. The deviation was quantified as the unsigned area between the actual movement and the straight line and is denoted as trajectory area.

The human effort required to perform the motion was assessed by measuring and analyzing the EMG signals of Biceps long head and Pectoralis minor muscles. These two muscles are among the most dominant arm flexors for the arm motion in the sagittal plane. Each EMG signal was rectified and filtered with a second-order low-pass filter with a cut-off frequency of 3 Hz. To obtain the muscular activation, we normalized the processed EMG signal by the EMG measured during the maximum voluntary contraction of the respective muscle. To quantify the human effort, the processed and normalized EMG signal was integrated over time. From now on, the muscular activity will be denoted as 0 ≤ EMG ≤ 1 and its time integral as iEMG.

A trace of the EMG signal for one subject is shown in **Figure 8**, where we can see a significant reduction of human effort for lowsupported motion compared to the low-unsupported motion, but there is no difference between others (high-unsupported, high-supported and low-supported).

#### 5. DISCUSSION

The goal of this study was to introduce a novel exoskeleton control approach that selectively augments the performance of the human user. This study also evaluate the impact of the controller on user motion and effort with multisubject experiments. We hypothesized that the proposed control approach will reduce human effort without diverting from the normal unassisted motion trajectory.

It is evident from the results that by exploiting the anisotropic effect of the controller, the human effort remains similar for the low-supported case compared to the high-supported and high-unsupported case, i.e., there was no significant statistical difference between these three. In effect, the human effort of subjects wearing the exoskeleton device with the proposed method became equal for both motions. On the basis of the results we assume that the approach would generalize for arbitrary motion in the entire workspace.

In addition, it is also evident from **Figures 6**, **8** that the unsupported motion in the low-manipulability area requires considerably more muscular effort than the unsupported motion in the high-manipulability area. The results of the supported motion indicate that the proposed method was able to effectively reduce the human effort for the motion in the low-manipulability area. Results also showed that the level of the human effort in the low-manipulability area with the proposed controller is comparable to the motion in the high-manipulability area.

By augmenting the human end-point force capabilities considering the instantaneous arm configuration and the direction of motion we showed that a spherical end-point force manipulability can be effectively maintained throughout the entire workspace of the human arm. We also found out that the proposed control approach did not alter user motion trajectory (**Figures 6**, **7**), since the difference between the two was statistically insignificant. This suggests that the method can augment the force manipulability without affecting the normal movement characteristics of the exoskeleton users.

In the analysis we were interested in normal human behavior therefore we used healthy subjects. The goal of the paper was to a design controller aimed at power-augmentation scenarios. Rehabilitation scenarios include disabilities and abnormal human behavior, and were therefore not in the scope of this paper. However, the results obtained within the scope of this paper are the basis for our future research, where we will be

#### REFERENCES

Anderson, D. E., Madigan, M. L., and Nussbaum, M. A. (2007). Maximum voluntary joint torque as a function of joint angle and angular velocity: Model development and application to the lower limb. J. Biomechan. 40, 3105–3113. doi: 10.1016/j.jbiomech.2007.03.022

interested to see if this method can be applied in rehabilitation scenarios for subjects with disabilities.

The proposed manipulability-based power-augmentation method fundamentally differs from assist-as-need methods usually employed in rehabilitation scenarios. Assist-as-needed controllers (Wang et al., 2010; Pehlivan et al., 2016; Shahbazi et al., 2016; Li et al., 2017; Luo et al., 2019) basically provide an exoskeleton assistance when the user is not able to follow the therapy-based predefined trajectories, i.e., the level of assistance is based the error between the desired motion and the actual motion. On the other hand, the proposed controller employs no predefined trajectories and the user is free to perform the movements as desired, while the the level of assistance is based on the measured manipulability in a given configuration at any given sample time. The main advantage of the assist-as-needed rehabilitation methods is that the the controller can operate with predefined desired trajectories, which is paramount for various therapy programs. The main advantage of the proposed manipulability-based power-augmentation method is that the controller does not take any predefined reference trajectories and therefore the users can define the motion themselves.

The proposed method was tested on two DoF arm exoskeleton that was available to us at the given time. In future, we will develop more complex exoskeletons and test the proposed method on more DoF.

#### AUTHOR CONTRIBUTIONS

TP, LP, JM, and JB contributed to the design, execution, and drafting of this work, and approved the final manuscript. Experimental data was collected by JB. Data was analyzed by TP.

#### FUNDING

This work was supported by the European Union's Horizon 2020 RIA programme (grant agreement No 687662 - SPEXOR) and by the Strategic Research Program for Brain Sciences from Japan Agency for Medical Research and development, AMED, ImPACT Program of Council for Science, Technology and Innovation (Cabinet Office, Government of Japan), JSPS KAKENHI JP16H06565, a project commissioned by the NICT, NEDO, and MIC-SCOPE.

#### ACKNOWLEDGMENTS

Technical assistance provided by Rok Goljat was greatly appreciated. The authors thank the members of the Department of Brain-Robot Interface at ATR Computational Neuroscience Labs for their assistance with participant recruitment.

Bicchi, A., Prattichizzo, D., and Melchiorri, C. (1997). "Force and dynamic manipulability for cooperating robot systems," Proceedings of the 1997 IEEERSJ International Conference on Intelligent Robot and Systems Innovative Robotics for RealWorld Applications IROS 97 (Grenoble), 1479–1484.

Buchanan, T. S., Lloyd, D. G., Manal, K., and Besier, T. F. (2004). Neuromusculoskeletal modeling: Estimation of muscle forces and joint moments and movements from measurements of neural command. 20, 367– 395. doi: 10.1123/jab.20.4.367


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Petriˇc, Peternel, Morimoto and Babiˇc. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Human-Inspired Online Path Planning and Biped Walking Realization in Unknown Environment

Mirko Rakovic´ 1,2 \*, Srdjan Savic´ 1 , José Santos-Victor <sup>2</sup> , Milutin Nikolic´ <sup>1</sup> and Branislav Borovac<sup>1</sup>

*<sup>1</sup> Faculty of Technical Sciences, University of Novi Sad, Novi Sad, Serbia, <sup>2</sup> Institute for Systems and Robotics, Instituto Superior Tecnico, University of Lisbon, Lisbon, Portugal*

The focus of research in biped locomotion has moved toward real-life scenario applications, like walking on uneven terrain, passing through doors, climbing stairs and ladders. As a result, we are witnessing significant advances in the locomotion of biped robots, enabling them to move in hazardous environments while simultaneously accomplishing complex manipulation tasks. Yet, considering walking in an unknown environment, the efficiency of humanoid robots is still far from being comparable with the human. Currently, bipeds are very sensitive to external changes and they have severe constraints for adaptation of walk to conditions from such a complex environment. Promising approaches for efficient generation and realization of walking in a complex environment are based on biological solutions that have been developed for many years of evolution. This work presents one such human-inspired methodology for path planning and realization of biped walk appropriate for motion in a complex unfamiliar environment. Path planning results in calculating clothoid curves that represent well the human-like walking path. The robot walk is realized by the composition of parametric motion primitives. Such an approach enables on-line modification of planned path and walk parameters at any moment, instantly. To establish the relationship between high-level path planner and the low-level joint motion realization, we had to find a way to extract the parameters of the clothoid paths that can be linked with the parameters of the walk and consequently to motion primitive parameters. This enabled the robot to adopt its walking for avoiding the obstacles and for a smooth transition between different paths. In this paper we provide a complete framework that integrates the following components: (i) bio-inspired online path planning, (ii) path-dependent automatic calculation of high-level gait parameters (step length, walking speed, direction, and the height of the foot sole), and (iii) automatic calculation of low-level joint movements and corresponding control terms (driving motor voltage) through the adaptation of motion primitives which realize walking pattern and preserves the dynamic balance of the robot.

#### Edited by:

*Calogero Maria Oddo, Sant'Anna School of Advanced Studies, Italy*

#### Reviewed by:

*Piotr Skrzypczynski, Poznan University of Technology, ´ Poland Yixiang Liu, Harbin Institute of Technology, China*

> \*Correspondence: *Mirko Rakovic´ rakovicm@uns.ac.rs*

Received: *01 November 2018* Accepted: *17 May 2019* Published: *04 June 2019*

#### Citation:

*Rakovic M, Savi ´ c S, Santos-Victor J, ´ Nikolic M and Borovac B (2019) ´ Human-Inspired Online Path Planning and Biped Walking Realization in Unknown Environment. Front. Neurorobot. 13:36. doi: 10.3389/fnbot.2019.00036*

Keywords: humanoid robot, bipedal locomotion, motion primitives, path planning, clothoid, walk realization

# 1. INTRODUCTION

We are witnessing that jobs which were in the past exclusively handled by humans, spanning from the care of patients in hospitals to playing musical instruments, are now possible to be carried out by robots. One of the reasons for this achievement lies in the fact that biomechanics of human activity (including walk) has been deeply studied and explored (Winter, 2009; Lin and Pandy, 2017; Young et al., 2017). The processing power of the computer has increased to the extent where it is possible to adopt a complex control algorithm for the real-time whole body control of the robot with numerous DOFs (Degrees Of Freedom) (Moro and Sentis, 2019). The focus of biped locomotion research has moved toward real-life scenario applications, like walking on uneven terrain, passing through doors, climbing stairs, and ladders, etc. This is mainly motivated by the Fukushima disaster and the DARPA Robotics Challenge (Krotkov et al., 2017). As a result, numerous robots have been rapidly developed to become machines that can move in unknown environments, simultaneously accomplishing complex tasks (Schwarz et al., 2016; Tsagarakis et al., 2017; Asfour et al., 2019).

Humanoid robot walk is usually evaluated against human walking performance. However, humans develop the walking skills since the very first year of their life, with constant extensive improvements and adaptation to the changes in the kinematics and dynamics of their body. Walking, as time goes by, becomes so perfected and automated that we are not aware of the complexity of the underlying sensory-motor control system. Most humans can seamlessly perform various types of walk, compensate for external disturbances, adapt to different complex ground structures, and simultaneously perform other tasks while communicating with others, manipulating objects etc. In combination with perception (mainly vision), humans can easily cope with an unknown surrounding, and reach their goal while avoiding and/or overcoming various obstacles, stairs, doors, and other objects commonly found in the human environment (Smulders et al., 2012).

Development of humanoids with such skills is obviously a challenging task. Humanoids should be able to plan the path online and adapt the walk realization in order to follow the path close enough. This task is accompanied by the need to constantly prevent the robot from falling, i.e., to preserve its dynamic balance. The well-known concept of Zero Moment Point (ZMP) introduced by Vukobratovic and Borov ´ ac (2004), defines the conditions to preserve a dynamic balance of bipeds (Vukobratovic et al., 2012). One of the most recurring ZMP based approaches for walk synthesis is ZMP preview controller introduced by Kajita et al. (2003). In this method used by Morisawa et al. (2007) and Perrin et al. (2012) among others, the robot needs to predefine the position of footprints, calculate the trajectory of the ZMP, the motion of the center of mass and finally the motion of the rest of the system.

However, humans do not plan and define the footprints on the ground before they start to walk. If it is expected from the robots to operate in the human everyday environment, the walking skills of the robot have to be as close as possible to those of humans. The possibility to adapt to the current state of the environment (to modify the online walking path and adapt gait parameters) is a necessity. Also, due to the always present disturbances, the control system has to be robust, fast, and responsive in order to successfully compensate for them. In addition, path generation (approaching, avoiding, and going around obstacles) should be inspired by the way humans perform this. Thus, the system for motion synthesis has to offer a biologically inspired online modification of planned path and walking parameters and be supported by a robust motion control system that can compensate disturbances while constantly preserving dynamic balance. Path planner introduced by Dornbush et al. (2018) decomposes the task into a sequence of smaller tasks and focuses the planning efforts to reason over much smaller search spaces. We find this is a human-like approach were local obstacles should be treated first, but always working on solving the main task. Kumagai et al. (2018) proposed an efficient footstep planning for the robot to traverse an unknown narrow space in a human-like manner. Although the robot is demonstrating the ability to move in highly cluttered space, the algorithm that does not require the footstep planning should be considered.

Taking into consideration adaptability, responsiveness, and robustness without the need for precise trajectory execution, a methodology that enables synthesis of online modifiable walk based on reconfigurable adaptive motion primitives (RAMPs) is introduced in Rakovic et al. (2014) ´ . The RAMP is a parametric movement with the relationship established between its parameters and the walking characteristics. We find that this approach offers an intuitive interface for integration with higher level path planning algorithms. Thus, any requirement for modification of characteristics of walking (for example, speed, stride, direction etc.) will instantly influence the change in RAMPs and as a result, a modified gait will emerge. RAMPs are inspired by neurological studies (Giszter et al., 1993) that showed the synergistic motion of leg joints caused by electrical microstimulation. The experiment showed that the same stimulation drives the leg toward the equilibrium point irrespective of the initial position. The motion of the foot corresponds to a vector field that was convergent toward an equilibrium point characterized by the location and intensity of stimulation (Mussa-Ivaldi et al., 1994). The same approach is embedded in the properties of RAMPs.

The use of primitive movements as building blocks for generating more complex movements is not a unique approach. In the last decade, several motion primitive methodologies have been defined. Hauser et al. (2008) introduced a library of steps, where each different step is one primitive. Zhang et al. (2008) used recordings of the human to segment the movement of the leg into motion primitives. Moro et al. (2014) introduced kinematic movement primitives derived with principal component analysis from recorded human movements that can be combined to define a complex motion applicable for the realization of the robot walking. Ijspeert et al. (2002) and Schaal (2006) defined most known motion primitive methodology called Dynamic Movement Primitives. It models the robotic joint or end-effector trajectories as a combination of nonlinear dynamical systems. Statistical learning techniques are used to learn how to combine these primitive movements to code basic rhythmic or discrete behavioral patterns. In Gams et al. (2014), extended Coupling Movement Primitives are presented to model the interaction of robot with objects and humans.

In this paper, the RAMPs methodology for walk realization is augmented with the biologically inspired on-line path planning algorithm. The path planning algorithm is developed, that is inspired by the behavior of humans walking in an unknown environment. It enables the robot to avoid obstacles in a humanlike manned in a cluttered environment. Simulation results show that the biped can adapt its walking path and gait parameters online. The path can be re-planned when needed, depending on the localization of the objects, i.e., obstacles on the robot's way toward the goal position. The proposed approach searches for the transitory goal position in close robot's surrounding and generates a path that is far enough from the obstacles. The parameters of the path are linked with the parameters of the walk (speed, direction, step length) that are thus driving the robot to adapt to a new path geometry.

The contribution of this work is the framework for online human-like path planning and walking that integrates (i) the human-inspired path planning algorithm, (ii) the link between the parameters of the walk and the RAMP parameters for defining joints motion, and (iii) the dynamic balance controller that compensates for disturbances and ensures the dynamic balance of the robot. The developed planning and control ensures that the robot follows the online generated walking path and preserves its balance. The automatic calculation of the walking parameters with respect to the geometry and current position of the robot is proposed. In section 2.1 the motivation and a brief overview of the methodology for generating an online modifiable walk based on primitives are presented. Afterward, in section 2.2, the algorithm for the path generation and calculation of walk parameters are introduced, followed by the description of the algorithm for path planning, given in section 2.3. To show the abilities of the framework, section 3 presents the simulation results of the biped walking in the environment with different obstacles. The simulation involves a complete dynamic robot model with many degrees of freedom. The results show that the robot can realize dynamically balanced walk while online changing the path and modifying the walking parameters. Examples show hot the robot avoids static and moving obstacles in order to reach the goal position while following the human-like shape of the walking path. The paper concludes and proposes the direction for further research in section 4.

#### 2. MOTIVATION, APPROACH, AND METHODS

#### 2.1. Motivation for Composing the Walk From Motion Primitives

Let us suppose that the robot has the task to go to the other part of the unknown room to pick an object. The task a robot needs to solve is similar to what the human is solving. The human will scan for the obstacles and make a plan for avoiding one obstacle at a time (Erni and Dietz, 2001). The common approach for solving this task in humanoid robotics is based on the use of well-known planners to generate the safe path toward the goal (Kuindersma et al., 2016), usually followed by the foot placement algorithm and precise realization of generated trajectories. Other approaches, such as Zaytsev et al. (2015), propose that the planner should focus on local tasks, and plan only a few steps ahead. This approach is more biologically inspired since it does not require precise execution of the preplanned foot placements and precise realization of calculated joint trajectories.

Reconfigurable Adaptive Motion Primitives is the methodology that enables the realization of complex motion such as walking that does not require the programming of joint trajectories for a full step in advance. The realization of motion is defined with the set of parameters that are defining the goal position of the end of the kinematic chain with respect to the base coordinate frame of the robot. The result is the simultaneous motion of joints that are driving the end-effector (foot, head, or hand) toward parametric equilibrium point. These parameters can be changed at any time instant. In Borovac and Rakovic´ (2011), different methods based on model learning and inverse kinematics are presented.

The shape of the motion of the kinematic chain is smooth, including migration between the execution of the consequent primitives. The smoothness is achieved by a gradual change in the velocities of end-effectors toward the new equilibrium. An example of a primitive that is a part of a walking cycle is leg stretching (**Figure 1**) that incorporates the simultaneous motion of leg joints. It is followed by the completion of leg bending, with the smooth transition of the foot velocity toward the new end goal position.

The velocity of foot **s**˙<sup>A</sup> is determined from the following equation:

$$\dot{\mathbf{s}}\_{A}(t\_{i}) = \left(1 - b\left(t\_{i}\right)\right) \cdot \dot{\mathbf{s}}\_{A}^{0} + b\left(t\_{i}\right) \cdot \begin{bmatrix} \boldsymbol{\nu}\_{int} \cdot \mathbf{p}\_{e}^{ort} \\ \boldsymbol{\alpha}\_{int} \cdot \mathbf{o}\_{e}^{ort} \end{bmatrix} \tag{1}$$

where **p** ort e and **o** ort e are the unit vectors of the error of position and orientation of the foot with respect the end-goal position. The coefficient b ensures the smooth transition in intensity and direction of the heel velocity.

moment when stretching starts.

The basic walk (i.e., walking on a flat surface) can be decomposed into simple movements, similar to the previously described leg stretching (**Figure 2**).

Altogether five different primitives have been identified for leg movement, as well as one for the trunk and one for the arms. The legs can perform: (i) leg bending, (ii) leg stretching, (iii) inclining the robot forward during single support phase, (iv) making the foot surface contact after the heel strike, and (v) transferring the body weight onto the subsequent supporting leg during double support phase. Two additional primitives are the (vi) primitive for maintaining the trunk in the upright posture and the (vii) arm swing primitive, that represents natural motion opposing the leg motion. Arm swinging reduces the angular momentum of the body, compensates the rotational motion caused by the legs and contributes to the preservation of dynamic balance.

A critical segment in walk realization is the preservation of dynamic balance. Each motion primitive ensures an appropriate shape of the end-effector trajectories and smooth transition from the previous primitive, but it does not ensure the preservation of dynamic balance. To fulfill these requirements, from path planning to execution of motion on a joint level, we adopted a cascade controller (**Figure 3**) that consists of four blocks: (i) block for path planning, (ii) block for tying motion primitives, (iii) dynamic balance controller, and (iv) joint motion controller.

The input to the control system is a goal position that the robot should reach. The first block calculates the path toward the goal or transitory position in case of obstacle avoidance. The second block determines the desired joint angular velocities based on a set of tied-up primitives. Since motion primitives do not consider the dynamic balance of the system, the third block is introducing the corrections of desired joint velocities taking into consideration current position of ZMP and projection of the center of mass to ensure the preservation of dynamic balance. Finally, the fourth block is calculating the joint control values. Since the humanoid robot is highly non-linear and highly coupled system, the fourth block is composed of feedback linearization, sliding mode control and disturbance estimator.

Walk synthesized in this way can be changed online, meaning that overall motion parameters are introduced: walk speed WSpeed, the height of the leg during the swing phase WHeight, step length WLength and walking direction WDir. The relationship between these parameters and parameters of the primitives is established which causes an immediate and automatic change of the parameters of the primitives with the change in the parameters of the walk. An example of the simulated biped walk where the robot is changing the parameters of the walk online, in order to pass between the tables and step over the bar on the ground, is shown in **Figure 4**.

#### 2.2. Path Generation With Clothoid Curves

The automatic calculation of four walking parameters (WSpeed, WDir, WHeight, and WLength) can be determined from the geometry of the path. Thus, an approach for generating the path, and for the calculation of the walking parameters based on its geometry is introduced. Several problems were addressed in our analysis. An important question is whether the four walking parameters are independent or is there some relationship established among them. Another important question concerned the shape of the human walking path. Some insights from physiology and biomechanics were used in our research to tackle these problems and to achieve biologically inspired, anthropomorphic robot walk. In Arechavaleta et al. (2008), possible strategies were investigated, focusing on the formation of a walking path with given goal position. The assumption was that the path is chosen according to some optimization principle. The results in Arechavaleta et al. (2008), have shown that the cost function to be minimized is a variation (time derivative) of the path curvature which implies that clothoid arcs are a good approximation of a walking path. This conclusion was adopted and clothoid arcs have been chosen for path shape in our path planning algorithm.

Having defined the coordinates of initial (i.e., current) position of the robot and the desired goal position<sup>1</sup> of the robot, a G1 fitting (Bertolazzi and Frego, 2015) with clothoid curve

<sup>1</sup> Initial and goal position refer to both position (i.e., x and y coordinates) and orientation (i.e., rotation about z-axis denoted by ϕ) of the robot's base coordinate frame in the pelvis.

can be used to fit between these two coordinates. This way, it is possible to set a new goal point at any time instant and to fit a new clothoid arc between the current position and the new goal position. This enables online modification of a walking path in case of unpredicted or dynamic obstacles while ensuring the smooth path transition.

In this paper, forward goal-oriented locomotion without backward and side steps has been considered. In this case, robot locomotion can be simply approximated with a nonholonomic unicycle model which implies the dependency between the walking direction and the tangent to the walking trajectory. This dependency is used to calculate one of the walk parameters, i.e., walking direction. In each time instance, a minimal distance is calculated from the clothoid curve to the coordinate system attached to the robot pelvis. When the point, with the minimal distance from the robot base, is determined, the tangent on the curve in that point is calculated. The idea is to maintain the current orientation of the x-axis of the robot base frame (xpelvis) and the x-axis of the frame in the nearest point (xdes), determined by the path tangent in that point (**Figure 5**).

The angle between the curve tangent (xdes) and x<sup>0</sup> axis of the global coordinate frame, is denoted as ϕtangent. If we imagine that the robot started with some initial offset from the desired path, the robot would walk parallel to the path, keeping the initial offset. Therefore, it is necessary to include some corrective, feedback term, which would return the robot on the desired path if it gets off the track. This feedback term is given with the second term in the equation:

$$W\_{Dir} = \varphi\_{tangent} + K\_P \cdot \operatorname{arctg}\left(\frac{d}{l}\right) \tag{2}$$

where ϕtangent is the angle of the path tangent, K<sup>P</sup> is proportional gain, d is the distance from the path and l is the step length.

When the robot base frame crosses the path curve, the sign of the proportional gain K<sup>P</sup> changes. This sign is determined on the basis of the sign of the cross product of the path tangent vector, in the point with minimal distance from the robot, and the vector pointing from the robot base frame toward the nearest point on the path. This feedback term tends to minimize the distance between the path and the robot base frame and it has only a proportional gain K<sup>P</sup> which depends on the distance from the current to the goal position. If the robot is far from the goal this gain is smaller, and the robot gradually compensates the offset from the desired path. If the robot is close to the goal this gain is higher and the robot acts more rapidly in order to get on the desired course on time. This simple controller actually behaves like a bang-bang controller and causes the robot to walk zig-zag around the desired path. Of course, this zig-zag motion is very small, cannot be noticed and does not disturb the dynamic balance. To the human eye, the robot appears to follow the desired path accurately.

Recent studies in the field of gait physiology and biomechanics (Egerton et al., 2011) confirms that there exists a relationship between stride length and cadence which contributes to the automatic gait control mechanism. Stride length is defined as the distance between one heel strike to the next of the same foot in the walking plane. It has been proven that this relationship is linear when the subjects are walking at the self-selected speed. So the faster walking implies longer steps to a certain boundary, which is called the breakpoint. Breakpoint happens at cadences greater than 150 steps/min. At this point, stride length starts to decrease with a quadratic relationship and a further increase in cadence. This is an extreme case which is not considered in this paper. Thus, the linear relationship (y = b<sup>1</sup> · x + b0) between stride length and cadence, with the slope b<sup>1</sup> = 0.01 and the intercept b<sup>0</sup> = 0.54 is adopted. Since walking speed is the product of cadence and stride length, the relationship between two walking parameters, Wspeed and WLength, is provided. It is sufficient therefore to define just one of them while the other can be calculated.

Another result provides the relationship between the path curvature and walking speed. This relation, known as the 1/3 power law, is given with the following equation:

$$\mathcal{W}\_{\text{Speed}} = \boldsymbol{K} \cdot \boldsymbol{R}^{\beta} \tag{3}$$

where K is the velocity gain factor, R is the radius of the curve and β is the power (having the value of 1/3). Velocity gain factor K is calculated as WDes Speed · (1/2) β , where WDes Speed is speed of straight walking. In this case, if the radius of the curve is 2 m, the walking speed is equal to WDes Speed, if the radius is >2 m, the speed will increase, and if it is <2 m the speed will decrease. Additionally, the minimum and maximum speeds are introduced as 0.5·WDes Speed and 2 · <sup>W</sup>Des Speed respectively.

Although this relationship was considered to be valid for a long period of time, recent studies (Olivier and Cretual, 2007) have shown that this coupling between velocity and curvature is not general. In Olivier and Cretual (2007), authors studied the velocity/curvature relationship during a single turning task. Their results have shown that power law does not apply to this situation where subjects are free to choose a walking trajectory. However, power law has been proven to be valid for predefined paths and long-term control of the turning task. Also, it was shown that the value of the power is not always 1/3 and that it depends on the geometry of the path. Since the path in our algorithm is predefined, in the form of the clothoid, and there is no final, unambiguous conclusion on this issue—the 1/3 power law has been adopted as the relationship between path curvature and walking speed. The fourth walking parameter WHeight is independent of other parameters and depends only on

the estimated value of the obstacle's height if there is an obstacle to be stepped over.

In **Figure 6**, the robot is walking on a path that is generated according to a previously described procedure. The path is generated first to reach the goal position with coordinates x = 1.5 m, y = 2 m, and ϕ = π/2 (orange line). In randomly selected time instants the path is re-planned with the goal position x = 3.5 m, y = 1.5 m, and ϕ = 0 (purple line). Again, nearly before reaching the second goal position, the third goal position x = 6.5 m, y = 0 m, and ϕ = −π/2 is selected (green line) and a path is generated for the third time. For all three paths, as a starting point either the nearest point on the existing path is selected, or the current position of the pelvis in the case when the path is not generated. **Figure 6** shows the robot's online change in walking direction, as well as in walking speed and step length to comply with the generated paths.

## 2.3. Online Path Planning Algorithm

The inputs for generating the clothoid curves are the current position of the robot, given with coordinates x, y, and ϕ, and the desired goal position given with coordinates xdes, ydes, and ϕdes. If there is an obstacle between the current and the goal position [see **Figure 7** (left)], an algorithm for finding via points for avoiding obstacles is needed. In this section, an approach is described, for determining the coordinates of the intermediate desired goal positions for generating the clothoid curves, in order to avoid the obstacles and to reach the final goal position.

The algorithm first tries to generate the shortest clothoid between current and goal position. When the clothoid is generated it is checked whether the path will ensure minimum clearance between the robot and the obstacles. The clearance value takes into account the minimum desired margin between the robot and the obstacle<sup>2</sup> . If the clearance is satisfied, the

clothoid is generated, and the robot starts to follow the path. If not, the procedure for finding two intermediate targets is executed. These intermediate targets provide two possible paths for bypassing the obstacle. This procedure takes into account, the current position of the robot and visible part of the obstacles [see **Figure 7** (right)]. These positions are used to calculate the convex surface. As a result, three points (current robot position and two points on the obstacle) constitute the corners of the convex surface.

Current position of the robot and two points are used to calculate vectors Ed<sup>1</sup> and Ed2. The vectors Ed<sup>1</sup> and Ed<sup>2</sup> are the unit vectors perpendicular to the line that connects the current robot's position and two possible target positions [Target 1 and Target 2 in **Figure 7** (right)]. Initial values of two target positions are actually two points on the obstacle. The target positions are

<sup>2</sup>The robot is modeled to have the distance between the left and right shoulder joint 36 cm. The clearance in the simulation examples in this papers is set to 30 cm and thus the margin between the robot and the obstacle is 12 cm, which is considered to be enough to avoid the collision with the obstacle.

iteratively modified by shifting them in the direction determined by the corresponding unit vectors Ed<sup>1</sup> and Ed2. The modified target positions are used to generate two clothoid curves for which it is checked if they are satisfying the clearance from the obstacles. If there is sufficient clearance the clothoid is generated and the procedure is stopped. The second condition for stopping the procedure is that the orientation of the target positions is more than 180<sup>o</sup> from the current orientation, which means the target points are on the back side of the robot. If this situation for both points occurs, the robot is surrounded by obstacles and it is not possible to find intermediate points to calculate the clothoid path.

Out of the first clothoid curve that has a clearance of two possible intermediate goal positions, the one that is closer to the robot's final position is selected. This algorithm is executed for every obstacle in the scene that is in the visible range<sup>3</sup> . Generated clothoid path to the goal target has to fulfill the minimum clearance constraint for each obstacle. In the case of multiple possible targets, the one with a minimal distance from the robot's current position is chosen.

Our path planning algorithm, based on clothoids, has been compared with another path planning algorithm, based on particle swarm optimization (Poli et al., 2007; YarpizPSO, 2019). Particle Swarm Optimization (PSO) has been reported in literature many times for robot path planning in dynamic environment (Raja and Pugazhenthi, 2012). However, our focus is on the comparison of geometrical properties of the paths, generated by the two aforementioned approaches. Therefore, we considered that, without loss of generality, static environment with static obstacles is sufficient for such comparison. **Figure 8** shows the comparison of the two paths, generated between the starting position (yellow square) and the goal position (green star), avoiding a set of static obstacles (blue circles).

To geometrically compare the two paths, a curvature of each path has been calculated as the inverse of the path radius at each point. The comparison of the two curvatures is shown in **Figure 9**. It can be seen that the clothoid path, generated with our approach, has smaller curvature values, compared to the path generated by the PSO-based approach. According to the

Equation (3), this implies that our approach, due to a smaller curvature, may provide higher walking speeds, then the approach based on PSO. However, it should be noticed that the clothoid path has the sharper changes of the curvature, meaning that its derivative would have higher peaks, compared to the PSObased path. It means that PSO algorithm provides smoother changes in the curvature, but at the cost of higher values of this curvature, leading to the less intensive change in the direction of the walk, but limiting the walking speed, while our approach does the opposite.

The time to generate path using PSO took 13.3349 s, whereas the time to generate the path using clothoids took 0.0233 s [on Intel(R) Quad Core(TM) i5-4590 CPU at 3.30 GHz]. This significant difference emphasizes the main contribution of our approach and qualifies clothoid path planning algorithm as suitable for online path planning. We have shown that our path planning approach is computationally less expensive then the approach based on particle swarm optimization, while providing the minimum curvature of the generated path.

<sup>3</sup>This range is set to 5 m in this simulation experiment and can be changed to meet different requirements.

of ZMP (red cross) and PCM (blue circle).

# 3. SIMULATION MODEL AND RESULTS

The concept of a free-flying mechanism has been used for robot modeling. The model can have multiple kinematic chains, composed of links with one rotational degree of freedom. Multi-DOFs, like hip or shoulder, are modeled with a series of one-DOF joints. More details about the used simulation software can be found in Vukobratovic et al. ´ (2007). The whole robot has 46 links and 52 DOFs since there are six additional DOFs for the free-flying base link. Kinematic structure of the robot is given in **Figure 10**.

The foot is not a single rigid object, but a two segment foot with a sole link and a fingers link. Both links were chosen in the form of trapezes and the contact between the foot and the ground is described by six points (four on the sole and two on the fingers). Each foot link was modeled as a rigid, nondeformable body with a thin, deformable visco-elastic layer of negligible mass. The visco-elastic layer was modeled as an isotropic Kelvin-Voigt material (Nikolic et al., 2014, 2018 ´ ). Without loss of generality, all actuators were assumed to be the same, and modeled as DC motors with permanent magnets.

In order to verify the proposed bioinspired motion planning algorithm, simulation case studies have been conducted, demonstrating the robot's ability for online path planning and walk realization in an unstructured environment. In all simulations, the above-described model of humanoid robot was used.

#### 3.1. Walking Around Walls

Simulation scenarios cover three different obstacle configurations, chosen to test the performance of the path planning algorithm. In simulation results, obstacles configuration and stick diagram of the robot motion are given. Additionally, footprints of the robot and projections of ZMP and CoM for the full path are plotted, which shows that dynamically balanced walk has been achieved, with successful obstacle avoidance. In each simulation, the goal of the robot is to reach a goal position given the coordinates x = 8 m and y = 0 m and the orientation φ = 0 o .

In the first simulation experiment, the robot was supposed to walk from the starting point to the goal point in the environment which includes two parallel, nonplanar, overlapping walls. The results of the first simulation are shown in **Figure 11** (left). The robot first tries to calculate the clothoid path directly from the starting point to the goal point and checks for the obstacles in the visible range. Since there is an obstacle in front of the robot (blue wall) robot finds an intermediate target point, in order to avoid the obstacle. A convex surface is calculated using three points, the robot's current position and two corners of the wall. Following the path planning algorithm, described in the previous chapter, two intermediate targets are calculated, since the obstacle may be circumvented from the left or from the right. The right target, which is closer to the shortest path without the obstacles, is chosen and the clothoid path is generated.

Each time when the robot starts new phase (see **Figure 2**), it recalculates a clothoid path, following the algorithm described in section 2.3. At this point a new obstacle (green wall) in front of the robot becomes visible. Since it is not possible to calculate a clothoid path directly to the goal point, because of the obstacle, the path planner calculates another path for obstacle avoidance. The path is calculated, which ensures the robot to bypass the green wall. When the robot avoids the second wall, the goal position becomes visible and the clothoid path toward the final goal position is generated.

The second simulation has a setup similar to the first one. There are also two parallel, nonplanar walls in the scene and the robot has to walk between the walls to reach the goal point. The

FIGURE 12 | Robot is walking toward the goal position and turns left to avoid the obstacle: Stick diagram of robot motion (top); footprints and projections of ZMP (red crosses) and PCM (blue circles) (down).

with the shorter clothoid path around the obstacle toward the goal: Stick diagram of robot motion (top); footprints and projections of ZMP (red crosses) and PCM (blue circles) (down).

major difference, compared to the first simulation scenario is that the robot, after circumventing the first wall has a clear sight to the goal position. Only one obstacle had to be avoided. The results of the second simulation are shown in **Figure 11** (middle).

In the third simulation scenario, the robot is searching for the path in the scene with two parallel, non-planar walls. At the beginning of the simulation, the second wall (green one) is completely hidden behind the blue wall. Since there is an obstacle between the robot and the goal position, an intermediate target is calculated, to bypass the obstacle. Robot chooses the intermediate target on the right corner of the convex surface since it is closer to the robot's current position. After avoiding the first wall, another obstacle (green wall) is on the robot's path toward the goal position. Thus, a new convex surface is calculated and two intermediate targets, which fulfill the clearance condition are calculated. The closer target is chosen as an intermediate goal for the clothoid generation and the robot circumvents the second wall. When the robot avoids the second obstacle, the clothoid path which leads the robot to the goal point is found. The results of the third simulation are shown in **Figure 11** (right).

#### 3.2. Avoiding Moving Obstacle

For illustrating the possibility of the robot to plan the path online and adapt its walking parameters to follow the path, the following simulation scenarios are prepared. The robot's goal position is the same as in previous examples, i.e., x = 8 m, y = 0 m, φ = 0 o . The obstacle, represented as a blue box that is moving from the right side of the robot, crosses the robot's path. The transparent blue trace represents the path of the obstacle. Same as is the section 3.1, the figure shows the stick diagram of the robot, the footprints and the projections of the ZMP and CoM of the robot.

The first example (**Figure 12**) shows the simulation of the robot walking toward the goal position, and the obstacle is crossing its path. The obstacle stops in front of the robot. The robot's path planning algorithm finds the point to generate the clothoid and turns left to avoid the obstacle. The obstacles stopped at the position in which the closest path to avoid the obstacle is always on the same side of the obstacle.

The second example (**Figure 13**) shows the simulation of the robot walking toward the same goal position. The obstacle stops in front of the robot but in this case, the end position of the obstacle is such that the robot first finds the path to turn left, and then switches to turn right. The reason for this switch is that at one moment, the path to avoid the obstacle is shorter if the robot turns right. The sequence of the figure shown in **Figure 14** illustrates the significant output of the path planning algorithm.

The last example (**Figure 15**) shows the simulation of the robot when the moving obstacle crosses its path, but instead of stopping in front of the robot, it continues to move and eventually clears the way for the robot to pass. In this example, we introduced the rule to slow down the walking speed when the obstacle is crossing the robot's shortest path toward the goal. When the obstacle crosses the robot's path, the robot will slow down and plan to turn left in order to avoid the obstacle. When the obstacle is out of the robot's path, the robot calculates the new

FIGURE 15 | Robot slows down the walking speed, turns left and then turns right when the obstacle crosses the robot's path. After the obstacle is not crossing the path, the robot continues with normal speed toward the goal position following the shortest clothoid path: Stick diagram of robot motion (top); footprints and projections of ZMP (red crosses) and PCM (blue circles) (down).

path toward the goal and speeds up to normal walking speed<sup>4</sup> . The link to the simulation experiment video of the robot avoiding the obstacles can be found in **Supplementary Material**.

# 4. CONCLUSION

This paper presented a novel methodology for biped walk path planning and walk synthesis based on motion primitives and neurological observations of human behavior. The presented approach showed the capacity for online modification of walk in an unknown environment. The first simulation showed how the robot can modify its walking parameters online, in order to avoid obstacles and walk between them. Results from **Figure 11** highlighted the path planning algorithm that finds a transitory position, and illustrated how a walk is modified without upfront calculation of reference foot placements, joint trajectories, and its execution. Simulation examples given in **Figures 12**–**15** showed the ability of a path planning algorithm to change the walking path online and adapt the high-level walking parameters. The framework for online path planning and walk realization showed the potential when the surrounding environment is unknown and dynamically changing.

The algorithm for walk synthesis was interfaced to the algorithm for biologically inspired, human-like path planning based on clothoid arcs. The algorithm for path planning prescribes an automatic change of parameters of walk based on the shape of a generated path. The outcome parameters of walk change over time and the realized walk changes simultaneously to comply with the laws derived from physiological studies and clinical research.

The proposed path planning approach, based on clothoid arcs smooth concatenation, was compared with another approach based on particle swarm optimization. It was shown that our approach provides a path with a smaller curvature, but a higher curvature derivative, i.e., sharper change of curvature, thus

# REFERENCES


leading to the sharper change in walking direction, but higher possible walking speeds.

In future work, we plan to take into account the derivative of the path curvature, beside the curvature itself, as another cost function for optimization, in order to obtain smoother changes of the walking direction. Also, we plan to analyze how ascending and descending stairs can be integrated into our path planning algorithm in order to avoid obstacles on the stairs. The focus will be on the analysis of a human approaching the stairs with static and dynamic obstacles and transition of the human-like behavior onto the biped humanoid robot.

# AUTHOR CONTRIBUTIONS

MR developed the idea and defined the relationship between the overall walking parameters and RAMP parameters. SS contributed to the implementation and adaptation of nonlinear sliding mode control and disturbance estimator and the extraction of walking parameters from planned clothoid curves. JS-V contributed to the development of the biologically inspired path planning algorithm and in structuring the paper. MN improved the dynamic model to better simulate the contact between the foot and the ground and upgraded the existing software for multibody dynamic simulation. His contribution was also in preparing and conducting the simulations. BB contributed to the development of the idea of biped walking using RAMPs and in defining simulation scenarios. All the authors participated in writing and proofreading.

# FUNDING

This work has been funded partially by the MPNTR of Republic of Serbia (III44008) and partially by the FCT of Republic of Portugal (UID/EEA/50009/2013) and RBCog-Lab research infrastructure.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnbot. 2019.00036/full#supplementary-material


<sup>4</sup>The video simulations of the robot avoiding the obstacle can be found on the following link: https://youtu.be/WBCesxnep0s, and the source code with the software for general modeling tool and the simulation examples from this paper can be found at https://github.com/rakovicm/ Online\_Planning\_Walking\_Unknown\_Environment


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Rakovi´c, Savi´c, Santos-Victor, Nikoli´c and Borovac. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Simultaneous sEMG Classification of Hand/Wrist Gestures and Forces

Francesca Leone<sup>1</sup> \*, Cosimo Gentile<sup>1</sup> , Anna Lisa Ciancio<sup>1</sup> , Emanuele Gruppioni <sup>2</sup> , Angelo Davalli <sup>2</sup> , Rinaldo Sacchetti <sup>2</sup> , Eugenio Guglielmelli <sup>2</sup> and Loredana Zollo<sup>1</sup>

<sup>1</sup> Unit of Biomedical Robotics and Biomicrosystems, Universiã Bio-Medico di Roma, Rome, Italy, <sup>2</sup> Italian Workers' Compensation Authority (INAIL), Vigorso di Budrio, Bologna, Italy

Surface electromyography (sEMG) signals represent a promising approach for decoding the motor intention of amputees to control a multifunctional prosthetic hand in a non-invasive way. Several approaches based on proportional amplitude methods or simple thresholds on sEMG signals have been proposed to control a single degree of freedom at time, without the possibility of increasing the number of controllable multiple DoFs in a natural manner. Myoelectric control based on PR techniques have been introduced to add multiple DoFs by keeping low the number of electrodes and allowing the discrimination of different muscular patterns for each class of motion. However, the use of PR algorithms to simultaneously decode both gestures and forces has never been studied deeply. This paper introduces a hierarchical classification approach with the aim to assess the desired hand/wrist gestures, as well as the desired force levels to exert during grasping tasks. A Finite State Machine was introduced to manage and coordinate three classifiers based on the Non-Linear Logistic Regression algorithm. The classification architecture was evaluated across 31 healthy subjects. The "hand/wrist gestures classifier," introduced for the discrimination of seven hand/wrist gestures, presented a mean classification accuracy of 98.78%, while the "Spherical and Tip force classifier," created for the identification of three force levels, reached an average accuracy of 98.80 and 96.09%, respectively. These results were confirmed by Linear Discriminant Analysis (LDA) with time domain features extraction, considered as ground truth for the final validation of the performed analysis. A Wilcoxon Signed-Rank test was carried out for the statistical analysis of comparison between NLR and LDA and statistical significance was considered at p < 0.05. The comparative analysis reports not statistically significant differences in terms of F1Score performance between NLR and LDA. Thus, this study reveals that the use of non-linear classification algorithm, as NLR, is as much suitable as the benchmark LDA classifier for implementing an EMG pattern recognition system, able both to decode hand/wrist gestures and to associate different performed force levels to grasping actions.

Keywords: pattern recognition, surface electromyography, hand gestures recognition, prostheses, gestures classifier, force classifiers, non-linear logistic regression, linear discriminant analysis

#### Edited by:

Toshiaki Tsuji, Saitama University, Japan

#### Reviewed by:

Levi Hargrove, Rehabilitation Institute of Chicago, United States Farong Gao, Hangzhou Dianzi University, China

#### \*Correspondence:

Francesca Leone f.leone@unicampus.it

Received: 02 February 2019 Accepted: 31 May 2019 Published: 19 June 2019

#### Citation:

Leone F, Gentile C, Ciancio AL, Gruppioni E, Davalli A, Sacchetti R, Guglielmelli E and Zollo L (2019) Simultaneous sEMG Classification of Hand/Wrist Gestures and Forces. Front. Neurorobot. 13:42. doi: 10.3389/fnbot.2019.00042

# 1. INTRODUCTION

The use of surface electromyography (sEMG) allows the noninvasive extraction of pattern information useful to control active prosthetic hands. In the last 70 years, several solutions have been proposed to extract gestures information from sEMG (Ciancio et al., 2016, 2017); the most simple were based on on-off (Scott and Parker, 1988), on Agonist/Antagonist (Popov, 1965) and Proportional Control (Fougner et al., 2012). Targeted Muscle Reinnervation (TMR) enabled amputees with shoulder disarticulation or transhumeral amputation to control motorized prosthetic devices with multi-DoFs (Aszmann et al., 2015) in a natural way. Pattern recognition methods enabled performance improvements to reach an intuitive and coordinated control (Li et al., 2018). Moreover, these techniques allowed the increasing of the number of controllable Degree of Freedoms (DoFs) (Ciancio et al., 2016). Different classification algorithms have been proposed in literature, including Euclidean Distance, Non-Linear Logistic Regression, k-Nearest Neighbors (kNN), Hidden Markov Model (HMM), Artificial Neural Network (ANN), Support Vector Machine (SVM), Linear Discriminant Analysis (LDA) (Chowdhury et al., 2013). However, different arm positions (Geng et al., 2012), electrode shift (Young et al., 2012a), signal non-stationarity (Lorrain et al., 2011) and force variation (Scheme and Englehart, 2011) can affect the pattern-recognition accuracy and robustness. In addition, physiological factors as motor unit (MU) recruitment, MU firing rate and contraction type (e.g., isometric, isotonic, concentric, or eccentric) make difficult the extraction of sEMG-force relationship due to nonlinear factors (Farina et al., 2007; Disselhorst-Klug et al., 2009; Staudenmann et al., 2009).

In literature two main approaches have been proposed to find a relationship between muscular activation and force: mathematical models and machine learning techniques. Force estimation, based on surface electromyographic measurements, was determined through a sEMG-force mathematical relationship, by applying Non-linear Wiener Hammerstein (NLWH) and Spectral Analysis Frequency Dependent Resolution (SPAFDR) models (Potluri et al., 2013). Buchanan et al. (2004) presented a computational neuro musculoskeletal model of the human arm with the aim to estimate muscle forces, joint moments and joint kinematics from neural signals. Moreover, "crosstalk risk factors" (CRF), as the dependency of the relationship between the sEMG signals, muscle length and isometric contraction force, had to be quantified to understand the effectiveness of the muscular co-ordination in generating force (Disselhorst-Klug et al., 2009).

Related to machine learning techniques, Srinivasan et al. (2012) proposed a method for estimating forces from surface electromyography (sEMG) signals with Artificial Neural Network (ANN). Wu et al. (2017) proposed a force estimation method employing a Regression Neural Network (GRNN) trained with sEMG and force signals. In the most recent study, Ren et al. (2017) divided force signals in different grades from 0 N to 16 N, expressed as percentage of the Maximum Voluntary Contraction (MVC). They used SVM to establish non-linear regression relationship between sEMG and force. Lv et al. (2017) used Linear Discriminant Analysis (LDA) to classify five finger gestures at two different levels of force (i.e., 10% MVC and 50% MVC), by using EMG and accelerometer signals. Li et al. (2018) proposed a method based on deep neural network to derive sEMG-force regression model for force prediction at eight different force levels. Al-Timemy et al. (2016) reported that force level variations negatively affected the performance of PR system and caused the increase of the classification error rates. However, an increasing of 6 − 8% in the classification performance can be reached by applying Time-Dependent Power Spectrum Descriptors (TD-PSD) features extraction to four classifiers [ i.e., LDA, Random Forest (RF), Naive Bayes (NB), k-Nearest Neighbor (kNN)] and training with all forces across nine transradial amputees. In order to investigate the performance of PR system in presence of variations in force, Scheme and Englehart (2011) evaluated a LDA classifier with Time-Domain (TD) features extraction, by using data of 10 classes performed at 20% and 80% of the strongest and reproducible contraction, except for the tenth class of no motion. The LDA classifier performed an error rate equals to 17% when trained and tested using data of 11 healthy subjects at all force levels. The error increased at 31 − 44% when trained at one force level and tested with all force levels. Subsequently, the effect of contraction strength on pattern recognition based control was studied in Scheme and Englehart (2013). By using a LDA classifier trained with dynamic ramp data of 10 healthy subjects, the classification error significantly improved (11.16 ± 0.54%).

Different strategies have been developed by combining the above techniques to make the control most fluid and intuitive for the user. Two proportional control algorithms were used to obtain a robust and proportional velocity commands that could improve the usability of PR (pattern recognition) based control (Scheme et al., 2014). Fougner et al. (2014) presented a novel pattern recognition system with mutex on-off control or proportional control of a commercial prosthetic hand and wrist. In Young et al. (2012b) three classification strategies were introduced and compared in order to provide simultaneous DoFs control. The first classification approach used a single linear discriminant analysis (LDA) classifier to discriminate both discrete and combined motions. All the discrete and combined gestures were considered as separated classes. The second proposed approach was based on a hierarchical classification strategy and consisted of a hierarchy of LDA classifiers. The highest classifier in the hierarchy determined a motion class for a single DoF by using both discrete and combined motion data. The output of this classifier determined which classifier of the second level could be used for discriminating the motion class of a second DoF. Finally, the parallel classification strategy employed one LDA classifier for each DoF and the decision of the single classifier is independently defined. The parallel classification strategy was presented also to either allow the simultaneous control of three-digits of a monkey (Baker et al., 2010) or to control the elbow and hand/wrist movement of an active myoelectric transhumeral prosthesis (Boschmann et al., 2011). No hierarchical strategy has ever been proposed to simultaneously identify desired gestures and forces.

This work aims at proposing and testing a hierarchical pattern recognition strategy to contemporary identify desired hand/wrist gestures and force levels. In details, a Finite State Machine (FSM) scheme (**Figure 2**) is introduced to manage desired hand/wrist gestures and force levels, following a hierarchical approach.

Differently from Young et al. (2012b) the hierarchical classification system is used to discriminate simultaneously hand/wrist gestures and desired force levels. In details, the highest NLR classifier, i.e., the "hand/wrist gestures classifier," is devoted to identify the desired hand/wrist class among seven gestures. The output of this classifier determines the next classifier used in the hierarchy. If the output of this classifier is "Spherical" motion class, then the "Spherical force classifier" is used to determine the desired force level to exert on an object. This second classifier is conditioned on the decision of the first classifier. The same strategy is adopted if the output of the first classifier is "Tip" motion class. In this case, the "Tip force classifier" lower in the hierarchy is used to determine the desired force levels. Thus, the classifiers of the second level of the hierarchy discriminate the force levels applied during the related grasping class. If, instead, the output of the first classifier is any hand/wrist gestures different from "Spherical" or "Tip" gesture, no force classifiers are activated. Hand and wrist gestures are classified by using the single "hand/wrist gestures classifier." When the "Spherical" or "Tip" state is chosen, the "Spherical force classifier" or the "Tip force classifier" is respectively activated in a hierarchical way to discriminate between three different force levels (i.e., Low, Medium, and High). The FSM use allowed the two classifiers of different grades of the hierarchy to work simultaneously. Until the "Spherical" or "Tip" state is classified by "hand/wrist gestures classifier," the "Spherical force classifier" or the "Tip force classifier" intervenes to discriminate force levels.

The NLR and LDA algorithms are employed for implementing the hierarchical classification approach, since LDA and NLR retained statistically similar value for F1Score performance and computational burden, despite LDA has the fewest number of classification parameters (Bellingegni et al., 2017). The number of gestures is increased from five (Bellingegni et al., 2017) to seven and the classification is extended to three different force levels, by using the same number of sensors. Force information is provided only for the two grasping classes (i.e., "Spherical" and "Tip") in which an object interaction is expected. Seven hand/wrist gestures (i.e., Rest, Spherical, Tip, Platform, Point, Wrist supination, and Wrist pronation) had been discriminated by using a Non-linear Logistic Regression (NLR) algorithm. When the "Spherical" or the "Tip" class are identified, a second NLR-algorithm-based classifier, i.e., respectively, "Spherical force classifier" or "Tip force classifier" is activated simultaneously in order to discriminate three force levels (i.e., Low, Medium, and High).

The same hierarchical pattern recognition strategy was implemented with three linear classifiers ("hand/wrist gestures classifier," "Spherical force classifier" and "Tip force classifier"), based on LDA with time domain features extraction. The performance of each algorithm (NLR and LDA) were measured by means of F1Score value and statistical analysis had been based on the Wilcoxon Signed-Rank test, which had been shown to be appropriate for comparing different classifiers in common datasets (Demšar, 2006). A comparative analysis among NLR and LDA with features extraction was implemented in order to define the most suitable classification algorithm for the realization of a gestures and forces classification architecture to control of a prosthetic device. Thus, in literature, several studies have been considered the LDA classifier with features extraction as ground truth (Simon et al., 2011; Young et al., 2014) and it can be used for the online control of prosthetic devices (Resnik et al., 2017) that is commercially available by COAPT (https://www.coaptengineering.com).

The performance of the proposed approach are evaluated during an experimental session involved 31 healthy subjects. The users are asked to perform seven hand/wrist motions and to replicate three different force levels during the "Spherical" and "Tip" grasps.

The paper is organized as follows. Section II describes the proposed force/gesture classification approach and the experimental setup used to collect sEMG and force data. Section III reports the results in terms of F1Score and accuracy of each classifier trained with both NLR and LDA algorithms. Section IV discusses the achieved results and then it reports the comparative analysis among NLR and LDA classifiers in terms of F1Score. The last section draws the conclusion, including some considerations regarding the comparative analysis between NLR and the LDA benchmark classifier, limits and future works.

# 2. MATERIALS AND METHODS

### 2.1. Forces/Gestures Classification Approach

A hierarchical pattern recognition strategy was proposed for the classification of the desired hand/wrist gestures and force levels from muscular signals (**Figure 1**). The FSM coordinated the hierarchical activation of the three classifiers implemented both with NLR and LDA algorithms for doing a comparison in terms of F1Score performance.The highest classifier in the hierarchy was a single classifier able to discriminate seven discrete hand/wrist motion classes. The output of this classifier determined the desired hand/wrist gesture and, in case of "Spherical" or "Tip" class, the force classifier, lower in the hierarchy, to be activated. Thus, the force classifiers were activated for force levels recognition.

The described hierarchy was implemented adopting NLR algorithm for both gesture and force classifiers. The same hierarchy was then reproduced using LDA algorithm in order to perform a comparative analysis. The Linear Discriminant Analysis (LDA), using time domain of the EMG signal, was frequently employed in literature because it was considered an efficient algorithm, simple to train and with an optimal compromise in terms of computational burden (Young et al., 2014). The Wilcoxon Signed-Rank test applied to the F1Score values was performed with significance threshold set to 0.05.

the classification of three force levels for "Spherical" gesture.

The FSM coordinated the three classifiers activation (i.e., one for hand/wrist gestures and two for force levels). The FSM approach was characterized by the following key features:


The proposed classification system was characterized by three different classifiers (**Figure 2**):


**Figure 2** in the red box). It was active if the "Spherical" gesture was identified and it was lowest in the hierarchy (**Figure 2**).

• The "Tip force classifier" was able to discriminate three force levels (i.e., Low, Medium, and High Level shown in **Figure 2** in the red box). It was active if the "Tip" gesture was identified and it was lowest in the hierarchy (**Figure 2**).

FSM determined the following different scenarios: a single classification approach used "hand/wrist gestures classifier" to recognize seven discrete hand/wrist motion classes; the classification approach become hierarchical when the output of this classifier was the "Spherical" or "Tip" motion class. In this case, a second classifier (force classifier) was activated. Until the FSM system remained in one of these two states (i.e., "Spherical" or "Tip"), the output of the FSM system provided hand/wrist gestures and the force levels information. Otherwise, if the FSM system was in a different state from the "Spherical" or "Tip," only the single "hand/wrist gestures classifier" was activated and the gesture information was supplied.

The force classifiers managed only a three classes classification problem related to three different force levels ( i.e., Low, Medium, and High).

The raw sEMG recording for the six EMG channels, related to all the seven performed movements of a single acquisition session, was reported in (**Figure 3**). In details, the enveloped EMG signal was acquired at 1 KHz to create three Datasets, used for both the NLR and LDA algorithms (**Figure 4**). For the NLR classifiers, the "raw" sEMG signals were used as input features in order to speed up the training and cross validation of the NLR algorithm (**Figure 4A**). On the other hand, for the LDA classifiers, five commonly used time domain features were extracted: Mean Absolute Value (MAV), Root Mean Square (RMS), Slope Sign Change (SSC), Waveform Length (WL) and Variance (σ 2 ) (**Figure 4B**). In this case, the features extraction avoided the generation of large-scale-dataset without performing the downsampling step and the time to complete the training is not too long. The TrainingSet of the "hand/wrist gestures classifier" was composed by sEMG signals related to all the seven states of FSM. This TrainingSet included the recording of Spherical and Tip gestures performed at three different force levels in order to correctly classify gestures independently from muscular contraction changes due to force variations.

The TrainingSets of "Spherical and Tip force classifiers" were composed only by sEMG data expressing different muscular contraction levels for these gestures. The NLR and LDA machine learning algorithms and dataset organization are provided below.

## 2.2. NLR Classification Algorithm and Dataset Organization

In each subject's acquisition, the sEMG data were organized in a 84000 ∗ 6 dimensions matrix. Each column of the matrix was coupled with an EMG sensor. Firstly three-way data split approach (Ripley, 2007) was applied to the dataset (84000 ∗ 6 sEMG data) and the Training Set (TR), the Cross Validation Set (CVS) and the Test Set (TS) were set to contain 60, 20, and 20% of the data, respectively. A random shuffle was implemented for filling these subsets with a proper proportion of all classes samples distribution.

The unique operation done on sEMG data was the scaling: it consists of subtracting the mean value to each signal and dividing the result by the range, as done in Bellingegni et al. (2017).Then,downsampling (with a step = 10, 100 Hz) was applied to reduce the data dimensions and training process.

The scaled "raw" sEMG data were directly used as input for the NLR model, without performing any features extraction. The use of only "raw" sEMG signals allowed a significant reduction of the classification time and of the response time without loss of system performance (Nazarpour, 2005; Dohnalek et al., 2013; Benatti et al., 2014). Moreover, the use of "raw" scaled sEMG signals (**Figure 3**) as input features approximated the class evaluation time and system readiness to the sampling time (Bellingegni et al., 2017). The discarded data rising from the downsampling process (90% of initial data) composed a new set of data called Generalization Set (GS) used as a second test to obtain an estimation of the generalization capability of each classifier. The three way data split approach was applied on the data coming from downsampling process (10% of initial data): TR, CVS, and TS were set to contain 6, 2, and 2% of the data, respectively. The TR and CV were used to train and cross validate the classifiers and the TS and GS were employed to test the performance of the classifiers. In details, the TR was used to train the supervised classification algorithm by minimizing a specific cross-entropy error cost function:

$$\begin{aligned} J(\theta, \theta\_0) &= -\frac{1}{m} \left[ \sum\_{i=1}^m \boldsymbol{\wp}^{(i)} \cdot \ln \boldsymbol{g} \left( \boldsymbol{\theta}^T \cdot \boldsymbol{\varkappa}^{(i)} + \theta\_0 \right) \right] \\ &- \frac{1}{m} \left[ \sum\_{i=1}^m \left( 1 - \boldsymbol{\wp}^{(i)} \right) \cdot \ln \left( 1 - \boldsymbol{g} \left( \boldsymbol{\theta}^T \cdot \boldsymbol{\varkappa}^{(i)} + \theta\_0 \right) \right) \right] \text{(1)} \end{aligned}$$

where m is the number of samples of TrainingSet, y (i) is the known class membership of the i-th sample, θ and θ<sup>0</sup> are the classification parameters and g(·) is the logistic function. Resilient Backpropagation (RProp) was chosen as minimization algorithm (Baykal and Erkmen, 2000; Bellingegni et al., 2017) for the NLR. Each single classifier was iteratively trained with all possible configurations of its internal parameters that had an appropriate range of values (Bellingegni et al., 2017).

To avoid overfitting and explore the best model, the CV was used to evaluate the performance of classifiers for each set of internal parameters (Bellingegni et al., 2017). In this study, the goodness of the classification was evaluated in terms of F1Score because it was considering more robust, in lieu of accuracy, to assess the performance (Powers, 2011). Once the optimal classification model had been chosen, TS was used to evaluate the performance of classifier when new features were introduced as input.

(σ 2 ).

Sensitive Resistors (FSR), Model 402 by Interlink Electronics.

The NLR algorithm calculated the class membership probability by using the following logistic function:

$$P(1 \mid \mathbf{x}, \boldsymbol{\theta}) = \begin{cases} \mathbf{g}(\boldsymbol{\theta}^T \cdot \mathbf{x}) = \frac{1}{1 + e^{-\left(\boldsymbol{\theta}^T \cdot \mathbf{x} + \boldsymbol{\theta}\_0\right)}}\\ 1 - P(\mathbf{y} = \mathbf{0} \mid \mathbf{x}, \boldsymbol{\theta}) \end{cases} \tag{2}$$

where θ and θ<sup>0</sup> are, respectively, the classification parameters vector and bias term, while g(·) is the logistic function. Additional polynomial features (e.g., x1, x2, x<sup>1</sup> ∗ x2, x 2 1 , x 2 2 ) were introduced to make non-linear this logistic regression model. The prediction of class labels hθ for the NLR algorithm was achieved by comparing the probability distribution P(y|x) with a decision threshold (TH):

$$h\_{\theta} = \begin{cases} P(1 \mid \mathbf{x}, \theta) \ge TH \to 1 \\ P(1 \mid \mathbf{x}, \theta) < TH \to 0 \end{cases} \tag{3}$$

#### 2.3. LDA Classification Algorithm and Dataset Organization

In order to create linear classifiers able to provide accurate movement classes and force levels recognition, a proper features set needs to be chosen to represent the sEMG signals (Hargrove et al., 2007). In our study, for each of the three LDA classifiers, five time-domain (TD) features (Mean Absolute Value (MAV), Root Mean Square (RMS), Slope Sign Change (SSC), Waveform Length (WL) and Variance (σ 2 ) were extracted from the corresponding channels of "raw" EMG data (**Figure 3**), in each analysis windows of 150 ms with an overlap of 100 ms (Smith et al., 2011). Since the LDA classifiers don't required the setting of internal parameters (Bellingegni et al., 2017), the training and test rely on a two ways data split approach (Ripley, 2007). Thus, the initial dataset was divided in this way: the TrainingSet (TR) contains 70% of the data and the test set contains the remaining 30% of the data. The training of the classifiers was performed by using the Equations (4,5). The subset were iteratively filled

FIGURE 6 | Subject positioning and data acquisition during experimental validation of the proposed approach. The subject was sitting in a comfortable chair in front of a PC monitor and was asked to perform six repetitions of each hand/wrist gesture. The subject performed "Spherical" and "Tip" gestures during the grasping of a rectangular object and executed three force levels. Written informed consent for the publication of this image was obtained.

trough a random shuffle in order to obtain a configuration with proportionated class number (Bellingegni et al., 2017). The downsampling step wasn't necessary because the features extraction avoided the generation of large-scale-dataset and guaranteed a short time for the training of the classifiers. In details, the Linear Discriminant Analysis (LDA) with features extraction is a binary supervised machine learning algorithm able to transform the features into a lower dimensional space, which maximizes the ratio of the between-class variance to the withinclass variance. This guarantees the maximum class separability (Welling, 2005). The following decision function is used to discriminate between only two different classes and to assign class label 1 or 2 to unknown data:

$$h\_{\beta}(\mathbf{x}) = \begin{cases} (\boldsymbol{\beta}^T \cdot \boldsymbol{\varkappa} + \beta\_0) \ge 0 \to 1\\ (\boldsymbol{\beta}^T \cdot \boldsymbol{\varkappa} + \beta\_0) < 0 \to 2 \end{cases} \tag{4}$$

where β and β<sup>0</sup> are, respectively, the classification parameters vector and the bias term. In details, the classification parameters can be evaluated in this way:

$$\begin{cases} \beta = \Sigma^{-1} \cdot (\mu\_1 - \mu\_2) \\ \beta\_0 = -\beta^T \cdot \left(\frac{\mu\_1 + \mu\_2}{2}\right) + \ln\left(\frac{\Pi\_1}{\Pi\_2}\right) \end{cases} \tag{5}$$

where 6 is the pooled covariance matrix, µ1, µ2, 51, 5<sup>2</sup> are the mean vectors and the prior probabilities of class 1 and 2, respectively. Since LDA is a binary algorithm a one vs. all approach was implemented to solve the multi-class classification problem. The class label (c) is predicted as following:

$$h\_{\beta}(\mathbf{x}) = \max\_{\boldsymbol{\varepsilon}} \left( \boldsymbol{\varepsilon} \boldsymbol{\beta}^{T} \cdot \boldsymbol{\chi} + \boldsymbol{\varepsilon}\_{\varepsilon} \beta\_{0} \right) \\
\text{and} \\
\begin{cases} \boldsymbol{\varepsilon}\beta = \boldsymbol{\Sigma}^{-1} \cdot (\boldsymbol{\mu}\_{\varepsilon}) \\
\boldsymbol{\varepsilon}\beta\_{0} = -\boldsymbol{\varepsilon}\_{\varepsilon} \boldsymbol{\beta}^{T} \cdot \left( \frac{\boldsymbol{\mu}\_{\varepsilon}}{2} \right) + \ln \left( \boldsymbol{\Pi}\_{\varepsilon} \right) \end{cases} \tag{6}$$

where <sup>c</sup>β and <sup>c</sup>β<sup>0</sup> are the classification parameters vector and the bias term of c class, respectively. An ad hoc developed software was implemented in Matlab for the construction of the three LDA classifiers.

The performance were evaluated through F1Score values and a Wilcoxon Signed-Rank test at p < 0.005 had been employed for comparing NLR and LDA classifiers in common datasets (Ortiz-Catalan et al., 2014). The LDA were trained and tested at 1KHz (without downsampling step) and for this reason the NLR model was evaluated considering the F1score on GS for the comparative analysis of the performance.

#### 2.4. Experimental Setup

Thirty-one healthy participants (age: 28 ± 7.6 years) were involved in the experiments. Six commercial active sEMG sensors (Ottobock 13E200 = 50, 27mmX18mmX9.5mm) were equidistantly fixed on an elastic adjustable bracelet and then were placed on the forearm of the able-bodied subjects in order to acquire sEMG signals (**Figure 5**).

The bracelet was located about 5cm below the subjects elbow, in line with the positioning of the electrodes, commonly used to control a prosthetic hand (Riillo et al., 2014). This type of electrodes output an enveloped signal of the "raw" signal

FIGURE 7 | Normalized confusion matrix of the "hand/wrist gestures classifier" obtained with NLR algorithm (A) and LDA algorithm (B). The confusion matrices are normalized with respect to the number of data belonging to the "GS" for the NLR classifier and to the "TS" for the LDA classifier. On the main diagonal the cardinality of the correct classifications is reported; in the top left dial and bottom right dial, the cardinality of the misclassified data related to the 7 output classes representing the hand gestures are reported.

FIGURE 8 | Normalized confusion matrix of the "Spherical force classifier" obtained with NLR algorithm (A) and LDA algorithm (B). Normalized confusion matrix of the "Tip force classifier" obtained with NLR algorithm (C) and LDA algorithm (D). The confusion matrices are normalized with respect to the number of data belonging to the "GS" for the NLR classifier and to the "TS" for the LDA classifier. The cardinality of the correct classifications is reported on the main diagonal; in the top left dial and bottom right dial, the cardinality of the misclassified data related to the 3 output classes that represented the force levels are reported.

(after amplification, filtering and rectification). The number of sEMG sensors was chosen equal to six because it was considered the highest number that was possible to place into the socket (Riillo et al., 2014). Moreover, it allowed to reduce the data dimensionality and complexity (Bellingegni et al., 2017). The EMG sensors operated in the range 0 − 5V with a bandwidth of 90 − 450Hz and a common rejection ratio higher than 100dB.

Five Force Sensitive Resistors (FSR), Model 402 by Interlink Electronics, were placed on a glove to verify the effective forces executed by the subjects. The relationship between the FSR voltage value V and the force value F was established with a statistically characterization as explained in Romeo et al. (2015). The relation between voltage and force is described trough the following mathematical expression:

$$F = p\_1 V^5 + p\_2 V^4 + p\_3 V^3 + p\_4 V^2 + p\_5 V + p\_6 \tag{7}$$

obtained with the polynomial model:

$$\wp = \sum\_{i=1}^{n+1} p\_i \mathfrak{x}^{(n+1-i)} \tag{8}$$

TABLE 1 | Mean value and standard deviation of F1Score and Accuracy of the "hand/wrist gestures classifier" calculated for 31 healthy subjects with NLR and LDA algorithms.


The classification performance for the NLR classifiers are evaluated on "GS," while LDA classifiers are tested on "TS," with data sampled at 1 KHz (without downsampling).

TABLE 2 | Mean value and standard deviation of F1Score and Accuracy of the "Spherical force classifier" calculated for 31 healthy subjects with NLR and LDA algorithms.

#### Spherical force classifier


The classification performance for the NLR classifiers are evaluated on "GS," while LDA classifiers are tested on "TS" with data sampled at 1 KHz (without downsampling).

TABLE 3 | Mean value and standard deviation of F1Score and Accuracy of the "Tip force classifier" calculated for 31 healthy subjects with NLR and LDA algorithms.


The classification performance for the NLR classifiers are evaluated on "GS," while LDA classifiers are tested on "TS," with data sampled at 1 KHz (without downsampling).

where n + 1 represents the number of fitting coefficients, while n (1 ≤ n ≤ 9) is the degree of the polynomial. The Anderson loop was used as signal conditioning circuit (Anderson, 1998).

CPU @ 1.80 GHz 2.40 GHz) and DAQ communicated by means of an USB port.

The EMG and force data were simultaneously acquired at 1 KHz using a suitable software on Labview platform by DAQ USB 6002 device. The PC (Samsung Intel(R) Core (TM) i7-4500U The subject was sitting in front of a monitor (**Figure 6**) and was asked to perform the following seven hand gestures: Rest (hand relax), Spherical (hand with all fingers closed), Tip (hand with thumb and finger touching as if picking a small object),

FIGURE 9 | (A) Average F1Score values calculated on 30 healthy subjects using NLR "hand/wrist gestures classifier" algorithm, tested on "GS," and LDA "hand/wrist gestures classifier" with 5 time domain features, tested on "TS." (B) Average F1Score values calculated on 30 healthy subjects using NLR "Spherical force classifier" algorithm, tested on "GS," and LDA "Spherical force classifier" with 5 time domain features, tested on "TS." (C) Average F1Score values calculated on 30 healthy subjects using NLR "Tip force classifier" algorithm, tested on "GS," and LDA "Tip force classifier" with 5 time domain features, tested on "TS." Statistical non-significance is indicated by "ns".

FIGURE 10 | Force sum average values are obtained, by FSR measurements, for 31 healthy subjects during, respectively, the "Spherical" and "Tip" gestures, performed six times: the blue, red and black values represent the mean value and standard deviation of respectively low, medium, and high force values performed by each subject.




Platform (hand completely open and stretched), Point (hand with all fingers closed except for the index finger), Wrist Supination and Wrist Pronation. The participants were asked to produce each of these gestures for six times and hold it for 2 s with an interval of rest state about 2 s between each repetition.

In a initial phase before the training, each subject was asked to produce maximum muscle contractions in order to perform the highest peak of force, while grasping a stiff object of rectangular shape (weight 66 g, dimensions 50 × 100 × 17 mm) with "Spherical" and "Tip" grasps. The object was used also during the training session.

Three force thresholds were established at 30% (low), 60% (medium), and 90% (high) of the sum of all force contributions recorded from FSR sensors. Three force bands were defined as follows to reduce the difficult to perform a punctual value of force: the low level was fixed between the ±15% of the lowest threshold (i.e., 30%), the medium level was fixed as ±15% of the medium threshold (i.e., 60%), while the high level starts from −15% of the highest threshold (i.e., 90%) and continued until the maximum value. These bands were used to give a visual feedback to the subject during the recording of "Spherical" and "Tip" gestures.

#### 3. RESULTS

The results of the "hand/wrist gestures classifier" are reported in **Table 1** in terms of the average accuracy and F1Score for NLR and LDA algorithms. The results of LDA classifiers with time domain features extraction were obtained with data sampled at 1 KHz (without downsampling). Thus, for the comparative analysis, we reported the results of NLR classifiers tested on "GS" because they represent the behavior of the classifiers when data sampled at 1 KHz are provided as input (Bellingegni et al., 2017). To this purpose, the Wilcoxon Signed-Rank test applied to the F1Score values was performed with significance threshold set to 0.05. Average classification accuracy for the NLR "hand/wrist gestures classifier," the NLR "Spherical force classifiers" and "Tip force classifiers" are respectively equals to 98.78, 98.80, and 96.09%. The LDA "hand/wrist gestures classifier" reaches an average classification accuracy equals to 95.41%, while the LDA "Spherical force classifiers" and "Tip force classifiers" show an average classification value of 98.74 and 97.60%, respectively.

The results of the two force classifiers, "Spherical force classifier" and "Tip force classifier" are shown, respectively, in **Tables 2**, **3**, in terms of the average F1Score and accuracy for the NLR and LDA classifiers. The average classification accuracy of the NLR "Spherical force classifier" is 98.35% for the low force level, 98.25% for the medium force level and 99.80% for the high force level.The LDA "Spherical force classifier" shows an average classification accuracy of 98.7% for the low force level, 98.05% for the medium force level and 99.48% for the high force level. The average classification accuracy of the NLR "Tip force classifier" is 94.36% for the low force level, 94.46% for the medium force level and 99.26% for the high force level. The LDA "Tip force classifier" shows an average classification accuracy of 97.79% for the low force level, 96.36% for the medium force level and 98.66% for the high force level.

**Figure 7** shows the average confusion matrix when testing the NLR and LDA "hand/wrist gestures classifier" on "GS" and "TS," respectively. In details, **Figure 7A** reports the normalized confusion matrix for the NLR "hand/wrist gestures classifier," while **Figure 7B** is related to the LDA "hand/wrist gestures classifier." **Figure 8** shows the average confusion matrices when testing the NLR "Spherical force and Tip force classifiers" on the "GS" (**Figures 8A,C**) and the LDA "Spherical force and Tip force classifiers" on "TS" (**Figures 8B,D**). As shown in **Figure 9** the NLR and LDA "hand/wrist gestures classifier" were able to identify seven hand gestures with an average F1Score of 96.01% and 95.41% respectively (**Figure 9A**). The "Spherical force classifier" identified the force level reaching an average F1 score of 98.75 and 98.79% with NLR and LDA classifiers, respectively (**Figure 9B**). The "Tip force classifier" was able to define the force level with an average F1 score of 94.04 and 97.53% with NLR and LDA classifiers, respectively (**Figure 9C**). The Wilcoxon Signed-Rank test applied to the F1Score values points out no statistically significant difference ('ns") between NLR and LDA algorithms (at p < 0.05).

In **Figure 10** the average values of the sum of all the FSR measurements for 31 healthy subjects are showed. The misclassification error rates are presented in **Tables 4**, **5** for both the NLR and LDA "hand/wrist gestures classifier". The NLR "hand/wrist gestures classifier" performed the highest misclassification errors (i.e., 9%) with "Point" class, while the LDA "hand/wrist gestures classifier" performed misclassification errors greater than 10% (i.e., 11, 12.7, and 11.3%) for the "Spherical, Tip and Point" classes, respectively. The NLR and LDA "Spherical force classifier" reached the maximum misclassification error (i.e., 3 and 8%, respectively) for the "Medium" force level. The "Tip NLR and LDA force classifier" presented the same maximum misclassification error (i.e., 8.5%) for the "Medium" force level.

#### 4. DISCUSSION

As shown in **Table 1** and **Figure 9** the NLR and LDA "hand/wrist gestures classifier" were able to identify seven hand gestures


TABLE 5 | Misclassification error rates of Spherical and Tip Force Classifier calculated with NLR and LDA algorithms.

with an average F1Score of 96.01 and 95.41% respectively. The "Spherical force classifier" identified the force level reaching an average F1 score of 98.75 and 98.79% with NLR and LDA classifiers, respectively (**Table 2**). The "Tip force classifier" was able to define the force level with an average F1 score of 94.04 and 97.53% with NLR and LDA classifiers, respectively (**Table 3**). These results seem to be very promising if we consider that similar values of average F1Score have been achieved only for gesture classification (Duan et al., 2016; Bellingegni et al., 2017). The comparative analysis between NLR and LDA classifier, applied to F1Score values, reported no statistically significant difference (p < 0.05) between them.

Confusion matrices, reported in **Figures 7**, **8**, confirmed the positive results of the accuracy parameter. The cardinality of the correct classifications on the main diagonal underlined the high classification accuracy even if some misclassified data out of the main diagonal suggested a bit minus performance of "Tip force classifier" respect to "Spherical force classifier." This is due to the major difficulty encountered by few subjects to modulate between low and medium force levels during a Tip grasp. The high force levels were always well discriminated at 99% of average accuracy for both the NLR and LDA force classifiers. In **Table 4**, the LDA "hand/wrist gestures classifier" obtained a greater misclassification error rates than NLR "hand/wrist gestures classifier" ranging from 1% to maximum 12, 7% for discriminating seven hand/wrist gestures classes by using data including different muscular activations related to desired force levels. This may due to the fact that linear classifiers, with straight line or plane decision boundary, could not be the most appropriate method for a seven multi-class problem with features not linearly separable at all. In comparison to the results presented in Scheme and Englehart (2011), the misclassification error values, obtained for the "Spherical" and "Tip" classes with the LDA "hand/wrist gestures classifier," were lower than 17% and, thus, it can be considered an effective result. Moreover, the misclassification error values, obtained for the "Spherical" and "Tip" classes with the NLR "hand/wrist gestures classifier" were, respectively, equals to 6 and 5% and these results can be considered positive for an usable system (< 10%) (Scheme and Englehart, 2011). Finally, the misclassification error rates for the "Spherical and Tip force classifiers" are similar (**Table 5**), ranging from 1% to maximum 8, 5% for both the NLR and LDA classifiers.

Almost all healthy subjects were able to modulate the force levels and fall into the range displayed by the visual feedback, without generating high variance values, as shown in **Figure 10**. Fewer subjects difficulty reproduced the force values within the force intervals, despite the visual feedback as reference. For instance, in **Figure 10**, the subjects 25 and 3 were not able to well differentiate between medium and high force levels during Tip grasp (represented as red and black points), while subject 28 performed the three force levels too closed during Tip grasp. This depended on the subject's difficulty to maintain the applied force within the force intervals.

These results are also more appreciable if we take into account that NLR, used for the classification of both hand/wrist gestures and force levels, was trained and tested using only raw scaled sEMG signals as input features. On the other hand, the LDA algorithm employed the minimum number of classification parameters and computational burden. However, the use of time domain features extraction based on time windowing, make the class evaluation time equals to the window shift and the system delay approximates to the time window length (Bellingegni et al., 2017). Furthermore, the same number of sensors were adopted to classify seven gesture classes respect to the previous five (Bellingegni et al., 2017) and to identify three levels of force during the execution of "Spherical" and "Tip" grasps. The proposed hierarchical classification architecture permitted to decode the user's motion intention and desired force levels with high reliability. Despite the proposed PR approach was tested only on healthy subjects, the reported results are promising for future developments on trans-radial amputees. The proposed hierarchical pattern recognition approach has obtained effective results with both NLR and LDA algorithms that have been demonstrated to be suitable for discriminating both hand/wrist gestures and force levels applied during grasping tasks. Moreover online performance will be evaluated for controlling a multifunctional prosthetic device.

## 5. CONCLUSION

In this study a hierarchical classification approach was developed and tested to discriminate both hand/wrist gestures and force levels applied during grasping tasks. The proposed PR system, implemented with both NLR and LDA classifiers, was tested on 31 healthy subjects by using 6 commercial sEMG sensors and five FSR placed on a glove. The method employed three different classifiers to discriminate both desired gestures and forces. To this purpose, the NLR and LDA algorithms were adopted for implementing the hierarchical classification approach and a comparative analysis among the performance of these two algorithms was done. The statistical analysis based on the Wilcoxon Signed-Rank test, applied to the F1Score values, revealed no statistically significant difference between NLR and LDA. The NLR classifiers exhibited excellent results in terms of accuracy both for gestures (i.e., 98.78%) and forces (Spherical 98.80%, Tip 96.09%). In particular, the force classifiers were able to robustly discriminate the same class of movement performed at different muscle contractions because they were trained with data containing the modulation of different force levels. Also the LDA classifiers achieved effective results in terms of accuracy both for gestures (i.e., 95.41%) and forces (Spherical 98.74%, Tip 97.60%). The misclassification errors of the NLR classifiers was limited to a maximum values of 9% for the "hand/wrist gestures classifier," 3% for the "Spherical force classifier" and 8.5% for the "Tip force classifier." On the other hand, the misclassification errors of the LDA classifiers reached the maximum values at 12, 7% for the "hand/wrist gestures classifier," 4% for the "Spherical force classifier" and 8.5% for the "Tip force classifier." In particular, the results of misclassification values obtained by the NLR and LDA "hand/wrist gestures classifier" for the "Spherical" and "Tip" classes, were particularly noteworthy and promising. Based on these outcomes, a new potential strategy should be introduced for mitigating the effect of different exerted forces within a given movement class. Another innovative contribution is represented by the use of FSM theory for the management of three classifiers. This control strategy avoids to face a more seven multi-class problem using a single classifier and make the system controllability less complex by activating the force classifiers only when the "hand/wrist gestures classifier" returns an output class belonging to a closure hand gesture. This classification approach, implemented both with NLR and LDA algorithms, have obtained positive results and seems to be very promising for identifying simultaneously desired gestures and force levels.

In conclusion, the proposed method allowed to extract from EMG signals all the valuable information regarding not only muscle contractions related to hand/wrist motions but also the changes of muscle activation patterns depending on the influence of different force levels. This approach will allow to improve the performance of the currently adopted prosthesis EMG control architectures thanks to the possibility to manage desired gestures and force levels in a more natural way. The ultimate goal will be to produce an intuitive controlled hand prosthesis integrating force regulation. Although the type of the recruited subjects did not allow to verify the performance in a real application scenario, this study permitted to provide a general indication about the performance of the proposed approach. Future works will be focused on the validation of the presented method on trans-radial amputees controlling multi-fingered hand prostheses. Moreover,

#### REFERENCES


online performance will be evaluated in real application scenario. After reaching an advanced grade of real time accuracy, an embedding version of this classification system will be developed to control a prosthetic device. Measures of system robustness and reliability will be performed testing the proposed approach during the control of prosthetic devices. Advanced control strategies (Ciancio et al., 2015; Barone et al., 2016) will be adopted to allow force regulation and slippage management during grasping (Cordella et al., 2016).

# ETHICS STATEMENT

The experimental protocol was approved by the local Ethical committee (Comitato Etico Université; Campus Biomedico di Roma) and complied with the Declaration of Helsinki. All subjects gave written informed consent in accordance with the Declaration of Helsinki.

# AUTHOR CONTRIBUTIONS

FL analyzed the literature, designed the proposed approach, acquired, and analyzed the experimental data and wrote the paper. CG contributed to the design of the proposed approach, to the analysis of the experimental data and wrote the paper. AC contributed to the design of the proposed approach and of the experimental setup, wrote the paper, and supervised the study. EGr contributed to the design of pattern recognition algorithm and contributed to the manuscript writing. AD contributed to the manuscript writing. RS contributed to the manuscript writing. EGu contributed to the manuscript writing. LZ designed the paper and supervised the study. All the authors read and approved the manuscript.

# FUNDING

This work was supported partly by the Italian Institute for Labour Accidents (INAIL) prosthetic center with PPR AS 1/3 (CUP: E57B16000160005) and PCR 1/2 (CUP: E57B16000160005) projects, partly by Fondazione ANIA with the project Development of bionic upper limb prosthesis characterized by personalized interfaces and sensorial feedback for amputee patients with macro lesion after car accident, partly by INAIL with the RehabRobo@work project (CUP: C82F17000040001) and partly by BANDO INTESE Innovation and technology transfer to support the fruition of research results on the territory co-financed by Regione Lazio (FILAS RU 2014 1193).

Aszmann, O. C., Roche, A. D., Salminger, S., Paternostro-Sluga, T., Herceg, M., Sturma, A., et al. (2015). Bionic reconstruction to restore hand function after brachial plexus injury: a case series of three patients. Lancet 385, 2183–2189. doi: 10.1016/S0140-6736(14)61776-1

Baker, J. J., Scheme, E., Englehart, K., Hutchinson, D. T., and Greger, B. (2010). Continuous detection and decoding of dexterous finger flexions with implantable myoelectric sensors. IEEE Trans. Neural Syst. Rehabil. Eng. 18, 424–432. doi: 10.1109/TNSRE.2010.2047590


distance and electrode configuration. IEEE Trans. Biomed. Eng. 59, 645–652. doi: 10.1109/TBME.2011.2177662


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Leone, Gentile, Ciancio, Gruppioni, Davalli, Sacchetti, Guglielmelli and Zollo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Tactile Decoding of Edge Orientation With Artificial Cuneate Neurons in Dynamic Conditions

Udaya Bhaskar Rongala1,2 \*, Alberto Mazzoni <sup>1</sup> , Marcello Chiurazzi <sup>1</sup> , Domenico Camboni <sup>1</sup> , Mario Milazzo<sup>1</sup> , Luca Massari 1,2, Gastone Ciuti <sup>1</sup> , Stefano Roccella<sup>1</sup> , Paolo Dario<sup>1</sup> and Calogero Maria Oddo<sup>1</sup> \*

<sup>1</sup> Scuola Superiore Sant'Anna, The BioRobotics Institute, Pisa, Italy, <sup>2</sup> Department of Linguistics and Comparative Cultural Studies, Ca' Foscari University of Venice, Venice, Italy

Generalization ability in tactile sensing for robotic manipulation is a prerequisite to effectively perform tasks in ever-changing environments. In particular, performing dynamic tactile perception is currently beyond the ability of robotic devices. A biomimetic approach to achieve this dexterity is to develop machines combining compliant robotic manipulators with neuroinspired architectures displaying computational adaptation. Here we demonstrate the feasibility of this approach for dynamic touch tasks experimented by integrating our sensing apparatus in a 6 degrees of freedom robotic arm via a soft wrist. We embodied in the system a model of spike-based neuromorphic encoding of tactile stimuli, emulating the discrimination properties of cuneate nucleus neurons based on pathways with differential delay lines. These strategies allowed the system to correctly perform a dynamic touch protocol of edge orientation recognition (ridges from 0 to 40◦ , with a step of 5◦ ). Crucially, the task was robust to contact noise and was performed with high performance irrespectively of sensing conditions (sensing forces and velocities). These results are a step forward toward the development of robotic arms able to physically interact in real-world environments with tactile sensing.

Edited by:

Jeffrey L. Krichmar, University of California, Irvine, United States

#### Reviewed by:

Alex Pitti, Université de Cergy-Pontoise, France Martin James Pearson, Bristol Robotics Laboratory, United Kingdom

#### \*Correspondence:

Udaya Bhaskar Rongala udayabhaskar.rongala@santannapisa.it Calogero Maria Oddo calogero.oddo@santannapisa.it

> Received: 20 January 2019 Accepted: 07 June 2019 Published: 02 July 2019

#### Citation:

Rongala UB, Mazzoni A, Chiurazzi M, Camboni D, Milazzo M, Massari L, Ciuti G, Roccella S, Dario P and Oddo CM (2019) Tactile Decoding of Edge Orientation With Artificial Cuneate Neurons in Dynamic Conditions. Front. Neurorobot. 13:44. doi: 10.3389/fnbot.2019.00044 Keywords: force and tactile sensing, neuro-robotics, conduction delays, mechanoreceptors, cuneate neurons, biologically-inspired robots, spiking neural networks

# INTRODUCTION

As robots become more accepted to be part of our daily social and work environments, the research focus has taken a diversion toward more human centric design and learning paradigms. Many research studies in recent time have taken inspiration from nature and its evolutionary principles, to exploit the robustness and low computational costs in performing a dynamic task in un-trained surroundings (Ijspeert, 2014).

In the last few decades several neurophysiological studies in mammals focused on understanding the role of the various families of mechanoreceptors (sensory receptors that are sensitive to mechanical distortions) spread across the human skin, and their role in projecting information about external world to brain (Johansson and Flanagan, 2009; Abraira and Ginty, 2013). Such studies, subsequently led to enhanced understanding of the profile of sensory information, that neuronal circuits receive during simple object manipulation tasks. These studies describe the nature of spatiotemporal information (spiking responses) in tactile sensory afferents. Further, there is evidence that tactile feature extraction can happen already at afferent stages (Johansson and Flanagan, 2009; Gollisch and Meister, 2010; Weber et al., 2013), complementing with central information processing in the sensory cortex (Bensmaia et al., 2008a; Hsiao, 2008). Explicit studies on the effects of geometric features such as edge orientations (vertical line tilted in an angle, **Figure 1C**) on sensory afferents reported about information processing at peripheral stages of sensory perception (Pruszynski and Johansson, 2014). In alike manner, in the present study our research focuses on developing a robust tactile perception for robots, based on bioinspired paradigms. Toward this goal, we aimed at reproducing the intelligence embodied in the connectivity of the peripheral human tactile sensory system. By reproducing this connectivity structure in our robotic system we mimicked the hardwired architecture that has been hypothesized in humans (Johansson and Flanagan, 2009). In particular, we captured the features of the conduction delays along the neural pathways from peripheral to the central processing stages, to allow the tactile processing of geometric features such as edge orientations.

Many research studies in recent time have created neurorobotic systems by combining computational models based on various neurophysiological data (Saal et al., 2017) along with tactile sensors to reconstruct tactile afferent like responses (Lee et al., 2017; Rongala et al., 2017; Osborn et al., 2018). Recently, biomimetic computational models of peripheral tactile perception were extended to take into account the second neuronal layer of decoding (Bologna et al., 2013; Rongala et al., 2018a,b) and this feature was also used for edge detection (Hay and Pruszynski, 2018). Some robotic studies have also focused on geometric feature extraction techniques, where Ruben and colleagues (Ponce Wong et al., 2014) studied edge orientation

FIGURE 1 | Methods. (A) Robot setup with finger and compliant wrist integration. Insert demonstrates the integration of tactile sensor onto robot end effector with the help of a compliant wrist. (B) Illustration of the active touch protocol, where the finger is moved across the stimuli (red dotted line). (C) Representation of all the 9 ridge stimuli (3D printed with a height and width of 0.5 × 0.5 mm, with placement of three ridges).

with similar approach of robotic-arm based exploration and classification. They used support vector regression methods to learn and classify the stimuli, which requires offline training. One other study from Hernandez and colleagues (Martinez-Hernandez et al., 2013) conducted static edge perception experiments by tapping the stimuli with tactile sensor to demonstrate passive tactile perception for contour following. This approach used a probabilistic classifier based on Bayesian formalism for tactile perception. The edge detection was done only on right angles (0◦ , 90◦ , 180◦ , 270◦ ) with controlled sensing force.

One characteristic feature (Ijspeert, 2014) of all the aforementioned studies is that they were conducted under controlled sensing conditions. However, in order to build biomimetic devices that are able to process tactile information in the real world, we need to capture the way this process is robust to varying sensing conditions. Therefore, here we adopted a neuro-inspired paradigm to create a tactile feature extractor and verified it to be an effective information decoding strategy. Combining a two-layer neurocomputational model based on discrete events and delays, along with soft robot interfaces led us to develop a functional tactile system, that was able to deliver effective decoding of geometric edge orientations under varying sensing conditions (sensing forces and velocities). We have also assessed that our system performance was robust to variation in sensing forces.

# MATERIALS AND METHODS

# Tactile Sensor

For this research study we used a tactile fingertip, with a core element of MEMS (Micro Electro-Mechanical System) sensor. This sensor comprises of 2 × 2 piezoresistive sensor array (**Figure 2A**), arranged with 2.36 mm pitch [SensorPitch (SP) in Equation (1)]. Each sensor array comprises of four sensory channels (four piezoresistors, with cross-shape arrangement), constituting for a total of 16 sensory-channels in a 22.3 mm<sup>2</sup> area (Beccai et al., 2005; Oddo et al., 2011). This tactile sensor demonstrated sensitivity for both tangential and normal forces (Oddo et al., 2011) and precision and repeatability in the neuromorphic encoding-decoding of a varied range of stimuli that include ridges (Oddo et al., 2016) and naturalistic textures (Rongala et al., 2017). In this research study we consider data from 2 sensory channels [sensory channel 8 (SC8) and sensory channel 11 (SC11), **Figure 3A**], that are sensitive to the tangential force arising along the stimulus sliding direction. These channels are also space shifted along same axis with respect to the stimulus direction, which makes them appropriate to validate the conductional delay hypothesis. We convert this analog sensory information to neuromorphic spikes (event-based representation of data alike in neurons) using neuron models that are described in further sections of this article.

# Compliant Wrist

A compliant wrist was assembled on a rigid end effector of a 6 DoFs anthropomorphic robot (Comau Racer-7-1.4) through a loadcell (6-axis Nano 43, ATI Industrial Automation, Apex,

mechanosensors inputs (green and red spike trains, MS1, and MS2) for a single stimulation sequence converge to the same cuneate nucleus neuron (CNi , 2nd order neuron), through specific conduction delays along the pathway. In the illustrated example, CN1 generates output spikes (blue spike trains) only when it receives MS inputs that coincide. (B) The other CNs (CN2 and CN3) respond to different angle stimuli, depending on the set of conduction delays and subsequent temporal relations formed between the inputs.

USA). Whereas, the other end of compliant wrist carries the tactile sensor (**Figure 1A**). The design of the wrist was aimed at realizing a soft joint, enabling adaptation upon contact (between the fingertip and the external surfaces). Such a compliant element also prevents damage to the tactile sensor without affecting its sensitivity. The structure of the compliant wrist was shaped as a cylinder with a diameter and height of 40 × 60 mm. Moreover, the soft wrist was realized with helicoidally flextures to increase its flexibility. The joint was manufactured through molding of a polymeric viscous material, namely Dragon Skin (10 medium, Smooth-on, USA). Two metallic plates were used on the top and bottom surfaces of the cylinder, as a mechanical interface between the robot end-effector and the tactile sensor. Under the typical loads for this specific application (in the order of 1 N), analyses with a finite element model (COMSOL Multiphysics, COMSOL Inc., USA) showed a compressive stiffness of 2.5 N/mm and a stiffness of 0.14 N/mm along the tangential directions (**Figures 3A,B**).

## Stimuli

We used 9 rectangular shaped ridges as stimuli with dimension of 0.5 × 0.5 mm (height and width). Each of this ridge is fabricated with an inclination angle, ranging from 0 to 40◦ angles with a step of 5◦ (**Figures 1C**, **4A**). These ridges were fabricated using 3D printing technology. For these experiments, we printed three ridges with consecutive angles onto a single stimuli bar, resulting in total 3 stimuli bars bearing the 9 stimuli (as shown in **Figure 1C**). As these ridges are printed on three different stimuli bars, they have slightly varied physique because of different bending and other small deformations that occur in stimuli due to the 3D printing technology. This further pose additional challenges in the generalization abilities of the developed neuro bioinspired architecture.

# Experimental Protocol

We adopted an active touch protocol, where the stimuli are kept fixated on a rig and the robot end-effector (hooked with tactile fingertip) is maneuvered across the surface of the stimuli (as illustrated with blue arrows in **Figure 1A**; **Supplementary Video**). The robot is controlled using a real-time industrial controller (IC3173, programmed with NI LabVIEW, LabVIEW Real-Time and LabVIEW FPGA, along with C5G COMAU robot controller), that received position control commands. The robot default home position was put just above (along z-axis of the robot) the surface of the tactile stimuli. Further, the robot end-effector was moved along its z-axis (toward the stimuli), until a reference sensing force (force exerted between the tactile fingertip and stimuli) of ∼200 mN was reached. This is considered as the initial z-axis position (Z1, in **Figure 3C**) for all the stimuli.

Once the "Z1" position reached, keeping the z-axis of the robot locked, the fingertip was slid across the surface of stimulus (robot end-effector translated in cartesian space) for a length of 15 mm with a fixed sliding velocity, covering the whole surface of the ridged stimuli. The fingertip was held still for 1 s in this position, at the end of sliding. Finally, the fingertip was retracted away from the stimulus along the z-axis. This trajectory (**Figure 1B**, red dotted line) was maintained across all the ridges. Once the sliding was finished across all three ridges, the robot was brought back to the home-position. The robot was progressed 0.5 mm in z-axis (toward the stimulus) from the previous z-axis position (Z1–4) to generate varied sensing/contact forces, but without implementing precise force-feedback control. This experimental protocol was repeated 5 times across each stimulus, in-order to assess the repeatability of the system. In the following study we have presented and validated the neuro-inspired architecture for

sensing force between the tactile sensor and stimuli. (A) Tangential stiffness (across x-axis) of the compliant wrist displacement as the function of applied force. (B) Compressive stiffness (across z-axis) of the compliant wrist displacement as the function of applied force. (C) Sensing forces recorded during finger contact with all stimuli (1–9) associated to the different z-axis positions of the robot (Z-positions 1–5). The z-positions are obtained with a progressive increase by a step of 0.5 mm toward the stimuli from home position (as described in Methods). The boxplot illustrates the range of force across 5 mechanical repetitions and all 9 stimuli. The forces are recorded using a loadcell that is placed between the compliant wrist and the robotic arm.

five different sensing forces (**Figure 3C**, with P < 0.001, using ANOVA test) and five different sensing velocities (5, 10, 15, 20, and 25 mm/s).

# Neuron Model (Mechanoreceptors, 1st Order Neurons)

The tactile sensor analog data was fed as virtual current input (Iinput) to a custom implementation of the Izhikevich neuron model (Izhikevich, 2003) to generate the mechanosensorslike spatiotemporal spike responses (**Figures 3A**, **4B**, 1st order neurons). The Izhikevich neuron model was chosen in order to reproduce the adaptation dynamics, which is an important

TABLE 1 | Izhikevich neuron model parameters.


characteristic that is observed in mechanoreceptors (Johansson and Flanagan, 2009). The basic Izhikevich neuron model was defined by the following nonlinear differential equations, where v is membrane potential and u is the adaptation variable.

$$\begin{aligned} \dot{\nu} &= A\nu^2 + B\nu + C - u + \frac{I\_{input}}{C\_m} \\ \dot{u} &= a(bV - u) \end{aligned} \tag{1}$$

When the membrane potential reached the spike threshold of 30 mV, an output spike was produced followed by a reset,

$$\text{if } V \ge 30 \text{ } m\nu \text{, then } \begin{cases} \quad \nu \precsim \mathcal{c} \\ u \precsim u + d \end{cases} \tag{2}$$

A, B, C are the standard Izhikevich model parameters, whereas the a, b, c, and d parameters were chosen as specified in **Table 1**, to reciprocate regular spiking behavior.

### Neuron Model (Cuneate Neurons, 2nd Order Neurons)

The cuneate neurons (CNs) were also modeled as regular spiking Izhikevich neurons, based on the similar differential equations described above (Equations 1, 2). Whereas, the input current (Iinput)to the CNs was modeled as the summation of currentbased post-synaptic potential(PSPtot, Equation 4) (Cavallari et al., 2014) from mechanosensor neurons along with addition of specific differential conduction delays (1T) (**Figure 3**). At each given spike time of mechanosensor output (t ∗ i ) is converted to a single PSP<sup>i</sup> , who's kernel was given by Equation (3).

$$\begin{aligned} PSP\_i &= \frac{\tau\_l}{\tau\_d - \tau\_r} \times \left[ \exp \left( -\frac{t - \tau\_l - t\_i^\*}{\tau\_d} \right) \right] \\ &- \exp \left( -\frac{t - \tau\_l - t\_i^\*}{\tau\_r} \right) \end{aligned} \tag{3}$$

$$\text{PSP}\_{tot} = \sum\_{i \in prt} \text{PSP}\_i \tag{4}$$

The parameters, decay time (τd), rise time (τr) and latency time (τl ) defines the shape of the PSP<sup>i</sup> kernel. The basic configuration values τ<sup>d</sup> = 12.5 ms, τ<sup>r</sup> = 4 ms, and τ<sup>l</sup> = 21 ms (constant to calculate the ratio) are chosen based on the previous assumptions of calcium concentration induced in the synapse as presented in our previous work (Rongala et al., 2018b). t ∗ i gives the input spike-time from i th mechanosensor.

#### Conduction Delays

A conduction delay is the time step (1 T) that was added to the whole output spike-train of mechanoreceptors (1st

order neurons) along the afferent pathway. These delays bear a resemblance to the conduction times in nerves that connects tactile afferents (in hand) to the cuneate neurons (in brainstem) of humans (Johansson and Flanagan, 2009). We tested differential conduction delays ranging from 175 ms (MS1 ahead of MS2) to −700 ms (MS2 ahead of MS1) with a step of 1 ms, constituting for 876 conduction delays.

# Classification Algorithm

Given the characteristics of the data we have chosen a linear discrimination method trained with supervised learning. The classifier was trained and tested using a 5-fold cross-validation, which was repeated for 100 iterations to ensure the robustness of the classifier and training procedure. We have taken advantage of the inbuilt MATLAB <sup>R</sup> functions to perform this computation.

A probability density of the CN spike responses is calculated using the histogram function in MATLAB <sup>R</sup> , with a binsize of 10. Further, the median of this probability distribution was chosen as an input vector to the above described classifier (one-dimensional input). While considering single force-based decoding (**Figures 5B**, **7B**, **10A**), the input vector data is binned as 9 classes, representing all the 9 stimuli. The same followed for generalized decoding (**Figures 6B**, **8**, **10B**), the input vector data was binned as 9 classes (9 stimuli), irrespective of sensing forces.

# RESULTS

To explore the possibility and the efficacy of a biomimetic approach to tactile sensing, we developed an experimental set-up comprising a biomimetic tactile sensor (Oddo et al., 2011) linked

(Figure 4C) across all the nine stimuli for a given force. The boxplot demonstrates a gradual shift in the CNs spiking respective to the stimuli angle and conduction delay. The spiking probability is calculated across 5 experimental repetitions of each CN. (B) Confusion matrix demonstrates a decoding performance achieved across all the 9 stimuli (for a fixed sensing force (Z1) and sensing velocity (5 mm/s)) based on CNs responses, using supervised linear discrimination classifier. The decoding accuracy was 100% across 5 stimuli with a step size of 10◦ (S1, S3, S5, S7, and S9), and 88.2% with a step size of 5◦ .

to a customized soft wrist and 6 degree-of-freedom robotic arm (**Figure 1A**). This interaction led to the efficient classification of stimuli containing 9 angle ridges selected in a range from 0◦ to 40◦ with a step of 5◦ (**Figure 1C**) based on the contact with the biomimetic fingertip (**Figure 1C**).

FIGURE 6 | Classification of CNs responses irrespective of sensing forces. (A) Spiking probability across all the nine stimuli and five different sensing forces (Figure 3C), for each conduction delays. These CNs responses are for a single sensing velocity (5 mm/s). This plot clearly demonstrates almost similar spiking probability, robust to the variation of the sensing force, thus supporting generalization ability of the proposed approach. (B) Confusion matrix illustrating the decoding performance achieved by CNs irrespective of forces (labeling stimuli irrespective of sensing force), across all the 9 stimuli. The decoding accuracy was 100% across 5 stimuli with a step size of 10◦ (S1, S3, S5, S7, and S9), and 94.5% with a step size of 5◦ .

is simulated for 5 experimental repetitions with an addition of 19 noise repetitions each, constituting a total of 100 repetitions for each CN configuration. The noise was generated by a gaussian distribution with σ = 10 ms. The continuous plot alongside each raster plot illustrates the density of spiking for that respective differential conduction delay. The green line illustrates the theoretical value (Equation 5) of conduction delay. The median value of the density is indicated in red dot. The red line across stimuli illustrates the linear shift of spiking probability across different conduction delays. (B) Confusion matrix illustrating the decoding performance that is achieved by CNs across all the 9 stimuli (for a single sensing force and sensing velocity), using supervised linear discrimination classifier. The decoding accuracy was 100% across 5 stimuli with a step size of 10◦ (S1, S3, S5, S7, and S9), and 91% with a step size of 5◦ .

A two layer neuro-computational model was used in processing these tactile sensory information. In the first layer we make use of the Izhikevich neuron model to convert the output of the 16-channel tactile sensor data to multiple neuron spiking responses (**Figure 2A**) (Rongala et al., 2017). This allowed mimicking the response properties of human mechanosensors (MSs). In the second layer, we emulate cuneate neurons (CNs), again as regular spiking Izhikevich neurons. The inputs to CNs were modeled as summation of current based postsynaptic potentials from the MSs (see Methods). The key of the decoding mechanism is the connectivity between the two layers. This connectivity embodies a model of event-based encoding of tactile responses (**Figure 2A**), emulating the discrimination properties of cuneate neurons (CNs) based on pathways with differential delay lines (**Figure 2A**). The CN is considered in this model as a coincidence detector (Johansson and Flanagan, 2009), meaning that a specific CN responds when there is a superimposition of input mechanosensor spike timing. Based on this hypothesis, each CN encodes a specific angle, depending on the match of the stimulus-driven delay between the activation of the MSs with the difference between their conduction delays, which results in a synchronized and thus effective stimulation of the appropriate CN recipient (**Figure 2A**). Thanks to this mechanism, the probability of each CN to respond to a certain stimulus depends of the combination of specific conduction delays connecting it to its presynaptic MSs (**Figure 2A**).

To validate this approach, initially we conducted the experimental protocol across all the 9 stimuli with a single sensing force (Z1, **Figure 3C**) and a single sensing velocity (5 mm/s, V1). Two sensory channels data (SC8 and SC11,

**Figure 2A**, see Material and Methods for details) were selected as inputs to the neuronal processing to emulate mechanosensorslike responses (MS1 and MS2, **Figure 4B**). These spike trains were then fed into CNs, with a given conduction delay time (1T). The second layer was constituted by 876 different cuneate neurons, each identified by a specific conduction delay between the inputs received from MS1 to MS2. The conduction delays are ranged from 175 to −700 ms with a step size of 1 ms. **Figure 4C** show the responses of CNs based on the MS1 and MS2 inputs for a given differential conduction delay, across all stimuli. The CNs responses demonstrated in **Figure 4C** (raster plots) are for all 5 experimental repetitions. Further, in order to test the effectiveness of the designed system we analyzed the spike distribution probability across all the CNs for each stimulus (**Figure 5A**). The spiking distribution illustrated in **Figure 5A**, are based on the CNs responses shown in **Figure 4C**. We found a gradual shift in the median of the spiking probability across the range of conduction delays, as a function of the stimuli ridge orientation (**Figure 5A**). This shift in probability illustrates that certain CNs respond more to a specific stimulus, i.e., those whose differential conduction delay tends to compensate the theoretical latency (TTheory) between the spiking activation of the related MSs.

$$T\_{\text{Theory}} = \text{SP} \times \frac{\tan(RA)}{V} \tag{5}$$

Where, SP denotes the Sensor Pitch (see Methods), RA denotes the Ridge Angle (edge orientation angle of the stimuli), and V defines the sensing velocity. This relationship suggests that it might be possible to invert the process and decode the presented orientation, looking at the distribution of firing across the CNs population. This decoding strategy was inspired by that hypothesized in humans (Johansson and Flanagan, 2009), where the vast amount of information from 10,000's of tactile afferents across the hand is reduced into a small and useful sensory dimension, and then further transmitted to higher level cognitive processing. For validation of the information content in CN responses we used a linear classification technique (see Methods for details), which yielded 88.2% accuracy in decoding across all the 9 stimuli (**Figure 5B**, chance level 11.1%). As demonstrated in the confusion matrix (**Figure 5B**), in some cases only consecutive stimuli are confounded, which is also contributed by the little angle difference between two stimuli along with dynamic sensing conditions in real time robot operation. Whereas, we achieved 100% correct classification (for all 5 sensing forces) restricting the decoding across five stimuli, with stimuli angle variation of 10◦ (S1, S3, S5, S7, and S9). This led us concluding that the actual accuracy of our device is better than 10◦ ridge identification.

In the real world the sensing conditions might change, hence we wanted to test that the performance of our device was not restricted to a particular force of contact. For this generalization test we conducted experiments similar to the one described above, but with 5 different levels of target sensing forces and a fixed sensing velocity (5 mm/s, V1). To further stress the generalization ability, these 5 target force levels were not generated by precise force-feedback control, but by setting 5 different z-axis positions of the robot end-effector (**Figure 3C**, see Methods for details). We then computed the CNs spiking probability across all the 9 stimuli and 5 forces (**Figure 6A**). We grouped then across all the forces the median of spiking probability belonging to same stimulus and used it as the feature vector for classifier to validate the CNs responses across all stimuli irrespective of sensing forces. We attained 94.5% correct decoding across all the nine stimuli (**Figure 6B**), with a stimuli angle variation of 5◦ . This shows that high level of accuracy can be achieved independently of the variations of the sensing force. This highly-generalized decoding performance proves our encoding strategies to be robust to varying sensing forces, thanks to an architecture mimicking the intelligence embodied in the neural pathways from periphery to the brain.

To further stress the robustness of our neuromorphic device, we introduced temporal jitters (with a gaussian noise of σ = 10 ms) in each experimental repetition. This contributed 100 repetitions of each CN encoding for each stimulus and sensing force level. **Figure 7A** shows the responses of CNs to these 100 repetitions, along with the spiking density for each conduction delay (continuous plot beside raster plot). We observed a coherent phenomenon of gradual shift in spiking intensity across the range of conduction delays for each stimulus (**Figure 7A**, red dot depicting the mean of spiking intensity) even with additional temporal noise. We also observed that the mean spiking intensity across each stimulus (**Figure 7A** red dotted line) fell near to the theoretically estimated conduction delays (**Figure 7A** green dotted line) based on Equation (5). We then

yielded a high decoding accuracy for these CN responses across all the 9 stimuli, even in the presence of the introduced temporal noise in MS spiking activity. In this condition with added spike jitter we attained a correct decoding between 91% and 100% across nine stimuli, for individual sensing force and fixed sensing velocity (**Figures 7B**, **10A**). Further, we attained a decoding accuracy of 94.7% irrespective of sensing forces and fixed sensing velocity (**Figure 8**).

Further, we have also tested our system for five different sensing velocities (5, 10, 15, 20, and 25 mm/s) for a given sensing force. The mechanosensor spike responses (MS1 and MS2) demostrate a gradual shift in their spike timing with respect to the stimuli, along with homogenous transformation of total spike time with respect to the sensing velocities (**Figure 9A**). Further we analyzed the spiking probability of CNs responses for each stimulus and given sensing velocity (**Figure 9B**). We found a gradual shift in the median of the spiking probability across the range of conduction delays (**Figure 9B**), showing that some CNs are sensitive to a specific stimulus, under an optimal sensing velocity.

In order to evaluate the effect of these dynamic sensing conditions on our neuromorphic device, we have performed stimulus classification based on the CNs responses (with additional temporal jitter in the MSs) for a combination of all 5 sensing forces and 5 sensing velocities. We achieved more than 90% correct decoding in 20 out of 25 experimental conditions (**Figure 10A**). A low decoding performance was observed at high sensing velocities with high sensing forces, where the spiking probability boundaries overlap. Further, we attained a high decoding performance irrespective of sensing force, for each sensing velocity (**Figure 10B**). This high decoding accuracy proved that the proposed architecture was highly robust to noise and dynamic sensing conditions.

# DISCUSSION

We developed an artificial tactile system, with a bioinspired tactile sensor mounted onto a traditional 6 degrees-of-freedom industrial robot using a compliant wrist, that allowed adaptation to irregular sensing dynamics present in the surrounding tactile world. Further, we used a neuroinspired two-layer architecture to process the tactile sensory information. Thanks to such a synergy, we achieved an excellent orientation decoding performance (100% for 10◦ and 94.9% across 5◦ orientation step, for

stimuli ranging from 0◦ to 40◦ ). This is an excellent result as compared to the state-of-art orientation detection in robotic applications (Martinez-Hernandez et al., 2013; Ponce Wong et al., 2014). Taking advantage of the precisions in existing tactile sensors and computational systems, and combining them with biomimetic architectures can lead to building functional systems that are more capable in sensing when compared to the psychophysical studies that report about 20◦ angular perceptual threshold in humans (Bensmaia et al., 2008b). Further, exploiting neuromorphic hardware systems to build spiking neuronal networks across the population of sensors enabled a computationally efficient implementation of a functional tactile system. Moreover, we were able to capture another peculiarity of human tactile detection, i.e., to perform decoding in a way largely irrespective of sensing forces, across different sensing velocities.

The multiple experimental sensing conditions, the irregularities of 3D printed stimuli, the intrinsic limitations in robot precision along with soft compliance in wrist, created a versatile sensing condition that generated a variety of sensory responses. The tactile system presented in this research, capable to cope with such realistic variability in experimental conditions, plays a keen role in robotic applications. Such biomimetic approach will allow the robots to adapt and perform effectively irrespective of the continuously changing environments.

Note that in this research study we considered nerve conduction delays that are larger than what have been observed in mammals (Johansson and Flanagan, 2009). In this study what we were interested into was to reproduce the principle of differential delay matching the encoding of the stimulus orientation. As the pitch of our tactile sensors was much larger than in humans, where each hand is densely populated with 10,000s of tactile sensors afferents, we used proportionally larger conduction delays coherently with the prediction of Equation (5). Anyway, the used sensor density allowed achieving a high decoding across the varied set of stimuli considered, providing evidence that robotics technology can benefit of neuroscientific advances and in turn robotics science can contribute to investigating neurophysiological hypotheses (Yang et al., 2016).

# AUTHOR CONTRIBUTIONS

UR, AM, and CO conceived and designed the study, analyzed data, discussed the results, and wrote the paper. UR and MC performed experiments. UR, MC, DC, MM, LM, GC, SR, PD, and CO developed the experimental setup. UR implemented the cuneate-based model. CO ideated the cuneatebased model, supervised, and coordinated the study. All authors contributed to manuscript revision, read and approved the submitted version.

# FUNDING

This work was supported in part by the Tuscany Region by means of the CENTAURO project (CUP D88C15000210008) funded under the FAR-FAS call for proposals, by the NEBIAS European project (NEurocontrolledBIdirectional Artificial upper limb and hand prosthesis; EU-FP7-ICT-611687), by the NanoBioTouch European project (Nano-resolved multiscale investigations of human tactile sensations and tissue engineered nanobiosensors; EUFP7-NMP-228844), and by the PARLOMA project (SIN\_00132) funded by the Italian Ministry of Universities, Education and Research within the Smart Cities and Social Innovation Under 30 program. This project received seed funding from the Dubai Future Foundation through Guaana.com open research platform.

# ACKNOWLEDGMENTS

The authors thank the partners of the CENTAURO project, namely Piaggio & Co. SpA, Robot System Automation srl, Roggi srl and Robotech srl. The authors thank Dr. Andrea Aliperta, for contribution toward preparation of multimedia material. The authors also thank Mr. Davide Bray, Mr. Tommaso Rizzo, Mr. Francesco Bruni, and Mr. Lorenzo Collodi, for the technical contributions given in the development of the compliant wrist.

#### REFERENCES


### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnbot. 2019.00044/full#supplementary-material


**Conflict of Interest Statement:** The authors have a pertinent patent application pending.

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Rongala, Mazzoni, Chiurazzi, Camboni, Milazzo, Massari, Ciuti, Roccella, Dario and Oddo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.