EDITED BY : Shuai Li, Yangming Li, Muhammad Umer Khan, Xin Luo and Huanqing Wang PUBLISHED IN : Frontiers in Neurorobotics

#### Frontiers Copyright Statement

© Copyright 2007-2019 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.

The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.

Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.

Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.

As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.

All copyright, and all rights therein, are protected by national and international copyright laws.

The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use. ISSN 1664-8714 ISBN 978-2-88945-697-0 DOI 10.3389/978-2-88945-697-0

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

#### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# NEURAL & BIO-INSPIRED PROCESSING AND ROBOT CONTROL

Topic Editors: Shuai Li, Hong Kong Polytechnic University, Hong Kong Yangming Li, Rochester Institute of Technology, United States Muhammad Umer Khan, Atilim University, Turkey Xin Luo, Chongqing Institute of Green and Intelligent Technology (CAS) Chongqing, China Huanqing Wang, Carleton University Ottawa, Canada

This Research Topic presents bio-inspired and neurological insights for the development of intelligent robotic control algorithms. This aims to bridge the inter-disciplinary gaps between neuroscience and robotics to accelerate the pace of research and development.

Citation: Li, S., Li, Y., Khan, M. U., Luo, X., Wang, H., eds. (2019). Neural & Bio-inspired Processing and Robot Control. Lausanne: Frontiers Media. doi: 10.3389/978-2-88945-697-0

# Table of Contents


Zhijun Zhang, Yongqian Huang, Siyuan Chen, Jun Qu, Xin Pan, Tianyou Yu and Yuanqing Li

*20 Learning by Demonstration for Motion Planning of Upper-Limb Exoskeletons*

Clemente Lauretti, Francesca Cordella, Anna Lisa Ciancio, Emilio Trigili, Jose Maria Catalan, Francisco Javier Badesa, Simona Crea, Silvio Marcello Pagliara, Silvia Sterzi, Nicola Vitiello, Nicolas Garcia Aracil and Loredana Zollo

*34 Hybrid EEG–fNIRS-Based Eight-Command Decoding for BCI: Application to Quadcopter Control*

Muhammad Jawad Khan and Keum-Shik Hong


Junqing Yang, Ruituo Huai, Hui Wang, Wenyuan Li, Zhigong Wang, Meie Sui and Xuecheng Su

*63 A New Noise-Tolerant Obstacle Avoidance Scheme for Motion Planning of Redundant Robot Manipulators*

Dongsheng Guo, Feng Xu, Laicheng Yan, Zhuoyun Nie and Hui Shao

*76 Bio-Inspired Genetic Algorithms With Formalized Crossover Operators for Robotic Applications*

Jie Zhang, Man Kang, Xiaojuan Li and Geng-yang Liu

*88 Different-Level Simultaneous Minimization Scheme for Fault Tolerance of Redundant Manipulator Aided With Discrete-Time Recurrent Neural Network*

Long Jin, Bolin Liao, Mei Liu, Lin Xiao, Dongsheng Guo and Xiaogang Yan


# Editorial: Neural & Bio-inspired Processing and Robot Control

Ameer Hamza Khan<sup>1</sup> , Shuai Li <sup>1</sup> \*, Xuefeng Zhou<sup>2</sup> , Yangming Li <sup>3</sup> , Muhammad Umer Khan<sup>4</sup> , Xin Luo<sup>5</sup> and Huanqing Wang<sup>6</sup>

<sup>1</sup> Department of Computing, Hong Kong Polytechnic University, Kowloon, Hong Kong, <sup>2</sup> Institute of Intelligent Manufacturing, Guangdong Academy of Sciences, Guangzhou, China, <sup>3</sup> Robotic Autonomy and Collaboration (RAC) Lab, Rochester Institute of Technology, New York, NY, United States, <sup>4</sup> Department of Mechatronics Engineering, Air University, Islamabad, Pakistan, <sup>5</sup> Chongqing Institute of Green and Intelligent Technology (CAS), Chongqing, China, <sup>6</sup> Department of Systems and Computer Engineering, Carleton University Ottawa, Ottawa, ON, Canada

Keywords: bio-inspired, robot control, neural processing, processing, intelligent algorithm

**Editorial on the Research Topic**

**Neural & Bio-inspired Processing and Robot Control**

### INTRODUCTION

The special issue on Neural & Bio-inspired Processing and Robot Control has successfully completed its 3 years of activity. In the beginning, we anticipated that the modern understanding of the biological and neurological system can drive the progress in robotics research in the forward direction. This special issue was started with the aim to promote the inter-disciplinary interaction between bio-inspired systems, and robotics. The profound understanding of both biological and robotics systems will help in bridging the gaps in our understanding of complex problems.

#### Edited by:

Mehdi Khamassi, Centre National de la Recherche Scientifique (CNRS), France

#### Reviewed by:

Mathias Quoy, Université de Cergy-Pontoise, France Frederic Alexandre, Inria Bordeaux - Sud-Ouest Research Centre, France

#### \*Correspondence:

Shuai Li shuaili@polyu.edu.hk

Received: 19 September 2018 Accepted: 19 October 2018 Published: 08 November 2018

#### Citation:

Khan AH, Li S, Zhou X, Li Y, Khan MU, Luo X and Wang H (2018) Editorial: Neural & Bio-inspired Processing and Robot Control. Front. Neurorobot. 12:72. doi: 10.3389/fnbot.2018.00072 ABOUT THE RESEARCH TOPIC

We are pleased to present 11 research articles, related to robot control, motion planning, and learning. 59 authors from several Asian and European institutions contributed to these articles. The contributions presented in this special issue are focused on the application of bio-inspired insight for the improvement in performance and accuracy of robotic systems. Thus it can be claimed that this special issue plays an important role in the development of a rapidly growing field of research (Cuperlier et al., 2007; Pfeifer et al., 2007; Arbib et al., 2008; Khamassi et al., 2011; Caluwaerts et al., 2012; Li and Zhang, 2018) at the intersection of bio-inspired systems and robotics. In our selection of the research topics, we classified the contributions into two main groups: (a) those applying the insight obtained from the biological systems to improve the performance and accuracy of the control algorithms in robots and (b) those applying the latest developments in neuro-sciences to develop intelligent robotic systems capable of autonomous decision making. We believe that the impact of this research topic can be better described in term of the improvements in robotic systems performance and development of intelligent algorithms by taking inspiration from simple biological mechanisms present in the nature.

One of the novel contribution in this special issue presents an intention-driven mechanism to help the severely disabled people in performing daily tasks (Zhang et al.). They used noninvasive brain machine interface to decode the human intentions and developed a robot which can perform the tasks accordingly. Another contribution presents how the learning algorithms can be applied to teach a robot to perform different tasks (Lauretti et al.). Such medical applications of robotics system will greatly help in improving the life quality of patients suffering from severe disabilities. Another similar work uses fNIRS as a non-invasive brain machine interface to control the quadcopter using human thoughts (Khan and Hong.).

**4**

Another novel contribution presented in this special issue includes the development of noise tolerant and robust schemes for the robot's motion planning in complicated environments while minimizing the effect of model uncertainties and errors [Ding et al.; Yang et al.; Guo et al.). Besides, another contribution focuses on the theoretical analysis of the genetic algorithms for safety-critical robotic applications and presents mechanisms to prove their safety (Zhang et al.).

Redundancy resolution of redundant robotic manipulators (Jin et al.), motion planning (Xiao et al.; Guo et al.), and control and development of control algorithms to enable the robots to perform human-like postures, gestures, and movements (Tommasino and Campolo) also happens to be a topic of great interest for the contributions in this special issue. Such work will greatly improve the quality of interaction between humans and robots, paving the way for seamless incorporation of robotic system in our daily life.

### REFERENCES


### NEXT STEP

With the tremendous success of the past three Special issues of this Research Topic, we will ask for contributions to these research topics again in next year. We hope that our efforts will contribute to accelerating the robotic research and bridge the inter-disciplinary gaps between the bio-inspired systems, neuroscience, and robotics.

### AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

### ACKNOWLEDGMENTS

The authors gratefully acknowledge the contributions of participants in this Special issue.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Khan, Li, Zhou, Li, Khan, Luo and Wang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# **An Intention-Driven Semi-autonomous Intelligent Robotic System for Drinking**

*Zhijun Zhang, Yongqian Huang, Siyuan Chen, Jun Qu, Xin Pan, Tianyou Yu and Yuanqing Li\**

*School of Automation Science and Engineering, South China University of Technology, Guangzhou, China*

In this study, an intention-driven semi-autonomous intelligent robotic (ID-SIR) system is designed and developed to assist the severely disabled patients to live independently. The system mainly consists of a non-invasive brain–machine interface (BMI) subsystem, a robot manipulator and a visual detection and localization subsystem. Different from most of the existing systems remotely controlled by joystick, head- or eye tracking, the proposed ID-SIR system directly acquires the intention from users' brain. Compared with the state-of-art system only working for a specific object in a fixed place, the designed ID-SIR system can grasp any desired object in a random place chosen by a user and deliver it to his/her mouth automatically. As one of the main advantages of the ID-SIR system, the patient is only required to send one intention command for one drinking task and the autonomous robot would finish the rest of specific controlling tasks, which greatly eases the burden on patients. Eight healthy subjects attended our experiment, which contained 10 tasks for each subject. In each task, the proposed ID-SIR system delivered the desired beverage container to the mouth of the subject and then put it back to the original position. The mean accuracy of the eight subjects was 97.5%, which demonstrated the effectiveness of the ID-SIR system.

#### *Edited by:*

*Shuai Li, Hong Kong Polytechnic University, Hong Kong*

#### *Reviewed by:*

*Dongsheng Guo, Huaqiao University, China Weibing Li, University of Leeds, United Kingdom Ning Tan, National University of Singapore, Singapore Ke Chen, Tampere University of Technology, Finland*

*\*Correspondence:*

*Yuanqing Li auyqli@scut.edu.cn*

*Received: 28 June 2017 Accepted: 18 August 2017 Published: 08 September 2017*

#### *Citation:*

*Zhang Z, Huang Y, Chen S, Qu J, Pan X, Yu T and Li Y (2017) An Intention-Driven Semi-autonomous Intelligent Robotic System for Drinking. Front. Neurorobot. 11:48. doi: 10.3389/fnbot.2017.00048* **Keywords: assistive robot, neural network, semi-autonomous control, brain–machine interface, object recognition and localization**

## **1. INTRODUCTION**

Independent living is essential for the patients with motor deficit due to stroke, spinal cord injures, etc. (Kim et al., 2012;Carlson and del R Millan, 2013; Susko et al., 2016). In order to assist the patients to live independently, intelligent robotics technology is an attractive solution (Hochberg et al., 2012; Wu et al., 2015; He et al., 2016).

With less burden during the task execution period, it is a challenging work to accurately and realtime obtain the intentions of patients, locate the desired object, and efficiently control the robot manipulator to grasp the object and deliver it to the user. Evidently, intention obtaining approach, robot control, and object perception are three key points.

### **1.1. Intention Obtaining Approach**

Brain–machine interface (BMI) technology is one of the favored solutions, as it can decode directly the users' intentions in terms of their brain signals without nervous peripherals. In 1999, some researchers applied the invasive BMI to train rats to control a robot arm (Chapin et al., 1999).

**6**

In 2011, Kim et al. (2011) used the microelectrode array signals to control a point-and-click cursor, which made it possible for patients with tetraplegic to use the computer. Later, as a representative work, Hochberg et al. (2012) proposed an invasive BMI technology based on the microelectrode array signals, allowing two patients with long-standing tetraplegia to control a robotic arm for drinking. Even though the invasive BMI is a good solution, it needs an operation on users in advance. The patients may suffer from the expensive craniotomy and additional risks, such as infections and side effects from operations. Therefore, atraumatic non-invasive BMI technology is a better choice for most people.

As Onose et al. pointed out, EEG is the only realistically practical non-invasive BMI approach at present among the existing non-invasive BMI technologies, because it is relatively affordable and easy to set-up (Onose et al., 2012; Ferracuti et al., 2013; Li et al., 2013, 2016; Yu et al., 2013). Other non-invasive BMI technologies, such as functional magnetic resonance imagery, magneto-encephalography, and positron emission tomography, are quite expensive and not portable in terms of the size and electrical energy usage (Onose et al., 2012). Therefore, a number of EEG-based BMI paradigms and systems are exploited and developed in recent years (Schröer et al., 2015; Wang et al., 2015). Active/voluntary paradigm (e.g., Motor imagery, for short as MI) and passive paradigm (e.g., P300 and steady-state evoked potentials, for short as SSVEP) are two basic strategies for the interaction between users and computers. Although some researchers employed the MI-BMI to control a robot arm to perform a task of picking and placing (Wang et al., 2015), the disadvantages are inherent and difficult to accept, such as less control options, more preliminary training, low accuracy, and instability (Li and Yu, 2015). By contrast, P300 evoked potential is more suitable to detect users' intention. It has been verified that the P300 allows very high accuracy and more optional orders with little training time (less than 5 min), which is essential in practical applications (Prezmarcos et al., 2011; Li and Yu, 2015). In addition, P300-BMI systems do not require subjects to learn how to modulate their EEG, and the P300-BMI was about two times faster than the equivalent Mi-BMI systems (Prezmarcos et al., 2011). A comparison research between P300-BMI system and SSVEP-BMI system has also proved that P300-BMI is more robust for subjects, though SSVEP-BMI has higher bit rate (Lijing et al., 2012). Moreover, the SSVEP-BMI needs to flash consistently in real time to obtain the corresponding signals, which is more tiresome for users. Considering the safety, robustness and less burden, P300-BMI system is more suitable and applied in the ID-SIR system.

In order to improve the accuracy and information transmission rate, efficient classification algorithms are necessary. Among numerous P300-BMI applications, support vector machine (SVM) and linear discriminant analysis (LDA) have been used to achieve acceptable results (Lenhardt et al., 2008; Schröer et al., 2015; Simbolon et al., 2015). As pointed out in Lenhardt et al. (2008), compared with other complex classifiers such as SVM, LDA was capable due to its good classification performance as well as low computational and training requirements. Hoffmann et al. successfully applied the LDA to obtain high classification accuracies and bit rates for severely disabled subjects (Hoffmann et al., 2008). Different from most existing LDA-P300 systems with a fixed training-round number (Townsend et al., 2010; Akram et al., 2015; Chang et al., 2016), a self-adaptive Bayesian linear discriminant analysis algorithm is exploited in this paper to classify the P300 signals to obtain the user's intention. It can effectively decrease the cost of recognition time. The user's intention is then translated into control commands that are used to control the robot manipulator to execute desired tasks.

### **1.2. Robot Control**

For the severely disabled patients, the less brain burden the system brought in, the better patients may feel. The designed intention-driven semi-autonomous intelligent robotic (ID-SIR) system seeks to decrease the need for user continuously sending commands through "shared control" to realize it. Here, shared control means that it is a semi-autonomous robot, which only needs very limited high-level commands of users. It indicates that users do not need to continuously send instructions to the BMI system. In practical applications, the user only needs to send one command to "tell" the BMI block which object is desired. All the other work will be finished automatically by the robot.

### **1.3. Object Perception**

Object perception is realized by embedding with the computer vision. Considering the complexity of objects in home/hospital environments, a region-growing algorithm, and a deep convolutional neural network (CNN) are implemented in the system for cup detection, as well as a depth information based vision localization technology is exploited and applied. Compared with the state-of-art system with color-based classifier (Schröer et al., 2015), the deep CNN method is more powerful and accurate. For instance, robot in Schröer et al. (2015) can only grasp a very specific cup in a predefined place, but the proposed ID-SIR system can grasp any learnt object from any initial position in the range of vision and robot attainability.

Before ending this section, the main contributions of this paper lie as below.


The remainder of this paper is organized in four sections. Section 2 presents the whole system in detail. The methods used in the ID-SIR system are stated in Section 3. The experiment results are discussed in Section 4. Section 5 concludes this paper with final remarks.

# **2. SYSTEM OVERVIEW**

In this section, the working mechanism and information transmission process of the proposed ID-SIR system (as shown in **Figure 1**) is stated in detail. From **Figure 1**, we can see that the ID-SIR system includes triple layers, i.e., the perception layer, decision-making layer, and execution layer. The perception layer of the system includes a P300-based brain–machine interface subsystem and a visual detection and localization subsystem. The decision-making layer is about how to convert and transmit the intention of users to the control commands of robots. The execution layer is used for robot control.

First, in the BMI subsystem, an EEG cap and a direct-current amplifier (NuAmps) are applied to acquire brain signals. After preprocessing of the signals and feature extraction, a self-adaptive Bayesian linear discriminant analysis (SA-BLDA) algorithm is employed for classification, and the intention of the user is obtained. Finally, an intention command is sent to the decisionmaking layer and the visual detection and localization subsystem as an output signal.

Second, in the visual detection and localization subsystem, two Microsoft Kinects are applied as the vision input sensors. With the help of region growing algorithm and deep neural network, the positions of the beverage containers are detected and obtained. Applying the Kinect software development kit (SDK), the position

information of the user's mouth is detected. The position information of the desired beverage container and the user's mouth are then sent to the decision-making layer in real time.

Third, the decision-making layer works as a connector and coordinator between the other modules, which is responsible for information transition and decision-making. It should decide when and how to deliver which beverage container to the mouth of the user according to the inputs from perception layers and feedback from the execution layer.

Fourth, in order to grasp the desired beverage container and deliver it to the mouth flexibly, a robot manipulator of six degreesof-freedom (DOF) with three fingers (KINOVA JACO<sup>2</sup> robot manipulator) is applied. Through motion planning and control, the executive commands, generated by the decision-making layer, are well preformed on the robot manipulator to move along the expected path and finish the drinking task.

## **3. METHODS**

In this section, the algorithms and working mechanism of three layers in the ID-SIR system are presented in detail. Specifically, it includes perception layer (including BMI and computer vision), decision-making layer, and execution layer.

### **3.1. Brain–Machine Interface**

In this section, the BMI subsystem of the proposed ID-SIR system is stated in detail. Specifically, it includes data acquisition and amplification, graphical user interface, time series and control mechanism, and mapping intentions to execution commands.

### 3.1.1. Data Acquisition and Amplification

First of all, EEG cap is worn by the user and the software setting is prepared. With the application of the cap, the scalp signals referenced to the right ear are detected.

In the experiment, a 32-channel Quik-CapTM (from Compumedics, Neuroscan, Inc.) is employed. The horizontal electrooculograph (HEOG) and vertical electrooculograph (VEOG) are about eye movements that are not necessary in our data analysis process. Therefore, the two channels are ignored in the designed BCI of the ID-SIR system. The corresponding names of electrodes and distribution of remaining 30 channels are shown in **Figure 2**. As the P300 signals are mainly produced in parietal lobe and occipital lobe, most of the sampling electrodes distribute in these zones.

Second, the captured EEG signals from the cap is amplified, recoded, and transmitted to the computer by a NuAmps device (Compumedics). In the signal acquisition process, all the impedances of the electrodes should be less than 5 KΩ, the sampling rate of the signals is 250 Hz, and the output band pass of the NuAmps device is between 0.5 and 100 Hz.

### 3.1.2. Experiment GUI Design

In order to attain the user's intention to control the robot manipulator to deliver the object (such as a bottle or a cup) to the month of the user, a P300-based speller system with 4 symbols are designed (see **Figure 3**). It is displayed as a 2 *×* 2 matrices, and each button is attached with a white word and black background in idle periods.

As can be seen from **Figure 3**, the 4 symbol buttons of the GUI are "cup1," "cup2," "cup3," and "back." Here, the number of symbols denotes the intention which can drive the robot to grasp the *i*th object (cup/bottle) and deliver it to the user's mouth (i = 1, 2, and 3). Symbol button "back" denotes the intention that drives the robot to put the object back.

### 3.1.3. Time Series and Control Mechanism

In the proposed ID-SIR system, the user's intention is recognized by a self-adaptive P300-based BMI system that works with the time series shown in **Figure 4**. When the button flashes, it changes into green background and black words.

A session is a user's off-line training or online testing time period, i.e., a subject's time cost in the training/testing experiment. In the training process, a character training time is a trial; and in the testing process, a character recognition time is a trial.

In the proposed ID-SIR system, one session includes *N* trials and each trail corresponds to the recognition time cost of a symbol button. Moreover, one trial is divided into *M* rounds. The number of rounds *M* is a self-adaption value determined by the user's mental state. The time period of a round spans from the flash of the first button to the recover of the final button. The corresponding time is denoted by *t*round. In general, the more rounds it takes, the more accurate the recognition will be and the more time the system will spend.

In order to enhance the efficiency, a small number of rounds are expected if the accuracy is satisfied to some extent. According to the actual applications, a trial is set as 10 rounds in the proposed ID-SIR system. Each of the four buttons flashes only once per round, and the total time cost of a round is*t*round = 1.2 s. The stimulus duration is the time cost when one button keeps continuous

lighting. In this system, the stimulus duration is 100 ms. Moreover, it is not necessary to start one button's flashing after others finish. The delay time between one button's flash and another button's staring point is called inter-stimulus interval (ISI). The ISI is 200 ms in the ID-SIR system. An epoch is the time period within which P300 signal is recorded and detected. In other words, the P300 signal can be found in an epoch if the user pays attention to the flash button during the corresponding epoch. In the ID-SIR system, *t*epoch = 600 ms.

### 3.1.4. Mapping Intentions to Executive Commands

In order to accurately map the intentions to task commands, a GUI and a decision-making block are necessary. As can be seen from **Figure 3**, there are four executive commands totally, i.e., "cup1," "cup2," "cup3," and "back." During the task execution, the flashing button stimulates the eyes, and the P300 signals are detected, recognized, and converted to executive commands. The robot manipulator is driven by the executive commands to deliver the expected cup to the user's month.

To do so, a self-adaptive Bayesian linear discriminant analysis (SA-BLDA) algorithm is exploited. In this self-adaptive algorithm, the round number *M* is dynamically and automatically determined on the basis of the user's mind state and the quality of the signals. The presented SA-BLDA algorithm considers both of the accuracy and the recognition speed.

#### *3.1.4.1. BLDA Algorithm Description*

To recognize the acquired P300 signals, a Bayesian linear discriminant analysis (BLDA) is exploited in the proposed ID-SIR system. Considering a regularization parameter, the BLDA algorithm can avoid overfitting problem (Hoffmann et al., 2008).

(a) Assume that a training set (*x*, *t*) consists of *P* sampling points, denoted by vector *x ∈ R P* , and *x* = (*x*1, *· · ·* , *xP*) T . Since we need to estimate whether it is a P300 signal or not, it is a logical problem, and thus target value *t ∈* {*−*1, 1}.

For Bayesian regression theory, target values *t* consists of *x* linearly weighted by *w* with Gaussian noise *nnoise* as bellow.

$$\mathbf{t} = \mathbf{w}^{\mathrm{T}}\mathbf{x} + \mathbf{n}\_{noise}.\tag{1}$$

The uncertainty over the value of the target variable can be described by using a Gaussian probability distribution. That is to say, *t* has a Gaussian distribution with the mean *µ* = *w* T *x*, and the variance *σ* <sup>2</sup> = *β −*1 , i.e.,

$$p(\mathbf{t}|\mathbf{x}, \mathbf{w}, \boldsymbol{\beta}) = \mathcal{N}\left(\mathbf{t}|\boldsymbol{\mu}, \sigma^2\right) = \mathcal{N}\left(\mathbf{t}|\mathbf{w}^\mathrm{T}\mathbf{x}, \boldsymbol{\beta}^{-1}\right),\tag{2}$$

where parameter *β* is the reciprocal of the variance, which denotes the precision of the Gaussian probability distribution.

For the convenience of analysis, we suppose the P300 signals of all the trials are independently and identically distributed. If the number of training samples is denoted by *Q*, for *Q* independent experiment samples and *P* total sampling numbers, inputs *X* can be denoted as *X* = *{x*1*, x*2*, · · · , xQ} ∈ R P×Q* . Considering the number of functional keys *K*, number of trails *N*, and number of rounds *M*, the experiment samples *Q* = *N · M · K*. If *C*Channels channels are used, and the sampling number of a section selected P300 signal is denoted by *S*Samples, the total sampling number *P* = *C*Channels*·S*Samples. According to the definition of a joint probability, the joint probability of independent experiment samples is determined by the product of the marginal probabilities for each sample value separately. Therefore, the likelihood function is

$$\begin{split} p(t|\mathbf{X},\mathbf{w},\boldsymbol{\beta}) &= \prod\_{n=1}^{Q} \mathcal{N}\left(t\_{n}|\mathbf{w}^{\mathrm{T}}\mathbf{x}\_{n}, \boldsymbol{\beta}^{-1}\right), \\ &= \left(\frac{\boldsymbol{\beta}}{2\pi}\right)^{Q/2} \cdot \exp\left(-\frac{\beta ||\mathbf{X}^{\mathrm{T}}\mathbf{w} - \mathbf{t}||^{2}}{2}\right). \end{split} \tag{3}$$

(b) For utilizing the Bayesian framework and for the convenience of analysis, a prior distribution over the polynomial coefficients *w* is considered. For simplicity, a zero mean Gaussian distribution is formulated as

$$\begin{split} \boldsymbol{\varrho}(\boldsymbol{\omega}|\boldsymbol{\alpha}) &= \mathcal{N}\Big(\boldsymbol{\omega}|\boldsymbol{0}, \boldsymbol{\alpha}^{-1}\boldsymbol{I}\Big) \\ &= \left(\frac{\boldsymbol{\alpha}}{2\pi}\right)^{\frac{p+1}{2}} \left(\frac{\boldsymbol{\epsilon}}{2\pi}\right) \exp\Big(-\frac{\boldsymbol{\alpha}}{2}\boldsymbol{\mathsf{w}}^{\mathsf{T}}\boldsymbol{I}'(\boldsymbol{\alpha})\boldsymbol{\omega}\Big), \end{split} \tag{4}$$

where parameter *α* decides the precision of this Gaussian distribution. For the linear regression with *P*th order polynomial, the total element number of feature vector *w* is *P* + 1. In practical applications, parameter *ε* is usually a small value. Matrix *I* is a unit matrix, and *I ′* (*α*) is

$$I'(\alpha) = \begin{bmatrix} \alpha & 0 & \cdots & 0 \\ 0 & \alpha & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \epsilon \end{bmatrix}.$$

Based on Bayes theorem (Bishop, 2006), the posterior distribution for *w* is

$$p(\mathbf{w}|\mathbf{X}, \mathbf{t}, \alpha, \beta) = \frac{p(\mathbf{t}|\mathbf{w}, \beta)p(\mathbf{w}|\alpha)}{\int p(\mathbf{t}|\mathbf{w}, \beta)p(\mathbf{w}|\alpha)d\mathbf{w}}.\tag{5}$$

For simplify, training set {*X*, *t*} can be replaced by **D**. Equation (3) is reformulated as

$$p(\mathbf{D}|\mathbf{w}, \beta) = \left(\frac{\beta}{2\pi}\right)^{Q/2} \cdot \exp\left(-\frac{\beta \left\|\mathbf{X}^\mathrm{T}\mathbf{w} - \mathbf{t}\right\|^2}{2}\right),\tag{6}$$

and equation (5) can be rewritten as

$$p(\mathbf{w}|\mathbf{D},\alpha,\beta) = \frac{p(\mathbf{D}|\mathbf{w},\beta)p(\mathbf{w}|\alpha)}{\int p(\mathbf{D}|\mathbf{w},\beta)p(\mathbf{w}|\alpha)d\mathbf{w}}.\tag{7}$$

From equation (7), we see that the posterior distribution of *w* is proportional to the product of the prior distribution and the likelihood function, i.e.,

$$
\mathfrak{p}(\mathfrak{w}|\mathsf{D},\alpha,\beta) \propto \mathfrak{p}(\mathsf{D}|\mathsf{w},\beta)\mathfrak{p}(\mathsf{w}|\alpha),\tag{8}
$$

where *w* can be determined by finding the most probable value of *w* given data set {*X*, *t*}. In equation (8), the likelihood *p*(*D*|*w*, *β*) and prior *p*(*w*|*α*) are computed by equations (6) and (4), respectively. The posterior distribution of *w* is Gaussian because both of the prior and likelihood are Gaussian, and the mean *m* and covariance *C* are

$$\mathcal{I}m = \beta \left(\beta \mathbf{X} \mathbf{X}^{\mathrm{T}} + I'(\alpha)\right)^{-1} \mathbf{X}t,\tag{9}$$

$$\mathbf{C} = \left(\beta \mathbf{X} \mathbf{X}^T + I'(\alpha)\right)^{-1},\tag{10}$$

where *α* and *β* can be computed by an iterative algorithm (Mackay, 1992).

(c) When a new input sample ˆ*x* is obtained, the distribution function of its predictive regression valueˆ*t* is

$$p(\hat{t}|\beta,\alpha,\hat{\mathbf{x}},\mathbf{D}) = \int p(\hat{t}|\beta,\hat{\mathbf{x}},\mathbf{w})p(\mathbf{w}|\beta,\alpha,\mathbf{D})d\mathbf{w}.\tag{11}$$

The predictive distribution (11) is also a Gaussian distribution, and the mean and variance are, respectively, as

$$\boldsymbol{\mu} = \boldsymbol{m}^{\mathrm{T}} \hat{\mathbf{x}}, \ \sigma^{2} = 1/\beta + \hat{\mathbf{x}}^{\mathrm{T}} \mathbf{C} \hat{\mathbf{x}}.\tag{12}$$

In this ID-SIR system, the decision is made by mean *µ*.

#### *3.1.4.2. Self-Adaptive Algorithm Design*

First, during each stimulus period, epoch data need to be preprocessed. Specifically, the sampled EEG data (about 150 discrete points) in 600 ms in each channel are filtered by a narrowband filter with frequency 0.1–20 Hz. In order to compress the data, the narrowband signal data are then sampled again once every 6 points. They are denoted by symbol *S*Samples (see **Figure 5**). All the 30-channel signals (i.e., *C*Channels = 30) are combined as a new vector *x* with *P* = *C*Channels*·S*Samples dimensions. During online test,

4 functional keys flash per round, and we can get 4-epoch EEG data. It means 4 feature vectors can be obtained at each round.

Second, in order to recognize the 4 characters (i.e., "cup1," "cup2," "cup3," and "back"), an SA-BLDA algorithm is exploited, and the corresponding flowchart is shown in **Figure 6**. When the first P300 signal of one trail comes, the round number *M* is set to zero after the system initializes. When a new EEG signal including 4-epoch data of a new round comes, the round number *M* is set to be *M* + 1. Afterward, all the EEG signal data of 4 epochs at the *M*th round are preprocessed and 4 feature vectors are constituted (each vector includes 30 channels data). The algebraic mean value of the previous *M* rounds feature vectors is computed. In the SA-BLDA algorithm, 4 characters are used, so 4 averaged eigenvectors corresponding to the 4 characters are obtained (i.e., each character corresponding to one ˆ*x* in (12)). From equation (12), 4 regression scores (i.e., *µ* in equation (12)) can be obtained. These scores are then normalized between 0 and 1, and denoted by notation *S*. Parameters *M*min and *M*max denote the minimum and maximum number of repeated rounds, respectively. In the proposed ID-SIR system, *M*min = 3 and *M*max = 8. Threshold *θ*<sup>0</sup> is set in view of training results. The specific selection method is described in the next section.

### *3.1.4.3. Selection of Threshold Parameter θ<sup>0</sup>*

Selection of threshold parameter *θ*<sup>0</sup> is a balance issue between classification accuracy and information transfer rate (ITR). A practical system is expected to have high classification accuracy and ITR. To achieve this aim, curves of accuracy and ITR with various *θ*<sup>0</sup> are firstly presented. In the ID-SIR system, since the ITR drops while *θ*<sup>0</sup> increases, *θ*<sup>0</sup> is set at the point where the curve of accuracy first reaches its highest value. A concrete application example is illustrated in Section 4.

### **3.2. Visual Detection and Localization**

In order to realize the automatic task of assistive drinking, it is essential to recognize and locate the desired object as well as the

user's mouth. As shown in **Figure 7**, two Kinect sensors are applied to execute perception tasks. One is placed in front of the user to detect the position of the user's mouth, while the other is set up beside the table to recognize and locate the cup, bottle, and pop can. The robot manipulator is placed on one side of the table

**FIGURE 7** | An illustration of coordinate transformation.

between Kinect and the user's chair. In the ID-SIR system, the desired object (such as a cup, a bottle, and a pop) can be put at anywhere in the cross field of Kinect's scanning zone and robot manipulator's working region (i.e., the area around the black and white calibration board in **Figure 7**).

In the ensuing sections, the coordinate transformations from camera coordinate system to the world coordinate system and further to the robot coordinate system are first discussed. The methods of the mouth and object (cup, bottle, and pop can) detection and localization are then analyzed in detail.

#### 3.2.1. Coordinate Transformation

In order to control the robot manipulator to grasp and move an object, the position information of the object in the robot coordinate system needs to be known.

First, camera calibration and transformation from the camera coordinate system to the calibration-board coordinate system are implemented.

The camera coordinate systems of two Kinects (denoted by K1 and K2), calibration-board coordinate systems (denoted by C1 and C2), and robot coordinate system (denoted by R) are shown in **Figure 7**. The relationship between the camera coordinate system and the calibration-board coordinate system is formulated as

$$
\begin{bmatrix} \mathbf{X\_K} \\ \mathbf{Y\_K} \\ Z\_\mathbf{K} \\ 1 \end{bmatrix} = \begin{bmatrix} \mathbf{^K R} & \mathbf{^K T} \\ \mathbf{0} & 1 \end{bmatrix} \begin{bmatrix} \mathbf{X\_C} \\ \mathbf{Y\_C} \\ Z\_\mathbf{C} \\ 1 \end{bmatrix}, \tag{13}
$$

where *X*K, *Y*K, and *Z*<sup>K</sup> represent the three-dimensional position information in the camera coordinate system of Kinect; *XC*, *YC*, and *Z<sup>C</sup>* represent the three-dimensional position information in the calibration-board coordinate system; *<sup>K</sup> <sup>C</sup>R* and *<sup>K</sup> <sup>C</sup>T* stand for rotation matrix and translation matrix.

In the ID-SIR system, a common camera calibration method is used to determine intrinsic and extrinsic parameters of Kinect (Zhang, 2000), with which parameters *<sup>K</sup> <sup>C</sup>R* and *<sup>K</sup> <sup>C</sup>T* are obtained. By using the SDK of Kinect, the three-dimensional position information of all the points of the object is obtained. The method of getting the three-dimensional position information of the object and mouth in the camera coordinate system will be illustrated in the following sections.

Second, the three-dimensional position information of the object and mouth in the camera coordinate system is transformed into the calibration-board coordinate system as

$$
\begin{bmatrix} \mathbf{X\_{C}} \\ \mathbf{Y\_{C}} \\ \mathbf{Z\_{C}} \\ 1 \end{bmatrix}\_{\text{Obect}} = \begin{bmatrix} \mathbf{\stackrel{K}{C}R} & \mathbf{\stackrel{K}{C}T} \\ \mathbf{0} & 1 \end{bmatrix}^{-1} \begin{bmatrix} \mathbf{X\_{K}} \\ \mathbf{Y\_{K}} \\ \mathbf{Z\_{K}} \\ 1 \end{bmatrix}\_{\text{Object}},\tag{14}
$$

where *<sup>K</sup> <sup>C</sup>R* and *<sup>K</sup> <sup>C</sup>T* are obtained during camera calibration of Kinect.

Third, the three-dimensional position information of the object and mouth in the calibration-board coordinate system is transformed into the robot coordinate system as

$$
\begin{bmatrix} X\_{\text{R}} \\ Y\_{\text{R}} \\ Z\_{\text{R}} \\ 1 \end{bmatrix}\_{\text{Obect}} = \begin{bmatrix} \prescript{\text{R}}{\text{C}}{R} & \prescript{\text{R}}{\text{C}}T \\ \mathbf{0} & 1 \end{bmatrix}^{-1} \begin{bmatrix} X\_{\text{C}} \\ Y\_{\text{C}} \\ Z\_{\text{C}} \\ 1 \end{bmatrix}\_{\text{Object}},\tag{15}
$$

where *<sup>R</sup> <sup>C</sup>R* and *<sup>R</sup> <sup>C</sup>T* stand for rotation matrix and translation matrix.

Fourth, the three-dimensional position information of the object (cup, bottle, and pop can) and mouth in the robot coordinate system are sent to decision-making layer to implement the drink delivering task.

#### 3.2.2. Object Detection and Localization

As mentioned above, a Kinect sensor is employed to collect the three-dimensional point cloud in the camera coordinate system. We first implement a plane extraction algorithm for background detection and elimination. Next, an object segmentation in the non-background proportion of the point cloud is applied. According to the collection of potential objects' three-dimensional point sets in the camera coordinate system, the corresponding RGB images of potential objects are isolated and identified with the recognition algorithm based on the library which includes images of the target object. After the recognition and coordinate transformation, the three-dimensional position information of the selected potential object in the robot coordinate system is obtained and sent to the decision-making layer to implement robot manipulator control.

#### *3.2.2.1. Background Extraction*

In order to recognize and locate the desired object on the table rapidly and accurately, plane extraction for background– foreground separation is essential. In the ID-SIR system, a region growing (RG) algorithm is exploited to search the horizontal background plane *H*Plane.

In the point cloud, we assume that the horizontal plane is a plane where all the normal vectors of points are nearly perpendicular. According to this assumption, all the neighboring points with nearly perpendicular norm vectors are considered as the points on the same horizontal plane. Based on this hypothesis, the RG algorithm is developed, and the corresponding flowchart is shown in **Figure 8**.

First of all, the normal vectors of each point in the point cloud are calculated. Without loss of generality, the point P, whose coordinate information is (*xk*, *yk*, *zk*) in the point-cloud space (or termed camera coordinate system), maps to the point of which the pixel coordinate is (*ik*, *jk*) in the pixel space (or termed image coordinate system). As shown in **Figure 9**, the normal vector*⃗v*<sup>P</sup> of

**FIGURE 8** | A flowchart describing the procedure of the region growing algorithm.

point P*<sup>k</sup>* is computed as

$$
\vec{\nu}\_{\mathbb{P}} = \vec{\nu}\_1 \times \vec{\nu}\_2,\tag{16}
$$

where *⃗v*<sup>1</sup> = P<sup>1</sup> *−* P<sup>3</sup> and *⃗v*<sup>2</sup> = P<sup>2</sup> *−* P4, P1(*ik*, *j<sup>k</sup>−*<sup>1</sup>), P2(*ik*+1, *jk*), P3(*ik*, *jk*+1), and P4(*i<sup>k</sup>−*<sup>1</sup>, *jk*) are the four surrounding points beside *P<sup>k</sup>* in the image coordinate system. All the normal vectors of the points in the point cloud are computed according to equation (16).

Second, search all the normal vectors that are nearly parallel to the perpendicular direction and add them into a potential horizontal plane set *M*Point. Here, the point P*<sup>k</sup>* can be seen as a seed (i.e., a starting point) of the region growing, in which the four surrounding points P<sup>S</sup> of the seed are checked whether their normal vectors are perpendicular and the distance D<sup>S</sup> between P<sup>S</sup> and the seed are smaller than a threshold value DThreshold.

The surrounding qualified points are collected into the potential planar point set *M*Point and inserted into a queue. They work as new seeds of the region growing. The circulation of the region growing will stop only when the queue is empty. Moreover, if the number of potential point set *n<sup>M</sup>*Point is larger than a certain value *n*C, the potential planar point set *M*Point would be added into the plane set *C*Plane. Finally, when the scanning of all the normal vectors *⃗v*<sup>P</sup> is completed, the plane set *C*Plane will be output as the horizontal plane *H*Plane.

#### *3.2.2.2. Object Segmentation*

In order to segment the expected object from the background, convex hull searching and two-times region growing (RG) algorithms are exploited. The flowchart of the algorithm is shown in **Figure 10**. The schematic diagram of two-times region growing algorithm is illustrated in **Figure 11**.

First, according to plane set *C*Plane, the convex hulls of objects in the RGB image are computed. A convex hull is the minimum polygon, which roughly describes the outline of an object.

Second, two-times region growing algorithm is proposed to obtain a complete object. The first-time region growing is applied to obtain all the point sets within the convex hulls, and the secondtime region growing algorithm is used to handle the convex hull boundary so as to obtain a complete object. Specifically, there are three steps.


*•* **Step 3.** In order to avoid erroneous judgment of the points near the convex hull boundary, two-times region growing algorithm is exploited to obtain the complete object. In **Figure 11**, the green dotted line represents the convex hull and the red solid line represents the object region after two-times region growing process. If a point belonging to object point set *M*Object is on the convex hull boundary (i.e., the yellow points on the green dotted line), then the point is considered as a seed of twotimes region growing. If any of the four points (i.e., the orange points) surrounding the seed are outside the convex hull and

the distance between two points is less than a threshold, then the corresponding points surrounding the seed are considered as the part of the object and put into potential object point set *M*Object. In addition, the points surrounding the seed are considered as new seeds as the next round judgment until there are no such points. Finally, all the potential object point sets *M*Object are put into the total object set *O*.

#### *3.2.2.3. Object Recognition*

In order to recognize objects effectively, a deep convolutional neural network (CNN) is designed and applied. Specifically, the architecture of our CNN is presented in **Figure 12**.

The network contains eight layers with weights: the first four are convolutional layers and the remaining are fully connected layers. Every convolutional layer is followed by a max-pooling layer with kernels of size 2 *×* 2. The neurons in the fully connected layers are linked to all neurons in the previous layer. The rectified linear units (Relu) is applied to every convolutional layer and fully connected layer as the activation function.

The first convolutional layer filters the 3 *×* 200 *×* 200 input image with 32 kernels of size 3 *×* 3 *×* 3. The second convolutional layer takes the max-pooled output of the first convolutional layer and filters it with 64 kernels of size 32 *×* 3 *×* 3. The third convolutional layer has 128 kernels of size 64 *×* 3 *×* 3 connected to the max-pooled output of the second convolutional layer. The fourth convolutional layer has 256 kernels of size 128 *×* 3 *×* 3. After the convolutional layers, a flatten layer is employed to transformed the multidimensional feature maps into single dimensional feature maps, which can be put into the fully connected layers. Four fully connected layers have 256, 128, 64, and 4 hidden units, respectively. Between the third and forth fully connected layers, "dropout" technique is applied to reduce the overfitting problem by setting the output of each hidden neuron to zero with probability 0.5. The output of the last fully connected layer is connected to a 4-way softmax which produces a distribution over the 4 class labels (i.e., background, cup, bottle, and pop can).

#### *3.2.2.4. Object Location*

After the recognition, the position information of the desired beverage container in the camera coordinate system is calculated

as the mean value of position information of all the points of the actual object point set. With the coordinate transformation from the camera coordinate system to the robot coordinate system (see Section 3.2), the three-dimensional position information of the desired beverage container in the robot coordinate system is obtained and is sent to the decision-making layer.

### *3.2.2.5. Mouth Detection and Localization*

In order to complete the automatical assistive drinking task, the position information of the user's mouth is required. As mentioned at the beginning of Section 3.2, a Kinect sensor is put in front of the user and capture the mouth. With the assistance of the Kinect SDK 2.0, the 3D location of the user's mouth in the camera coordinate system is obtained. By using the coordinate transformation mentioned in Section 3.2.1, the three-dimensional position information of the user's mouth in the robot coordinate system is obtained and is sent to the decision-making layer.

### **3.3. Robot Manipulator Control**

As shown in **Figures 7** and **13**, KINOVA JACO<sup>2</sup> robot manipulator is employed in the ID-SIR system. The robot manipulator has six joints and three fingers. Each finger has a controllable joint and a passive joint. When the controllable joint is grasping an object, the passive joint rotates automatically so that it can hold the object more firmly. Consequently, the robot manipulator has an adequate ability to grasp an object firmly and deliver it to the user's mouth steadily.

By using the official API, the end-effecter of the robot manipulator (i.e., the three fingers) can be controlled to move from an initial position to an expected position automatically and smoothly. Therefore, only several separated key points in the task space are required to obtain the continuous tracking trajectories of joint space. At present, only the positions of the desired beverage container and the user's mouth are variable. The remaining position points in the delivering process are predefined. With the consideration of the manipulator's stability and user's safety during the task execution, joint velocities of the manipulator are limited at an appropriate speed. Moreover, the manipular state, including position and direction information, is captured and transferred to the robot controller in real time so as to perform accurate control. The manipular state is also sent back to the decision-making layer to make sure that the task is finished.

**FIGURE 13** | The ID-SIR system assists a user for drinking (the individual agrees to publish his photo).

# **4. EXPERIMENTS AND USER STUDY**

The study was approved by the Ethics Committee of South China University of Technology. Written informed consent was obtained from each subject. In order to verify the effectiveness of the proposed ID-SIR system, two experiments are designed: one is the CNN training and the other is whole system evaluation. Moreover, comparisons among existing BMI-based assistive robotic systems and our ID-SIR system are also presented. **Figure 13** shows a scenario of a user drinking with the help of the ID-SIR system.

### **4.1. CNN Training**

In order to train our CNN to recognize the desired object, a specific data set needs to be established. Without loss of generality, we task three kinds of objects (i.e., a cup, a bottle, and a pop) as an example. The data set was designed to contain 4 classes, i.e., cup, bottle, pop can, and background. Thus, 26,564 images in total, approximately 6,500 samples for each class, were gathered through a Kinect applying the region growing algorithm. The data set was then divided into training set and validation set randomly with a rate of 7:3.

Before training, data augmentation was implemented as generating new images with rescaling and horizontal reflections to reduce the overfitting problem. After 5 epoch of training with "adam" optimization scheme, our CNN finally achieved 0.9905 accuracy on the validation set.

### **4.2. Whole System Evaluation**

Eight volunteers were asked to attend the evaluation experiment. The whole system evaluation process consisted of two parts: offline training and online testing. These volunteers were all healthy subjects (19–21 years old), among which only one subject (i.e., subject 8) had experience in using P300-based BMI system before and the other seven subjects had no experience in BMI system.

#### 4.2.1. Off-Line Training

The EEG signal data were acquired by the following three steps. First of all, a target symbol was given randomly by the computer and displayed in the text box above the four buttons. Second, the subject was asked to pay attention to the given target symbol. Third, the buttons flashed in a random order. Each subject had to complete 40 off-line trials (i.e., *N* = 40) and the chain of potential signals, including useful EEG P300 signals and noises from 30 channels, were recorded in this training process.

After the data acquisition, the data set was processed by the method of self-adaptive Bayesian linear discriminant analysis (SA-BLDA) illustrated in Section 3.1, and the classifier model of the subject was obtained, which was employed to detect the intention command in online testing process.

**Figure 15** illustrates the relationship among accuracy, ITR and *θ*<sup>0</sup> in the SA-BLDA algorithm. As analyzed in Section 3.1.3, parameter *θ*<sup>0</sup> is set at the point where the curve of accuracy first reaches its highest value. According to this rule, the final selections of *θ*<sup>0</sup> of all the subjects and the corresponding accuracy and ITR are listed in **Table 1**. From the table, we can see that the off-line training process is fast, and the accuracy is high. Specifically, all the accuracies are greater than 95%, and all the ITR are less than 20 bits/min.

#### 4.2.2. Online Testing

During the online testing process, each subject was asked to control the robot manipulator to finish 10 times assistive drinking tasks. In each task, the subject chose a beverage container through the P300-based BMI subsystem and controlled the robot to deliver the beverage container to his mouth, and then sent the "back" command to drive the robot manipulator to send the drink back to its original position. Evidently, two commands were required to complete each task: (i) grasp and deliver the desired beverage container to the month, (ii) put the beverage container back. Therefore, during the online testing experiment, each subject was asked to finish 20 control commands (i.e., trial number *N* = 20). The snapshots of a subject experiencing one assistive drinking task are shown in **Figure 14**.

The experimental results of eight subjects' online testing are shown in **Tables 2** and **3**. In the second and third columns of **Table 2**, the average round number *M*<sup>a</sup> and the corresponding average time of P300 signal recognition *t*P300 of each subject are presented, respectively. The fourth and fifth columns list eight average time and average accuracy when users completed 10 times drinking tasks. It is worth pointing out that a drinking task includes delivering process and returning process. In other words, the time cost of a drinking task includes time periods of P300 signal recognition, object recognition, object localization, and robot operating. As seen from **Table 3** that the mean time of P300 signal recognition is 5.25 s and the average time of completing one task is 84 s in the online testing. The average accuracy of 10 times drinking tasks controlling the robot manipulator is 97.50%. The eight online experiments verify the effectiveness of the proposed ID-SIR system.

**Table 3** shows the evaluation of the eight subjects to the proposed ID-SIR system after their experiences. The first four questions are about the functions of the ID-SIR system and the average scores are 4.25, 4.75, 4.13, and 4.75, respectively. These four high scores demonstrate well that the ID-SIR system is very capable and suitable for the assistive drinking tasks. The scores of Q5 and Q6 (reaching to 3.5 and 4.25, respectively) shows that subjects did not bear so much burden in the experiment and the user experience of the ID-SIR system is acceptable. The 4.13 score of the last question indicates that it is possible for the ID-SIR

**FIGURE 14** | Snapshots of a subject completing one delivering task by using the ID-SIR system (The individual agrees to publish his photo).

**TABLE 1** | Selections of *θ*<sup>0</sup> and the corresponding accuracy and ITR of eight subjects during off-line training process.


**TABLE 2** | Results of ten times online assistive drinking testing.


**TABLE 3** | Evaluation of eight subjects in experiments.


system to continue to perform experiments on patients with stroke and neurodegenerative diseases.

### **4.3. Comparisons with the Existing Systems**

In order to highlight the advantages and effectiveness of our system, comparisons among existing BMI-based assistive robotic systems and the ID-SIR system are shown in **Table 4**.

As shown in **Table 4**, a robotic system in Hochberg et al. (2012) first applied the invasive MI-based BMI technology with a robot manipulator to complete foam balls reaching and grasping tasks and achieved the accuracy as 95.6% (touch) and 62.2% (grasp) spending about 7 s per task. Later, a female patient with tetraplegia and anarthria was assisted by the system to drink coffee from a bottle speeding more than 85 s each time with 67.7% accuracy. However, this system is inefficient and cause great burden on users. Users have to concentrate continually to control the robot

manipulator in real time. Besides, sensors need to be implanted in users' brains and more than 1 month is required for the operation recovery and training. The robotic assistive systems in Wang et al. (2015) and (Katyal et al., 2013) employed non-invasive BMI technology and eye-tracking technology to a control robot manipulator to grasp or pick objects. Besides, vision algorithms, such as Euclidean clustering extraction (ECE) algorithm or sample consensus (SC) algorithm, were also used to locate objects in RGB-D images. However, they did not consider about detection or assistive drinking problems. Regarding the assistive drinking problem, the system in Schröer et al. (2015) incorporated noninvasive MI-based BMI technology with object localization and mouth detection to control the robot. However, the system took almost 2 min to complete one task and the color-based classifier for recognizing a specific colorful plastic cup limited the choices for users.

In order to overcome the deficits of existing systems listed in **Table 4**, our ID-ARR system applies non-invasive



P3000-based BMI technology to complete the assistive drinking task automatically and reduce great burdens on users. It only requires users to have short time training at the beginning and concentrate only two times to give out commands during each whole drinking process. Besides, two-times region growing algorithm and convoluted neural network are applied to recognize and locate the object, which are more effective and generalizable in practical environments.

### **5. CONCLUSION**

In this paper, an intention-driven semi-autonomous intelligent robotic (ID-SIR) system has been designed. The system is composed of a P300-based brain–computer interface (BMI) subsystem, a robot manipulator and an automatic-visual-inspection subsystem. It can detect a desired object and deliver it to the mouth of the user. In order to detect the intention of the user, a self-adaption Bayesian linear discriminant analysis algorithm has been exploited and performed to improve training efficiency and accuracy. Besides, a novel two-times region growing algorithm has been proposed to obtain the complete object. One of the important contributions of this paper is that the combination of BMI and semi-autonomous robot technologies eases the burden on the brain and satisfy user's assisted-living requirement. By using our system, eight subjects successfully complete 10 times assistive drinking tasks with satisfactory accuracies (*≥*97.5%). The experiment results have verified the capability of the proposed ID-SIR system and the corresponding algorithms. Compared with

### **REFERENCES**


the existing BMI system, the advantages of the proposed ID-SIR system are that (1) the object is not predefined and can be put at anywhere in the cross field of sensor's scanning zone and robot manipulator's region and (2) both the accuracy and efficiency are considered in the P300-BMI subsystem. Further studies will be conducted to set up the system on a mobile platform and investigate the practical performance on patients.

### **AUTHOR CONTRIBUTIONS**

All authors listed have made a substantial, direct, and intellectual contribution to the work and approved it for publication.

### **FUNDING**

This research has been supported by the National Natural Science Foundation (No. 61603142 and No. 61633010), Guangdong Foundation for Distinguished Young Scholars (No. 2017A030306009), Science and Technology Program of Guangzhou (No. 201707010225), the Fundamental Research Funds for Central Universities (No. 2017MS049), Scientific Research Starting Foundation of South China University of Technology, National Key Basic Research Program of China (973 Program, No. 2015CB351703), Guangdong Natural Science Foundation (No. 2014A030312005), and National Training Program of Innovation and Entrepreneurship for Undergraduates (No. 201710561206).


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2017 Zhang, Huang, Chen, Qu, Pan, Yu and Li. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# **Learning by Demonstration for Motion Planning of Upper-Limb Exoskeletons**

*Clemente Lauretti <sup>1</sup> \*, Francesca Cordella<sup>1</sup> , Anna Lisa Ciancio<sup>1</sup> , Emilio Trigili <sup>2</sup> , Jose Maria Catalan<sup>3</sup> , Francisco Javier Badesa<sup>4</sup> , Simona Crea<sup>2</sup> , Silvio Marcello Pagliara<sup>5</sup> , Silvia Sterzi <sup>6</sup> , Nicola Vitiello2,7, Nicolas Garcia Aracil <sup>3</sup> and Loredana Zollo<sup>1</sup>*

*<sup>1</sup>Research Unit of Biomedical Robotics and Biomicrosystems, Università Campus Bio-Medico, Rome, Italy, <sup>2</sup> The BioRobotics Institute, Scuola Superiore Sant'Anna, Pisa, Italy, <sup>3</sup>Biomedical Neuroengineering Research Group, Miguel Hernandez University, Elche, Spain, <sup>4</sup>Departamento de Ingeniería en Automática, Electrónica, Arquitectura y Redes de Computadores, Universidad de Cádiz, Cádiz, Spain, <sup>5</sup>GLIC—Italian Network of Assistive Technology Centers, Bologna, Italy, <sup>6</sup>Unit of Physical and Rehabilitation Medicine, Università Campus Bio-Medico, Rome, Italy, <sup>7</sup> Fondazione Don Carlo Gnocchi, Firenze, Italy*

The reference joint position of upper-limb exoskeletons is typically obtained by means of Cartesian motion planners and inverse kinematics algorithms with the inverse Jacobian; this approach allows exploiting the available Degrees of Freedom (i.e. DoFs) of the robot kinematic chain to achieve the desired end-effector pose; however, if used to operate non-redundant exoskeletons, it does not ensure that anthropomorphic criteria are satisfied in the whole human-robot workspace. This paper proposes a motion planning system, based on Learning by Demonstration, for upper-limb exoskeletons that allow successfully assisting patients during Activities of Daily Living (ADLs) in unstructured environment, while ensuring that anthropomorphic criteria are satisfied in the whole human-robot workspace. The motion planning system combines Learning by Demonstration with the computation of Dynamic Motion Primitives and machine learning techniques to construct task- and patient-specific joint trajectories based on the learnt trajectories. System validation was carried out in simulation and in a real setting with a 4- DoF upper-limb exoskeleton, a 5-DoF wrist-hand exoskeleton and four patients with Limb Girdle Muscular Dystrophy. Validation was addressed to (i) compare the performance of the proposed motion planning with traditional methods; (ii) assess the generalization capabilities of the proposed method with respect to the environment variability. Three ADLs were chosen to validate the system: drinking, pouring and lifting a light sphere. The achieved results showed a 100% success rate in the task fulfillment, with a high level of generalization with respect to the environment variability. Moreover, an anthropomorphic configuration of the exoskeleton is always ensured.

**Keywords: motion planning, machine learning, learning by demonstration, dynamics movement primitives, assistive robotics**

# **1. INTRODUCTION**

Understanding trajectory planning in human movements plays a paramount role in upper-limb exoskeletons for rehabilitation and assistive purposes because of the tight physical human-robot interaction. A typical strategy for determining the desired trajectory to be tracked by the exoskeleton in complex tasks, such as the Activities of Daily Living (ADLs), is to replicate human movements

#### *Edited by:*

*Shuai Li, Hong Kong Polytechnic University, Hong Kong*

#### *Reviewed by:*

*Yuqing Lin, Beijing Institute of Technology, China Leslie Samuel Smith, University of Stirling, United Kingdom Long Jin, Lanzhou University, China*

> *\*Correspondence: Clemente Lauretti c.lauretti@unicampus.it*

*Received: 27 July 2017 Accepted: 31 January 2018 Published: 23 February 2018*

#### *Citation:*

*Lauretti C, Cordella F, Ciancio AL, Trigili E, Catalan JM, Badesa FJ, Crea S, Pagliara SM, Sterzi S, Vitiello N, Garcia Aracil N and Zollo L (2018) Learning by Demonstration for Motion Planning of Upper-Limb Exoskeletons. Front. Neurorobot. 12:5. doi: 10.3389/fnbot.2018.00005* (An et al., 1988). Joint trajectories from unimpaired volunteers, caregivers, or therapists can be pre-recorded and later executed by the robotic system throughout specific mapping methods, i.e., spline decomposition (Jiang et al., 2013), or else optimization of *ad hoc* developed objective functions (Provenzale et al., 2014). However, these methods are successful in structured environments, since they cannot manage variability in the environment and external perturbations.

For ADLs in unstructured environment, a Cartesian motion planner can be conveniently adopted (Marchal-Crespo and Reinkensmeyer, 2009) and a purposely developed mathematical model of human motor behavior should be formulated in order to plan the desired trajectories in a way similar to humans. This is the case, for example, of the minimum jerk criterion (Flash and Hogan, 1985) or the minimum torque model (Svinin et al., 2010) for point-to-point reaching tasks.

For the exoskeletons, the approach based on Cartesian motion planning requires that inverse kinematics (IK) (Kim et al., 2012) is applied for computing joint motion, with the consequent increase of the computational burden. Moreover, the traditional IK algorithm with inverse Jacobian allows exploiting the available DoFs of the robot kinematic chain to achieve the desired end-effector pose; however, it does not guarantee that anthropomorphic criteria in the whole human-robot workspace are satisfied, especially in non-redundant structures. Alternative methods that account for anthropomorphic configurations in the joint space are based on the computation of the swivel angle. It can be estimated by means of geometric methods (Mihelj, 2006) or analytical methods based on the augmented Jacobian (Papaleo et al., 2015); however, in the case of non-redundant exoskeletons (as most of the commercially available ones (Marchal-Crespo and Reinkensmeyer, 2009)), the computation of the swivel angle causes the reduction of the number of Cartesian DoFs to be controlled, since the swivel angle is computed in lieu of one of the controlled Cartesian coordinates; as a consequence, this entails a reduction of the success rate in the fulfilment of the ADLs. Other approaches are based on hybrid Cartesian joint motion planners (Pattacini et al., 2010); nevertheless, as for the methods based on the computation of the swivel angle, they cannot be adopted in non-redundant exoskeletons without reducing the Cartesian DoFs to be controlled.

An alternative approach is represented by Learning By Demonstration (LbD), where the human subject is observed during the task execution and the robotic system replicates the learnt movement. It allows avoiding motion planning in the Cartesian space and inverse kinematics, but it requires to learn the target joint configuration to be reached through supervised learning. In literature, supervised learning methods, based on NNs, are widely used by researchers to learn the IK of redundant and nonredundant robots as in Oyama et al. (2001). Due to their adaptability to several contexts, NNs are employed in several robotic applications. In Li et al. (2017), they are used for redundancy resolution in presence of noise; in Jin and Li (2016) and Jin et al. (2017), NNs are adopted for motion control of multiple cooperating redundant manipulators; and in Noda et al. (2014), they are used for robot motion generation based on data from multimodal sensory systems. Nevertheless, to the best of our knowledge, how learning methods based on NNs can improve performance of LbD approaches during the learning of motion features and robot IK is not fairly explored.

This work proposes a motion planning system grounded on LbD for generating reference trajectories in the joint space for upper-limb exoskeletons, starting from the observation of the human motion during the execution of ADLs. The paper contribution is mainly addressed to extend the LbD approach in Ijspeert et al. (2013) for the control of upper-limb exoskeletons and to significantly improve it by introducing a Neural Network (NN), that learns the motion features and the robot inverse kinematics. The proposed method offers the following three main advantages with respect to the available techniques used in literature to plan the motion of upper-limb exoskeletons (i.e., motion planning in the Cartesian space and inverse kinematics): (i) it does not require the formulation of mathematical models of human motor behavior in order to accomplish the task in a way similar to humans; (ii) it allows performing the task also in unstructured environments (where a variability can be caused, for example, by the object position changes and subject different anthropometries); (iii) it guarantees the task accomplishment in the feasible workspace by preserving anthropomorphic configurations of the assisted human arm.

The proposed motion planner is based on Dynamic Movement Primitives (DMPs), with a well-defined landscape attractor (Ijspeert et al., 2013). This attractor allows replicating the recorded trajectory by means of a weighted sum of optimally spaced Gaussian Kernels; weight parameters (DMP parameters) are extracted from demonstrated movements with a Locally Weighted Regression (LWR) algorithm and are used to train a neural network through supervised learning. The neural network has the aim to define DMP parameters and joint target position and receives in input context factors (such as object position or task type). The DMP parameters are then processed by the DMP computation module that provides the exoskeleton reference joint trajectories.

The proposed motion planner was tested on an upper-limb exoskeleton during ADLs tasks. The exoskeleton was made of a 4-DoF shoulder-elbow exoskeleton (i.e., NeuroExos Shoulder-Elbow Module (NESM) (Crea et al., 2016)) for reaching movements, and a 5-DoF wrist-hand exoskeleton responsible for the grasping phase. The system was experimentally validated on four patients with Limb Girdle Muscular Dystrophy (LGMDs). They were asked to perform one ADL (i.e., the drinking task) and two activities belonging to the Southampton Hand Assessment Procedure (SHAP) clinical test (i.e., pouring and lifting a light sphere, consisting in reach-grasp-move-release a spherical object). The position of the object to be grasped was acquired by means of an external camera (Optitrack).

A comparative analysis with the traditional approach based on path planning and IK for upper-limb exoskeletons was carried out. Moreover, the data acquired during the experimental session were used to assess the generalization capability of the proposed motion planning system with respect to the different anthropometry of the patients and the different object positions. Performance of the proposed motion planning system was measured through a set of performance indicators, consisting of success rate, distance from target position, distance from the physiological behavior, and computational burden.

The paper is organized as follows: in Section 2, the exoskeleton, the proposed motion planner, and the experimental setup and protocol are presented. Experimental results are illustrated and discussed in Section 3 and Section 4, respectively. Finally, conclusion and future works are reported in Section 5.

## **2. MATERIALS AND METHODS**

### **2.1. Exoskeletons**

The upper-limb exoskeleton used to validate the proposed motion planning system is shown in **Figure 1**. It consists of the NESM 4- DoF exoskeleton and a 5-DoF Wrist-hand exoskeleton, described in the following.

### 2.1.1. NESM

NESM is an upper-limb exoskeleton consisting of four active DoFs addressing the shoulder abduction/adduction (sA/A), flexion/extension (sF/E) and internal/external rotation (sI/E), as well as the elbow flexion/extension (eF/E) movements (Crea et al., 2016). Additional passive degrees of freedom and size regulations are included within the kinematic chain to improve the safety and wear ability of the device: this system automatically compensates for joint misalignments of the elbow and shoulder complex and allows users with different anthropometries wearing the device.

Each actuation unit has a series-elastic actuator (SEA), comprising a DC motor and reduction gear in series with a custom spring. Two absolute encoders placed at both sides of the spring allow sensing the joint torque by measuring the spring deformation, and at the same time, the encoder mounted more proximally to the human joint provides the joint angular value. By virtue of the SEA architecture, both position and torque control strategies have been implemented.

The sA/A and sF/E actuation units are identical and are able to withstand peak torques up to 60 *Nm*. Similarly, the sI/E and eF/E

**FIGURE 1** | NESM upper-limb exoskeleton with the wrist-hand exoskeleton.

actuation units can deliver up to 30 *Nm* of peak torques. These features make the exoskeleton suitable to assist users having highly reduced or null residual motion capabilities of their upper arm. Notably, in this study, the position control modality is employed to perform completely passive mobilization of the user's arm.

Each joint can move within the following range of motion (the zero configuration is with the arm parallel to the trunk): 0° to*−*90° for sA/A and sF/E, *−*75° to 25° for sI/E and 0° to 120° for eF/E.

### 2.1.2. Wrist-Hand Exoskeleton

The wrist-hand exoskeleton is composed of two modules, the hand and the wrist, that can be used separately or in combination. The wrist exoskeleton guarantees the activation of the prono/supination movements. It consists of a DC motor with a reduction stage, which drives a geared ring guide. The guide is attached to an orthosis that aligns the forearm with the guide axis. Joint limits are mechanically provided, but, if necessary they can be reduced via software for increasing the safety in the humanrobot interaction.

The hand exoskeleton has 4 active DoFs: F/E of the index finger Metacarpophalangeal (MCP) joint, F/E of the middle finger MCP joint, F/E of the ring and little finger MCP joints, and F/E of the thumb MCP joint. A linkage mechanism between the M regulator as well. When a reference MCP and the Proximalinterphalangeal (PIP) joint is adopted on each finger and is driven by a linear actuator, for moving both PIP and MCP joints. A unique linear actuator is used for driving the PIP and MCP joints of both the third and the fourth fingers. The thumb A/A is fixed in a suitable position.

The wrist exoskeleton can be easily connected to the shoulderelbow exoskeleton. In fact, by simply removing the forearm cuff from the NESM, the cuff integrated to the wrist exoskeleton can be attached to the output frame of the elbow actuation unit. The resulting device is a full-arm robotic exoskeleton.

## **2.2. Low Level Control (LLC)**

The control system used to operate the NESM implements two control strategies: joint position and joint torque control modes. When controlled in position, each actuation unit drives the joint position along a reference value or trajectory. The controller is based on a proportional-integral-derivative (PID) regulator, which operates on the difference between the reference joint angle and the measured one. The output is a current commanded to the driver of the SEA actuation unit to provide the torque necessary to achieve the movement with null steady-state error.

In the torque control mode, each motor is controlled to provide a certain amount of torque. The closed-loop torque controller output is dependent on the difference between the desired joint torque and the measured one, and it is built on a PID regulator as well. When a reference torque of 0 *Nm* is commanded on each joint, the device can be used in transparent mode, allowing the user to freely move the arm. Conversely, the wrist module and the hand exoskeleton could be controlled only in position; the controller used to operate these devices is based on a PID regulator, which operates on the difference between the reference joint angle and the measured one.

### **2.3. Motion Planning Based on LbD for Upper-Limb Exoskeletons**

The proposed motion planning for upper-limb exoskeletons is shown in **Figure 2**. A variation of LbD method used in Ijspeert et al. (2013) is presented. In particular, in this work, differently from Ijspeert et al. (2013), a combination of DMP and supervised learning is adopted with the aim of avoiding motion planning in the Cartesian space and inverse kinematics. The proposed motion planning consists of two main stages, named off-line neural network training and DMP computation. In the off-line neural network training, the trajectories executed by a healthy human subject, e.g., the therapist or the caregiver, are recorded by means of motion tracking devices such as magneto inertial sensors or the robot itself when backdriven, and distinctive features, named DMP parameters, are subsequently extracted using a LWR algorithm ("Motion recording and DMP parameters extraction" block in **Figure 2**). Hence, a neural network is trained through the Levenberg-Marquardt (LM) supervised learning algorithm in order to associate DMP parameters and robot joint target position to context factors taken in input (i.e., object position and task to be performed).

In the DMP computation, the patient can perform an ADL task with the assistance of the exoskeleton. Depending on the task and object position, the trained neural network provides the proper set of DMP parameters and robot joint target positions for computing the set of DMPs that best fit the desired task ("DMP computation" block).

### 2.3.1. DMP Computation

The computation of the DMPs is obtained through the resolution of a non-linear second order system, expressed as

$$
\tau \ddot{q} = \alpha\_q \left( \beta\_q \left( \mathbf{g} - q \right) - \dot{q} \right) + f \tag{1}
$$

where *τ* is a time constant, *α<sup>q</sup>* and *β<sup>q</sup>* are positive constants, *q*<sup>0</sup> and *g* are the initial and final points of the trajectory, respectively, and *f* is a forcing term that implements the landscape attractor of the system. In equation (1), *q* refers to a generic joint position of the robot that needs to be computed for each joint of the exoskeleton (i.e., *q*1, *q*<sup>2</sup> *. . . q*5).

A possible formulation of the forcing term, namely the landscape attractor (Ijspeert et al., 2013), is

$$f(\mathbf{x}) = \frac{\sum\_{i=1}^{N} \Psi\_i(\mathbf{x})\omega\_i}{\sum\_{i=1}^{N} \Psi\_i(\mathbf{x})} \mathbf{x}(\mathbf{g} - \mathbf{q}\_0) \tag{2}$$

where *ω<sup>i</sup>* is the DMP parameters, namely the weight parameters adopted to reconstruct the recorded motion, while *x* is the state variable of the system that makes equation (1) a time-independent system. It is defined as

$$
\tau \dot{\mathfrak{x}} = -\alpha\_{\mathfrak{x}} \mathfrak{x} \tag{3}
$$

where *α<sup>x</sup>* is a positive constant. On the other hand, Ψ*i*(*x*) is Gaussian kernels expressed as

$$\Psi\_i(\mathbf{x}) = \exp\left(-\frac{1}{2\sigma\_i^2} \left(\mathbf{x} - c\_i\right)^2\right) \tag{4}$$

where *σi*, *ci*, and *N* are the width, the centers, and the number of Gaussian functions, respectively. The state variable *x* as well as centers *c<sup>i</sup>* range between 0 and 1.

As in our previous work (Lauretti et al., 2017a), an optimized spatial allocation of the Gaussian kernels is adopted, depending on the complexity of the recorded trajectory. Hence, *c<sup>i</sup>* and *σ<sup>i</sup>* are defined as

$$c(x) = \frac{\int\_0^x V\_\epsilon(z) \, dz}{||\int\_0^1 V\_\epsilon(z) \, dz||} \tag{5}$$

$$V\_{\varepsilon}(z) = 1 - \alpha\_{\varepsilon} \sum\_{k=1}^{P} \exp\left(-\beta\_{\varepsilon} \left(z - z\_{k}\right)\right) \tag{6}$$

$$
\sigma(\mathbf{x}) = \gamma\_z \frac{V\_c(\mathbf{x})}{N} + \delta\_z \tag{7}
$$

$$c\_i = c\left(\frac{i}{N}\right) \tag{8}$$

$$
\sigma\_i = \sigma \left(\frac{i}{N}\right) \tag{9}
$$

where *αz*, *βz*, *γz*, and *δ<sup>z</sup>* are positive constants, *P* is the number of critical points of the recorded trajectory, and *z<sup>k</sup>* is the normalized time instant of each critical point. A graphical representation of *c* and *σ* functions is provided in **Figure 3**.

#### 2.3.2. Off-Line Neural Network Training *2.3.2.1. DMP Parameters Extraction*

DMP parameters *ω<sup>i</sup>* are extracted through a LWR algorithm (Schaal and Atkeson, 1998). The recorded motion and derivatives, i.e., *qd*, *q*˙ *<sup>d</sup>*, and ¨*q<sup>d</sup>* are inserted in the forcing term in equation (2) as follows

$$f\_t = \tau \ddot{q}\_d - \alpha\_q \left(\beta\_q \left(\mathbf{g} - q\_d\right) - \dot{q}\_d\right),\tag{10}$$

and a function approximation problem is formulated. Hence, a locally weighted quadratic error is minimized by means of the following cost function

$$J\_i = \sum\_{t=1}^{p} \Psi\_i(t) \left( f\_t(t) - \omega\_i \epsilon(t) \right)^2 \tag{11}$$

$$\epsilon(t) = \mathbf{x} \left(\mathbf{g} - q\_0\right) \tag{12}$$

and *ω<sup>i</sup>* parameters that make *f<sup>t</sup>* as close as possible to *f* are found, for each kernel function Ψi(*t*), in order to reconstruct the trajectory *qd*, *q*˙ *<sup>d</sup>*, and ¨*qd*, through *q*, *q*˙, and ¨*q*, respectively. In equation (12), *ϵ* is the error between the target joint position to be reached *g* and the initial joint position of the exoskeleton *q*0.

#### *2.3.2.2. Neural Network*

A Levenberg-Marquardt algorithm (LM) has been adopted for the off-line neural network training (Lourakis, 2005). Given a

parameter vector *p ∈ ℜ<sup>n</sup>* and a measurement vector *x ∈ ℜ<sup>m</sup>* , the LM algorithm finds the functional relation (*f*) that maps the parameter vector *p* into an estimated measurement ˆ*x* (ˆ*x* = *f*(*p*)). A linear approximation of *f* in the neighborhood of *p* is provided by a Taylor series expansion

$$f(\mathfrak{p} + \delta\_{\mathfrak{p}}) = f(\mathfrak{p}) + f\delta\_{\mathfrak{p}} + o(\mathfrak{p}) \tag{13}$$

Neglecting the higher order terms *o*(*p*), equation (12) could be approximated as

$$f(\mathbf{p} + \delta\_{\mathbb{P}}) \approx f(\mathbf{p}) + I\delta\_{\mathbb{P}} \tag{14}$$

where *J* is the Jacobian matrix *<sup>δ</sup>f*(*P*) *δP* .

At each step of the iterative process, LM looks for the *δ<sup>p</sup>* that minimizes the error defined as *∥x − f*(*p* + *δp*)*∥* = *∥x − f*(*p*) + *Jδp∥* = *∥ϵ−Jδp∥*. The error is minimized when *Jδp*–*ϵ* is orthogonal to the column space of *J*, namely when the following condition holds

$$J^T(J\delta\_p - \epsilon\_p) = 0 \tag{15}$$

$$\mathbf{J}^T \mathbf{J} \boldsymbol{\delta}\_{\mathfrak{p}} = \mathbf{J}^T \boldsymbol{\epsilon}\_{\mathfrak{p}}.\tag{16}$$

In the LM method, equation (16), called *normal equation*, is written as

$$\mathbf{N}\delta\_{\mathbf{p}} = \mathbf{J}^{\mathrm{T}}\epsilon\_{\mathbf{p}} \tag{17}$$

$$\mathbf{N} = \mu + \mathbf{J}^T \mathbf{J} \tag{18}$$

where *J T J* and *µ* are called damping and damping term, respectively. One iteration of LM algorithm consists of finding an acceptable value of the damping term that reduces the error *ϵp*. In other words, if *δ<sup>p</sup>* computed from equation (17), leads to a reduction of the error *ϵp*, the damping term is decreased and the following iteration is processed; otherwise, the damping term is increased and equation (17) is solved again. The LM algorithm stops running when, at least, (i) *J T ϵ<sup>p</sup>* of equation (17) is lower than a preset threshold *ϵ*<sup>1</sup> or (ii) *δ<sup>p</sup>* is lower than a threshold *ϵ*<sup>2</sup> or (iii) a maximum number of iteration *NMAX* is reached. For the sake of brevity, the complete LM algorithm is not shown; further theoretical details about the implemented method could be found in Lourakis (2005).

The structure of the adopted neural network is reported in **Figure 4**. A two layer feed-forward network with *M* sigmoid hidden neurons and *N* + 1 linear output neurons is used for each joint and for each task the user wants to perform. The inputs of each network are the Cartesian target positions to be reached, *Px*, *Py*, and *P<sup>z</sup>* (e.g., object position); on the other hand, the outputs of each network are the DMP parameters, *ω*1, *ω*2*. . .ωN*, and the target joint angles, *Q<sup>i</sup>* (N is the number of DMP parameters computed for the i-th joint).

#### *2.3.2.3. Adapting NN Outputs to Different Subject Anthropometries*

In order to adapt the proposed method to different human bodies, a recursive method that adjusts the NN outputs for distinct subject anthropometries was used. Its functioning principle is shown in **Figure 5**.

In the block scheme, *P<sup>d</sup>* is the Cartesian target position to be reached, *Q* is the output configuration of the robot joints, subject 1 is the person involved in the NN training phase while subject 2 is the assisted person, who wants to perform an ADL thanks to the exoskeleton assistance. It is worth noting that, with the aforementioned exoskeletons and the described tasks, two loops of the recursive algorithm are suitable to obtain an acceptable error in reaching the target position (less than 10 mm).

### **2.4. Traditional Path Planning and IK**

A simple path planning, based on a third-order polynomial function, was implemented in order to generate Cartesian trajectories with null velocity at the beginning and at the end of the movement. It can be written as

$$z = -2\frac{z\_f - z\_i}{D^3}t^3 + 3\frac{z\_f - z\_i}{D^2}t^2 + z\_i \tag{19}$$

where *z* is the desired exoskeleton Cartesian pose, z*<sup>f</sup>* and *z<sup>i</sup>* are the final and initial desired Cartesian pose, respectively, and *D* is the motion duration.

Hence, two IK methods were adopted to generate the reference joint position for the exoskeleton. They are IK based on the computation of the swivel angle (named in the following *IK with swivel angle*) and IK with the inverse Jacobian (named in the following *IK Inverse Jacobian*).

#### 2.4.1. IK with Swivel Angle

The IK algorithm with swivel angle was *ad hoc* developed for a 4-DoF spherical-revolute (S-R) manipulator (i.e., the shoulderelbow exoskeleton), based on geometrical considerations. An additional constraint was imposed to calculate the analytical solution for the last revolute DoF of the upper-limb exoskeleton (i.e., the wrist prono-supination). For the sake of clarity, the Denavit-Hartenberg model and parameters of the upper-limb exoskeleton are reported in **Figure 6**.

The IK algorithm for the shoulder-elbow exoskeleton manages three Cartesian coordinates and one orientation coordinate and consists of the following steps:

*•* Being the target position known (vector*⃗p*), the solution for the elbow angle is derived geometrically:

$$q\_4 = \pi - a \cos\left(\frac{d\_3^2 + d\_5^2 - |\vec{p}|^2}{2d\_3d\_5}\right) \tag{20}$$


$$q\_1 = \operatorname{atan} \left( \frac{\mathbf{y}\_{o\_3}}{\mathbf{x}\_{o\_3}} \right) \tag{21}$$

$$q\_2 = a \cos \left(\frac{z\_{o\_3}}{d\_3}\right) \tag{22}$$

$$q\_3 = -\arccos\left(\frac{z\_{ec} - d\_3 \cos q\_2 - d\_5 \cos q\_2 \cos q\_4}{d\_5 \sin q\_4 \sin q\_2}\right) \quad \text{(23)}$$

	- (1) Palm of the hand pointing downward (for pouring and sphere reaching-moving):

$$q\_5 = \arctan\left(\frac{\cos q\_4 (\cos q\_1 \sin q\_3 + \cos q\_2 \cos q\_3 \sin q\_1)}{+ \sin q\_1 \sin q\_2 \sin q\_4}\right) \tag{24}$$

(2) Palm of the hand pointing left (for drinking):

$$q\_5 = \arctan\left(\frac{\cos q\_1 \cos q\_3 - \cos q\_2 \sin q\_1 \sin q\_3}{\cos q\_4 (\cos q\_1 \sin q\_3 + \cos q\_2 \cos q\_3 \sin q\_1)}\right) \tag{25}$$

#### 2.4.2. IK with Inverse Jacobian

The IK algorithm with inverse Jacobian is well-described by the following equation (Siciliano et al., 2010),

$$
\dot{q} = J\_A^{-1}(q)(\dot{\mathbf{x}}\_d + \mathbf{K}\mathbf{e})\tag{26}
$$

where *J −*1 *A* is the analytical inverse Jacobian computed on the kinematic chain of **Figure 6**, *q* and *q*˙ are the joint angle and its derivative, respectively, *x*˙ *<sup>d</sup>* is the desired velocity in the Cartesian space, K is a positive definite matrix (usually diagonal), and *e* is the *operational space error* defined as *e* = *xd*–*xe*. The desired joint configuration *q* is obtained by numerically integrating equation (26) through the Euler method.

### **2.5. Experimental Setup**

The experimental platform for the validation of the proposed motion planning system based on LbD is shown in **Figure 7**.

A user graphical interface is used to show the action to perform to the subject. The control system architecture consisted in a finite-state machine, which divides the main task (i.e., drinking, pouring and reach-grasp-move-release a sphere) into several elementary actions (corresponding to the subtasks listed in **Table 1**) that the different devices can accomplish (e.g., waiting for the trigger, reaching the glass, grasping, etc.). Each subtask is triggered by the user by means of the combined M-IMU/EMG interfaces, letting him/her to control the exoskeletons. An abort function was also implemented in the state machine to safely abort the execution of the task at any time.

The communication within the subsystems composing the platform is managed by the Yet Another Robotic Platform (YARP) messaging system. The motion commands acquired by the user are sent, through the YARP server, to the exoskeletons. All the acquired data are synchronized and saved under YARP.

The platform components are shown in **Figure 8** and are detailed in the following.

### 2.5.1. User Interface

The interface adopted to detect the user movement intention is based on the combined use of 2 push-buttons and 2 M-IMUs (Lauretti et al., 2017b).

The 2 push-buttons were placed on a table in order to be activated by the index and the thumb fingers and to be used as a switch. Moreover, the two M-IMUs (XSens MTw) were placed on the user trunk and head in order to detect his/her neck motion. An Awinda Station was used to record at 100 Hz of synchronized wireless data from the two M-IMUs.

By means of the developed interface the user may exploit: (i) the head yaw motion in the negative direction to operate the upper-limb exoskeleton movements and the head yaw motion in the positive direction to abort the task; (ii) the index finger and thumb residual motion to trigger the hand opening and the hand closing.

### **2.6. Experimental Protocol**

#### 2.6.1. Off-Line Neural Network Training

The developed neural network was trained off-line on a healthy volunteer subject (with upper arm length *LUpper Arm* = 0.33 *m* and forearm length *Lforearm* = 0.3 *m*). He was asked to perform the drinking task, with 41 different glass positions (**Figure 9A**) and two activities belonging to the SHAP clinical test, i.e., pouring (for 15 different positions of the glass and the bottle as in **Figure 9B**) and reach-grasp-move-release a sphere (for 25 different positions of the sphere as in **Figure 9C**). Each task was performed 5 times per each object position and arm motion was recorded. The shoulder motion, i.e., the sA/A, sF/E, sI/E, and

#### **TABLE 1** | Tasks description.

#### **Task 1: Drinking**


eF/E movements were recorded through the NESM used in transparent mode; conversely, the wrist Prono-Supination wP/S was recorded by means of two M-IMUs placed on the subject forearm and hand.

About 70% of the recorded data was used to train the neural network; the remaining 30% was used to validate and test the neural network in order to avoid over-fitting issues.

#### 2.6.2. DMP Computation and Control

The experimental session was aimed to measure performance of the proposed motion planning system, compare with the traditional approach based on inverse kinematics described in Section 2.4 and assess generalization capability. The system was tested during the fulfillment of the same ADLs used for training, i.e., drinking, the pouring, and reach-grasp-move-release a sphere. In the following, they are named task 1, 2, and 3, respectively. Additionally, each task is divided into a number of subtasks listed in **Table 1**.

The validation was performed in simulation and in the real setting with patients. Simulation tests allowed evaluating system

**FIGURE 8** | A representative subject performing the task (the subject signed an informed consent document to authorize publication of this picture).

performance in the whole human-robot workspace (238, 75, and 125 object positions were considered for task 1, 2, and 3, respectively). On the other hand, in the real setting, system performance was assessed on four patients with Limb Girdle Muscular Dystrophy (LGMDs). They, aged 38.5 on average (Standard Deviation 14.6), volunteered to participate in this study. The experimental protocol was approved by the local Ethical committee (Comitato Etico Università Campus Biomedico di Roma, reference number: 01/17 PAR ComEt CBM), by the Italian Ministry of Health (Registro—classif. DGDMF/I.5.i.m.2/2016/1096) and complied with the Declaration of Helsinki. The subjects were asked to perform three repetitions of the three tasks thanks to the assistance of the 4-DoF upper-limb and 5-DoF wrist-hand exoskeletons (3 object positions for each task were considered in this case).

### *2.6.2.1. Comparative Analysis (CA) with Inverse Kinematics Methods*

The CA was aimed to measure performance of the proposed motion planning based on LbD during the control of the exoskeleton and compare the results with the traditional approaches based on path planning and inverse kinematics described in Section 2.4. The comparative analysis (CA) was carried out in simulation on a subject modeled with 30 cm upper arm and forearm lengths and with 238, 75, and 125 different object positions.

#### *2.6.2.2. Generalization Capability Assessment (GCA)*

The GCA was aimed to evaluate the generalization level of the proposed motion planning with respect to the different anthropometries of the patients and the different object positions. First, it was tested in simulation environment (GCA–sim) for 238, 75, and 125 object positions (for task 1, 2, and 3 respectively) per 25 different subject anthropometries, i.e., the combination of the following upper arm and forearm lengths: *LUpper Arm* = 30 cm, 32 cm, 34 cm, 36 cm, 38 cm and *Lforearm* = 30 cm, 32 cm, 34 cm, 36 cm, 38 cm. Subsequently, the proposed motion planning was tested on the four patients (GCA–real), with *LUpper Arm* = 33 *cm* and *Lforearm* = 30 cm, 30 cm, 35 cm, 37 cm, for 3 object positions per task.

System performance was measured through three quantitative indicators reported below.

#### *2.6.2.3. Performance Indices*

The proposed performance indicators are: (i) *Position Err1*, *Orientation Err1*, *Position Err2*, *Orientation Err2*, (ii) PhJL (iii) Success Rate and (iv) mean cycle time. They are aimed at evaluating (i) distance from target position, (ii) distance from anthropomorphic configurations, taking into account the physiological joint limits, (iii) the success rate of the task execution, and (iv) the computational burden.

### *2.6.2.4. Distance from Target Position*

The error was measured during subtasks 1-1, 2-1, 3-1, and 3-2 (**Table 1**) as

$$Position\,Err = \frac{1}{2}\sqrt{\left(\mathbf{x}\_t - \mathbf{x}\right)^2 + \left(y\_t - y\right)^2 + \left(z\_t - z\right)^2} \tag{27}$$

$$\text{Orientation } Err = ||\alpha\_t - \alpha||\tag{28}$$

where *xt*, *yt*, and *z<sup>t</sup>* are the coordinates of the target position and *x*, *y*, and *z* are the coordinates of the actual position reached by the robot end-effector; *α<sup>t</sup>* is the desired angle *α* that needs to be 0 for a successful task fulfillment; *α* is illustrated in **Figure 10** and is defined for subtask 1-1 and 2-1 as

$$\alpha\_1 = a \cos \left( \frac{X\_{\text{ce}}^T Y\_0}{||X\_{\text{ce}}|| \, ||Y\_0||} \right) \tag{29}$$

Conversely, for subtask 3-1 and 3-2 *α* is defined as

$$\alpha\_2 = a \cos \left( \frac{Z\_{\text{cc}}^T Y\_0}{||Z\_{\text{cc}}|| \, ||Y\_0||} \right) \tag{30}$$

where *Y*0, *Xee*, and *Zee* are defined in the base reference frame [*XB*, *YB*, *ZB*] as *Y*<sup>0</sup> = [0, 1, 0], *Xee* = *T ee <sup>B</sup>* [1*,* 0*,* 0*,* 1] and *Zee* = *T ee <sup>B</sup>* [0*,* 0*,* 1*,* 1] (*T ee <sup>B</sup>* is the base/end-effector transformation matrix).

For subtask 2-2, the position and orientation error are expressed as

$$PositionErr\_{2-2} = \frac{1}{2}\sqrt{\left(\mathbf{x}\_{\text{bottle}} - \mathbf{x}\_{\text{glass}}\right)^2 + \left(z\_{\text{bottle}} - z\_{\text{glass}}\right)^2} \tag{31}$$

**FIGURE 10** | **(A)** a graphical representation of the end-effector and the base reference frame is shown; **(B)** the *α* angle for task 1-1, 2-1 is shown; **(C)** the *α* angle for task 3 is shown; **(D)** The base reference frame and bottle, end-effector, and glass reference frames are shown; **(E)** the *β* angle for task 2-2 is shown.

$$\text{Orientation } Err\_{2-2} = ||\beta\_t - \beta||\tag{32}$$

$$
\begin{bmatrix} \chi\_{bottle} \\ \mathcal{Y}\_{bottle} \\ Z\_{bottle} \\ 1 \end{bmatrix} = \begin{bmatrix} \mathbf{c}^e \\ \mathbf{c}^e \\ \mathbf{0} \\ 1 \end{bmatrix} \tag{33}
$$

$$\beta = \frac{\pi}{2} - a \cos \left( \frac{X\_{\text{cc}}^T Y\_0}{||X\_{\text{cc}}|| \, ||Y\_0||} \right) \tag{34}$$

where *xbottle*, *zbottle*, *xglass*, and *zglass* are expressed in the [*XB*, *YB*, *ZB*] reference frame (**Figure 10**), *T bottle ee* is the end-effector/bottletip transformation matrix during the whole subtask 2-2 (i.e., when the hand exoskeleton is grasping the bottle) and *β*<sup>t</sup> is the desired *β* that needs to range from 0 to *<sup>π</sup>* 3 in orderto successfully accomplish the pouring task. Thus, defining *β<sup>t</sup>* = 0, an acceptable value of the orientation error, for a successful task fulfillment, ranges from 0 to *<sup>π</sup>* 3 .

#### *2.6.2.5. Distance from the Physiological Joint Limits*

The distance from the physiological joint limits is measured to assess the level of anthropomorphism of the reached configuration during motion. It is expressed as

$$PHJL = \left\| \frac{2(q\_i - \bar{q}\_i)}{q\_{iM} - q\_{im}} \right\|\,\tag{35}$$

where *q<sup>i</sup>* is the actual position of the i-th joint, *qiM* and *qim* are the upper and lower physiological limit of the i-th joint, respectively, and ¯*q<sup>i</sup>* is the mean value between *qiM* and *qim*. An acceptable value of *PhJL* for the considered tasks ranges in between 0 and 1.

#### *2.6.2.6. Success Rate of the Task Execution*

The success rate of the task execution is evaluated as

$$\text{Success rate} = \frac{N\_{\text{succ}}}{N\_{\text{tot}}} \cdot 100\tag{36}$$

where *Nsucc* is the number of trials successfully accomplished and *Ntot* is number of all the performed trials. Trials of tasks 1 and 3 are considered successfully accomplished if all the following conditions are satisfied:


Conversely, trials of Task 2 are considered successfully accomplished if all the following conditions are satisfied:


The aforementioned ranges were experimentally retrieved.

#### *2.6.2.7. Computational Burden*

The computational burden of the three compared methods is evaluated through the mean cycle time; it is the time required to complete one cycle of the algorithm that computes the desired joint trajectory starting from the object position and the task type. The computational time of the 3 methods was evaluated under the same hardware conditions (Processor: Intel(R) Core(TM)2 Duo CPU 3.00 GHz) and development environment (MATLAB R2014b).

#### *2.6.2.8. Statistical Analysis*

For motion planner comparative analysis, mean value and SD of the computed performance indices were calculated for each task on the different object positions and subject anthropometries. For the generalization tests, mean value and SD of the computed performance indices were also calculated for all the subjects and the number of repetitions of each task. A statistical analysis based on Wilcoxon paired-sample test was performed for the comparative analysis between the proposed motion planning system and the traditional motion planner based on inverse kinematics. The analysis was carried out on multiple comparisons with Bonferroni correction; hence, significance was achieved for *p-value <* 0.05/*nc*, where *n<sup>c</sup>* is the number of multiple comparisons.

## **3. RESULTS**

The results of the comparative analysis are reported in **Figure 11**. In particular, mean value and standard deviation of the position error, orientation error, and PhJL computed on the 238, 75, and 125 object positions (for task 1, 2, and 3, respectively) are reported.

One can observe that the DMP-based control always exceeded the other two algorithms based on inverse kinematics in terms of success rate. The DMP-based control always achieved 100% while the IK inverse Jacobian reached 71.4% and the IK swivel angle reached 84.7%. The differences are statistically significant with *p-value <* 0.0083 (for the DMP-based control compared to IK inverse Jacobian *p-value* = 0.0031 and for DMP-based control compared to IK swivel angle *p-value* = 0.0045).

On the other hand, as expected, the DMP-based control suffers from a position error higher than the one achieved with the other two algorithms (this difference is statistically significant with *pvalue* = 0.0012 for the DMP-based control compared to IK inverse Jacobian and *p-value* = 0.0008 for DMP-based control compared to IK swivel angle), for all the subtasks except for subtask 2-2.

Indeed, about the position error of the subtask 2-2, the results achieved with the DMP-based control are comparable to the one

#### **TABLE 2** | Experimental results obtained for GCA.


achieved with IK inverse Jacobian (*p-value* = 0.09), but are better than the one achieved with IK swivel angle (*p-value* = 0.0033).

Conversely, the orientation error achieved for each task with the DMP-based control is comparable to the one achieved with IK with swivel angle (*p-value* = 0.12). The difference is statistically significant between the orientation error achieved by the DMPbased control and the one achieved with IK inverse Jacobian, which is lower (*p-value* = 0.0028).

Moreover, the results clearly show that the DMP-based control and IK with swivel angle ensure a more anthropomorphic configuration than IK inverse Jacobian, measured through *PhJL*. The differences are statistically significant, with *p-value* = 0.0024 for the comparison between DMP-based control and IK inverse Jacobian and *p-value* = 0.0019 for the comparison between IK swivel angle and IK inverse Jacobian.

Finally, considerations about the computational burden of the three methods have been made; a mean cycle time of 0.4 ms, 7.2 ms, and 0.1 ms for the DMP-based control, IK inverse Jacobian, and IK with swivel angle, respectively, has been estimated. As expected, IK inverse Jacobian has a higher computational burden compared the other two methods, since it is an iterative method. Conversely, it is interesting to note that the proposed DMP-based method, once trained, has a relatively low computational burden (comparable to the geometrical approach based on the swivel angle) since the DMP resolution is not computationally heavy.

The experimental results of the GCA are shown in **Table 2**. Mean value and standard deviation of position error, orientation error, and PhJL are reported. They were computed for GCA–sim on 238, 75, and 125 object positions (for task 1, 2, and 3, respectively) and 25 different subject anthropometries. Instead, for GCA–real they were calculated on the four subjects and 3 object positions for each task. It is interesting to note that performance achieved in the real setting are very close to the simulation results; moreover, the proposed motion planning based on DMP has a high generalization level with respect to the different object positions and subject anthropometries, since the success rate achieved for the 3 task is 100%.

### **4. DISCUSSIONS**

The comparative analysis (**Figure 11**) showed that the IK inverse Jacobian has better performance than the DMP-based control in terms of position and orientation error, but it does not guarantee physiological configuration and always the success of the operation in the whole human-robot workspace. Conversely, the IK with swivel angle reached better results than DMP-based control in terms of position and orientation error for tasks requiring the control of only one orientation parameter (e.g., tasks 1, 2-1 and 3). Instead, it increased in more complex tasks that required the control of more than one orientation parameter (task 2-2).

Nevertheless, it is worth pointing out that the position error obtained with the DMP-based control (even though higher than the traditional approaches) is fully compatible with the considered application domain which does not require very high accuracy. In fact, it is shown in the literature that accuracy of human movements during the execution of ADLs is around 1-2 cm (Merlo et al., 2013). The achieved position error is moreover well balanced by the very high success rate and the guarantee of an anthropomorphic configuration (which also entail system reliability and safety during the task fulfilment).

Furthermore, the high generalization level of the proposed approach ensures higher robustness to the environmental changes than the two other traditional methods, especially the one based on the computation of the swivel angle, which needs to be *a priori* specified. A geometrical approach for inverting kinematics (Section 2.4) has the clear advantage of a low computational burden, but it is not guaranteed that it can be easily applied on all the kinematic chains. Conversely, the proposed DMP based method offers the advantage of being applicable to any kinematic chain, thanks to the offline training, and has a good computational time (which is comparable with the IK swivel angle and significantly lower than the IK algorithm with inverse Jacobian).

### **5. CONCLUSION**

A learning by demonstration method for planning motion of upper-limb exoskeletons was presented in this work. It is grounded on the computation of DMPs and machine learning techniques to construct the task- and patient-specific joint trajectories based on the learnt trajectories. Distinctive features, namely the DMP parameters, were firstly extracted from the motion recorded during certain activities performed by a human subject wearing the upper-limb exoskeleton. They were subsequently used, together with the recorded joint angles and Cartesian positions, to train a supervised neural network (a two layer feed-forward network). The neural network provided the more appropriate set of DMP parameters to generate the task- and patient-specific trajectories of the exoskeleton joints.

The proposed motion planning was preliminarily validated in simulation and later experimentally validated on 4 patients with LGMDs, who used the combined M-IMU/EMG interface for controlling the upper-limb exoskeleton. The validation session was aimed to (i) assess performance of the proposed motion planning system by means of quantitative indicators and compare it with traditional methods used to operate upper-limb exoskeletons, which are based on path planning and inverse kinematics (IK inverse Jacobian and IK swivel angle); (ii) investigate the generalization level of the proposed approach with respect to the variability in the experimental scenario, given for example by different anthropometry of the patients and different object positions.

The results achieved for the comparative analysis showed that the DMP-based control guarantees a 100% success rate in the task fulfillment, with an acceptable position and orientation error for the targeted application. Moreover, it also ensures that the exoskeleton always has configurations within the physiological joint limits, differently from methods based on path planning and inverse kinematics. Furthermore, the computational time required by the proposed approach is lower than the one required by the IK algorithm with inverse Jacobian and comparable with the IK with swivel angle.

Finally, the results achieved in simulation as well as in the experimental setting also showed a high generalization level of the DMP based motion planning with respect to the different object positions and subjects anthropometries. A success rate of 100% for all tasks was reported.

Future works will be addressed to extend the study to a higher number of patients and grasping and manipulation tasks, by applying the proposed motion planning approach also to the hand exoskeleton (which in this study was used to perform grasping tasks).

### **ETHICS STATEMENT**

The experimental protocol was approved by the local Ethical committee (Comitato Etico Università Campus Biomedico di Roma, reference number: 01/17 PAR ComEt CBM), by the Italian Ministry of Health (Registro—classif. DGDMF/I.5.i.m.2/2016/1096) and complied with the Declaration of Helsinki. All subjects gave

### **REFERENCES**


written informed consent in accordance with the Declaration of Helsinki.

## **AUTHOR CONTRIBUTIONS**

CL designed the paper, analyzed the literature, designed and developed the proposed motion planner, analyzed the experimental data and wrote the manuscript. FC organized the experimental sessions, acquired the data, analyzed the literature and contributed to the manuscript writing. ALC analyzed the literature and partly contributed to the manuscript writing. ET and SC designed and developed the IK method used as benchmark for the comparative analysis, i.e., IK based on the computation of the swivel angle, and partly contributed to the manuscript writing. JMC and FJB designed and developed the wrist-module and the hand exoskeleton. SMP recruited the patients. SS, NV, NGA contributed to the design of the experiments and discussed the results. LZ contributed to the design of the motion planner and the experiments, discussed the results, wrote the paper and supervised the study. All the authors read and approved the final version of the manuscript.

### **FUNDING**

This work was supported partly by the Italian Institute for Labour Accidents (INAIL) with the PPR 2 (CUP: E58C13000990001), PCR 1/2 (CUP: E57B16000160005), PPR AS 1/3 (CUP: E57B16000160005) and RehabRobo@work (CUP: C82F17000040001) projects and partly by the European Project H2020/AIDE: Adaptive Multimodal Interfaces to Assist Disabled People in Daily Activities (CUP: J42I15000030006).


Svinin, M. M., Goncharenko, I. A., Hosoe, S., and Osada, Y. (2010). *Optimality Principles and Motion Planning of Human-Like Reaching Movements*. London, UK: INTECH Open Access Publisher.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2018 Lauretti, Cordella, Ciancio, Trigili, Catalan, Badesa, Crea, Pagliara, Sterzi, Vitiello, Garcia Aracil and Zollo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

#### *Muhammad Jawad Khan1 and Keum-Shik Hong1,2\**

*1School of Mechanical Engineering, Pusan National University, Busan, South Korea, 2Department of Cogno-Mechatronics Engineering, Pusan National University, Busan, South Korea*

In this paper, a hybrid electroencephalography–functional near-infrared spectroscopy (EEG–fNIRS) scheme to decode eight active brain commands from the frontal brain region for brain–computer interface is presented. A total of eight commands are decoded by fNIRS, as positioned on the prefrontal cortex, and by EEG, around the frontal, parietal, and visual cortices. Mental arithmetic, mental counting, mental rotation, and word formation tasks are decoded with fNIRS, in which the selected features for classification and command generation are the peak, minimum, and mean ΔHbO values within a 2-s moving window. In the case of EEG, two eyeblinks, three eyeblinks, and eye movement in the up/down and left/right directions are used for four-command generation. The features in this case are the number of peaks and the mean of the EEG signal during 1 s window. We tested the generated commands on a quadcopter in an open space. An average accuracy of 75.6% was achieved with fNIRS for four-command decoding and 86% with EEG for another four-command decoding. The testing results show the possibility of controlling a quadcopter online and in real-time using eight commands from the prefrontal and frontal cortices *via* the proposed hybrid EEG–fNIRS interface.

#### *Edited by:*

*Xin Luo, Chongqing Institute of Green and Intelligent Technology (CAS), China*

#### *Reviewed by:*

*Huanqing Wang, Carleton University, Canada Shuai Li, Hong Kong Polytechnic University, Hong Kong*

#### *\*Correspondence: Keum-Shik Hong kshong@pusan.ac.kr*

*Received: 25 October 2016 Accepted: 24 January 2017 Published: 17 February 2017*

#### *Citation:*

*Khan MJ and Hong K-S (2017) Hybrid EEG–fNIRS-Based Eight-Command Decoding for BCI: Application to Quadcopter Control. Front. Neurorobot. 11:6. doi: 10.3389/fnbot.2017.00006*

Keywords: brain–computer interface, hybrid EEG–fNIRS, mental task, classification, quadcopter control

# INTRODUCTION

Brain–computer interface (BCI) or brain–machine interface (BMI) is a method of communication between brain and hardware by means of signals generated from the brain without the involvement of muscles and peripheral nervous system (Naseer and Hong, 2015b; Schroeder and Chestek, 2016). Although prosthetic devices utilize muscles or peripheral nerve signals (Ravindra and Castellini, 2014; Chadwell et al., 2016; Chen et al., 2016), brain signals are equally viable for provision of direct neural signals for interface purposes (Waldert et al., 2009; Quandt et al., 2012; Kao et al., 2014). A BCI, specifically, is an artificial intelligence system that can recognize a certain set of patterns generated by brain. The BCI promises as a platform to improve the quality of life of individuals with severe motor disabilities (Muller-Putz et al., 2015). The BCI procedure when acquiring control commands from the brain consists of five steps: signal acquisition, signal enhancement, feature extraction, classification, and control-interfacing (Nicolas-Alonso and Gomez-Gil, 2012).

The complicated surgical procedures performed for microelectrode implantation and establishment of BCI have been outstandingly successful in achieving control of robotic and prosthetic arms by means of neuronal-signal acquisition (Hochberg et al., 2006, 2012). These methods, however, are far from perfect options for BCI purposes, as they are all invasive and incur significant risks (Jerbi et al., 2011; Schultz and Kuiken, 2011; Rak et al., 2012; Ortiz-Rosario and Adeli, 2013).

The alternative non-invasive methods measure brain activities *via* either detection of electrophysiological signals (Li et al., 2010; Bai et al., 2015; Weyand et al., 2015) or determination of hemodynamic response (Bhutta et al., 2014; Ruiz et al., 2014; Hong et al., 2015; Naseer and Hong, 2015a; Weyand et al., 2015). Electrophysiological activity is generated by the neuronal firings prompted in the performance of brain tasks (Guntekin and Basar, 2016). The hemodynamic response is the increase of hemoglobin as a result of the neuronal firing that occurs when the brain performs an activity (Ferrari and Quaresima, 2012; Boas et al., 2014). The leading non-invasive BCI modalities in terms of cost and portability are electroencephalography (EEG) and functional near-infrared spectroscopy (fNIRS) (von Luhmann et al., 2015; Lin and Hsieh, 2016). The selection criterion for each modality is task dependent.

Electroencephalography has applications for active-, passive-, and reactive-type BCIs (Turnip et al., 2011; Zander and Kothe, 2011; Turnip and Hong, 2012; Urgen et al., 2013; Yoo et al., 2014). It is most widely employed with reactive-type tasks in the performance of which the brain output is generated in reaction to external stimulation. Commands are generated by detection of steady-state visually evoked potentials (SSVEP) and P300-based activations (Li et al., 2010, 2013; Turnip and Hong, 2012; Cao et al., 2014; Bai et al., 2015). fNIRS-based BCIs, meanwhile, are most commonly of the active type, which obtains brain activity output *via* user intentionality, independent of external events. For the purposes of fNIRS-based active BCIs, mostly mental (e.g., math, counting, etc.) and motor-related tasks (e.g., motor imagery) are selected (Naseer et al., 2014; Hong et al., 2015; Hong and Naseer, 2016). Although recent studies have shown the importance of fNIRS-based BCI for reactive and passive tasks (Hu et al., 2012; Santosa et al., 2014; Bhutta et al., 2015; Khan and Hong, 2015), active-type tasks are primarily used to increase the number of commands for this modality. The active-type BCI is preferred over the reactive BCI, as it allows a person to communicate with a machine at will. For both EEG and fNIRS, the drawback of increasing the number of active commands is the decrease in accuracy for BCI (Vuckovic and Sepulveda, 2012; Naseer and Hong, 2015a).

As a means of compensating for the accuracy reduction problem is the use of a single-brain signal acquisition modality, the hybrid BCI concept was proposed (Pfurtscheller et al., 2010). The design of a hybrid BCI entails the combination of either two modalities (at least one of which is a brain signal acquisition modality) or different brain signals (e.g., SSVEP and P300). The EEG–fNIRS-based hybrid BCI has been reported to enhance classification accuracy (Fazli et al., 2012; Putze et al., 2014; Tomita et al., 2014) and increase the number of commands (Khan et al., 2014). Classification accuracy can be improved by simultaneously decoding EEG and fNIRS signals for the same activity and combining the features. The number of active commands can be increased by decoding brain activities from different brain regions (e.g., motor tasks for EEG and mental tasks for fNIRS). However, for these cases, the reported window size using fNIRS for optimal classification is around 10 s (Tomita et al., 2014). The problem of window size reduction and others relevant to real-time/online BCI applications require further research. **Table 1** summarizes the most recent work (Kim et al., 2014; Bai et al., 2015; Combaz and Van Hulle, 2015; Hortal et al., 2015; Ma et al., 2015; Naseer and Hong, 2015a; Ramli et al., 2015; Yin et al., 2015) in terms of command number, accuracy, and window size as those parameters relate to robotic-control applications.

In the present BCI research, we decoded eight active commands using signals from the frontal and prefrontal cortices. Four tasks (mental math, mental counting, word formation, and mental rotation) were decoded using fNIRS, and four eye movement signals (up/down eye movement, left/right eye movement, twice or three times eyeblinks) were decoded using EEG. In the fNIRS classification and generation of commands, a 0- to 2-s window was used, whereas in the case of EEG, a 1-s window was used. The commands thus generated were used to update a quadcopter's movement coordinates (six movements and start/stop). Revealing the obtained results briefly, the signal mean, peak, and minimum-value features obtained using oxyhemoglobin data in 0–2 s window provided 76.5% accurate classification. For EEG, signal peak and number of peaks achieved 86% accurate results. The testing of the drone in an arena showed the possibility of quadcopter control using eight-brain commands from the frontal cortex. To the authors' best knowledge, this is the first fNIRS study to decode and classify brain activity in 0–2 s window. Also, this is the first study to decode four commands from the prefrontal cortex using fNIRS. Moreover, this work shows the first hybrid EEG–fNIRS-based decoding of eight active commands from the frontal and prefrontal cortices.

### MATERIALS AND METHODS

### Subjects

A total of 10 healthy adults were recruited (all right-handed males; mean age: 28.5 ± 4.8). Right-handers had been sought in order to minimize any variations in the electrophysiological and hemodynamic responses due to the hemispheric dominance difference. None of the selected participants had participated in any previous brain signal acquisition experiment, and none had a history of any psychiatric, neurological, or visual disorder. All of them had normal or corrected to normal vision, and all provided a written consent after having been informed in detail about the experimental procedure. Experiments with fNIRS and EEG were approved by the Institutional Review Board of Pusan National University, and they were conducted in accordance with the ethical standards encoded in the latest Declaration of Helsinki.

### Electrode/Optode Placement

The frequency domain system ISS Imagent (ISS Inc., USA) was used for the signal acquisition. A total of eight sources and two detectors, making a combination of 16 channels, were positioned around the prefrontal cortex. The FPz location was positioned between the two detectors. The Emotiv EEG headset (Emotiv Epoc, USA) was used to acquire the EEG signals. The

#### TABLE 1 | Comparison of our proposed method with recent electroencephalography (EEG)-based work on command generation, accuracy, and window size.


electrodes/optodes were positioned on the head according to the International 10–20 system (Jurcak et al., 2007). The electrode and optode placement is illustrated in **Figure 1**.

### Experimental Procedure

The experimental procedure consisted of two sessions: training and testing. The subjects were trained to perform eight tasks detected simultaneously by EEG and fNIRS, after which the recorded brain activities were tested using real-time/online analysis.

functional near-infrared spectroscopy–electroencephalography (fNIRS–EEG) experiment. (A) 16-channel fNIRS with 2 detectors and 8 emitters in the prefrontal brain region and (B) 14-electrode configuration of the Emotiv EEG headset.

### Training Session

For the training session, the subjects were seated in a comfortable chair and told to relax. A computer monitor was set up approximately 70 cm in front of the subjects. The session began with a resting period of 2 min to establish a data baseline. After the resting period, the screen cued the participants to perform one of eight specific tasks. The tasks were as follows:


The mental tasks were recorded mainly using fNIRS, as the previous work (Weyand et al., 2015) has shown its utility for high-accuracy detection of the above-noted tasks. The eye movement tasks were recorded principally using EEG, as the Emotiv EEG head set, as noted earlier, is commercially available as a system for detection of various facial movements and motor signals. The training session was divided into two parts: mental task training and eye movement training. In the first part, the subjects were trained for mental arithmetic, counting, rotation, as well as word formation tasks. Each task consisted of five 10-s trials separated by a 20-s resting session. In the second part of the training session, the subjects were instructed to move their eyes according to the cue given. Each trial in this case was 5 s in duration, and the resting period was 10 s. Details on the experimental paradigm and data recording sequence are provided in **Figure 2**.

### Testing Session

As part of the testing session, the training data were used to test the movement of a quadcopter (Parrot AR drone 2.0, Parrot SA., France). Specifically, the eight commands recorded during the training session were used to navigate the quadcopter in an open arena. The data were translated into commands and the subjects were asked to move the quadcopter in a rectangular path.

### Signal Acquisition and Processing

The data for both modalities (EEG and fNIRS) were independently processed and filtered to acquire the desired output signals. In both cases, band-pass filtering was used to remove physiological noise from the acquired signals.

### fNIRS Signal Processing

The frequency domain fNIRS system used two wavelengths (690 and 830 nm) to determine the changes in the concentration of hemoglobin. The sampling rate of 15.625 Hz was used to

Several channels were activated for both EEG and fNIRS. Proper channel selection is essential to the high-accuracy generation of commands. Previous work has employed *t*-value-based approaches (Hong and Nguyen, 2014; Hong and Santosa, 2016), bundled-optode-based approaches (Nguyen and Hong, 2016; Nguyen et al., 2016), and channel-averaging approaches (Khan and Hong, 2015; Naseer and Hong, 2015a). Other studies alternatively have employed their own algorithms, for instance, independent component analysis, etc. (Hu et al., 2010; Kamran and Hong, 2013; Santosa et al., 2013). We used the following criteria for selection of fNIRS and EEG channels.

#### fNIRS Channel Selection

For fNIRS channel selection, we calculated the peak (max) value of ΔHbO in the baseline and in the first trials of the mental arithmetic, mental counting, mental rotation, and word formation tasks, respectively. If the difference between the max value of the trial and the baseline value was positive, the channel was selected for classification; if neutral or negative (equal to or less than zero), it was discarded.

#### EEG Channel Selection

In case of EEG, we measured the power spectrum for each channel. The selected channels were those in which the signal power corresponding to the eyeblink and movement tasks was significant. Mostly the channels near the frontal brain region were active in this case.

### Feature Extraction and Classification

In order to generate commands, we first extracted the relevant features for classification. We selected signal peak and signal mean as features as, according to the literature (Khan and Hong, 2015), they provide better performance for fNIRS-based BCI systems. Also, in consideration of a recently reported possibility of an initial fNIRS signal dip (Hong and Naseer, 2016; Zafar and Hong, 2017), we added a minimum (min) signal value as a feature. We also investigated the possibility of minimizing the time for command generation by means of 0–1, 0–1.5, and 0–2 s windows.

For EEG signals, following channel selection we selected the signal mean and number of peaks as features for command generation. In this case, we used a moving window of 1 s to extract the relevant feature values.

For both modalities, MATLAB®-based functions were used to calculate the features of the mean, peak, min, and number of peaks. For offline processing, the extracted features were rescaled between 0 and 1 by the equation

$$a' = \frac{a - \min a}{\max a - \min a},\tag{3}$$

where *a* ∈ *Rn* represents the feature value, *a*′ is the rescaled value between 0 and 1, max *a* denotes the largest value, and min *a* indicates the smallest value. These normalized feature values were used in a four-class classifier for training and testing of the data. We used linear discriminant analysis (LDA) to classify the

acquire the data. The modified Beer-Lambert law (Baker et al., 2014; Bhatt et al., 2016) was utilized to convert the data into concentrated changes of oxy- and deoxy-hemoglobin (ΔHbO and ΔHbR):

$$A(t; \lambda) = \ln \frac{I\_{\text{in}}(\lambda)}{I\_{\text{out}}(t; \lambda)} = \alpha(\lambda) \times c(\lambda) \times l \times d(\lambda) + \eta\_{\text{b}} \tag{1}$$

$$
\begin{bmatrix}
\Delta \boldsymbol{\omega}\_{\text{HbO}}(t) \\
\Delta \boldsymbol{\omega}\_{\text{HbR}}(t)
\end{bmatrix} = \begin{bmatrix}
\boldsymbol{\alpha}\_{\text{HbO}}(\boldsymbol{\lambda}\_{1}) & \boldsymbol{\alpha}\_{\text{HbR}}(\boldsymbol{\lambda}\_{1}) \\
\boldsymbol{\alpha}\_{\text{HbO}}(\boldsymbol{\lambda}\_{2}) & \boldsymbol{\alpha}\_{\text{HbR}}(\boldsymbol{\lambda}\_{2})
\end{bmatrix}^{-1} \begin{bmatrix}
\Delta \boldsymbol{A}(t; \boldsymbol{\lambda}\_{1}) \\
\Delta \boldsymbol{A}(t; \boldsymbol{\lambda}\_{2})
\end{bmatrix} \frac{1}{l \times d(\boldsymbol{\lambda})},\tag{2}
$$

where *A* is the absorbance of light (optical density), *I*in is the incident intensity of light, *I*out is the detected intensity of light, α is the specific extinction coefficient in µM<sup>−</sup>1 cm−<sup>1</sup> , *c* is the absorber concentration in micromolars, *l* is the distance between the source and the detector in centimeters, *d* is the differential path-length factor, and η is the loss of light due to scattering.

The data were first preprocessed to remove physiological noises related to respiration, cardiac, and low-frequency drift signals. In order to minimize the physiological noise due to heart pulsation (1–1.5 Hz for adults), respiration (approximately 0.4 Hz for adults), and eye movement (0.3–1 Hz), the signals were low-pass filtered using a fourth-order Butterworth filter at a cutoff frequency of 0.15 Hz. The low-frequency drift signals were minimized from the data using a high-pass filter with a cutoff frequency of 0.033 Hz (Kamran and Hong, 2014; Bhutta et al., 2015; Hong and Santosa, 2016).

#### EEG Signal Processing

The 14-channel EEG data were acquired at a sampling rate of 128 Hz. The α*-,* β*-, Δ-*, and θ-bands, acquired by band-pass filtering between 8 and 12 Hz, 12 and 28 Hz, 0.5 and 4 Hz, and 4 and 8 Hz, respectively, enabled isolation of the electrodes corresponding to the eye movement activities (Lotte et al., 2007; Ortiz-Rosario and Adeli, 2013; Ma et al., 2015).

signals for EEG and fNIRS, as, in one of our previous studies, we found it to be faster than support vector machine (Khan and Hong, 2015).

For our case, *xi* ∈ *R*<sup>2</sup> , where, for fNIRS, *i* denotes the classification class, μ*i* is the sample mean of class *i*, and μ is the total mean over all of the samples *l*. That is,

$$
\mu\_i = \frac{1}{n\_i} \sum\_{\mathbf{x} \in \text{class } i} \mathbf{x},
\mu = \frac{1}{n} \sum\_l \mathbf{x}\_l,\tag{4}
$$

where *ni* is the number of samples of class *i* and *n* is the total number of samples. The optimal projection matrix *V* for LDA that maximizes the following Fisher's criterion is

$$J(V) = \frac{\det(V^T \mathcal{S}\_\mathbb{B} V)}{\det(V^T \mathcal{S}\_\mathbb{W} V)},\tag{5}$$

where *S*B and *S*W are the between-class scatter matrix and the within-class scatter matrix, respectively, given by

$$S\_{\rm B} = \sum\_{i=1}^{m} n\_i (\mu\_i - \mu)(\mu\_i - \mu)^T,\tag{6}$$

$$S\_{\mathbf{w}} = \sum\_{i=1}^{m} \sum\_{\mathbf{x}\_{l} \in \text{class } i} (\mathbf{x}\_{l} - \boldsymbol{\mu}\_{i})(\mathbf{x}\_{l} - \boldsymbol{\mu}\_{i})^{T},\tag{7}$$

where the total number of classes is given by *m*. Equation 5 was treated as an eigenvalue problem in order to obtain the optimal vector *V* corresponding to the largest eigenvalue. In the case of offline testing, 10-fold cross-validation was used to estimate the classification accuracy (Lotte et al., 2007; Hwang et al., 2013; Ortiz-Rosario and Adeli, 2013).

### Control Scheme for Quadcopter

For control of the quadcopter, we formulated eight commands for classification: up/down movements, clockwise/counterclockwise rotations, forward/backward movements, and start/stop. After classification, we updated the quadcopter's movement coordinates by Wi-Fi communication. The quadcopter has navigated using the transmitted commands. **Figure 3** provides a block diagram of the BCI scheme for quadcopter control.

### RESULTS

**Figure 4** plots Subject 2's ΔHbO values for all 16 channels and four activities. It can be seen that not all of the channels were active when performing a brain activity. However, for all four of the mental tasks, the activation pattern appears in the same channels.

The plots in **Figure 4** serve to emphasize the necessity of selecting proper channels for distinguishing of brain activities. As per our channel-selection criterion, we subtracted the max value in the baseline from the max value of the first trial. Accordingly, channels 4, 9, and 10 were selected for Subject 2, whereas channel 8 was not, due to having a higher value of baseline. We intended to identify different brain channels for different activities; therefore, as per our criterion, the subtraction of the first trial for each activity can identify different channels. However, in this case, for all subjects selected, the common channels were activated as corresponding to the mental tasks.

As various windows sizes have been used for detection of fNIRS features in different studies (Utsugi et al., 2008; Luu and Chau, 2009; Power et al., 2010; Naseer and Hong, 2013; Schudlo et al., 2013; Schudlo and Chau, 2015; Weyand and Chau, 2015; Weyand et al., 2015; Naseer et al., 2016a,b), we intended to minimize the window size applicable to BCI applications. We therefore selected 0–0.5, 0–1, 0–1.5, and 0–2 s windows for feature acquisition and investigated both hemodynamic and initial dip features to acquire the best window size for reduced computation time. The signals of Subject 2 as averaged over all of the trials in the reduced window are plotted in **Figure 5**.

In the case of EEG, we examined the power spectrum in order to identify the activated channels. The F3, F4, O1, and O2

regions were most active. We selected the channel showing the highest power corresponding to the eye movement task. **Figure 6** plots the normalized power spectrum of the selected channel for Subject 2.

The accuracies obtained for fNIRS are shown in **Table 2**. The accuracies achieved using EEG for the selected channels are shown in **Table 3**.

For real-time/online testing, we associated each activity with quadcopter movement. The associated activity for each is shown in **Figure 7**.

We associated opposite movements with EEG/fNIRS activities; for example, if forward movement was associated with EEG signals, backward movement was associated with fNIRS signals. This was done to ensure safety from the quadcopter and anyone in the area. This scheme has benefits in any case, as, if a command is misclassified/misinterpreted, a command with the second modality can be generated to countermove the misclassified movement. As per **Figure 7**, when EEG was used for one motion, fNIRS was used for its counter motion. This selection reflected EEG's demonstratively higher accuracy for most of the subjects.

We have tested the movement of the quadcopter in an arena. The subjects were asked to move the quadcopter in a rectangular path. They were asked to land the quadcopter near to the takeoff position. After take off, the subjects were informed to move the drone almost 3 m in forward direction, then 2 m to the left. The subjects had to increase the height by almost 0.5 m when reaching the left corner. After increasing the height, the subjects were to move the quadcopter backward 3 m and then 2 m to the right to reach the takeoff spot. After reaching the final position, the subjects were asked to land the drone. The path followed by the drone from Subject 2 is shown in **Figure 8**.

Since the drone requires quick response commands to maneuver, it can be clearly seen that the path was not properly followed. The subject had to adjust the path to the desired path to reach the final position. This is due to the delay in command generation and transmission to the drone for movement control. Further improvement can be made by incorporating an adaptive control algorithm to the drone's control and reduction of window size to stabilize the trajectory followed by the drone.

### DISCUSSION

In this study, we decoded eight active brain commands using hybrid EEG–fNIRS for BCI. The generated commands were tested using a quadcopter. To the best of the authors' knowledge, there are only two previous studies that have tested their BCI schemes for control of a quadcopter in 3D space (LaFleur et al., 2013; Kim et al., 2014). Our work has an advantage over both studies, as we were able to control the quadcopter with a greater number of active commands. LaFleur et al. (2013) controlled the height and rotation of a quadcopter using motor imageries for the left, right, and both hands. However, in this case, the quadcopter was given a fixed forward speed. Therefore, this work did

not give full quadcopter control to the user for navigation. Also, as it is impossible for some subjects to perform motor imagery (Vidaurre and Blankertz, 2010), their scheme is suited only to a specific set of users who can in fact perform it. Kim et al. (2014) integrated EEG with an eye tracker for generation of eight commands. Although their eye tracking was effective, an LED-based flash light was used to monitor the eye movement. As such, with the illuminating LED enhancing the contrast between the pupil and iris, it would be difficult to maintain quadcopter-controlling concentration for any significant span of time. In our case, there are no such drawbacks, as the necessary mental and eye movement commands are easy to generate. Also, our scheme yields more freedom to the user for drone control. It also includes a measure—specifically EEG/fNIRS integration—for avoidance of any miss-directional movement. A given EEG command has been matched with an opposing fNIRS command (see **Figure 7**). Thus, in the event that an EEG command is incorrectly classified and the quadcopter follows a wrong direction, fNIRS signals can be used to counteract that command.

Our proposed scheme for fNIRS classification incorporates features for both hemodynamics and initial dips. To the best of our knowledge, this is the first work to generate commands in a

TABLE 2 | Classification accuracies of four functional near-infrared spectroscopy window sizes (based upon the mean, peak, and minimum values of **Δ**HbO).


0–2 s window for BCI. Whereas previously, different windows have been reported for fNIRS-based classification using fNIRS, in the current work, the smallest window size for classification

electrode.

was used. Although the reported optimal window size, which is to say, the size allowing for the highest classification accuracy, is 2–7 s (Naseer and Hong, 2013), the 0–2 s window size, albeit causing a decrease in accuracy, is still usable for BCI. Also, we used the mean, min, and peak values of ΔHbO for classification.


Further investigation of feature selection and the use of adaptive algorithms will improve the results in both window size and classification.

Another advantage of the proposed scheme is its decoding of all four fNIRS activities from the prefrontal cortex. In previous research (Naseer and Hong, 2015a), four choices have been decoded using fNIRS in a 2–7 s window. However, in this case, only two classes were decoded from the prefrontal cortex, the other two commands having been generated using data from the motor cortex. Our proposed work is more advantageous for more users, as those suffering from locked-in syndrome cannot properly perform motor tasks. Another advantage of our work is the reduced command-generation time using fNIRS. This enables patients who are partially locked-in (with minor eye movements) to use eight commands in controlling a robot in online/real-time scenarios.

A previous fNIRS study (Hong and Santosa, 2016) proposed channel selection based on a *t-*value criterion. It classified four sound categories from the auditory cortex using the channels'

highest *t*-values. In our approach, we used the baseline as a reference for selection of channels. For classification, we selected, for four prefrontal activities, the common channels yielding a positive value after taking the difference between the peak value and the baseline. The drawback here is that only a limited number of channels can be selected. The algorithm can be further improved by adding the "difference of mean" for channel selection. In the comparison of our approach with *t-*value-based channel selection (see **Table 4**), most of the selected channels were common. It can be seen that the *t*-value-based scheme can identify more active channels. However, channel detection time also is an important factor for real-time applications, and our proposed scheme can identify the activated channels much quicker than the previous schemes. Thus, our method allows much room for further development in terms of command generation and real-time control.

A limitation of this method is the acquisition of activities using eye movement tasks. Although the use of eye movement for robot control has already been demonstrated to be effective (Ma et al., 2015), eye movements are related to motor activity, and so, it is difficult for motor-disabled patients to generate four EEG-based commands. The selection of different active tasks for EEG can improve the results. Another, fNIRS-related limitation of the proposed method is the variation in hemodynamic responses in subjects due to trial-to-trial variability (Hu et al., 2013). Granted, the proposed features (peak, mean, and min ΔHbO) might not yield the best performance for each subject in the 0–2 s window. However, this problem is not insoluble, and certainly, it will be addressed in further investigations into feature selection. Also, the use of adaptive algorithms promises improvement in fNIRS command generation time.

### CONCLUSION

In this study, we investigated the possibility of decoding eight commands from the frontal and prefrontal cortices

TABLE 4 | Comparison of selected channels and time between the proposed method and the *t*-value-based method.


by combining electroencephalography and functional nearinfrared spectroscopy (fNIRS) for a BCI. Four EEG commands were generated by eye movements (two and three blinks as well as left/right and up/down movements), using the number of peaks and the mean value as features. In the case of fNIRS, we chose mental counting, mental arithmetic, mental rotation, and word formation tasks for the purpose of activity decoding. We selected a 0–2 s window to generate the commands using fNIRS signals. The signal mean, peak, and minimum values were used as features for incorporation of hemodynamic signals and initial dip features in the classifier. The obtained 76.5% accuracy indicates the possibility of classifying the activities in reduced windows. We tested the generated commands in a real-time scenario using a quadcopter. The movement coordinates of the quadcopter were updated using the hybrid EEG–fNIRS-based commands. The performed experiments served to demonstrate the BCI feasibility and potential applications of the proposed eight-command decoding scheme. Further research on better feature selection and minimization of time window for command generation can improve the controllability of the quadcopter. Moreover, the incorporation of adaptive algorithms for flight control along with brain signal decoding for stable flight can further strengthen the results.

### AUTHOR CONTRIBUTIONS

MJK has conducted all the experiments, carried out the data processing, and made initial manuscript. K-SH has suggested

### REFERENCES


the theoretical aspects of the current study and supervised all the process from the beginning. All the authors have approved the final manuscript.

### FUNDING

This research was supported by the Mid-Career Researcher Program (grant no. NRF-2014R1A2A1A10049727) and the Convergence Technology Development Program for Bionic Arm (grant no. 2016M3C1B2912986) through the National Research Foundation of Korea (NRF) under the auspices of the Ministry of Science, ICT and Future Planning, Republic of Korea.


Kao, J. C., Stavisky, S. D., Sussillo, D., Nuyujukian, P., and Shenoy, K. V. (2014). Information systems opportunities in brain-machine interface decoders. *Proc. IEEE* 102, 666–682. doi:10.1109/JPROC.2014.2307357

Khan, M. J., and Hong, K.-S. (2015). Passive BCI based on drowsiness detection: an fNIRS study. *Biomed. Opt. Express* 6, 4063–4078. doi:10.1364/BOE.6.004063


motor imagery-based brain-computer interface. *J. Neural Eng.* 10, 046003. doi:10.1088/1741-2560/10/4/046003


brain-computer interface. *IEEE Trans. Neural Syst. Rehabil. Eng.* 23, 548–561. doi:10.1109/TNSRE.2015.2399392


systems in general. *J. Neural Eng.* 8, 025005. doi:10.1088/1741-2560/8/2/ 025005

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2017 Khan and Hong. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# **An Improved Recurrent Neural Network for Complex-Valued Systems of Linear Equation and Its Application to Robotic Motion Tracking**

*Lei Ding, Lin Xiao\*, Bolin Liao, Rongbo Lu and Hua Peng*

*College of Information Science and Engineering, Jishou University, Jishou, China*

#### *Edited by:*

*Shuai Li, Hong Kong Polytechnic University, Hong Kong*

#### *Reviewed by:*

*Weibing Li, University of Leeds, United Kingdom Yinyan Zhang, Hong Kong Polytechnic University, Hong Kong Dechao Chen, Sun Yat-sen University, China Ke Chen, Tampere University of Technology, Finland*

> *\*Correspondence: Lin Xiao xiaolin860728@163.com*

*Received: 30 May 2017 Accepted: 11 August 2017 Published: 01 September 2017*

#### *Citation:*

*Ding L, Xiao L, Liao B, Lu R and Peng H (2017) An Improved Recurrent Neural Network for Complex-Valued Systems of Linear Equation and Its Application to Robotic Motion Tracking. Front. Neurorobot. 11:45. doi: 10.3389/fnbot.2017.00045* To obtain the online solution of complex-valued systems of linear equation in complex domain with higher precision and higher convergence rate, a new neural network based on Zhang neural network (ZNN) is investigated in this paper. First, this new neural network for complex-valued systems of linear equation in complex domain is proposed and theoretically proved to be convergent within finite time. Then, the illustrative results show that the new neural network model has the higher precision and the higher convergence rate, as compared with the gradient neural network (GNN) model and the ZNN model. Finally, the application for controlling the robot using the proposed method for the complex-valued systems of linear equation is realized, and the simulation results verify the effectiveness and superiorness of the new neural network for the complex-valued systems of linear equation.

**Keywords: complex-valued systems of linear equation, recurrent neural network, finite-time convergence, robot, gradient neural network, motion tracking**

### **1. INTRODUCTION**

Today, the complex-valued systems of linear equation has been applied into many fields (Duran-Diaz et al., 2011; Guo et al., 2011; Subramanian et al., 2014; Hezari et al., 2016; Zhang et al., 2016; Xiao et al., 2017a). In mathematics, the complex-valued systems of linear equations can be written as

$$Az(t) = b \in \mathbb{C}^n,\tag{1}$$

where*A ∈* C *n×n* and *b ∈* C *n* are the complex-valued coefficients, and *z*(*t*) *∈* C *n* is a complex-valued vector to be computed. Xiao et al. (2015) proposed a fully complex-valued gradient neural network (GNN) to solve such a complex-valued systems of linear equation. However, the corresponding error norm usually converges to the theoretical solution after very long time. So to increase the convergence rate, a kind of neural network called Zhang neural network (ZNN) is proposed to make the lagging error converge to 0 exponentially (Zhang and Ge, 2005; Zhang et al., 2009). However, in Xiao (2016) and Xiao et al. (2017b), Xiao pointed that the original ZNN model cannot converge to 0 within finite time, and its real-time calculation capability may be limited (Marco et al., 2006; Li et al., 2013; Li and Li, 2014; Xiao, 2015). So, Xiao (2016) presented a new design formula, which can converge to 0 within finite time for the time-varying matrix inversion.

Considering that a complex variable can be written as the combination of its real and imaginary parts, we have *A* = *A*re + *jA*im, *<sup>b</sup>* <sup>=</sup> *<sup>b</sup>*re <sup>+</sup> *jb*im, and *<sup>z</sup>*(*t*) <sup>=</sup> *<sup>z</sup>*re(*t*) <sup>+</sup> *<sup>z</sup>*im(*t*), where the symbol *<sup>j</sup>* <sup>=</sup> *<sup>√</sup> −*1 means an imaginary unit. Therefore, the equation (1) can be presented as

$$[A\_{\rm re} + jA\_{\rm im}][z\_{\rm re}(t) + jz\_{\rm im}(t)] = b\_{\rm re} + jb\_{\rm im} \in \mathbb{C}^{\rm n},\tag{2}$$

where *A*re *∈* R *n×n , A*im *∈* R *n×n* , *z*re *∈* R *n , z*im *∈* R *n* , *b*re *∈* R *n* , and *b*im *∈* R *n* . According to the complex formula, the real (or imaginary) part of the left-side and right-side of equation is equal (Zhang et al., 2016). Then we have

$$\begin{cases} A\_{\rm re} z\_{\rm re}(t) - A\_{\rm im} z\_{\rm im}(t) = b\_{\rm re} \in \mathbb{R}^n, \\ A\_{\rm im} z\_{\rm re}(t) + A\_{\rm re} z\_{\rm im}(t) = b\_{\rm im} \in \mathbb{R}^n. \end{cases} \tag{3}$$

Thus, we can express the equation (3) in a compact matrix form as:

$$
\begin{bmatrix} A\_{\rm re} & -A\_{\rm im} \\ A\_{\rm im} & A\_{\rm re} \end{bmatrix} \begin{bmatrix} z\_{\rm re}(t) \\ z\_{\rm im}(t) \end{bmatrix} = \begin{bmatrix} b\_{\rm re} \\ b\_{\rm im} \end{bmatrix} \in \mathbb{R}^{2n}.\tag{4}
$$

We can write the equation (4) as

$$\text{Cx}(t) = e \in \mathbb{R}^{2n},\tag{5}$$

where *C* = [ *A*re *−A*im *<sup>A</sup>*im *<sup>A</sup>*re ] , *<sup>x</sup>*(*t*) = [ *z*re(*t*) *z*im(*t*) ] , and *e* = [ *b*re *b*im] . Now the complex-valued system of linear equation can be computed in real domain. In this situation, most methods for solving realvalued system of linear equation can be used to solve the complexvalued system of linear equation (Zhang and Ge, 2005; Zhang et al., 2009; Guo et al., 2011). For example, a gradient neural network (GNN) can be designed to solve such a real-valued system of linear equation. The GNN model can be directly presented as follows (Xiao et al., 2015):

$$\dot{\mathbf{x}}(t) = -\gamma \mathbf{C}^{\mathrm{T}} (\mathbf{C}\mathbf{x}(t) - e),\tag{6}$$

where design parameter *γ >* 0 is employed to adjust the convergence rate of the GNN model. Zhang et al. (Zhang et al., 2016) used the recurrent neural network to solve the complex-valued quadratic programming problems. Hezari et al. (2016) solved a class of complex symmetric system of linear equations using an iterative method. However, the above mentioned neural networks cannot converge to the desired solution within finite time. Considering that the complex-valued system of linear equation can be transformed into the real-valued system of linear equation, a new neural network can be derived from the new design formula proposed by Xiao for solving the complex-valued system of linear equation (Xiao et al., 2015). In addition, the new neural network possesses a finite-time convergence property.

In recent years, the research on robot has become a hot spot (Khan et al., 2016a,b; Zanchettin et al., 2016; Guo et al., 2017), and the neural network has been successfully applied into the robot domain (He et al., 2016; Jin and Li, 2016; Woodford et al., 2016; Jin et al., 2017; Xiao, 2017). However, the application of the new design method for the complex-valued system of linear equation in robot domain has not been reported. So this is the first time to propose a new neural network, which can convergence within finite-time for solving the complex-valued system of linear equation and its application to robot domain.

The rest of this paper is organized into four sections. Section 2 proposes a finite-time recurrent neural network (FTRNN) to deal with the complex-valued system of linear equation, and its convergence analysis is given in detail. Section 3 gives the computersimulation results to substantiate the theoretical analysis and the superiority. Section 4 gives the results of the application for controlling the robotic motion planning. Finally, the conclusions are presented in Section 5. Before ending this section, the main contributions of the current work are presented as follows.


### **2. FINITE-TIME RECURRENT NEURAL NETWORK**

Considering that the complex-valued system of linear equation can be computed in real domain, the error function *E*(*t*) of traditional ZNN can be presented as

$$E(t) = \mathbb{C}x(t) - e \in \mathbb{R}^{2n}.\tag{7}$$

Then, according to the design formula *E*˙(*t*) = *−γ*Φ(*E*(*t*)), the original ZNN model can be presented as

$$\mathbf{C}\dot{\mathbf{x}}(t) = -\gamma \Phi(\mathbf{C}\mathbf{x}(t) - e),\tag{8}$$

where Φ(*·*) means an activation function array, and *γ >* 0 is used to adjust the convergence rate. In this paper, the new design formula in Xiao (2016) for *E*(*t*) can be directly employed and written as follows:

$$\frac{\mathrm{d}E(t)}{\mathrm{d}t} = -\gamma \Phi \Big(\rho\_1 \mathrm{E}(t) + \rho\_2 \mathrm{E}^{i/f}(t)\Big),\tag{9}$$

where the parameters *ρ*<sup>1</sup> and *ρ*<sup>2</sup> satisfy *ρ*<sup>1</sup> *>* 0, *ρ*<sup>2</sup> *>* 0, and *f* and *j* mean the positive odd integer and satisfy *f > j*. Then we have

$$\mathbf{C}\dot{\mathbf{x}}(t) = -\gamma \Phi \Big(\rho\_1 (\mathbf{C}\mathbf{x}(t) - \mathbf{e}) + \rho\_2 (\mathbf{C}\mathbf{x}(t) - \mathbf{e})^{j/f}(t)\Big). \tag{10}$$

To simplify the formula, Φ(*·*) uses the linear activation function. Then we have

$$\frac{\mathrm{d}E(t)}{\mathrm{d}t} = -\gamma \left(\rho\_1 E(t) + \rho\_2 E^{j/f}(t)\right),\tag{11}$$

and

$$\text{Tolving set at } \epsilon.$$

and

$$\mathbf{C}\dot{\mathbf{x}}(t) = -\gamma \left(\rho\_1 (\mathbf{C}\mathbf{x}(t) - \mathbf{e}) + \rho\_2 (\mathbf{C}\mathbf{x}(t) - \mathbf{e})^{j/f}\right),\tag{12}$$

which is called the finite-time recurrent neural network (FTRNN) model to online deal with the complex-valued system of linear equation. In addition, for design formula (11) and FTRNN model (12), we have the following two theorems to ensure their finitetime convergence properties.

**Theorem 1.** *The error function E*(*t*) *of design formula (11) converges to zero within finite-time t<sup>u</sup> regardless of its randomly generated initial error E*(0)*:*

$$t\_{\mathfrak{u}} = \frac{f}{\gamma \rho\_1 (f - j)} \ln \frac{\rho\_1 h\_{\mathcal{M}}(\mathbf{0})^{(f - j)/f} + \rho\_2}{\rho\_2},$$

*where hM*(0) *means the maximum element of the matrix E*(0).

P. For design formula (11), we have

$$\frac{\mathrm{d}E(t)}{\mathrm{d}t} = -\left(\gamma\rho\_1 E(t) + \gamma\rho\_2 E^{j/f}(t)\right). \tag{13}$$

To deal with the dynamic response of the equation (13), the above differential equation can be rewritten as below:

$$E^{-j/f}(t) \diamond \frac{\mathrm{d}E(t)}{\mathrm{d}t} + \gamma \rho\_1 E^{(f-j)/f}(t) = -\gamma \rho\_2,\tag{14}$$

where the matrix-multiplication operator *⋄* means the Hadamard product and can be written as

$$\mathcal{W}\diamond\mathcal{S} = \begin{bmatrix} \mathcal{W}\_{11}\mathcal{S}\_{11}, & \mathcal{W}\_{12}\mathcal{S}\_{12}, & \cdots & \cdot, & \mathcal{W}\_{1n}\mathcal{S}\_{1n} \\ \mathcal{R}\_{21}\mathcal{S}\_{21}, & \mathcal{W}\_{21}\mathcal{S}\_{21}, & \cdots & \cdot, & \mathcal{W}\_{2n}\mathcal{S}\_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ \mathcal{W}\_{m1}\mathcal{S}\_{m1}, & \mathcal{W}\_{m2}\mathcal{S}\_{m2}, & \cdots & \cdot, & \mathcal{W}\_{mn}\mathcal{S}\_{mn}, \end{bmatrix} \in \mathbb{R}^{m \times n}.$$

Now let us define *Y*(*t*) = *E* (*f*–*j*)/*f* (*t*). Then, we have

$$\frac{\mathrm{d}Y(t)}{\mathrm{d}t} = \frac{f-j}{f} E^{-j/f}(t) \diamond \frac{\mathrm{d}E(t)}{\mathrm{d}t} .$$

Thus, the differential equation (14) can be equivalent to the following first order differential equation:

$$\frac{\mathrm{d}Y(t)}{\mathrm{d}t} + \frac{f-j}{f}\gamma\rho\_1 Y(t) = -\frac{f-j}{f}\gamma\rho\_2 I. \tag{15}$$

This is a typical first order differential equation, and we have

$$Y(t) = \left(\frac{\rho\_2}{\rho\_1}I + Z(0)\right) \exp\left(-\frac{f-j}{f}\gamma\rho\_1 t\right) - \frac{\rho\_2}{\rho\_1}I.\tag{16}$$

So we have

$$E^{(f-j)/f}(t) = \left(\frac{\rho\_2}{\rho\_1}I + E^{(f-j)/f}(0)\right) \exp\left(-\frac{f-j}{f}\gamma\rho\_1 t\right) - \frac{\rho\_2}{\rho\_1}I,\tag{17}$$

$$E(t) = \left[ \left( \frac{\rho\_2}{\rho\_1} I + E^{(f-j)/f}(0) \right) \exp\left( -\frac{f-j}{f} \gamma \rho\_1 t \right) - \frac{\rho\_2}{\rho\_1} \right]^{f/(f-j)}.\tag{18}$$

From the equation (18), we can find the error matrix *E*(*t*) will converge to 0 in *tu*, and

$$\left(\frac{\rho\_2}{\rho\_1}I + E^{(f-j)/f}(\mathbf{0})\right) \exp\left(-\frac{f-j}{f}\gamma\rho\_1\mathbf{t}\_\mu\right) - \frac{\rho\_2}{\rho\_1}I = \mathbf{0}.\tag{19}$$

Considering each element of the matrix *E*(*t*) has the same identical dynamics, we have

$$t\_{ik} = \frac{f}{\gamma \rho\_1 (f - j)} \ln \frac{\rho\_1 h\_{ik}^{(f - j)/f}(\mathbf{0}) + \rho\_2}{\rho\_1},\tag{20}$$

where *hik* means the *ik*th element of the matrix *E*(0), and *tik* means the *ik*th finite-time convergence upper bound of the matrix *E*(*t*). Let *hM*(0) = max(*hik*). Then for any *ik*th element of the matrix *E*(*t*), we have the maximum convergence time:

$$t\_u = \frac{f}{\gamma \rho\_1 (f - j)} \ln \frac{\rho\_1 h\_M(0)^{(f - j)/f} + \rho\_2}{\rho\_2}.$$

According to the above analysis, we can draw a conclusion that the error matrix *E*(*t*) will converge to 0 within the finite time *t<sup>u</sup>* regardless of its initial value *E*(0). Now the proof is completed.

**Theorem 2.** *The state matrix X*(*t*) *of FTRNN model (12) will converge to the theoretical solution of (5) in finite time t<sup>u</sup> regardless of its randomly generated initial state x*(0)*, and*

$$t\_u \in \left\{ \frac{f}{\gamma \rho\_1 (f - j)} \ln \frac{\rho\_1 h\_L(\mathbf{0})^{(f - j)/f} + \rho\_2}{\rho\_2}, \right.$$

$$\frac{f}{\gamma \rho\_1 (f - j)} \ln \frac{\rho\_1 h\_M(\mathbf{0})^{(f - j)/f} + \rho\_2}{\rho\_2} \right\}$$

*where hM*(0) *and hL*(0) *mean the largest and the smallest elements of the matrix E*(0), *respectively.*

P. Let *x*(*FT*)(*t*) represent the solution of the FTRNN model (12), *x*(*org*)(*t*) represent the theoretical solution of the equation (5), and ˜*x*(*t*) represent the difference between *x*(*FT*)(*t*) and *x*(*org*)(*t*). Then, we can obtain

$$
\tilde{\mathbf{x}}(t) = \mathbf{x}\_{(FT)}(t) - \mathbf{x}\_{(org)}(t) \in \mathbb{R}^{2n \times 2n}.\tag{21}
$$

The equation (21) can be written as

$$
\varkappa\_{(FT)}(t) = \tilde{\varkappa}(t) + \varkappa\_{(org)}(t) \in \mathcal{R}^{2n \times 2n}.\tag{22}
$$

Substitutes the above equation into FTRNN model (12), we have

$$\mathcal{C}(\dot{\tilde{\mathbf{x}}}(t) + \dot{\mathbf{x}}\_{\text{(org)}}(t)) = -\gamma \Big(\rho\_1 (\mathcal{C}(\tilde{\mathbf{x}}(t) + \mathbf{x}\_{\text{(org)}}(t)) - \mathbf{e})$$

$$+ \rho\_2 (\mathcal{C}(\tilde{\mathbf{x}}(t) + \mathbf{x}\_{\text{(org)}}(t)) - \mathbf{e})^{\dagger/\dagger} \Big). \tag{23}$$

Frontiers in Neurorobotics | www.frontiersin.org

*,*

Considering *Cx*(*org*)(*t*) *− e* = 0 and *C x*˙(*org*)(*t*) = 0, the above equation can be written as

$$\mathbf{C}\dot{\tilde{\mathbf{x}}}(t) = -\gamma \Big(\rho\_1(\mathbf{C}\tilde{\mathbf{x}}(t) - e) + \rho\_2(\mathbf{C}\tilde{\mathbf{x}}(t) - e)^{j/f}\Big).$$

Furthermore, considering *E*(*t*) = *C*(*x*˜(*t*) + *x*(*org*)(*t*)) *− e*, *Cx*(*org*)(*t*) *− e* = 0, and *E*(*t*) = *Cx*˜(*t*), the above differential equation can be written as

$$\frac{\mathrm{d}E(t)}{\mathrm{d}t} = -\gamma \Big(\rho\_1 (E(t) - e) + \rho\_2 (E(t) - e)^{j/f} \Big).$$

Let *E*˜(*t*) = *E*(*t*) *− e*, then we have

$$\frac{d\tilde{E}(t)}{dt} = -\gamma \left(\rho\_1 \tilde{E}(t) + \rho\_2 \tilde{E}^{j/f}(t)\right). \tag{24}$$

So according to the equation (20), we have

$$\tilde{t}\_{\rm ik} = \frac{f}{\gamma \rho\_1 (f - j)} \ln \frac{\rho\_1 \tilde{h}\_{\rm ik}^{(f - j)/f}(\mathbf{0}) + \rho\_2}{\rho\_1},\tag{25}$$

where ˜*tik* means the time upper of *ik*th solution of the matrix *E*˜(*t*), and ˜*hik* means the *ik*th initial error value of the matrix *E*˜(0).

Let us define ˜*h<sup>M</sup>* = max( ˜*hik*(0)), and ˜*h<sup>L</sup>* = min( ˜*hik*(0)) with *i, k* = 1, 2, *. . . n*. Then for all possible *i* and *k*, we have

$$\begin{aligned} &\frac{f}{\gamma \rho\_1(f-j)} \ln \frac{\rho\_1 \tilde{h}\_L^{(f-j)/f}(\mathbf{0}) + \rho\_2}{\rho\_1} \\ &\leqslant \tilde{t}\_{\vec{k}}(t) \leqslant \frac{f}{\gamma \rho\_1(f-j)} \ln \frac{\rho\_1 \tilde{h}\_M^{(f-j)/f}(\mathbf{0}) + \rho\_2}{\rho\_1} .\end{aligned}$$

The above equation shows that the state matrix *x*˜(*t*) = *x*(*FT*)(*t*)*−x*(*org*)(*t*) will converges to 0 within finite time regardless of its initial error value. In another word, the matrix *x*(*FT*)(*t*) for the FTRNN model (12) will converge to the theoretical solution *x*(*org*)(*t*) for the theoretical model (5) within finite time regardless of its randomly generated initial state *x*(0). Now the proof is completed.

### **3. COMPUTER SIMULATION**

In this section, a digital example will be carried out to show the superiority of FTRNN model (12) to GNN model (6) and ZNN model (8). We can choose the design parameters *f* and *j*, which satisfy *f > j*. For example, we choose *f* = 5 and *j* = 1 in this paper. In addition to this, to substantiate the superiority of FTRNN model (12), we choose the same complex-valued matrix *A* and *b* as these of the paper (Xiao et al., 2015). Then we have

$$A = \begin{bmatrix} -0.7597 + 0.6503j & -0.8391 - 0.5440j & 0.2837 - 0.9589j & 1 \\ 0.7597 + 0.6503j & -0.8391 + 0.5440j & -0.2837 - 0.9589j & 1 \\ 0.7597 - 0.6503j & -0.8391 - 0.5440j & -0.2837 + 0.9589j & 1 \\ 0 - 1.0000j & -1.0000 & 0 + 1.0000j & 1 \end{bmatrix}.$$

So we have

$$A\_{\rm re} = \begin{bmatrix} -0.7597 & -0.8391 & 0.2837 & 1 \\ 0.7597 & -0.8391 & -0.2837 & 1 \\ 0.7597 & -0.8391 & -0.2837 & 1 \\ 0 & -1.0000 & 0 & 1 \end{bmatrix},$$

and

$$A\_{\rm im} = \begin{bmatrix} 0.6503 & -0.5440 & -0.9589 & 0 \\ 0.6503 & 0.5440 & -0.9589 & 0 \\ -0.6503 & -0.5440 & 0.9589 & 0 \\ -1.0000 & 0 & 1.0000 & 0 \end{bmatrix}.$$

Now the randomly generated vector *b* = [1.0000, 0.2837 + 0.9589*j*, 0.2837 *−* 0.9589*j*, 0]<sup>T</sup> in Xiao et al. (2015) is employed in this paper. The theoretical solution of the complex-valued linear equation system can be written as *z*(*org*) = [*−*0.4683*−*0.2545*j*, 1.2425 + 0.3239*j*, *−*0.6126 + 0.0112*j*, 1.5082 + 0.4683*j*]. Then according to the equation (5), we have


and *e* = [1.0000, 0.2837, 0.2837, 0, 0, 0.9589, *−*0.9589, 0]<sup>T</sup> . So the theoretical solution of the complex-valued linear equation system can be rewritten as *x*(*org*) = [*−*0.4683, 1.2425, *−*0.6126, 1.5082, *−*0.2545, 0.3239, 0.0112, 0.4683]<sup>T</sup> .

First, a zero initial complex-valued state *z*(0) *∈* C 4 is generated, which can be transformed into the real-valued state *x*(0) *∈* R 8 in real domain. To help facilitate the contrast, we choose the design parameter *γ* = 5 and *γ* = 500, respectively.

Now GNN model (6), ZNN model (8), and FTRNN model (12) are applied to solve this complex-valued systems of linear equation problem. The output trajectories of the corresponding neuralstate solutions are displayed in **Figures 1**–**3**. As seen from such three figures, we can conclude that the output trajectories of the neural-state solutions can converge to the theoretical solutions, but the convergence rates are different. By comparison, we can easily find that FTRNN model (12) has a fastest convergence property.

To directly show the solution process of such three neuralnetwork models, the evolution of the corresponding residual errors, measured by the norm ||*E*(*t*)||2, is plotted in **Figure 4** under the conditions of *γ* = 5 and *γ* = 500. From **Figure 4A**, the results are consistent with those of **Figures 1**–**3**. In addition, from **Figure 4B**, the convergence speeds of GNN model (6), ZNN model (8), and FTRNN model (12) can be improved as the value of *γ* increases.

Now we can draw a conclusion that, as compared with GNN model (6) and ZNN model (8), FTRNN model (12) has the most superiority for solving the complex-valued system of linear equation problem.

### **4. APPLICATION TO ROBOTIC MOTION TRACKING**

In this section, a five-link planar manipulator is used to validate the applicability of the finite-time recurrent neural network (FTRNN) (Zhang et al., 2011). It is well known that the kinematics equations of the five-link planar manipulator at the position level and at the velocity level are, respectively, written as follows (Xiao and Zhang, 2013, 2014a,b, 2016; Xiao et al., 2017c):

$$r(t) = f(\theta(t))\tag{26}$$

$$
\dot{r}(t) = J(\theta)\dot{\theta}(t) \tag{27}
$$

where *θ* denotes the angle vector of the five-link planar manipulator, *r*(*t*) denotes the end-effector position vector, *f*(*·*) stands for a smooth non-linear mapping function, and *J*(*θ*) = *∂f*(*θ*)/*∂θ ∈ R m×n* .

To realize the motion tacking of this five-link planar manipulator, the inverse kinematic equation has been solved. Especially, equation (27) can be seen as a system of linear equations when the end-effector motion tracking task is allocated [i.e., *r*˙(*t*) is known and ˙*θ*(*t*) needs to be solved]. Thus, we can use the proposed FTRNN model (12) to solve this system of linear equations. Then, based on the design process of FTRNN model (12), we can obtain the following dynamic model to track control of the five-link planar manipulator [based on the formulation of equation (27)]:

$$\mathbf{C}\dot{\mathbf{x}}(t) = -\gamma \left(\rho\_1(\mathbf{C}\mathbf{x}(t) - \mathbf{e}) + \rho\_2(\mathbf{C}\mathbf{x}(t) - \mathbf{e})^{j/f}\right),$$

where *C* = *J*, *x* = ˙*θ* and *e* = *r*˙(*t*).

In the simulation experiment, a square path (with the radius being 1 m) is allocated for the five-link planar manipulator to

track. Besides, initial state of the mobile manipulator is set as *θ*(0) = [*π*/4, *π*/4, *π*/4, *π*/4, *π*/4]<sup>T</sup> , *γ* = 10<sup>3</sup> and task duration is 20 s. The experiment results are shown in **Figures 5** and **6**. From the results shown in such two figures, we can obtain that the fivelink planar manipulator completes the square path tracking task successfully.

### **5. CONCLUSION**

In this paper, a finite-time recurrent neural network (FTRNN) for the complex-valued system of linear equation in complex domain is proposed and investigated. This is the first time to propose such a neural network model, which can convergence within finite time to online deal with the complex-valued system of linear equation in complex domain, and the first time to apply this FTRNN model for robotic path tracking by solving the system of linear equation. The simulation experiments show that the proposed FTRNN model has better effectiveness, as compared to the GNN model and the ZNN model for the complex-valued system of linear equation in complex domain.

### **AUTHOR CONTRIBUTIONS**

LD: experiment preparation, publication writing; LX: experiment preparation, data processing, publication writing; BL: technology support, data acquisition, publication review; RL: supervision of data processing, publication review; HP revised the manuscript.

### **FUNDING**

This work is supported by the National Natural Science Foundation of China under grants 61503152 and 61363073, the Natural Science Foundation of Hunan Province, China under grants 2016JJ2101 and 2017JJ3258), the National Natural Science Foundation of China under grants, 61563017, 61662025, and 61561022, and the Research Foundation of Jishou University, China under grants 2017JSUJD031, 2015SYJG034, JGY201643, and JG201615.

### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer, YZ, and handling editor declared their shared affiliation.

*Copyright © 2017 Ding, Xiao, Liao, Lu and Peng. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

*Junqing Yang1,2, Ruituo Huai2 , Hui Wang2 , Wenyuan Li1 \*, Zhigong Wang1 , Meie Sui2 and Xuecheng Su2*

*<sup>1</sup> Institute of RF and OE-ICs, Southeast University, Nanjing, China, 2College of Electrical Engineering and Automation, Shandong University of Science and Technology, Qingdao, China*

An evaluation method is described that will enable researchers to study fight control characteristics of robo-pigeons in fully open space. It is not limited by the experimental environment and overcomes environmental interference with flight control in small experimental spaces using a compact system. The system consists of two components: a global positioning system (GPS)-based stimulator with dimensions of 38 mm × 26 mm × 8 mm and a weight of 18 g that can easily be carried by a pigeon as a backpack and a PC-based program developed in Virtual C++. The GPS-based stimulator generates variable stimulation and automatically records the GPS data and stimulus parameters. The PC-based program analyzes the recorded data and displays the flight trajectory of the tested robo-pigeon on a digital map. This method enables quick and clear evaluation of the flight control characteristics of a robo-pigeon in open space based on its visual trajectory, as well as further optimization of the microelectric stimulation parameters to improve the design of robo-pigeons. The functional effectiveness of the method was investigated and verified by performing flight control experiments using a robo-pigeon in open space.

#### *Edited by:*

*Xin Luo, Chongqing Institute of Green and Intelligent Technology (CAS), China*

#### *Reviewed by:*

*Hiroaki Wagatsuma, Kyushu Institute of Technology, Japan Eiji Uchibe, Advanced Telecommunications Research Institute International, Japan*

#### *\*Correspondence:*

*Wenyuan Li liwyseu@163.com*

*Received: 12 April 2017 Accepted: 31 July 2017 Published: 14 August 2017*

#### *Citation:*

*Yang J, Huai R, Wang H, Li W, Wang Z, Sui M and Su X (2017) Global Positioning System-Based Stimulation for Robo-Pigeons in Open Space. Front. Neurorobot. 11:40. doi: 10.3389/fnbot.2017.00040*

Keywords: brain–computer interface, robo-pigeon, bio-robot, flight control, stimulator

### INTRODUCTION

Some animals have amazing senses of smell that enable them to detect narcotics, explosives, and pipeline leaks (Britt et al., 2008). Engineers have not developed any device with odor detection capabilities comparable to those of canines. Furthermore, some animals can traverse a variety of terrain types more efficiently than electromechanical robots or humans (Grinke et al., 2015); for instance, engineers have not developed a micro air vehicle with flight abilities comparable to those of pigeons. Animals could be employed to conduct search and rescue missions more efficiently than electromechanical robots if they could be controlled. Therefore, increasing interest is developing in the prospect of controlling animals and utilizing them as new kinds of robots.

Special kinds of bionic robots called bio-robots have been developed based on brain–computer interfaces (BCIs), which involve direct communication between the brain and an external device. The first BCI-based robo-rat was developed according to a "virtual reward" behavior model in 2002 (Talwar et al., 2002). Micro-electrodes were implanted into three regions in a rat's brain: the medial forebrain bundle (MFB) and the whisker representations in the left and right somatosensory cortices (SIs). The MFB was stimulated to generate intense excitement as a virtual reward, while the left

and right SIs were stimulated to generate "virtual touches" in the corresponding directions. With stimulation of both the MFB and SIs, the robo-rat could be trained to perform certain behaviors, such as "walk forward," "turn left," and "turn right" in certain circumstances (Xu et al., 2004). These BCI-based bio-robots were developed successfully, and proof-of-concept tests were conducted in which motion along a complicated route was performed by remote control, even in 3-D terrain, by utilizing a telemetry system for brain microstimulation. BCI-based bio-robots are different from electromechanical robots as they are controlled by electrical stimulation of specific regions within the brains of the employed animals. In addition, bio-robots incorporate animals' visual, audio, and tactile sensory capabilities, which increase their intelligence without requiring any extraneous attachments. Bio-robots' movements are not dependent on motors, which are necessary in electromechanical robots and have high energy consumptions. Consequently, bio-robots are not limited by energy shortage capabilities when traveling over long distances and are more skilled than electromechanical robots when conducting complex missions. As bio-robots are superior to electromechanical robots in many potential applications, researchers have been investigating different types of bio-robots, such as rats (Feng et al., 2007; Huai et al., 2009; Pi et al., 2010; Zhang et al., 2012; Su et al., 2014; Zheng et al., 2015; Yu et al., 2016), geckos (Guo et al., 2009), sharks (Gomes et al., 2006), goldfishes (Kobayashi et al., 2009), carps (Peng et al., 2011), cockroaches (Holzer and Shimoyama, 1997), pigeons (Su et al., 2012), beetles (Hirotaka et al., 2008), and honeybees (Bao et al., 2011).

We developed the first BCI-based robo-pigeon using a new "virtual fear" behavior model in 2007 (Xinhua, 2007). In this type of robo-pigeon, microelectrodes are implanted into three motion-related nuclei in the brain: the left and right dorsalis intermedius ventralis anterior (DIVA) nuclei and the periaqueductal gray (PAG) region. The robo-pigeon can then be controlled *via* neural reactions to functional electrical stimulation. In this method, charge is transferred into the three motion-related regions in the brain of the robo-pigeon, externally exciting the membrane potentials of the neurons and inducing the neurons to fire in response. The efficacy of a robo-pigeon closely depends on the amount of charge transferred to the three motion-related neural tissues. The transferred charge is determined by several factors, including the frequency, number, and duration of the stimulation pulses, as well as the locations and surface coating of the stimulating electrodes. It is impractical to maintain the same precise electrode positioning in the brains of different pigeons (i.e., pigeons of different species, weights, or ages) during surgical implantation. Electrode positioning inaccuracy makes it essential to vary and optimize the stimulus settings in every stimulation channel for each robo-pigeon.

We previously reported on the remote control of robopigeons using a conventional neural stimulator with an RF transceiver (Yang et al., 2015). Forward motion and turns to the left and right were achieved by stimulating the PAG region and the left and right DIVA nuclei, respectively, by utilizing a telemetry system for brain microstimulation in laboratory environments, i.e., small enclosed spaces, typically experimental chambers or workshops, which are completely different from the open spaces in which robo-pigeons would actually be employed, limit their flight, and interfere with optimization of the stimulus settings. In addition, robo-pigeons would be out of the sight of their handlers in open space, and the telemetry system previously employed in laboratory environments is useless in such situations. These problems will be solved by the new system and method described in this paper, which can be utilized for flight control and optimization of the stimulus settings of robopigeons in open space.

### MATERIALS AND METHODS

### Overview

We describe the design of the system in the context of robo-pigeon fight control experiments in this section. The system consists of two separate components: an integrated global positioning system (GPS)-based microstimulator and a customized C++ program. The former is mounted on the back of a robo-pigeon and connected to the electrodes implanted in its brain and is responsible for generating micro-electrical stimulation and recording the experimental data during the flight control test. The latter is run on a PC or laptop and is in charge of analyzing the experimental data and displaying the results.

### GPS-Based Microstimulator

The GPS-based microstimulator in the proposed system is primarily composed of a microprocessor (ATmega8L, Atmel Inc.), a trans-flash (TF) card module, and a micro GPS module (SR-92) with a built-in patch antenna (ProGin Technology Inc.). All of the components are assembled on a printed circuit board and powered by a 3.7 V polymer battery.

**Figure 1** illustrates the circuitry of the GPS-based microstimulator in detail. Reg710-3.3 is a 3.3 V regulator. The programmable microprocessor (ATmega8L) has 23 digital I/O pins: two of these pins are employed to communicate with the GPS module as a serial port, and three pins are used to operate the TF card as a serial peripheral interface (SPI) port. Six pins are retained to generate a biphasic pulse as three separate output channels. In each output channel, two pins (PC0 and PC1, PC2 and PC3, or PC4 and PC5) are used to stimulate one of the three motion-related brain regions with constant voltage biphasic stimulus pulses. The GPS-based microstimulator also records the experimental data, including the location and velocity of the bio-robot and the corresponding stimulation parameters (pulse number, duration, and frequency) during its flight.

One program runs on the microprocessor onboard the robopigeon and generates gradient stimulation in every output channel of the GPS-based microstimulator. After the flight control test has begun, each channel is initiated with the same stimulation parameters: pulse number = 2, pulse duration = 0.2 ms, and pulse frequency = 80 Hz. Then, the corresponding pulse trains are applied alternately to stimulate the three motion-related brain regions. Simultaneously, the GPS data and stimulation parameters are recorded on the TF card. During the robo-pigeon flight control test, the stimulus is delivered independently to DIVA or PAG according to different circumstances of flight test

Table 1 | Formatted data recording.


or take-off test. The stimulus parameters are increased gradually and alternately according to the following rules: the pulse duration, number, and frequency are increased in increments of 1 (unit: 0.1 ms), 5, and 10, respectively. One parameter is changed in each stimulation cycle until each parameter reaches its maximum value (pulse number = 20, pulse duration = 0.8 ms, pulse frequency = 120 Hz.).

The microprocessor receives the GPS data and extracts the latitude, longitude, and speed from the GPRMC [one of many sentences in the National Marine Electronics Association (NMEA) standard for GPS receiver, for all sentences start with GP, RMC-recommended minimum data for GPS]-formatted GPS data stream. These data and the corresponding stimulation parameters are written into a file allocation table file on the TF card as a record with the format shown in **Table 1**. Each record is composed of a pre-header (PH), GPRMC-formatted GPS data, a command, and stimulation parameters. For example, in "@3559.8758, N, 12006.8668, E, 20.91, L, 10, 4, 80," "@" is a header identifying the start of the record, the location of the object is 35°59.8758′ N and 120°6.8668′ E, the speed of the object is 20.91 knots (1 knot = 1.852 km/h), "L" indicates stimulation of the left DIVA (L: stimulation of left DIVA, R: stimulation of right DIVA, T: stimulation of PAG, I: idle), and the stimulation parameters are: pulse number = 10, pulse duration = 0.4 ms, and pulse frequency = 80 Hz. Each update of the GPS data triggers a new recording; therefore, all of the experimental data are recorded one by one on the TF card during the flight of the robo-pigeon in open space.

### Data processing

The proposed data processing system includes a TF card reader and a PC-based program developed in Visual C++. The program processes the experimental data that are recorded on the TF card and displays the experimental results, as shown in **Figure 2**. The results are presented in the form of the 2-D trajectory of the investigated robo-pigeon that is drawn automatically by the program on a digital map based on the experimental data. Each trajectory is composed of numerous dots with four different shapes. Each dot corresponds to a data record and a position of the bio-robot during its flight. The shapes represent the following command types: ▲ = L command, ■ = R command, ● = T command, and ☆ = idle, i.e., without a command.

The trajectory in **Figure 2** demonstrates how the control characteristics of the robo-pigeon can be evaluated. Each segment of the trajectory can be categorized as corresponding to effective control or ineffective control. The effective control situations include stimulating the right DIVA nucleus and inducing a right turn (SRDRT), stimulating the left DIVA nucleus and inducing a left turn (SLDLT), and stimulating the PAG region and inducing take-off (SRTO), while the ineffective control situations include stimulating the right DIVA nucleus but not inducing a turn (SRDNT), stimulating the left DIVA nucleus but not inducing a turn (SLDNT), and stimulating the PAG region but not inducing take-off (SRNT). For effective control, the stimulus parameter settings can be determined and optimized for every stimulation channel and the corresponding stimulation parameters recorded. If ineffective control is observed, further investigation should be

conducted on the locations and surface coating of the stimulating electrodes and robo-pigeon design optimization strategies should be identified.

## RESULTS

We manufactured the modules shown in **Figure 3A**: a micro-GPS module, output leads, a TF card, and a motherboard. All of the other units were mounted on the motherboard through slots to assemble the GPS-based microstimulator depicted in **Figure 3B**. The GPS-based microstimulator was powered by a rechargeable 3.7 V, 240 mAh polymer battery attached to the back side of the motherboard and electrically connected to the motherboard when used by turning on the microswitch. The battery of the system lasts about 2.3 h, which is enough for conducting a complete test of robo-pigeon, compared with an average of slightly more than 1 h needed for each test. The GPS-based microstimulator was mounted on the back of a robo-pigeon as a backpack, and the terminals of the six output leads were connected to the stimulating electrodes located in the three brain regions (the left and right DIVA nuclei and the PAG region). The backpack measured 38 mm × 26 mm × 8 mm, weighed 18 g, and could easily be carried as a backpack by the pigeon, as shown in **Figures 3C,D**.

The microprocessor was preloaded with a C program to generate stimulation and record the data automatically following the process illustrated in **Figure 4**. As soon as a valid GPS position was received by the microprocessor through the serial port, it entered the test or idle state. The idle state was employed to avoid

fatigue due to continuous testing. In the following test, the speed of the robo-pigeon was first determined based on the GPS data. If the speed was more than 0.3 m/s, the flight test was conducted as follows: stimulation was applied to the left or right DIVA nucleus according to the rules mentioned above. Otherwise, a take-off test involving PAG stimulation was conducted. In each test cycle, the GPS position data and corresponding stimulation parameters were arranged based on the format of **Table 1** and written into a text file on the TF card. Only the GPS position data were recorded for the idle state.

When the test was completed, the text file on the TF card was transferred to a computer *via* a TF card reader and processed using a custom PC-based program to display the test results presented in **Figure 5**. The satellite picture was vectorized into a digital electronic map using SuperMap Deskpro 5.0 (SuperMap Software Co., Ltd.). Based on the data on the TF card, a series of vector graphs representing the flight control characteristics was constructed on the digital map using different types of identification dots. In **Figure 5**, each dot represents a position of the robo-pigeon during its flight, each ▲ denotes a position at which left DIVA stimulation was issued, each ■ represents a position at which right DIVA stimulation was issued, and each ☆ indicates a position at which no stimulation was issued.

For convenience of illustration, a solid auxiliary line with arrows was drawn manually and divided into several segments, which are labeled as L1, R1, F1, L2, R2, L3, R3, and F2 in **Figure 5**. Each segment corresponds to the command type and stimulation

Figure 5 | Results of robo-pigeon flight control test in open space.

parameters listed in **Table 2** according to the data recorded on the TF card.

Command and stimulation are determined by a program running on the GPS stimulator, the whole vector graphs are drawn automatically by our custom PC-based program based on the GPS data on the TF card. **Table 2** shows command and stimulation corresponding to flight trajectory in the **Figure 5**. L1 corresponds to left DIVA stimulation and contains ten points, each point represents a position on the digital map and has the same stimulation (pulse number = 20, pulse duration = 0.4 ms, pulse frequency = 100 Hz) shown in **Table 2**. R1 corresponds to right DIVA stimulation, it has the same relation between the stimulation and the GPS data with L1. F1 is corresponding to idle without stimulation. According to stimulation gradient rules mentioned above, the frequency was increased in increments of 10 and a new combination of parameters (pulse number = 20, pulse duration = 0.4 ms, pulse frequency = 110) was generated, L2 and R2 are flight trajectory corresponding to left and right DIVA stimulation with the new parameter.


Then, the pulse duration was increased in increments of 1 (unit 0.1 ms). L3 and R3 are results of fight control with the new stimulation (pulse number = 20, pulse duration = 0.5 ms, pulse frequency = 110).

According to the above results, the following conclusions can be drawn: (1) the tested robo-pigeon can be controlled in left and right direction in open space, (2) the multiple parameter combinations shown in **Table 2** are effective for the tested robo-pigeon, (3) the minimal stimulus parameters (pulse number = 20, pulse duration = 0.4 ms, pulse frequency = 100 Hz) could be determined as the optimized stimulus parameters for flight control of the robo-pigeon, taking into account energy consumption and potential nerve damage.

A solid auxiliary line with arrows was drawn manually and divided into two segments, which are labeled as T\_1 and F\_1 in **Figure 6**. T\_1 demonstrates an effective take-off using the stimulation listed in **Table 3** according to the data recorded on the TF card. F\_1 is corresponding to idle without stimulation. The stimulus parameters (pulse number = 20, pulse duration = 0.6 ms, pulse frequency = 110 Hz) could be determined as the optimized stimulus parameters for take-off of the robo-pigeon.

Results such as those obtained in this study will enable researchers to analyze and evaluate the control characteristics of robo-pigeons and optimize the parameter settings for each stimulation channel. Furthermore, such results will facilitate the identification of problems that could not be exposed in laboratory environments and the development of strategies to improve automatic robo-pigeon navigation in potential applications.

### DISCUSSION

The stimulus parameters and position are not recorded synchronously due to difficulties of positioning a flying robo-pigeon

Figure 6 | Results of robo-pigeon take-off test in open space.



in the indoor environment, so, the results of flight control experiment in the laboratory need to be manually analyzed and judged by the experimenter. Moreover, it is impossible to draw accurately the motion trajectory of robo-pigeons only based on experimental videos, in addition, this work is time consuming and laborious. Therefore, the experimental results cannot be objectively displayed and quantitatively analyzed. More unfortunately, those experimental data are not complete due to space constraints in the laboratory. So, it does not apply to open space, which is actual environment of potential application of robopigeons. In view of the above deficiencies, we proposed a method based on the newly designed GPS-based stimulator, it has the capabilities of collecting experimental data and generating visual trajectory of robo-pigeons automatically.

Random errors originating from the electrode positioning, electrode coating, and devices used for robo-pigeon development are inevitable. Those errors cause the parameter settings to differ between individuals and could even lead to the functional failure of a robo-pigeon. Therefore, it is essential to test and evaluate the control characteristics of robo-pigeons in open space. The method proposed just provides a simple and feasible solution to the complex and tedious tests.

The system described in this report overcomes some of the drawbacks of previous telemetry systems by providing an evaluation method of robo-pigeon flight control testing in open space. A special feature of the system is that it can record experimental data and display experimental results automatically. The system is not limited by the experimental environment and

### REFERENCES


will enable researchers to evaluate the control characteristics of robo-pigeons in open space quickly and clearly, to optimize the parameter settings for every stimulation channel, and to identify methods of improving the robo-pigeon design process. Simultaneously, the system will enrich brain stimulation research methods and enable bio-robot experiments to be conducted in open environments.

### ETHICS STATEMENT

Experiments were done on adult pigeons, microelectrodes were implanted into three emotion-related nuclei in pigeon's brain: left and right DIVA and PAG. This study was approved by the Ethics Committee of Shandong University of Science and Technology (Agreement number: SDUST201406-1-036). All of the experiments were performed in accordance with the guidelines issued by the Ethics Committee of Shandong University of Science and Technology, and they complied with the Guidelines for the Use of Animals of the International Brain Research Organization. All of the pigeons used in this study were well cared for and lived in groups in a special dovecote, never sacrificed.

### AUTHOR CONTRIBUTIONS

Conception or design of the work: JY, WL, RH, ZW, and XS. Experiment: JY, WL, RH, HW, MS, and XS. Data analysis: JY, WL, HW, XS, and MS. Drafting the article: JY, RH, HW, WL, and MS. Critical revision of the article: WL, JY, ZW, and XS. All the authors have approved the final manuscript.

### FUNDING

This work was supported by the National Natural Science Foundation of China (Nos. 61203370, 61305129, 61471119, and 61534003) and Qingdao Economic and Technological Development Zone Program (No. 2013-1-36).

tetherless microsystem," in *2008 IEEE 21st International Conference on Micro Electro Mechanical Systems* (Tucson), 164–167.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2017 Yang, Huai, Wang, Li, Wang, Sui and Su. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# A New Noise-Tolerant Obstacle Avoidance Scheme for Motion Planning of Redundant Robot Manipulators

#### Dongsheng Guo\*, Feng Xu, Laicheng Yan, Zhuoyun Nie and Hui Shao

College of Information Science and Engineering, Huaqiao University, Xiamen, China

Avoiding obstacle(s) is a challenging issue in the research of redundant robot manipulators. In addition, noise from truncation, rounding, and model uncertainty is an important factor that affects greatly the obstacle avoidance scheme. In this paper, based on the neural dynamics design formula, a new scheme with the pseudoinverse-type formulation is proposed for obstacle avoidance of redundant robot manipulators in a noisy environment. Such a scheme has the capability of suppressing constant and bounded time-varying noises, and it is thus termed as the noise-tolerant obstacle avoidance (NTOA) scheme in this paper. Theoretical results are also given to show the excellent property of the proposed NTOA scheme (particularly in noise situation). Based on a PA10 robot manipulator with point and window-shaped obstacles, computer simulation results are presented to further substantiate the efficacy and superiority of the proposed NTOA scheme for motion planning of redundant robot manipulators.

Keywords: obstacle avoidance, noise tolerant, pseudoinverse-type formulation, redundant robot manipulators, motion planning

### 1. INTRODUCTION

Recently, redundant robot manipulators have played an increasingly important part in many scientific and industrial fields. Motion planning is the fundamental issue, and has been extensively studied (Siciliano and Khatib, 2008; Siciliano et al., 2009; Flacco and De luca, 2015; Qiu et al., 2016; Zhang et al., 2016; Li M. et al., 2017; Jin and Li, 2018). A collision-free motion is necessary for a redundant robot manipulator because collision would lead to the failure of the motion planning task. Moreover, such collision(s) may cause considerable damage to the robot manipulator. Avoiding environmental obstacle(s) is an important issue in the motion planning of redundant robot manipulators. Many effective obstacle avoidance schemes have thus been developed for redundant robot manipulators (Maciekewski and Klein, 1985; Wang and Hamam, 1992; Chen et al., 2002; Lee and Buss, 2007; Guo and Zhang, 2012, 2014; Marcos et al., 2012; Chen and Zhang, 2016; Xiao and Zhang, 2016; Guo and Li, 2016). For example, Lee and Buss (2007) investigated obstacle avoidance by using the Jacobian transpose method. In Marcos et al. (2012), Machado et al. presented a multi-objective method for redundant robot manipulators to avoid obstacles. Note that most reported obstacle avoidance schemes are assumed to be free of noise in the simulations or experiments.

Given its practical application in the industry, another important issue for redundant robot manipulators that requires consideration is the simultaneous handling of environmental noise(s)

#### Edited by:

Florian Röhrbein, Technische Universität München, Germany

#### Reviewed by:

Bolin Liao, Jishou University, China Zhijun Zhang, South China University of Technology, China Dechao Chen, Sun Yat-sen University, China Weibing Li, University of Leeds, United Kingdom Xing He, Southwest University, China Jörn Fischer, Mannheim University of Applied Sciences, Germany

> \*Correspondence: Dongsheng Guo gdongsh@hqu.edu.cn

Received: 15 May 2017 Accepted: 07 August 2018 Published: 29 August 2018

#### Citation:

Guo D, Xu F, Yan L, Nie Z and Shao H (2018) A New Noise-Tolerant Obstacle Avoidance Scheme for Motion Planning of Redundant Robot Manipulators. Front. Neurorobot. 12:51. doi: 10.3389/fnbot.2018.00051 during the end-effector task execution (Yildirim and Eski, 2010). Noise is inevitably encountered when implementing the scheme for obstacle avoidance of redundant robot manipulators; this noise comes in the form of truncation error, rounding error, model uncertainty, and external disturbance (Jin et al., 2017a; Li et al., 2018). The robustness against additive noise is an important factor in designing a reliable obstacle avoidance scheme. Many researchers have thus examined robot manipulators in the presence of noise (Gaudiano et al., 1996; Florchinger, 2000; Siu et al., 2010; Yildirim and Eski, 2010; Ting et al., 2011; Guo et al., 2017; Jin et al., 2017a; Li et al., 2018). For example, Yildirim and Eski (2010) presented a noise analysis of robot manipulators using neural networks. In Li et al. (2018) designed a new approach for redundant robot manipulators that was combined with the neural controller inherently tolerant to noise.

The aforementioned noise (e.g., realization error and/or external error) has significant effects on the efficacy of the obstacle avoidance schemes for redundant robot manipulators. Such noise may also cause scheme invalidation or the failure of the end-effector primary task. For time-critical end-effector tasks, denoising must be integrated with motion planning for redundant robot manipulators (Li et al., 2018). Time is precious for the obstacle avoidance of redundant robot manipulators in practice, and any noise-reduction preprocessing may consume additional time (which may be in breach of the real-time requirement) (Li et al., 2018). Thus, an effective obstacle avoidance scheme is worth investigating for redundant robot manipulators in a noisy environment. Such a scheme must be inherently tolerant to various types of noise and able to make the robot manipulator avoid obstacles simultaneously.

In recent years, neural dynamics (as an important branch of artificial intelligence) has attracted considerable attention (Zhang et al., 2002; Zhang and Yi, 2011; He et al., 2014a,b; Zhang and Guo, 2015; Li et al., 2016, 2017a,b). It has been studied for solving different mathematical problems arising in motion planning of redundant robot manipulators. In particular, an exponent-type design formula was proposed by Zhang et al. (Zhang et al., 2002; Zhang and Yi, 2011; Zhang and Guo, 2015), and different neural-dynamics models were further developed to solve various types of time-varying problem. Note that some of these models have been applied effectively to redundant robot manipulators, showing well the application prospect of the neural-dynamics approach. By following the aforementioned successful work, another neural-dynamics design formula, which has noise suppression capability, was proposed and investigated by Jin et al. (2016a,b, 2017b). In this paper, such a design formula is exploited to develop an effective obstacle avoidance scheme for motion planning of redundant robot manipulators in the presence of noise.

Specifically, by using the neural-dynamics design formula in Jin et al. (2016a,b, 2017b) to solve the system of time-varying nonlinear kinematic equations, the corresponding obstacle avoidance scheme with the pseudoinverse-type formulation is proposed for redundant robot manipulators in this paper. Such an obstacle avoidance scheme can suppress constant and bounded time-varying noises, and is thus termed as the noise-tolerant obstacle avoidance (NTOA) scheme. For the proposed NTOA scheme, theoretical results are presented to show its excellent property. Computer simulation results are illustrated based on the PA10 robot manipulator with point and window-shaped obstacles to further substantiate the efficacy and superiority of the proposed NTOA scheme for motion planning of redundant robot manipulators.

The rest of this paper is organized into five sections. In section 2, the formulations of the neural-dynamics design formula and the NTOA scheme are given. Section 3 presents the theoretical results of the proposed NTOA scheme. In section 4, simulation results that are synthesized by the proposed NTOA scheme are provided. Section 5 shows the discussion about the NTOA scheme. Section 6 concludes this paper with final remarks. The main contributions of this paper are as follows.


### 2. DESIGN FORMULA AND SCHEME FORMULATION

In this section, the formulation of the neural-dynamics design formula is presented. Then, by exploiting this design formula to solve the system of time-varying nonlinear kinematic equations, the corresponding obstacle avoidance scheme is developed for motion planning of redundant robot manipulators in the presence of noise.

### 2.1. Neural-Dynamics Design Formula

Let us consider the following time-varying problem:

$$
\phi(t) = 0 \in \mathbb{R}^n.
$$

In the neural-dynamics design methodology (Zhang et al., 2002; Zhang and Yi, 2011; Zhang and Guo, 2015), to solve this problem, the following vector-valued error function is defined:

$$e(t) := \phi(t) \in \mathbb{R}^n.$$

Then, the time derivative of the error function e(t), i.e., e˙(t), is selected such that e(t) converges to zero. In Zhang et al. (2002), an exponent-type design formula was originally developed by Zhang et al. to determine e˙(t). Such a design formula is written as follows:

$$
\dot{e}(t) = -\gamma e(t), \tag{1}
$$

where design parameter γ > 0 ∈ R is used to scale the convergence rate of the solution (Zhang et al., 2002; Zhang and Yi, 2011; Zhang and Guo, 2015). Corresponding to the specific time-varying problem to be solved, expanding (1) yields the related neural-dynamics model. Furthermore, based on (1), another neural-dynamics design formula with noise suppression capability was developed by Jin et al. (2016a,b, 2017b). This design formula is written as follows:

$$
\dot{e}(t) = -k\_{\rm Pe} e(t) - k\_{\rm I} \int\_0^t e(\tau)d\tau,\tag{2}
$$

where k<sup>P</sup> > 0 ∈ R and k<sup>I</sup> > 0 ∈ R are the design parameters. Please refer to Jin et al. (2016a,b, 2017b) for details about the property of (2). Based on previous research, such a neural dynamics design formula (2) is applied to the obstacle avoidance of redundant robot manipulators in the ensuing section.

### 2.2. NTOA Scheme

The redundancy-resolution problem related to the motion planning of redundant robot manipulators is described as follows: given the desired end-effector path rd(t) ∈ R <sup>m</sup>, the corresponding joint trajectory θ(t) ∈ R n should be obtained in real time t. In mathematics, solving the redundancy-resolution problem can be equivalent to solving the following system of time-varying nonlinear kinematic equations:

$$f(\theta) = r\_{\mathsf{d}}.\tag{3}$$

where f(·) is a differentiable nonlinear mapping.

To solve (3), the error function e(t) is defined as follows:

$$e(t) = f(\theta) - r\_{\mathrm{d}} \in \mathcal{R}^m.$$

Evidently, expanding (2) yields the following result:

$$J\dot{\theta} = \dot{r}\_{\rm d} - k\mathbb{p}(f(\theta) - r\_{\rm d}) - k\_{\rm l} \int\_{0}^{t} (f(\theta) - r\_{\rm d}) d\tau,\tag{4}$$

where J ∈ R m×n is the Jacobian matrix of the robot manipulator, θ˙ ∈ R n is the joint-velocity vector, and r˙<sup>d</sup> ∈ R <sup>m</sup> is the time derivative of rd. By generalizing the conventional pseudoinversetype formulation (Siciliano and Khatib, 2008; Siciliano et al., 2009), the following redundancy-resolution scheme for the motion planning of redundant robot manipulators is obtained:

$$\dot{\boldsymbol{\theta}} = \boldsymbol{J}^{\dagger}(\dot{\boldsymbol{r}}\_{\rm d} - k\_{\rm P}(f(\boldsymbol{\theta}) - \boldsymbol{r}\_{\rm d}) - k\_{\rm I} \int\_{0}^{t} (f(\boldsymbol{\theta}) - \boldsymbol{r}\_{\rm d}) d\boldsymbol{\tau}) + (\boldsymbol{I} - \boldsymbol{J}^{\dagger} \boldsymbol{J}) \boldsymbol{z}. \tag{5}$$

As mentioned previously, avoiding environmental obstacle(s) is an important issue in the motion planning of redundant robot manipulators. Thus, by choosing a suitable z in (5), the NTOA scheme proposed in this paper for redundant robot manipulators is formulated as follows:

$$\dot{\theta} = J^{\uparrow}(\dot{r}\_{\rm d} - k\_{\rm P}(f(\theta) - r\_{\rm d}) - k\_{\rm I} \int\_{0}^{t} (f(\theta) - r\_{\rm d}) \mathrm{d}\tau) + \kappa \sum\_{i=1}^{n-\xi} V\_{Ni}^{\rm T} \nabla H(\theta) V\_{Ni,i} \tag{6}$$

where κ ∈ R is a real scalar (Li et al., 2001), ς = rank(J), and superscript <sup>T</sup> denotes the transpose operator. In addition, ∇H(θ) is the gradient of a performance criterion H(θ) (being a scalar function of the joint-angle vector θ ∈ R n ), and VNi is the ith column vector of V<sup>N</sup> ∈ R <sup>n</sup>×(n−ς) with V<sup>N</sup> = [Vς+<sup>1</sup> Vς+<sup>2</sup> · · · Vn]. V<sup>j</sup> (with j = ς + 1, ς + 2, · · · , n) is the jth column vector of the orthogonal unitary matrix V ∈ R n×n , which is obtained by using the singular value decomposition (SVD) of J (Lee and Buss, 2007). That is,

$$J = USV^{\mathrm{T}}\_{\cdot}$$

where U ∈ R <sup>m</sup>×<sup>m</sup> is the orthogonal matrix and S ∈ R m×n contains the singular values of J in its main diagonal. For obstacle avoidance, ∇H(θ) in (6) is replaced with the following escape velocity θ˙<sup>C</sup> in the joint space (Lee and Buss, 2007):

$$\phi\_C = \sum\_{i=1}^k f\_{Ci}^\mathrm{T}(\theta) \nu\_{C\vec{v}j},$$

where k is the number of critical points and JCi(θ) is the Jacobian matrix of the ith critical point C<sup>i</sup> . In addition, vCij is defined as a function of the minimum distance dij (between the ith link and the jth obstacle) along the direction away from the critical point C<sup>i</sup> (Lee and Buss, 2007):

$$\nu\_{\mathbf{C}\ddot{\boldsymbol{\eta}}} = \begin{cases} 0 & \text{for } d\_1 < d\_{\dot{\boldsymbol{\eta}}}, \\ \frac{\nu\_0}{2} \{ \cos(\pi \frac{d\_{\ddot{\boldsymbol{\eta}}} - d\_2}{d\_1 - d\_2}) + 1 \} u\_{\dot{\boldsymbol{\eta}}} & \text{for } d\_2 < d\_{\dot{\boldsymbol{\eta}}} \lesssim d\_1, \\ \nu\_0 u\_{\dot{\boldsymbol{\eta}}} & \text{for } d\_{\dot{\boldsymbol{\eta}}} \lesssim d\_2, \end{cases}$$

where v<sup>0</sup> is the maximum escape velocity and uij is the unit vector from the critical point C<sup>i</sup> on the ith link to the jth obstacle. In addition, the predefined thresholds d<sup>1</sup> and d<sup>2</sup> are the outer and inner safety thresholds, respectively. Descriptions of the outer and inner safety thresholds can be seen in Guo and Zhang (2014) and/or Guo and Li (2016).

Note that, for the proposed NTOA scheme (6), this paper limits the investigation that the two design parameters k<sup>P</sup> and k<sup>I</sup> satisfy k 2 <sup>P</sup> > 4k<sup>I</sup> numerically. Furthermore, theoretical results of (6) are presented in the ensuing section to show its effectiveness on motion planning and noise suppression.

### 3. THEORETICAL RESULTS

In this section, four theorems are provided to investigate the performance of the proposed NTOA scheme (6) in three situations, namely, the zero noise, constant noise, and bounded time-varying noise.

### 3.1. Obstacle Avoidance Without Considering Noise

In this subsection, the proposed NTOA scheme (6) is studied for redundant robot manipulators without considering the existence of noise (i.e., zero noise situation).

**Theorem 1:** The trajectory of the Cartesian error e(t) for the proposed NTOA scheme (6) is asymptotically stable.

**Proof:** See **Appendix A** in Supplementary Material.

**Theorem 2:** In addition to Theorem 1, the Cartesian error e(t) synthesized by the proposed NTOA scheme (6) has the property of exponential convergence.

**Proof:** See **Appendix B** in Supplementary Material.

For the proposed NTOA scheme (6), the results of Theorems 1 and 2 indicate that the corresponding Cartesian error e(t) has the property of global and exponential convergence. This means that the magnitude of e(t) synthesized by (6) can be kept within the region of a small value. By choosing the k<sup>P</sup> and k<sup>I</sup> values appropriately, the e(t) magnitude can be small enough during the motion task execution (whether noise exists or not), then the performance of (6) is considered satisfactory (De Luca et al., 1992). In summary, these results theoretically guarantee that the proposed NTOA scheme (6) is effective in the motion planning of redundant robot manipulators.

### 3.2. Obstacle Avoidance With Noise Considered

In this subsection, the proposed NTOA scheme (6) is investigated considering the existence of noise (i.e., the situations of constant noise and bounded time-varying noise). For further investigation, the proposed NTOA scheme (6) under the pollution of noise is formulated as follows:

$$\dot{\theta} = f^\dagger \left( \dot{r}\_{\rm d} - k\_{\rm P}(f(\theta) - r\_{\rm d}) - k\_{\rm I} \int\_0^t (f(\theta) - r\_{\rm d}) d\tau + \delta(t) \right) + \kappa \sum\_{i=1}^{n-\xi} V\_{\rm Ni}^T \nabla H(\theta) V\_{\rm Ni}, \tag{7}$$

where δ(t) ∈ R <sup>m</sup> denotes the vector-form noise (e.g., constant realization errors, time-varying bias errors, and the superposition of these errors). Now, the performance of the noise-polluted NTOA scheme (7) is studied for redundant robot manipulators.

**Theorem 3:** In the case of the vector-form constant noise δ(t) = c ∈ R <sup>m</sup>, the Cartesian error e(t) synthesized by the noise-polluted NTOA scheme (7) has a convergence property.

**Proof:** See **Appendix C** in Supplementary Material.

As mentioned before, the convergence property can guarantee a small magnitude of the Cartesian error e(t) during the motion task execution. Based on the proof in **Appendix C** in Supplementary Material, no matter how large the unknown vector-form constant noise δ(t) = c ∈ R <sup>m</sup> is, the Cartesian error e(t) for the noise-polluted NTOA scheme (7) is convergent, with the steady-state error being zero. Thus, as synthesized by (7) with the appropriate values of k<sup>P</sup> and k<sup>I</sup> , the magnitude of e(t) can be small enough.

In many practical applications, the noise can be time-varying. For the fast time-varying noise, the low-pass filter could be exploited to tackle the problem of error compensation. However, the noise-reduction preprocessing may consume extra time and violate the real-time requirement. Thus, investigating the performance of the noise-polluted NTOA scheme (7) in the presence of bounded time-varying noise (or even random noise) is necessary.

**Theorem 4:** In the case of the bounded vector-form timevarying noise δ(t) ∈ R <sup>m</sup>, the Cartesian error e(t) synthesized by the noise-polluted NTOA scheme (7) is bounded, with the steady-state error being bounded by

$$2\sqrt{\frac{m}{k\_{\rm P}^2 - 4k\_{\rm I}}} \max\_{0 \le \mathfrak{r} \le t} |\delta\_i(\mathfrak{r})|,$$

where δi(t) denotes the ith element of δ(t) and the symbol | · | denotes the absolute value of a scalar.

**Proof:** See **Appendix D** in Supplementary Material.

From Theorem 4, the upper bound of the steady-state error is in inverse proportion to the k<sup>P</sup> value. By choosing a large enough k<sup>P</sup> value and an appropriate k<sup>I</sup> value, the steady-state error can be arbitrarily small, which means that the magnitude of e(t) synthesized by (7) can also be small enough during the task execution. This result further shows that the noisepolluted NTOA scheme (7) is effective in the motion planning of redundant robot manipulators in the presence of bounded time-varying noise.

In summary, the results in this section (i.e., Theorems 1– 4) theoretically substantiate the efficacy and superiority of the proposed NTOA scheme (6) for redundant robot manipulators whether noise exists or not.

### 4. SIMULATIVE VERIFICATIONS

In this section, based on the PA10 robot manipulator with an equipped long tool (Zhang and Wang, 2004; Guo and Zhang, 2012, 2014), computer simulations with the existence of point and window-shaped obstacles are performed to verify the efficacy and superiority of the proposed NTOA scheme (6). For comparison, the obstacle avoidance method presented in Lee and Buss (2007) is simulated as well.

### 4.1. Point Obstacle Avoidance

In this example, the PA10 robot manipulator is simulated, in which there exists a point obstacle. The obstacle avoidance method in Lee and Buss (2007) and the proposed NTOA scheme (6) are both applied to such a robot manipulator for the endeffector tracking of a circular path considering three situations (i.e., the situations of zero, constant, and bounded time-varying noises).

#### 4.1.1. Zero Noise

The obstacle avoidance method in Lee and Buss (2007) and the proposed NTOA scheme (6) are both tested in the zero-noise situation, and the corresponding simulation results are presented in **Figures 1**–**3** and **Table 1**.

**Figure 1** shows the simulation results synthesized by the general minimum velocity norm (MVN) scheme [i.e., the proposed NTOA scheme (6) using k<sup>P</sup> = k<sup>I</sup> = κ = 0], in which the existence of the point obstacle is not considered. As seen from **Figure 1A**, the desired motion is achieved successfully by the

TABLE 1 | Maximal absolute value (MAV) of Cartesian error synthesized by the proposed NTOA scheme (6) with zero noise considered and with different kP and kI values used.


PA10 robot manipulator. However, **Figure 1B** indicates that the minimum link-obstacle distance d<sup>m</sup> is smaller than 0.05 m during [4.91, 6.25] s. This can be viewed as a collision phenomenon (Guo and Zhang, 2014; Guo and Li, 2016) that may cause damage to the robot manipulator and the point obstacle. Thus, exploiting an effective obstacle avoidance method/scheme for PA10 robot manipulator is necessary.

**Figure 2** illustrates the simulation results synthesized by the obstacle avoidance method in Lee and Buss (2007), where the existence of the point obstacle is considered. As shown in **Figures 2A,B**, during the motion task execution, the minimum link-obstacle distance d<sup>m</sup> is always greater than 0.05 m. This substantiates that the presented point obstacle is successfully avoided by the obstacle avoidance method in Lee and Buss (2007). **Figure 2C** shows that the maximal Cartesian error is approximately 5.0 × 10−<sup>5</sup> m. However, the detailed results in **Figure 2C** show that the divergence phenomenon in the Cartesian error is present. Thus, the obstacle avoidance method in Lee and Buss (2007), although effective, may be less desirable and less applicable in robotics.

By contrast, **Figure 3** presents the simulation results synthesized by the proposed NTOA scheme (6) with k<sup>P</sup> = k<sup>I</sup> = 10 and κ = 1. As seen from **Figures 3A,B**, the minimum link-obstacle distance d<sup>m</sup> during the motion is always >0.05 m, which implies that obstacle avoidance is achieved successfully. In addition, as shown in **Figure 3C**, the maximal Cartesian error is <5.0 × 10−<sup>6</sup> m, indicating the efficacy of the proposed NTOA scheme (6) for the motion planning of PA10 robot manipulator. The comparison between **Figure 2C** and **Figure 3C** shows that the maximal Cartesian error via (6) is about 10 times smaller than the one via the obstacle avoidance

TABLE 2 | Maximal absolute value (MAV) of Cartesian error synthesized by the proposed NTOA scheme (6) with constant noise considered and with different kP and kI values used.


method in Lee and Buss (2007). Furthermore, the divergence phenomenon does not exist for the Cartesian error presented in **Figure 3C**. Thus, a prominent advantage of the proposed NTOA scheme (6) is that it guarantees a Cartesian error with no divergence (which agrees with the convergence results in Theorems 1 and 2). These comparative results indicate that the proposed NTOA scheme (6) is superior to the obstacle avoidance method in Lee and Buss (2007).

For further investigation, the proposed NTOA scheme (6) is tested by using different k<sup>P</sup> and k<sup>I</sup> values, and the related simulation results are presented in **Table 1**. As seen from **Table 1**, the maximal absolute value (MAV) of the Cartesian error synthesized by (6) is small enough (i.e., of order 10−<sup>7</sup> ∼ 10−<sup>6</sup> ), showing the efficacy on motion planning. In addition, it follows from **Table 1** that the MAV of Cartesian error decreases when the k<sup>P</sup> and k<sup>I</sup> values increase. Thus, the design parameters k<sup>P</sup> and k<sup>I</sup> play an important role in (6), and should be set as large as the robotics systems would permit (or selected appropriately large for simulation/experiment purposes).

In summary, the above results (i.e., **Figures 1**–**3** and **Table 1**) substantiate the efficacy and superiority of the proposed NTOA scheme (6), as compared with the obstacle avoidance method in Lee and Buss (2007).

window-shaped obstacle and noise is not considered.

#### 4.1.2. Constant Noise

The proposed NTOA scheme (6) is tested in the situation of the constant noise δ(t) = c = [0.10, 0.15, 0.20]<sup>T</sup> . In this situation, the noise-polluted NTOA scheme (7) is actually used, and the corresponding simulation results are shown in **Figure 4** and **Table 2**. Note that the obstacle avoidance method in Lee and Buss (2007) with constant noise considered is test as well. However, the computer simulation failed, which shows that such a method does not have a capability of suppressing noise and cannot handle this kind of noise (thereby leading to the failure of the task execution).

**Figure 4** shows the simulation results synthesized by the noise-polluted NTOA scheme (7) with k<sup>P</sup> = k<sup>I</sup> = 10<sup>3</sup> and κ = 1. As seen from **Figure 4**, the obstacle-avoidance purpose is achieved successfully via (7) in the sense that d<sup>m</sup> is always >0.05 m. In addition, the simulated end-effector trajectory is close to the desired circular path with a small Cartesian error. The existence of constant noise leads to the significant increase of the Cartesian error (from the zero initial value) in the transient phase, as shown in **Figure 4C**. Owing to the noise suppressing capability, (7) can handle this kind of noise. The resultant Cartesian error is convergent (which agrees with the result of Theorem 3), and is kept within the region of a small value (i.e., of order 10−<sup>4</sup> ). These results indicate that the noise-polluted NTOA scheme (7) is effective for robotic practical applications.

The noise-polluted NTOA scheme (7) is tested as well using different k<sup>P</sup> and k<sup>I</sup> values, and the related simulation results are presented in **Table 2**. As shown in **Table 2**, the MAV of Cartesian

error is small enough, meaning that the motion planning task is executed successfully via (7) (though constant noise exists). **Table 2** also indicates that the performance of (7) is improved effectively by increasing the k<sup>P</sup> and k<sup>I</sup> values, showing again the important role of k<sup>P</sup> and k<sup>I</sup> in the noise-polluted NTOA scheme (7).

In summary, the above simulation results substantiate the efficacy and superiority of the proposed NTOA scheme (6) [i.e., the performance of the noise-polluted NTOA scheme (7)] for motion planning of redundant robot manipulators in the presence of constant noise.

#### 4.1.3. Bounded Time-Varying Noise

The proposed NTOA scheme (6) is tested in the situation of the bounded time-varying noise δ(t) = [0.2 sin(t), 0.2 sin(2t), 0.2 sin(3t)]<sup>T</sup> . Similarly, the noise-polluted NTOA scheme (7) is actually used in this situation, and the corresponding simulation results are presented in **Figure 5**.

As shown in **Figure 5**, the minimum link-obstacle distance d<sup>m</sup> is always >0.05 m and the maximal Cartesian error is <3.0 × 10−<sup>4</sup> m. Thus, the purposes of obstacle avoidance and motion planning are both achieved successfully using the noise-polluted NTOA scheme (7). According to **Figure 5C**, the Cartesian error is bounded and is kept within (−3, 2) × 10−<sup>4</sup> m during the task execution. This coincides with the result of Theorem 4. The noise-polluted NTOA scheme (7) is tested using different k<sup>P</sup> and k<sup>I</sup> values for further investigation. Owing to the similarity of results, the related simulation results are omitted here. This result indicates that the MAV of the Cartesian error via (7) decreases as the k<sup>P</sup> and k<sup>I</sup> values increase.

In summary, these simulation results substantiate the efficacy and superiority of the proposed NTOA scheme (6) [or the excellent performance of the noise-polluted NTOA scheme (7)] for redundant robot manipulators in the presence of bounded time-varying noise.

### 4.2. Window-Shaped Obstacle Avoidance

In this example, the PA10 robot manipulator is simulated, in which there exists a window-shaped obstacle (Guo and Zhang, 2014). The proposed NTOA scheme (6) is applied to such a robot manipulator for its end-effector tracking a circular path. Similarly, the following three situations of noise are considered in the investigation of the proposed NTOA scheme (6):

 Zero noise: [0, 0, 0]<sup>T</sup> ,

 Constant noise: [0.1, 0.2, 0.3]<sup>T</sup>

 , Bounded time-varying noise: [0.2 cos(t), 0.2 cos(2t), 0.2 cos(3t)]<sup>T</sup> .

Without considering the existence of the window-shaped obstacle, **Figure 6** shows the simulation results synthesized by the general MVN scheme. As seen from **Figure 6**, the desired motion for the PA10 robot manipulator is achieved successfully, but the minimum link-obstacle distance d<sup>m</sup> is smaller than 0.05 m during [2.67, 3.96] s. This close distance (being <0.05 m) means that there exists a collision that may cause considerable damage to the PA10 robot manipulator and the window-shaped obstacle.

To avoid the window-shaped obstacle, the proposed NTOA scheme (6) [as well as the noise-polluted NTOA scheme (7)] is applied to the PA10 robot manipulator under the aforementioned three situations, and the simulation results are shown in **Figures 7**, **8**. Note that, for **Figure 7**, the left subfigures present the simulated motion trajectories of the robot manipulator, and the right subfigures present the corresponding profiles of the minimum link-obstacle distance dm. Evidently, as shown in **Figures 7**, **8**, all of the simulated end-effector trajectories are close to the desired circular path (with the Cartesian errors being small enough), and the minimum link-obstacle distance d<sup>m</sup> during the motion is always greater than 0.05 m. This indicates that the obstacle-avoidance and motion-planning purposes are both achieved successfully via (6) [or (7)], no matter whether noise exists or not. Besides, as seen from **Figure 8**, in each situation of noise, the change of the Cartesian error is similar to the change presented in **Figures 3C**, **4C**, **5C**, showing that these results coincide with the results of Theorems 1–4. Thus, being one of its prominent advantages, the proposed NTOA scheme (6) guarantees that the Cartesian error occurs without divergence, as evidenced by both the theoretical and simulation results.

In summary, these simulation results have substantiated again the efficacy and superiority of the proposed NTOA scheme (6) for motion planning of redundant robot manipulators in the presence of different kinds of noise.

### 5. DISCUSSION

For the proposed NTOA scheme (6), the feedback item kP(f(θ)− rd) and the integration item k<sup>I</sup> R t 0 (f(θ) − rd)dτ ) are incorporated into the scheme formulation. Such a scheme thus contains the proportional, integral, and derivative information of the desired end-effector path rd. In this sense, the proposed NTOA scheme (6) processes the characteristic of proportional-integralderivative (PID), thereby showing that (6) can be considered as a nonlinear PID controller for the obstacle avoidance of redundant robot manipulators. Because of the special characteristic, the proposed scheme (6) is robust against constant noise and bounded time-varying noise, and enables the effective obstacle avoidance of redundant robot manipulators even in the presence of noise. The efficacy of (6) has been analyzed and verified via the theoretical and simulation results in sections 3 and 4.

By summarizing the simulation results in section 4, the superiority of the proposed NTOA scheme (6) over the obstacle avoidance method in Lee and Buss (2007) is presented as follows.


In summary, the proposed NTOA scheme (6) is advantageous over the obstacle avoidance method in Lee and Buss (2007) because it guarantees nondivergence Cartesian error regardless of the absence or presence of noise. Given this characteristic, the proposed NTOA scheme (6) is superior to the obstacle avoidance method in Lee and Buss (2007) for redundant robot manipulators.

Besides, both the theoretical and simulation results in sections 3 and 4 have indicated that the design parameters k<sup>P</sup> and k<sup>I</sup> are important to ensure the precision of the Cartesian error for the proposed NTOA scheme (6). To a certain extent, such two parameters are similar to the PID parameters and can be used to enhance the noise suppression capability of the proposed NTOA scheme (6). In general, the values of k<sup>P</sup> and k<sup>I</sup> can be selected in accordance with the actual situations of noise and practical requirements of precision. Summarizing the theoretical analysis and simulation results shows that k<sup>P</sup> and k<sup>I</sup> should be set to sufficiently large values (e.g., k<sup>P</sup> = k<sup>I</sup> = 10<sup>3</sup> or larger) to ensure the satisfactory performance of the proposed NTOA scheme (6), particularly in the presence of noise.

### 6. CONCLUSION

In this paper, based on the neural dynamics design formula in Jin et al. (2016a,b, 2017b), the new NTOA scheme (6) is proposed and investigated for the motion planning of redundant robot manipulators in the presence of noise. This scheme, which is capable of suppressing constant and bounded time-varying noises, enables the robot manipulator to avoid obstacles successfully. Theoretical results are presented for the proposed NTOA scheme (6) [as well as the noise-polluted NTOA scheme (7)] to show its excellent performance in motion planning and its remarkable noise suppression capability. On the basis of the PA10 robot manipulator with point and windowshaped obstacles and different kinds of noise, simulation results are provided to further substantiate the efficacy and superiority of the proposed NTOA scheme (6).

### AUTHOR CONTRIBUTIONS

DG had the basic ideas and made lots of research on neural dynamics and its application to redundant robot manipulators. FX made detailed scheme simulations and summarized the research results in the manuscript. LY, ZN, and HS revised the manuscript duly and carefully. All of the authors discussed the results at length.

### FUNDING

This work is supported by the National Natural Science Foundation of China (with number 61603143 and 61572534),

### REFERENCES


the Natural Science Foundation of Fujian Province (with number 2016J01307), the Promotion Program for Young and Middle-aged Teacher in Science and Technology Research of Huaqiao University (with number ZQN-YX402 and ZQN-PY408), and also the Scientific Research Funds of Huaqiao University (with number 15BS410).

### ACKNOWLEDGMENTS

The authors would also like to thank the editors and reviewers sincerely for their time and effort spent in handling the paper, as well as the constructive comments for further improving the presentation and quality of this paper.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnbot. 2018.00051/full#supplementary-material


manipulators. Int. J. Syst. Sci. 47, 932–945. doi: 10.1080/00207721.2014. 909971


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Guo, Xu, Yan, Nie and Shao. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Bio-Inspired Genetic Algorithms with Formalized Crossover Operators for Robotic Applications

Jie Zhang<sup>1</sup> \*, Man Kang<sup>1</sup> , Xiaojuan Li <sup>2</sup> and Geng-yang Liu<sup>1</sup>

*<sup>1</sup> College of Information Science and Technology, Beijing University of Chemical Technology, Beijing, China, <sup>2</sup> College of Information Engineering, Capital Normal University, Beijing, China*

Genetic algorithms are widely adopted to solve optimization problems in robotic applications. In such safety-critical systems, it is vitally important to formally prove the correctness when genetic algorithms are applied. This paper focuses on formal modeling of crossover operations that are one of most important operations in genetic algorithms. Specially, we for the first time formalize crossover operations with higher-order logic based on HOL4 that is easy to be deployed with its user-friendly programing environment. With correctness-guaranteed formalized crossover operations, we can safely apply them in robotic applications. We implement our technique to solve a path planning problem using a genetic algorithm with our formalized crossover operations, and the results show the effectiveness of our technique.

#### Edited by:

*Yangming Li, University of Washington, United States*

#### Reviewed by:

*Long Jin, Lanzhou University, China Duo Liu, Chongqing University, China Yi Wang, Shenzhen University, China Zili Shao, Hong Kong Polytechnic University, Hong Kong*

#### \*Correspondence:

*Jie Zhang jzhang@mail.buct.edu.cn*

Received: *01 August 2017* Accepted: *06 October 2017* Published: *24 October 2017*

#### Citation:

*Zhang J, Kang M, Li X and Liu G-y (2017) Bio-Inspired Genetic Algorithms with Formalized Crossover Operators for Robotic Applications. Front. Neurorobot. 11:56. doi: 10.3389/fnbot.2017.00056* Keywords: genetic algorithm, formalization, crossover operator, high order logic, HOL4

## INTRODUCTION

Genetic algorithms are widely adopted in robotic applications such as path planning (Hu and Yang, 2004; Taharwa et al., 2008; Achour and Chaalal, 2011; Liu et al., 2013; Sanfilippo et al., 2013; Gautam and Verma, 2014; Vicmudo et al., 2014). When genetic algorithms are applied in such safety-critical applications, it is extremely important to prove their correctness. Specially, crossover operators play a key role in searching for near-optimal solution in genetic algorithms. Therefore, it becomes an important issue for how to develop correctness-guaranteed formalized crossover operations in robotic applications (Zhou and Sun, 1999; Wang and Cao, 2002).

There have been studies to formalize crossover operations of genetic algorithms. In Uchibori and Endou (1999), completed the formalization of crossover operators. In Vidal et al. (2008), a mathematical abstraction of crossover operators is proposed to extend the applicability of formalized crossover operators in genetic algorithms. In Nawaz et al. (2013), the correctness of genetic algorithms with formalized crossover operators is verified. While the above studies lay the foundation for formalizing crossover operations of genetic algorithms, effective mechanisms and techniques are still urgently needed for developing correctness-guaranteed formalized crossover operations that can be easily deployed in genetic algorithms in practice.

In this paper, we for the first time develop correctness-guaranteed formalized crossover operations based on HOL4 (Higher-Order Logic 4) (HOL Project, 2017) that is easy to be deployed with its user-friendly programing environment. We first present a general structural model and construct the formal model of cross operators. Based on these, one-point crossover operator and multi-point crossover operator are then formalized and proved with HOL4. We conduct a case study by implementing the proposed technique in robotic applications to solve a path planning problem, in which a genetic algorithm with our formalized crossover operations has been developed, and the results show that our technique can be easily applied and effectively solve optimization problems with genetic algorithms.

The rest of paper is organized as follows. Section Manuscript Formatting presents background. Section Higher-order Logic Representation of Crossover Operators: Basic Elements describes the formal model of cross operations with HOL4. In sections Higher-order Logic Representation and Formal Verification of One-point Crossover Operators and Higher-order Logic Representation and Formal Verification of Multi-Point Crossover Operators, we formalize and prove one-point and multi-point crossover operators with HOL4, respectively. Section Discussion discusses the proposed technique. Section Evaluation evaluates the proposed work with a case study for implementing our technique to solve path planning in robotic applications. Finally, we conclude this paper in section Conclusion.

### MANUSCRIPT FORMATTING

### Population

In order to complete the formalization of crossover operators, we must formalize the population that is the base of the evolution of genetic algorithms and the workspace of crossover operators. According to the collective property of the population, a population is defined as the abstract set and is represented as ": bool list - > bool" in HOL4. We use the symbol D to represent the non-empty set of a population. In addition, in order to ensure that crossover operations can be carried out in the formalized population D to generate new chromosomes, population D also needs to meet the following two properties:


### Crossover Operations

A crossover operation is defined as the behavioral process in which offspring are produced by crossover operators. A crossover operation intercepts two parent chromosomes at the crossover point, and reconnects the dissected gene segments to create a new chromosome. **Figure 1** illustrate how a crossover operation works.

To implement the higher-order logic formalization of crossover operations, we can abstract the process shown in **Figure 1**. into three elements, namely, the operation object, the operation position and the basic operation. Based on this abstraction, **Figure 2** shows a structural model. In **Figure 2**, chromosomes are individuals in population D; chromosome p and chromosome q as operation objects that represent the two parent chromosomes; cross-term l denotes the operation position which is the set of crossover points; the basic operations consisting of TAKE, DROP and APPEND are the behavior operations used to complete gene exchange.

As shown in **Figure 2**, the operation objects and operation positions in the general model of crossover operations constitute the basic variables of the formal model, and the basic operations in the general structural model construct the behavior of the formal model. Moreover, the basic variables and the basic behavior operations will form the formal model of the crossover operation.

### HIGHER-ORDER LOGIC REPRESENTATION OF CROSSOVER OPERATORS: BASIC ELEMENTS

To realize the formalization of crossover operations, the prerequisite work is to use the higher-order logic to represent the basic elements of crossover operations. Therefore, the higherorder logic representation of the three basic elements in the above model and the proofs of their related properties are presented in this section.

### Higher-Order Logic Representation of Chromosomes

Since a chromosome is an arrangement of a limited number of genes, the data structure of chromosomes in HOL4 is defined as a list; the data type of elements in the list is defined as Boolean (: bool). Then a chromosome can be represented as a Boolean list (: bool list). Correspondingly, two parent chromosomes of an operation object can be represented by p and q respectively, p = p<sup>1</sup> . . . pn,q = q<sup>1</sup> . . . qn,p ∈ D,q ∈ D. Here p<sup>i</sup> (: bool) (1≤ i≤n) is the gene that constitutes the chromosome p (: bool list);q<sup>i</sup> (: bool) (1≤ i≤n) is the gene that forms the chromosome q (: bool list).

### Higher-Order Logic Representation of Cross-Term

The crossover position in crossover operations, called crossterm, is represented by a natural number. Thus, its data type is defined as natural number (: num) in HOL4. Since the crossover operators include one-point crossover and multi-point crossover, the number of crossover points may be one or more. Therefore, the data structure of cross-term in HOL4 is defined as natural number lists (: num list) and represented by l (: num list).

### Higher-Order Logic Representation of Basic Operations

As described above, the data structure of both chromosomes and cross-term are defined as lists. By analyzing the list theory base in HOL4, the operation functions, namely, TAKE, DROP and APPEND, exactly match the three basic operation functions in the general model. Therefore, TAKE is used to get the first n genes of chromosome p, abbreviated as p ↑ n; DROP is used to obtain the genes after the n-th position of chromosome p, abbreviated as q ↓ n; APPEND is utilized to connect the two chromosome fragments p<sup>1</sup> and q<sup>1</sup> to form a new chromosome, abbreviated as p<sup>1</sup> ++ q1.

The mathematical description of the three basic operations (TAKE, DROP and APPEND) is presented as follows.

For any p, q ∈ D, let n be the length of p, m the crossover point, k the length of q, where m, n, k ∈ N. The basic operations are defined as:

$$\begin{aligned} p \quad \text{TAKE} \quad m &= \begin{cases} (\not p\_1, \dots, \not p\_m) & \text{if} & m < n, \\ p & \text{if} & m \ge n. \end{cases} \\ \not p \quad \text{DROP} \quad m &= \begin{cases} (\not p\_{m+1}, \dots, \not p\_n) & \text{if} & m < n, \\ [] & \text{if} & m \ge n. \end{cases} \\ \not p \quad \text{APRED} \quad q &= (\not p\_1, \dots, \not p\_n, q\_1, \dots, q\_k). \end{aligned}$$

Here, [] denotes an empty list.

Based on the above definitions, the higher-order logic representations of the three basic operations in HOL4 can be expressed respectively as follows:

> val TAKE = [] |− (!l. TAKE 0 l = []) /\ !n x l. TAKE (SUC n) (x::l) = x::TAKE n l: thm > val DROP = [] |− (!l. DROP 0 l = l) /\ !n x l. DROP (SUC n) (x::l) = DROP n l: thm > val APPEND = [] |− (!l. [] ++ l = l) /\ !l1 l2 h. h::l1 ++ l2 = h::(l1 ++ l2): thm

### Formal Verification of Basic Operations

In HOL4 library, TAKE and DROP are used to manipulate the list, and they have two parameters, i.e., natural number and list. Function TAKE can cut the child list of list before the natural number, and function DROP can cut the child list of list after the natural number. In order to prove the properties of the formalized crossover operators, it is necessary to prove the properties of TAKE and DROP (Darmochwal and Nakamura, 1991; Kotowicz, 1993; Uchibori and Endou, 1999; Vidal et al., 2008; Nawaz et al., 2013). Since the existing properties of APPEND in HOL4 are sufficient, there is no need for more proofs. The basic properties of TAKE and DROP are classified as follows and their mathematical descriptions are given below.

Properties of TAKE: for any p, q ∈ D,m, n ∈ N


$$
\rho \uparrow 0 = \lceil \rceil \tag{2}
$$

$$\ast \colon ((p \uparrow m) \uparrow n) = (p \uparrow \text{MIN}(m, n)) \tag{3}$$

$$\ast \left( p \uparrow \text{MIN}(m, n) \right) = \left( \left( p \uparrow n \right) \uparrow m \right) \tag{4}$$

∗ ((p ↑ m) ↑ n) = ((p ↑ n) ↑ m) (5)

LENGTH p = LENGTH q ==> (6)

∗ LENGTH (p ↑ n) = LENGTH (q ↑ n)

$$\begin{aligned} \text{LENGTH } (p \uparrow n) &= \\ \text{MIN}(n, (\text{LENGTH } p \text{ })) \end{aligned} \tag{7}$$

$$\begin{aligned} \text{(}(m \text{<< = LENGTH } p) \text{)} \text{(}(m \text{<< = n}) \text{)} &==> \\ \text{(\* (m \text{<< = LENGTH } (p \uparrow n)) } \end{aligned} $$

(m<= LENGTH p ) ==>

$$\stackrel{\circ}{\ast}(m = \text{LENGTH } (p \uparrow m) )$$

$$\begin{aligned} \text{(LENGTH } p \text{ <<=n)} &== \\ \text{(} (p++q) \uparrow n &= p++\\ \end{aligned} \tag{10}$$

 $\ast \text{ ( $q \uparrow ($ n-\text{LENGTH } p\text{ }\text{)}\text{)})}$ 

$$\begin{aligned} \text{LENGTH ( $p \uparrow n$ ) } &= \\ \text{if} (n << \text{LENGTH } p \text{ )} \\ \text{then } n \end{aligned} \tag{11}$$

else (LENGTH p)

$$\text{(LENGTH } p \text{) } \prec = \newline n == \succ \text{ } p \uparrow \text{ } n = p \tag{12}$$

$$\begin{aligned} (n &:= \text{LENGTH } p \text{ }) == : \\ ((p ++ q) \uparrow n &= p \uparrow n) \end{aligned} \tag{13}$$

$$\begin{aligned} \text{(LENGTH } p \text{ $$

$$(\stackrel{\frown}{q} \uparrow (\stackrel{\frown}{n-} \text{LENGTH } p \text{ })))$$

$$\begin{aligned} \{ \text{LENGTH } p < n \} &== \\ \{ (p++q) \uparrow n &= p++\\ \{ q \uparrow (n-\text{LENGTH } p ) \} & \end{aligned} \tag{15}$$

$$\begin{aligned} (m &:= \text{LENGTH } p) \land (n &:= m) &== \\ (((p \uparrow m) \uparrow n) &= p \uparrow n) \end{aligned} \tag{16}$$

Properties of DROP: for any p, q ∈ D, m, n ∈ N

$$\*\left[\right] \downarrow \downarrow \ n = \left[\right] \tag{17}$$

$$
\mathfrak{p} \downarrow \mathfrak{0} = \mathfrak{p} \tag{18}
$$

$$p \downarrow 0 = q == \succ p = q \tag{19}$$

$$\ast \colon ((\not p \downarrow m) \downarrow n) = (\not p \downarrow (m+n)) \tag{20}$$

$$\ast \colon (p \downarrow (m+n)) = ((p \downarrow n) \downarrow m) \tag{21}$$

$$\*\left(\left(p\downarrow m\right)\downarrow n\right) = \left(\left(p\downarrow n\right)\downarrow m\right)\tag{22}$$

LENGTH p = LENGTH q ==>

$$\stackrel{\circ}{\ast} \text{LENGTH } (p \downarrow n) = \stackrel{\circ}{\text{LENGTH } (q \downarrow n)}$$

$$\text{LENGTH } (p \downarrow n) = (\text{LENGTH } p) - n \tag{24}$$

$$(\text{LENGTH } p \text{) } \prec = \newline n == \succ \text{ } p \downarrow \newline n = [] \newline \tag{25}$$

$$\begin{aligned} (n &:= \text{LENGTH } p \text{ }) == \\ ((p++q) \downarrow n &= (p \downarrow n) ++ q) \end{aligned} \tag{26}$$

$$(\text{LENGTH } p \prec = n) == >$$

$$(\text{(}p++q)\downarrow n = q\downarrow \text{(}n-\text{LENGTH }p\text{)})$$

$$\begin{aligned} ((n+m)\prec = \text{LENGTH } p) &== \\ (((p\downarrow m)\downarrow n) &= p\downarrow (n+m)) \end{aligned} \tag{28}$$

The relation between TAKE and DROP: for any p, q ∈ D,m, n ∈ N

$$\ast \ (p \uparrow n) \downarrow n = [] \tag{29}$$

$$\ast \text{ ( $p \downarrow n$ )} \uparrow m = \text{( $p \uparrow (m+n)$ )} \downarrow n \tag{30}$$

$$\ast \text{ ( $p \uparrow n$ )} \downarrow \text{ } m \text{ = ( $p \downarrow m$ )} \uparrow \text{ ( $n-m$ )}\tag{31}$$

In the above equations, the properties with ∗ are required to be proved in this paper, while these properties without ∗ have existed in HOL4 and need not be proved.

### Formal Modeling and Implementation of Crossover Operations with HOL4

As mentioned above, crossover operations are the process of generating offspring. In order to establish a formal model of crossover operation, we first construct the basic implementation flow of generating offspring based on the general structural model of crossover operation, as shown in **Figure 3**.

In **Figure 3**, p and q are two parent chromosomes; l is the cross-term that represents the crossover position; l = [s] indicates that there is only one crossover point s; chromosome p' is the offspring chromosome generated.

The basic implementation of crossover operation in **Figure 3** can only be used for one-point crossover operator. In order to apply the formalized crossover operation to other crossover operators, the crossover process is improved according to the characteristics of multi-point crossover operators.

In general, the process of multi-point crossover can be regarded as the repetition of one-point crossover. Therefore, when the number of crossover points are n (n > 1) in crossterm l, the operation objects of TAKE and DROP are the offspring chromosomes generated by n-1 rounds of crossover.

Let CROSSOVER represent a crossover operation. CROSSOVER crosses the chromosomes p and q in turns according to the crossover points in cross-term l. According to the features of the functional language, recursive methods can be used to achieve the repeated process between one-point crossover and multi-point crossover. **Figure 4** shows the implementation process of crossover operations.

As shown in **Figure 4**, CROSSOVER l p q is the offspring chromosome generated by the crossover operation with two parent chromosomes p and q. Similarly, CROSSOVER l q p is another offspring chromosome generated by the crossover operation with two parent chromosomes q and p. To complete the gene exchange, a crossover operation uses two basic operations, namely, TAKE and DROP. The operation object of

Frontiers in Neurorobotics | www.frontiersin.org

TAKE and DROP also contains the crossover operation itself, so the whole process contains two recursive lines. Because the two recursive lines are parallel, the method employed is called double recursion. According to the execution diagram of the double recursion, it can be observed that the recursion procedure is to reduce the size of the cross-term, while the regression process is to exchange genes at each crossover point in turn.

In **Figure 4**, the implementation procedure of the crossover operation can be viewed as a binary tree where the number of crossover points corresponds to the height of the binary tree. For the special case in which there is only one crossover point, the height of the full binary tree is one. Therefore, the crossover operation with the double recursion method, which can be used to construct one-point crossover and multi-point crossover, possesses generality. Moreover, the implementation process of this crossover operation can also be used to form other crossover operators such as uniform crossover operators and partially matched crossover operators.

According to the implementation process of the crossover operation, the mathematical description of the crossover operation is given as follows:

$$\text{CROSSOVER } l \text{ p } \mathbf{q} = \begin{cases} p & \text{if } l = []. \\ \text{((CROSSOVER } t \text{ p } q) \uparrow h) & \text{if } l = h \text{ : :t.} \\ ++ \text{((CROSSOVER } t \text{ q } p) \downarrow h) & \end{cases}$$

Based on the above mathematical description, the higher-order logic implementation of the crossover operation in HOL4 is given as follows:

> val CROSSOVER\_def = [] |− (!p q. CROSSOVER [] p q = p) /\ !h t p q. CROSSOVER (h::t) p q = TAKE h (CROSSOVER t p q) ++ DROP h (CROSSOVER t q p): thm

The higher-order logic description of the crossover operation is an important preliminary work for formalizing crossover operators. We further describe the one-point crossover operator and multi-point crossover operator using higher-order logic in HOL4 and complete the proofs of their relevant properties next.

### HIGHER-ORDER LOGIC REPRESENTATION AND FORMAL VERIFICATION OF ONE-POINT CROSSOVER OPERATORS

One-point crossover operator selects two chromosomes in population D as two parent chromosomes and one random crossover point, and then exchanges the chromosome segments at the crossover point to obtain two new offspring chromosomes.

Two parent chromosomes in population D are defined as follows:

$$\begin{array}{c} p = p\_1, p\_2, \dots, p\_n \\ q = q\_1, q\_2, \dots, q\_n \end{array}$$

p and q represent the two parent chromosomes; p<sup>i</sup> (1≤i≤n) and q<sup>i</sup> (1≤i≤n) express the genes that make up the chromosomes.

Choose an random intersection i(1≤i≤n),then generate two new offspring:

$$\begin{array}{l} p' = p\_1, \ldots, p\_i, q\_{i+1}, \ldots, q\_n \\ q' = q\_1, \ldots, q\_i, p\_{i+1}, \ldots, p\_n \end{array}$$

p ′ and q ′ denote the two offspring; p<sup>i</sup> (1≤i≤n) and q<sup>i</sup> (1≤i≤n) express the genes that make up the chromosomes.

### Formalization of One-Point Crossover Operator in HOL4

From the definition of the one-point crossover operator, it is known that the one-point crossover operator generates two chromosomes at the same time, while the crossover operation can only produce one offspring chromosome every time. Thus, the implementation of the one-point crossover operator needs two crossover operations. Since the two offspring are generated at the same time and their relation is parallel, two-tuples are used to indicate the relation between two offspring generated in the mathematical description of the one-point crossover operator as follows:

$$
\begin{array}{c}
\overline{\odot} \ n \ p \ q = \text{(CROSOVER [n] } \ p \ q, \\
\text{CROSOVER [n] } \ q \ p \text{)}.
\end{array}
$$

Symbol ⊙¯ represents an one-point crossover operator; CROSSOVER denotes the crossover operation; p and q are two parent chromosomes in population D; [n] is the crossover term with one crossover point n.

Based on the above mathematical descriptions, the higherorder logic description of the one-point crossover operator in HOL4 is as follows:

> val ONEPOINT\_CROSSOVER\_def = [] |− !n p q. ONEPOINT\_CROSSOVER n (p,q) = (CROSSOVER [n] p q, CROSSOVER [n] q p): thm

### Verification of One-Point Crossover Operator

In order to ensure the correctness of one-point crossover operator, we prove the four basic properties of the one-point crossover operator in HOL4.

Theorem 1: Given any p, q ∈ D and a random crossover point n, if LENGTH p = LENGTH q, then LENGTH (CROSSOVER [n] p q) = LENGTH p. The higher-order logic description is as follows:

> val OCROSSOVER\_LENGTH = [] |− !n p q.

p IN D /\q IN D /\ (LENGTH p = LENGTH q) ==> (LENGTH (CROSSOVER [n] p q) = LENGTH p): thm

Theorem 1 ensures that the one-point crossover does not change the length of chromosome.

Theorem 2: Given any p, q ∈ D, if LENGTH p = LENGTH q, 0 is the crossover point, then CROSSOVER [0] p q = q. The higher-order logic description is as follows:

> val OCROSSOVER\_ZERO = [] |− !p q. p IN D /\q IN D /\(LENGTH p = LENGTH q) ==> (CROSSOVER [0] p q = q): thm

Theorem 2 shows that when crossover point is 0, the offspring generated by one-point crossover operator are the same as parent chromosomes, but the order is exchanged, that is, the first offspring is the second parent and the second child is the first parent.

Theorem 3: Given any p, q ∈ D, if LENGTH p = LENGTH q, n is the crossover point andLENGTH p<n, then CROSSOVER [n] p q = p. The higher-order logic description is as follows:

> val OCROSSOVER\_TOO\_LONG = [] |− !n p q. p IN D /\q IN D /\ (LENGTH p = LENGTH q) /\ LENGTH p < n ==> (CROSSOVER [n] p q = p): thm

Theorem 3 guarantees that if the position of the crossing point is larger than the length of chromosome, then the offspring produced by the one-point crossover operator are the same as the parent chromosomes.

Theorem 4: Given any p, q ∈ D, if LENGTH p = LENGTH q, n is the crossover point and LENGTH p = n, then CROSSOVER [n] p q = p. The higher-order logic description is as follows:

> val OCROSSOVER\_EQ = [] |− !n p q. p IN D /\q IN D /\ (LENGTH p = LENGTH q) /\ (n = LENGTH p) ==> (CROSSOVER [n] p q = p): thm

Theorem 4 holds the property that if the position of the intersection is equal to the length of chromosome, the offspring generated by the one-point crossover operator are the same as the parents.

As mentioned above, Theorems 1-4 mainly reflect the relation between the positions of crossover points and the results produced by the one-point crossover operator. In addition, the formalization of the one-point crossover operator provides a good foundation for our analysis and design of formalization of the multi-point crossover operator.

### HIGHER-ORDER LOGIC REPRESENTATION AND FORMAL VERIFICATION OF MULTI-POINT CROSSOVER OPERATORS

With a multi-point crossover operator, two chromosomes in population D are selected as two parent chromosomes, and with n crossover points, we exchange the chromosome segments to eventually obtain two new offspring chromosomes.

Two parent chromosomes in population D are:

$$\begin{array}{l} p = p\_1, p\_2, \dots, p\_n \\ q = q\_1, q\_2, \dots, q\_n \end{array}$$

p and q represent the two parent chromosomes; p<sup>i</sup> (1≤i≤n) and q<sup>i</sup> (1≤i≤n) represent the genes that make up the chromosomes.

By randomly selecting n crossover points: i, j, k,. . . ., the offspring produced can be represented as follows:

$$\begin{array}{l} p' = p\_1, \ldots, p\_i, q\_{i+1}, \ldots, q\_j, p\_{j+1}, \ldots, p\_k, q\_{k+1} \ldots \\ q' = q\_1, \ldots, q\_i, p\_{i+1}, \ldots, p\_j, q\_{j+1}, \ldots, q\_k, p\_{k+1} \ldots \end{array}$$

Here, p ′ and q ′ denote the two offspring; p<sup>i</sup> (1≤i≤n).

### Formalization of Multi-Point Crossover Operator in HOL4

From the definition of the multi-point crossover operator, we can see that the multi-point crossover operator is similar to the one-point crossover operator. In both cases, progeny is generated in parallel. The difference is the number of crossover points. Therefore, the creation of multi-point crossover also needs two crossover operations. However, the cross-term denoted by l is an arrangement of multiple crossover points rather than only one point. When describing the multi-point crossover with mathematical methods, we still use two-tuples to represent the parallel relation between two offspring.

Therefore, the mathematical description of the multi-point crossover operator obtained is as follows:

$$
\overline{\otimes}\ l\ (p,q) = \text{(CROSSOVER }l\ p\ q\ ,\ \text{(CROSOVER }l\ q\ p).
$$

Symbol ⊗ represents multi-point crossover operator; CROSSOVER denotes crossover operation; p and q are two parent chromosomes in population D; l is the cross-term with multiple crossover points.

Based on the above mathematical description, a multi-point crossover operator in HOL4 can be denoted as follows:

> val MULTIPOINT\_CROSSOVER\_def = [] |− !l p q. MULTIPOINT\_CROSSOVER l (p,q) = (CROSSOVER l p q, CROSSOVER l q p): thm

### Verification of Multi-Point Crossover Operator

To ensure that the higher-order logic representation of a multipoint crossover operator is correct, its properties must be verified. In the following, Theorems 6 and 7 describe the relation between the crossover point and offspring generated by the multi-point crossover operator; Theorems 8-9 and Theorems 14-16 mainly show that the results produced by the multi-point crossover operator are independent of the arrangement of cross-term; Theorem 10 guarantees that the elimination of two identical elements in a cross-term does not affect the results obtained by the multi-point crossover operator.

Theorem 5: Given any p, q ∈ D, any crossterm l, if LENGTH p = LENGTH q, then LENGTH( CROSSOVER l p q) =LENGTH p. The higher-order logic description is as follows:

> val XCROSSOVER\_LENGTH = []|− !l p q. p IN D /\q IN D /\ (LENGTH p = LENGTH q) ==> (LENGTH (CROSSOVER l p q) = LENGTH p): thm

Theorem 5 ensures that the length of new chromosomes generated by the multi-point crossover is equal to the length of two parent chromosomes p and q.

Theorem 6: Given any p, q ∈ D, any cross-term l, if LENGTH p = LENGTH q, then CROSSOVER (0 : :l) p q = CROSSOVER l q p. The higher-order logic description is as follows:

> val XCROSSOVER\_ZERO\_APPEND = [] |− !p q l. p IN D /\q IN D ==> (CROSSOVER (0::l) p q = CROSSOVER l q p): thm

Theorem 6 shows that in the case of the same chromosomes p and q, adding a crossover point 0 at the beginning of the crossterm l does not change the progeny generated by the multi-point crossover.

Theorem 7: Given any p, q ∈ D, any cross-term l, if LENGTH p = LENGTH q and LENGTH p ≤ n, then CROSSOVER (n::l) p q = CROSSOVER l p q. The higher-order logic description is as follows:

> val XCROSSOVER\_TOO\_LENGTH = [] |− !p q n l. p IN D /\q IN D /\ (LENGTH p = LENGTH q) /\LENGTH p <= n ==> (CROSSOVER (n::l) p q = CROSSOVER l p q): thm

Theorem 7 guarantees that in the case of the same parent chromosomes p and q, when n is not less than the length of chromosome, adding a crossover point n at the beginning of the cross-term l does not change the progeny generated by multi-point crossover.

Theorem 8: Given any p, q ∈ D, any crossterm l, l1, l2, if LENGTH p = ccLENGTH q and CROSSOVER l1 p q = CROSSOVER l2 p q, then CROSSOVER (l++l1) p q = CROSSOVER (l++l2) p q . The higher-order logic description is as follows:

```
> val XCROSSOVER_EQ =
      []|− !l1 l2.
```
(!p q.p IN D /\q IN D /\ (LENGTH p = LENGTH q) /\ (CROSSOVER l1 p q = CROSSOVER l2 p q)) ==> !p q l. CROSSOVER (l ++ l1) p q = CROSSOVER (l ++ l2) p q: thm

Theorem 8 shows that in the case of the same parent chromosomes p and q, if the offspring generated by the multipoint crossover with cross-term l1 are equal to the offspring produced by the multi-point crossover with cross-term l2, adding cross-term l at the beginning of the cross-term l1 and l2 respectively does not change the equivalency of the progeny generated in the same way.

Theorem 9: Given any p, q ∈ D, any cross-term l, any m, n ∈ N, if LENGTH p = LENGTH q, then CROSSOVER (n : :(m : :l)) p q = CROSSOVER (m : :(n : :l)) p q. The higher-order logic description is as follows:

> val XCROSSOVER\_SWAP = [] |− !p q l m n. p IN D /\q IN D /\ (LENGTH p = LENGTH q) ==> (CROSSOVER (n::m::l) p q = CROSSOVER (m::n::l) p q): thm

Theorem 9 holds the property that in the case of the same parent chromosomes p and q, two crossover points with different order are added respectively at the beginning of cross-term l, then the progeny produced by the multi-point crossover with the new two cross-terms respectively are the same.

Theorem 10: Given any p, q ∈ D, any crossterm l, any n ∈ N, if LENGTH p = LENGTH q, then CROSSOVER (n : :(n : :l)) p q = CROSSOVER l p q. The higher-order logic description is as follows:

$$\begin{array}{l} \succ\text{val XCROSSOVER\\_SAME} = \\ \quad \text{[]} \mid - \mid p \neq l \mid n. \\ \quad p \text{ IN } D \land q \text{ IN } D \land\\ \quad \text{(LENGTH } p = \text{LENGTH } q) == > \\ \quad \text{(CROSSOVER (\$n::n::l) \$ p \neq q = \text{CROSSOVER } l \not\mid q\$):} \text{thm } \end{array}$$

Theorem 10 guarantees that in the case of the same parent chromosomes p and q, if we add two identical crossover points at the beginning of the cross-term l to obtain a new cross-term, the offspring generated by multi-point crossover operator with the new cross-term are the equal to those generated with crossterm l. In other words, the elimination of two identical elements in cross-term does not affect the results generated by multi-point crossover.

From Theorems 8–10, we can see that the results produced by the multi-point crossover operator used in genetic algorithms are related to the position of crossover points, and independent of the order of crossover points.

To demonstrate that the results generated by the multipoint crossover are not affected by the order of crossterm, the concept of strictly increasing list is used to verify the properties of the crossover operator. Strictly increasing list means that the elements in the list are in ascending order, and two identical elements are eliminated in the list. If cross-term l' is the strictly increasing list of crossterm l, the property that progeny generated by the multipoint crossover operator are not affected by the order of cross-term, can be expressed as: CROSSOVER l p q = CROSSOVER l ′ p q.

In order to get the strictly increasing list of any list in HOL4, it is necessary to define two predicates INSERT\_PL and CANON\_PL. Given a strictly increasing list l and a natural number n, when n is not in list 1, predicate INSERT\_PL can produce a new strictly increasing list with element n; otherwise it gets a new list that has the eliminated element n. The function of predicate CANON\_PL is to get the strictly increasing list l' of any given list l.

The higher-order logic presentations of predicate INSERT\_PL and predicate CANON\_PL are expressed as follows:

> val INSERT\_PL = [] |− (!n. INSERT\_PL n [] = [n]) /\ !n h t. INSERT\_PL n (h::t) = if n < h then n::h::t else if n = h then t else h::INSERT\_PL n t: thm > val CANON\_PL = [] |− (CANON\_PL [] = []) /\ !h t. CANON\_PL (h::t) = INSERT\_PL h (CANON\_PL t): thm

To ensure that the definitions of the two predicates are correct, we need to prove the properties of the strictly increasing list. The property of the strictly increasing list can be described as "any element in a strictly increasing list is smaller than the next element". Predicate INCREASE\_PRO is used to represent this property in HOL4, and its higher-order logic description is as follows:

> val INCREASE\_PRO = [] |- (INCREASE\_PRO [] <=> T) /\ !t1 h1. INCREASE\_PRO (h1::t1) <=> case t1 of [] => T |h2::t2 => h1 < h2 /\INCREASE\_PRO (h2::t2): thm

In addition, it is also required to prove the following properties of the strictly increasing list:

Theorem 11: Given any cross-term l, any m, n ∈ N, if INCREASE\_PRO(n : :l) and m<n, then INCREASE\_PRO(m : :l). The higher-order logic description is as follows:

$$\begin{array}{rcl} \text{>val LIST\\_INCREASE\\_ONE} & =\\ \text{[] } | - \text{!} \, | \, m \, n \, \text{INCREASE\\_PRO ( $n::l)} \land \\ m < n == \text{ } \text{INCREASE\\_PRO ($ m::l)} \text{:} \, \text{thm} \end{array}$$

Theorem 11 shows that if adding a natural number n at the beginning of list l possesses the strictly increasing property, then adding a natural number m that is smaller than n at the beginning of list l can still get a new strictly increasing list.

Theorem 12: Given any cross-term l, any n ∈ N, if INCREASE\_PRO l, then INSERT\_PL n l still meets INCREASE\_PRO(INSERT\_PL n l). The higher-order logic description is as follows:

> val LIST\_INCREASE\_INSERT = [] |- !l n. INCREASE\_PRO l ==> INCREASE\_PRO (INSERT\_PL n l): thm

Theorem 12 ensures that if l is a strictly increasing list and a natural number n is inserted into list 1 by predicate INSERT\_PL, then the new list obtained is still a strictly increasing list.

In order to prove the theorem 13, we need to give the following lemma:

Lemma 1: Given any cross-term l, any m, n ∈ N, if INCREASE\_PRO(m: :n : :l), then m<n and INCREASE\_PRO(n : :l). The higher-order logic description is as follows:

> val LIST\_INCREASE\_IMP = [] |− !l m n. INCREASE\_PRO (m::n::l) ==> m < n /\INCREASE\_PRO (n::l): thm

Lemma 1 shows that if (m : :n : :l) is a strictly increasing list, then (n : :l) is a strictly increasing list and m < n.

Theorem 13: Given any cross-term l, if INCREASE\_PRO l, then CANON\_PL l = l. The higher-order logic description is as follows:

> val LIST\_INCREASE\_CANON = [] |− !l. INCREASE\_PRO l ==> (CANON\_PL l = l): thm

Theorem 13 illustrates that if the arrangement of a list is strictly incremental, then the list is a strictly increasing list.

The proofs of Theorems 11–13 ensure the correctness of the strictly increasing list defined in HOL4. By using the definition of strictly increasing list, the following properties of the multi-point crossover operator can be further proved.

Theorem 14: Given any p, q ∈ D, any cross-term l. l ′ is the strictly increasing list of l, if LENGTH p = LENGTH q, then CROSSOVER l p q = CROSSOVERl ′ p q. The higher-order logic description is as follows:

> val CANON\_XCROSSOVER\_EQ = [] |− !p q l n. p IN D /\q IN D /\ (LENGTH p = LENGTH q) ==> (CROSSOVER (CANON l) p q = CROSSOVER l p q): thm

Theorem 14 guarantees that in the case of the same parent chromosomes p and q, the progeny generated by multi-point crossover with cross-term l are equal to the ones produced in the same way with the strictly increasing list of l. It is also more straightforward to illustrate that the results of the multi-point crossover are independent of the order of the elements in the cross-term.

Theorem 15: Given any p, q ∈ D, any cross-term l. l ′ is the strictly increasing list of l, any n ∈ N. if LENGTH p = LENGTH q, then CROSSOVER (INSERT\_PL n l′ ) p q = CROSSOVER (n : :l) p q. The higher-order logic description is as follows:

> val LINCREASE\_XCROSSOVER\_N = [] |− !p q l n. p IN D /\q IN D /\

(LENGTH p = LENGTH q) ==> (CROSSOVER (INSERT\_PL n l) p q = CROSSOVER (n::l) p q): thm

Theorem 15 shows that in the case of the same parent chromosomes p and q, if the crossover point n is inserted into cross-term l and l' respectively, where l' is the strictly increasing list of l, then the results obtained by the multi-point crossover under these two new cross-terms are the same.

Theorem 16: Given any p, q ∈ D, any cross-term l1, l2, if LENGTH p = LENGTH q and (CANON\_PL l1) = (CANON\_PL l2), then CROSSOVER l1 p q = CROSSOVER l2 p q. The higher-order logic description is as follows:

> val CANON\_XCROSSOVER\_DEQ = [] |− !p q l1 l2. p IN D /\q IN D /\ (LENGTH p = LENGTH q) /\ (CANON\_PL l1 = CANON\_PL l2) ==> (CROSSOVER l1 p q = CROSSOVER l2 p q): thm

Theorem 16 ensures that in the case of the same parent chromosomes p and q, if the strictly increasing list of different cross-terms are the same, then the offspring respectively produced by the multi-point crossover under the different crossterms are the same.

### DISCUSSION

As shown above, one-point and multi-point crossover operators are formalized and verified. The proposed technique is general and can applied in formalizing and verifying other genetic operators in genetic algorithms such as mutation. Mutation is another genetic operator that can preserve genetic diversity in such a way local minima caused by similar populations of chromosomes can be avoided. With mutation, one or more gene values in a chromosome can be changed from its initial state so a better solution may be achieved. To implement mutation, a common method is to generate a random variable for each bit in a chromosome sequence that is used to determine whether or not a particular bit will be amended. To realize the formalization and verification of mutation operations, first, we need to use the higher-order logic to represent basic elements, and then we can construct the formal modeling and perform verification with HOL4. This will be investigated in our future work.

### EVALUATION

### Experimental Setup

We conduct the experiment in Windows 7 with the specific tool HOL4, and the programing language is ML. In the experiment, robot path planning based on GA is described by ML, and the cross operation in algorithm is described by multi-point crossover mentioned in this paper. We proved the effectiveness of our formal model of crossover by running the algorithm in HOL4 successfully. In addition, we do not need to specify the inputs because of the advantage of theorem proving, and HOL4 will exhaust all the cases space, i.e., covering all inputs. The final output will show that robot can avoid collisions in any input situation.

### Case Study on Robotics

Genetic Algorithms have many advantages compared with traditional optimization methods. In this section, we present a case study on robotics, in which a genetic algorithm with the two-point crossover operator implemented by HOL4 is applied in robot path planning. We give the specific formal description and collision free verification.

The workspace of the robot is a 2-D environment. We assume that the location and size of obstacles are known, and the obstacles will stay the same during the movement of robot. The robot working space is modeled with grids following Cartesian coordinates. As shown in **Figure 5**, the lower left corner of the grid array is the coordinate origin, the right direction of the horizontal axis is the forward direction of x axis, and the up direction of the vertical axis is the forward direction of y axis. Each grid interval corresponds to a unit length on the coordinate axis, and any grid can be uniquely identified by (x, y). The length and width of a grid is defined as 10 units of distance, S represents the starting point of the robot, G represents the target point of the robot, and black grids are used to represent obstacles.

The moving path of the robot is represented by the chromosome. We defined the chromosome through a real list, where the subscript of the list represents the value of the coordinate x, and the gene is the value of the list, which represents the value of the coordinate y. The population is composed of a certain number of chromosomes, and its size is 20. In order to guarantee the global optimality of the genetic algorithm, the initial population is randomly generated.

The fitness function directly affects the computation efficiency and time of the genetic algorithm. In the path planning of the robot, the design of the fitness function needs to consider the length of the path and collision avoidance. Therefore, the fitness function is set to the sum of the path length and the obstacles' coordinates. It is represented as:

$$L = L1 + L2 = \sum\_{i=1}^{N} \sqrt{(\wp\_{i+1} - \wp\_i)^2 + 1} + A \sum\_{j=1}^{m} (\wp\_j + \wp\_j)^2$$

Here, L1 represents the distance between two adjacent coordinate points, and L2 represents the obstacle coordinates. When i = xj and yi = yj, A is 100; otherwise A is 0. N = 10 indicates that the space coordinate has 10 unit lengths,m represents the number of obstacles. In this way, the fitness function is very large when there are obstacles in the path. In order to simplify the calculation in HOL4, we modified the function as follows:

$$L' = \sum\_{i=1}^{N} (\wp\_{i+1} - \wp\_i)^2 + 10 + A \sum\_{j=1}^{m} (\wp\_j + \wp\_j)^2$$

In this function, L1 represents the sum of squares of the distance between two points. Although we modified the calculated way of

L1, it does not affect the comparative relationship between the length of the path of two chromosomes. Thus, the new fitness function can also be used to evaluate the optimal path in the path planning. According to the property of the shortest path, we can see that the lower the fitness value of the chromosome is, the better the path will become.

In this genetic algorithm, we use three basic genetic operators: selection, crossover, and mutation. In the selection operator, we use ranking selection, by which each individual in the population is ranked from low to high according to the fitness, the selected probability of the forward 80% individuals in the population is 6.25%, and the remaining 20% is 0. In the crossover operator, we use the two-point crossover operator, the crossover probability is 0.9, and the two intersections are generated randomly. In the mutation operator, we use the basic bit mutation, the mutation probability is 0.08, and the mutation position is selected randomly.

Next, we present how to define each operators using HOL4. The selection operator is defined as follows:

val Pi = FST (List.nth (QuickresultList, hd Random.rangelist(0,16) (1,Random.newgen())))

QuickresultList preserves the chromosome and its fitness value according to fitness value from low to high. hd Random.rangelist(0,16) (1,Random.newgen()) indicates the selected chromosome subscript with equal probability. FST deletes the selected fitness value of the chromosome and returns the selected chromosome to the chromosome Pi◦

The crossover operator is defined as follows:

fun TWOIPOINT\_CROSSOVER l (p,q) = (CROSSOVER l p q,CROSSOVER l q p)

TWOIPOINT\_CROSSOVER is the formalized crossover operator, CROSSOVER is the formalized crossover operation. With the two-point crossover, TWOIPOINT\_CROSSOVER l (p,q) will transfer a cross-term l to two crossover points.

The mutation operator is defined as follows:

fun BMUTATION (pih::pil) point = List.take ((pih::pil),point-1) @ ((Random.rangelist(1,11)(1,Random.newgen()) @ List.drop ((pih::pil),point)))

BMUTATION represents the mutation operator, pih::pil is the chromosomes to be mutated, point indicates the mutation point randomly produced. List.take ((pih::pil), point-1) obtains the genes before the mutation point, Random.rangelist(1,11)(1,Random.newgen()) can select the allele gene by equal probability at the mutation position, List.drop ((pih::pil),point) obtains the genes after the mutation point, @ is used to connect the gene segments obtained by the three functions after the mutation.

With the genetic algorithm, we can obtain the optimization collision free path. In order to ensure that the genetic algorithm can find the final path and meet the collision free conditions, we need verify the final path. In this paper, the array of coordinate points of the final path represented by the list. If one of the elements in the list is equal to the coordinate of the obstacle, the final path does not meet the property of collision free.

The verification result is shown as follow:

val BP = INTER ([(1,2),(2,4),(3,6),(4,7),(5,8),(6,8),(7,9),(8,9), (9,9),(10,10)], [(5,7),(7,8),(8,8),(9,8)]); > val BP = []: int list

Among them, INNER refers to the formal description of the property of collision free is shown below:

val BP = INTER (PATH, O);

It is used to determine whether the two lists have the same elements. If there is no such elements, it outputs the empty list; otherwise, it generates the new list of the same elements. PATH is the list of the final robot path generated by the genetic algorithm, PATH=[(x1,y1),. . . ,(x10,y10)], (xi,yi)(1≤i≤10) indicates the coordinates of the final path. O is the list of the Coordinates of the obstacles,O=[(xO1,yO1),. . . (xOm,yOm)], (xOi,yOi)(1≤i≤m) indicates the Coordinates of the obstacles, and m represents the number of the obstacles.

### REFERENCES


The BP is final result. It is an empty list, illustrating the optimal or near optimal path generated by the genetic algorithm has no common elements with the obstacles. Thus, the final path meets the collision free condition.

### CONCLUSION

In this paper, we formalized crossover operations with higher-order logic based on HOL4 that is easy to be deployed with its user-friendly programing environment. We implemented our technique to solve a path planning problem using a genetic algorithm with our formalized crossover operations, and the results show the effectiveness of our technique.

There are two directions for the future work. First, it is interesting to extend formalized crossover operations to other applications such as energy optimization for embedded systems (Wang et al., 2011) and non-volatile memory (Chen et al., 2016; Long et al., 2016; Wang et al., 2016; Liu et al., 2017), and construct a crossover operator library in HOL4. Moreover, we can further formalize genetic algorithms using formalized crossover operators. Based on this, the formalized genetic algorithm can be used to create the general tactics in HOL4, thus improving the automation level of the interactive theorem proving system.

### AUTHOR CONTRIBUTIONS

JZ: Substantial contributions to the conception and design of the work, drafting the work, final approval of the version to be published. MK: Substantial contributions to draft the work the acquisition, analysis, and interpretation of data for the work. XL: Revising the work critically for important intellectual content. GL: In ensuring that questions related to the accuracy of the work are appropriately investigated and resolved.

### FUNDING

We thank the support of the National Natural Science Foundation of China (grant numbers: 61572331, 61373034).


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Zhang, Kang, Li and Liu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Different-Level Simultaneous Minimization Scheme for Fault Tolerance of Redundant Manipulator Aided with Discrete-Time Recurrent Neural Network

#### Long Jin<sup>1</sup> , Bolin Liao<sup>2</sup> \*, Mei Liu<sup>1</sup> , Lin Xiao<sup>2</sup> , Dongsheng Guo<sup>3</sup> and Xiaogang Yan<sup>4</sup>

*<sup>1</sup> School of Information Science and Engineering, Lanzhou University, Lanzhou, China, <sup>2</sup> College of Information Science and Engineering, Jishou University, Jishou, China, <sup>3</sup> School of Information Science and Engineering, Huaqiao University, Xiamen, China, <sup>4</sup> Department of Computer Science, University of Otago, Dunedin, New Zealand*

#### Edited by:

*Shuai Li, Hong Kong Polytechnic University, Hong Kong*

#### Reviewed by:

*Zhijun Zhang, South China University of Technology, China Yinyan Zhang, Hong Kong Polytechnic University, Hong Kong Dechao Chen, Sun Yat-sen University, China Huanqing Wang, Carleton University, Canada*

> \*Correspondence: *Bolin Liao mulinliao8184@163.com*

Received: *15 May 2017* Accepted: *28 August 2017* Published: *11 September 2017*

#### Citation:

*Jin L, Liao B, Liu M, Xiao L, Guo D and Yan X (2017) Different-Level Simultaneous Minimization Scheme for Fault Tolerance of Redundant Manipulator Aided with Discrete-Time Recurrent Neural Network. Front. Neurorobot. 11:50. doi: 10.3389/fnbot.2017.00050* By incorporating the physical constraints in joint space, a different-level simultaneous minimization scheme, which takes both the robot kinematics and robot dynamics into account, is presented and investigated for fault-tolerant motion planning of redundant manipulator in this paper. The scheme is reformulated as a quadratic program (QP) with equality and bound constraints, which is then solved by a discrete-time recurrent neural network. Simulative verifications based on a six-link planar redundant robot manipulator substantiate the efficacy and accuracy of the presented acceleration fault-tolerant scheme, the resultant QP and the corresponding discrete-time recurrent neural network.

Keywords: redundant manipulator, different level, fault tolerance, physical constraint, discrete-time recurrent neural network

### 1. INTRODUCTION

In recent decades, robotics has drawn more and more attention in scientific areas and engineering applications. Many researches have been focused on this topic, and various kinds of robots have thus been developed and investigated (Roberts and Maciejewski, 1996; Sun et al., 2009; Zhang and Zhang, 2012, 2013; Li et al., 2014; Jin and Zhang, 2015; Jin et al., 2016; Jin and Li, 2017; Jin et al., 2017a,b; Zhang et al., 2017a,b). As a typical kind of robot, redundant manipulators have played a more and more important role in numerous fields of engineering applications (Roberts and Maciejewski, 1996; Zhang and Zhang, 2012, 2013; Jin and Zhang, 2015; Liao and Liu, 2015; Jin et al., 2016, 2017a,b; Jin and Li, 2017; Zhang et al., 2017a,b). For redundant manipulators, they can accomplish subtasks easily and dexterously and optimization of various performance criteria, since they possess more degrees of freedom (DOF) than the minimum amount required to accomplish a given end-effector primary task. Recent progresses have shown advantages of using adaptive neural networks for the control of nonlinear systems (Tang et al., 2014, 2017). For example, an adaptive control scheme was provided in Tang et al. (2014) for robot manipulator systems with unknown functions and dead-zone input by using adaptive neural networks, of which the parameters of the dead zone are assumed to be unknown but bounded. One of the fundamental issues in operating such a redundant manipulator is the inverse-kinematics problem (or termed, redundancy-resolution problem). Specifically, the joint trajectories need to be generated online based on the provided desired Cartesian trajectories of the endeffector. That is, if the manipulator control scheme is fault tolerant, the end-effector can fulfill the required task even if its joint fails. As one essential and challenging issue in inverse kinematics, it is important to tolerate joint failure online during the task execution. For example, in the remote applications such as space or sea exploration, repairing broken actuators is costly or even impossible and the probability of their failure is increased due to the hostile operating environment. To say the least, it may induce the damage to people and/or manipulator if a manipulator without a fault-tolerant scheme being equipped suddenly encounters a joint-lock situation during the execution. Thus, the fault-tolerant ability is an important or even indispensable design criterion for robotic systems.

More recently, the fault tolerance has attracted significant interest in the societies of academia and industry because of the increased demands in practical applications for safety and productivity as well as operating efficiency. Numerous researches on the topic of fault tolerance have thus been presented and investigated (Roberts and Maciejewski, 1996; Izumikawa et al., 2005; Siqueira and Terra, 2009; Li and Zhang, 2012). Authors in Siqueira and Terra (2009) described the fault occurrence for a three-link manipulator based on a Markovian jump model, which took into account all possible fault sequences in a threelink manipulator robot, and defined guidelines to control an n-link manipulator robot with several faults. Izumikawa et al. (2005) designed a controller of a fault-tolerant system with a signal observer for a strain gauge sensor fault. By switching from the controller for normal mode to the controller for unnormal mode, the stability and the control performance of such a system were maintained. Generally speaking, the existing methods for fault tolerance can be categorized as the passive type and the active type (Zhang and Jiang, 2008). The former one fixes and designs the corresponding controllers to be robust against the presumed faults, which needs neither fault detection nor controller reconfiguration, with limited capabilities for fault tolerance. The latter one reacts to the failures of the system actively by reconfiguring control actions to maintain the stability and acceptable performance of the entire system (Bustan et al., 2014).

A fault-tolerant scheme was presented by adding a matrixvector form equality constraint for the faulty joint, which took the limits of joint angle and joint velocity into consideration (Li and Zhang, 2012). However, this fault-tolerant scheme can not handle the joint-acceleration limits. More seriously, this scheme may introduce the undesirable discontinuity phenomenon in the velocity solution because it was investigated at the jointvelocity level. Thus, it is worthy to investigate the scheme for fault-tolerance of redundant robot manipulators at the jointacceleration level, which can effectively remedy the discontinuity phenomenon at the joint-velocity level and incorporates robot dynamic. However, up to now there is almost no study on the fault tolerance of redundant robot manipulators on the combination of different level. In other words, the study on the fault tolerance of redundant robot manipulators at the different level is still rare despite its appealingness.

In this paper, we make progress along this direction by presenting a different-level simultaneous minimization scheme, which takes both the robot kinematics and robot dynamics into account. The scheme is then reformulated as a quadratic program (QP) subject to equality and bound constraints. In order to solve such a QP problem, a discrete-time recurrent neural network is developed and applied to online solution of the QP problem. Simulative results based on a six-link planar robot manipulator further illustrate the efficacy and superiority of the proposed fault-tolerant scheme for fault tolerance of redundant robot manipulators.

### 2. PRELIMINARIES AND SCHEME FORMULATION

To lay a basis for further discussion, the relationship between the end-effector velocity r˙ ∈ R <sup>m</sup> and the joint velocity q˙ ∈ R n for redundant robot manipulators can be given hereinafter directly (Zhang and Zhang, 2012)

$$J(q)\dot{q} = \dot{r},\tag{1}$$

where J(q) ∈ R m×n is the Jacobian matrix of the end-effector with q being the joint-angle vector. By differentiating Equation (1) with respect to time t, the relationship between the end-effector acceleration r¨ and the joint acceleration q¨ is obtained as follows (Zhang and Zhang, 2012):

$$J(q)\ddot{q} = \ddot{r}\_\flat = \ddot{r} - \dot{J}(q)\dot{q},\tag{2}$$

where ˙J(q) is the time derivative of J(q). Note that, since the manipulator system is redundant (i.e., m < n), Equations (1) and (2) are under-determined, and generally admit an infinite number of feasible solutions in terms of inverse kinematics. This implies the ability to accommodate more functional constraints such as fault tolerance. For example, once a joint is stuck, other joints can take over its workload and move the end-effector to its goal via a fault-tolerant scheme. It is worth mentioning here that the fault-tolerant scheme investigated in Li and Zhang (2012) was based on Equation (1) (i.e., at the joint-velocity level), while the fault-tolerant scheme presented in the ensuing subsections is based on Equation (2) (i.e., at the joint-acceleration level).

For the online solution of Equation (2), the following QPoriented formulation can be used (Zhang and Zhang, 2012):

$$\text{minimize} \quad \ddot{q}^{\text{T}} \Lambda \ddot{q}/2 + c^{\text{T}} \ddot{q}, \tag{3}$$

$$\text{subject to} \quad J(q)\ddot{q} = \ddot{r}\_{\text{b}},\tag{4}$$

where coefficients 3 ∈ R n×n and c ∈ R n are defined accordingly for a specific redundancy-resolution scheme. In addition, superscript T denotes the transpose of a matrix or vector.

However, without appropriate remedied measures, when a manipulator suffers a joint failure, its performance would be significantly affected, and even worse, the manipulator fails to complete the prescribed path task. In safety-critical systems, the consequences of a minor fault in a system component can

be catastrophic. Therefore, the demands on reliability, safety and fault tolerance are generally high. It is necessary to take fault tolerance into consideration in the above inverse-kinematic scheme (i.e., Equations 3 and 4) in order to improve the reliability and availability while tracking a desirable path. Inspired by Roberts and Maciejewski (1996), at a time instant, if there are nf joints being locked (e.g., the ith, · · · , and jth joints with i, j ∈ [1, 2, · · · , n] and i 6= j), then we directly remove the corresponding joint-acceleration variables in the scheme. For example, if a failed joint i is locked, the corresponding relationship between the end-effector acceleration and the joint acceleration is obtained as

$$[\ulcorner \dot{J}(q) \urcorner] \urcorner \ddot{\ddot{q}} \urcorner = \ddot{r}\_{\flat} = \ddot{r} - [\ulcorner \dot{J}(q) \urcorner] \urcorner \dot{q} \urcorner],\tag{5}$$

where <sup>i</sup> J(q) = [j1, · · · , ji−1, ji+1, · · · , jn] and <sup>i</sup>˙J(q) = [ ˙j1, · · · , ˙ji−1, ˙ji+1, · · · , ˙jn] with j<sup>h</sup> and ˙j<sup>h</sup> denoting the hth column of J(q) and ˙J(q), respectively. In addition, <sup>i</sup> q˙ = [q˙1, · · · , q˙i−1, q˙i+1, · · · , q˙n] and <sup>i</sup> q¨ = [q¨1, · · · , q¨i−1, q¨i+1, · · · , q¨n]. The reduced Equation (5) then determines the kinematic properties of the degraded system.

By incorporating the joint physical constraints as well as the robot dynamic presented in Appendix (Supplementary Material), the different-level simultaneous minimization scheme for fault tolerance of robot manipulators is written as

$$\text{minimize} \qquad \alpha(\ddot{\theta}^{\text{T}}W\ddot{\theta}/2 + p^{\text{T}}\ddot{\theta}) + \beta \text{r}^{\text{T}}\mathfrak{r}/2,\tag{6}$$

$$\text{subject to} \quad J(\theta)\ddot{\theta} = \ddot{r}\_{\text{a}},\tag{7}$$

$$
\mathfrak{r} = H\ddot{\theta} + c\_{\mathfrak{r}}(\dot{\theta}, \theta) + g\_{\mathfrak{r}}(\theta), \tag{8}
$$

$$
\theta^- \lesssim \theta \lesssim \theta^+,\tag{9}
$$

$$
\theta^- \lessdot \theta \lesssim \theta^+,\tag{10}
$$

$$
\theta^- \lessdot \ddot{\theta} \lessdot \ddot{\theta}^+,\tag{11}
$$

$$
\mathfrak{r}^- \leqslant \mathfrak{r} \leqslant \mathfrak{r}^+,\tag{12}
$$

where α ∈ (0, 1) and β ∈ (0, 1) are the weighting factors with α+ β = 1 numerically; θ, θ˙, θ¨ and τ denote the n<sup>r</sup> dimensional jointangle, joint-velocity, joint-acceleration and joint-torque vectors, respectively; W = I ∈ R nr×n<sup>r</sup> , and J(θ) ∈ R m×n<sup>r</sup> , p ∈ R nr ; b = ¨r<sup>a</sup> + Kv(r˙<sup>d</sup> − J(θ)θ˙) + Kp(r<sup>d</sup> − f(θ)) ∈ R <sup>m</sup>; r¨<sup>a</sup> = ¨r − ˙J(θ)θ˙ with ˙J(θ) ∈ R <sup>m</sup>×n<sup>r</sup> and n<sup>r</sup> = n − n<sup>f</sup> [n<sup>f</sup> denotes the number of faulty joint(s)]. In addition, H denotes the n<sup>r</sup> × n<sup>r</sup> dimensional inertia matrix; c<sup>τ</sup> denotes the n<sup>r</sup> dimensional Coriolis/centrifugal force vector and g<sup>τ</sup> denotes the n<sup>r</sup> dimensional gravitational force vector. Besides, τ <sup>±</sup> = ±140 × 1<sup>v</sup> N·m. For simplicity and for example, α is set to be 0.995 (i.e., β = 0.005) in the ensuing simulations.

Remark 1: Fault detection is a fundamental, specialized and relatively independent part for fault tolerance, for which many methods have been proposed. These methods can be classified into two categories: model-based methods and databased methods (Yüksel and Sezgin, 2010). Model-based methods obtain the deviations signals between the model and the real system named as residuals to detect faults. Data-based methods are based on only processing input and output signals of the system to detect faults. In this paper, for focusing on the superior performance of the fault-tolerant scheme in faulty situation, it can be simply assumed that the fault detection/diagnosis system can always detect and diagnose an unexpected joint fault and immediately give the feedback to the fault-tolerant system. Once the fault-tolerant system receives such a feedback, it activates the reconfiguration mechanism and removes the corresponding joint-acceleration variables in the scheme.

Remark 2: Note that the model disturbance and computational round-off errors always exist in practical application. In order to improve the accuracy of the scheme, the feedback control needs to be incorporated into the forward kinematics equation. One effective approach is to add feedbacks of Cartesian position and velocity error, i.e., instead of using J(θ)θ¨ = ¨ra, Equation (7) can be replaced with

$$J(\theta)\ddot{\theta} = \ddot{r}\_{\text{a}} + K\_{\text{v}}(\dot{r}\_{\text{d}} - J(\theta)\dot{\theta}) + K\_{\text{p}}(r\_{\text{d}} - f(\theta)),$$

where K<sup>p</sup> and K<sup>v</sup> are positive-definite symmetric m × m gain matrices for position-error and velocity-error feedbacks, respectively. In the ensuing simulations and experiments, the diagonal elements of the gain matrices K<sup>p</sup> and K<sup>v</sup> are set as 10 and the off-diagonal elements are set as 0 for simplicity and clarity.

With the aid of conversion techniques given in Cheng et al. (1994), Cheng et al. (1995), and Park et al. (1998), the new bound constraint can thus be written as η <sup>−</sup> 6 θ¨ 6 η <sup>+</sup>, where the ith elements of η <sup>−</sup> and η <sup>+</sup> are defined respectively as

$$\begin{aligned} \eta\_i^- &= \max \{ \mathsf{y}\_{\mathsf{P}} (\theta\_i^- + \vartheta\_i - \theta\_i), \mathsf{y}\_{\mathsf{V}} (\dot{\theta}\_i^- - \dot{\theta}\_i), \ddot{\theta}\_i^- \}, \\ \eta\_i^+ &= \min \{ \mathsf{y}\_{\mathsf{P}} (\theta\_i^+ - \vartheta\_i - \theta\_i), \mathsf{y}\_{\mathsf{V}} (\dot{\theta}\_i^+ - \dot{\theta}\_i), \ddot{\theta}\_i^+ \}, \end{aligned}$$

where ϑ > 0 is a small constant vector used to scale the safety region. Besides, coefficients γ<sup>p</sup> > 0 and γ<sup>v</sup> > 0 determine the deceleration magnitude.

Based on the above analysis, the proposed scheme for physically-constrained redundant robot manipulators can be reformulated into the following standard QP in terms of θ¨:

$$\text{minimize} \qquad \alpha(\ddot{\theta}^{\text{T}}W\ddot{\theta}/2 + \mathbf{p}^{\text{T}}\ddot{\theta}) + \beta\mathbf{r}^{\text{T}}\mathbf{r}/2 \tag{13}$$

$$\text{subject to} \quad A\ddot{\theta} = b,\tag{14}$$

$$
\eta^- \lesssim \vartheta \lesssim \eta^+,\tag{15}
$$

where W = I ∈ R nr×n<sup>r</sup> , A = J(θ) ∈ R m×n<sup>r</sup> , b = ¨r<sup>a</sup> + Kv(r˙<sup>d</sup> − J(θ)θ˙) + Kp(r<sup>d</sup> − f(θ)) ∈ R <sup>m</sup>, τ = Hθ¨ + c<sup>τ</sup> (θ˙, θ) + g<sup>τ</sup> (θ), and p = (µ + ν)θ˙ + µν(θ − θ(0)) ∈ R <sup>n</sup><sup>r</sup> with µ > 0 and ν > 0. In addition, θ¨ and η <sup>±</sup> are defined the same as before.

Neural networks have been recognized as a powerful tool for real-time processing and successfully applied widely in various control systems (Wang et al., 2015, 2016). In particular, we use the following discrete-time recurrent neural network for solving online the QP problem (Xiao and Zhang, 2013; Zhang and Zhang, 2013).

$$\mu^{k+1} = \mu^k - \frac{\|e(\mu^k)\|\_2^2}{\|(M^\mathrm{T} + I)e(\mu^k)\|\_2^2} (M^\mathrm{T} + I)e(\mu^k), \tag{16}$$

where k · k<sup>2</sup> denotes the two-norm of a matrix; the decision variable vector u and its upper and lower bounds u <sup>±</sup> ∈ R <sup>N</sup> (with N = n<sup>r</sup> + m) are defined respectively as

$$\boldsymbol{u} = \begin{bmatrix} \ddot{\boldsymbol{\theta}} \\ \boldsymbol{\nu} \end{bmatrix}, \ \boldsymbol{u}^+ = \begin{bmatrix} \boldsymbol{\eta}^+ \\ \boldsymbol{\varpi} \boldsymbol{1}\_{\rm V} \end{bmatrix}, \ \boldsymbol{u}^- = \begin{bmatrix} \boldsymbol{\eta}^- \\ -\boldsymbol{\varpi} \boldsymbol{1}\_{\rm V} \end{bmatrix},$$

with y ∈ R <sup>m</sup> being the auxiliary decision vector (i.e., dual decision vector) defined corresponding to equality constraint (Equation 14), 1<sup>v</sup> = [1, · · · , 1]<sup>T</sup> denoting an appropriately dimensioned vector composed of ones, constant ̟ = 10<sup>10</sup> ∈ R being defined to replace +∞ numerically, and the augmented matrix M and vector z being defined respectively as

$$M = \begin{bmatrix} \alpha \, W + \beta H \, \, \, \, -A^T \\ A \, \, \, \, \, 0 \end{bmatrix} \in \mathbb{R}^{N \times N}, \,\, z = \begin{bmatrix} p + c\_t \\ -b \end{bmatrix} \in \mathbb{R}^N.$$

Besides, P(·):R <sup>N</sup> → is a piecewise-linear projection operator defined from space R <sup>N</sup> onto set , and the ith element of P(u) is defined as

$$[P\_{\Omega}(\boldsymbol{u})]\_{i} = \begin{cases} \boldsymbol{u}\_{i}^{-}, & \text{if } \boldsymbol{u}\_{i} \leqslant \boldsymbol{u}\_{i}^{-}, \\ \boldsymbol{u}\_{i}, & \text{if } \boldsymbol{u}\_{i}^{-} < \boldsymbol{u}\_{i} < \boldsymbol{u}\_{i}^{+}, \ \forall i \in \{1, 2, \cdots, N\}, \\ \boldsymbol{u}\_{i}^{+}, & \text{if } \boldsymbol{u}\_{i} \geqslant \boldsymbol{u}\_{i}^{+}, \end{cases}$$

### 3. SIMULATIVE RESULTS

In this paper, a six-link planar manipulator with motor-driven push-rod type joints is presented as a simulative platform to illustrate the effectiveness of the scheme. The hardware system of the six-link planar manipulator, which has six joints, is presented in Zhang and Zhang (2013). The physical parameters of the six-link planar manipulator are shown in **Table 1**, of which θ + i and θ − i denote respectively the upper and lower limits of the ith joint-angle vector and l<sup>i</sup> denotes the length of the ith link, i = 1, 2, · · · , 6. Besides, in the simulations and the ensuing experiment, the final error tolerance of ke(u k )k is set as 10−<sup>5</sup> for the discrete-time QP solver Equation (16) with the sampling gap being 0.01 s. The end-effector of the six-link planar redundant robot manipulator is expected to track a square-path with sidelength being 0.039 m. In addition, the duration of the pathtracking task is 20 s, ϑ = 0.1 rad, joint-velocity limits θ˙<sup>±</sup> = ±1.5 × 1<sup>v</sup> rad/s, joint-acceleration limits θ¨<sup>±</sup> = ±2 × 1<sup>v</sup> rad/s<sup>2</sup> and µ = ν = 4.

For comparison and for illustrating the efficacy of the different-level simultaneous minimization scheme (Equations 6–12) in the faulty situation, the simulation results


synthesized by scheme (Equations 6–12) with the first joint being faulty from on t = 15 s are shown in **Figure 1**. As observed from **Figure 1A**, the end-effector motion trajectory is close enough to the desired square path (i.e., with the robot dynamics taken into account, the tracking task is also completed), which substantiates the effectiveness of the different-level simultaneous minimization scheme (Equations 6–12) in the faulty situation. In addition, the tracking position error with the maximal position error being less than 4 × 10−<sup>6</sup> m shown in **Figure 1B** further shows the efficacy and accuracy of such a different-level simultaneous minimization scheme. Besides, in **Figure 1C**, for the first joint torque (i.e., τ<sup>1</sup> denoted by the blue lines), the solid lines and dashed lines respectively denote the joint-torque profiles corresponding to the no-fault situation and fault-tolerant situation with the first joint being faulty from on t = 15 s. As observed from **Figure 1C**, after t = 15 s, the value of the first joint torque becomes zero. With the first joint being faulty from on t = 15 s, the values of the joint velocities and joint accelerations are zero, which implies the efficacy of the different-level simultaneous minimization scheme (Equations 6–12) in the faulty situation. In summary, the above simulation results substantiate the efficacy and accuracy of the the different-level simultaneous minimization scheme (Equations 6–12), which takes both the robot kinematics and robot dynamics into account.

Remark 3: Note that, for a fault-tolerant task, it can be classified into the following two cases. (i) The equality constraint is always satisfied; e.g., once some joints are simultaneously faulty, the remainder joints can take over the workload and move the end-effector to its goal via a fault-tolerant scheme. (ii) The equality constraint can not be always satisfied; e.g., with some joints being faulty, the equality constraint does not hold at some time instants, and thus the path-tracking task can not be fulfilled in this situation. Theoretically speaking, the equality constraint should be satisfied all the time. However, strictly speaking, the equality constraint can not be satisfied exactly even for the first case. Specifically, the tracking position-error profiles synthesized by the different-level simultaneous minimization scheme (Equations 6–12) are numerically near zero but nonzero (i.e., 10−<sup>6</sup> ). That is because the simulation and computation are performed on a finite-arithmetic finite-memory digital computer. Then, the tracking position error may be inevitable between the desired path and actual trajectory, which is used to be the input of the feedback to track the task in a more accurate manner. To show clearly the second case (i.e., the equality constraint is not always satisfied), the corresponding motion trajectories and tracking errors with the first five joints being locked from on t = 15 s are visualized in **Figure 2**. Specifically, as seen from **Figure 2**, with the first five joints being faulty from on t = 15 s, the values of the corresponding joint velocities and joint accelerations are zero and the corresponding joint angles remain the same as θ(t = 15). To distinguish those two types of position error (i.e., the usual computational error, and the failure error due to the lack of feasible solution) as well as to keep the robotic system more reliable, a criterion can be added to the scheme. For example, the emergency brake of the system can be activated for the position error larger than 0.01 m with an increasing trend. As shown in **Figure 2D**, with the red dotted line denoting the

criterion, the robotic system can be stopped at time instant t ≈ 16 s to prevent the potential damage(s) to nearby people and/or robot arms.

It is worth pointing out that, although the investigations are based on the joint-lock situation, the efficacy of the proposed different-level simultaneous minimization scheme (Equations 6–12) as well as the resultant QP is not limited to the joint-lock situation. The joint-lock situation is just a representative of lots of joint faulty situations, such as the failure of one joint actuator. In addition, being the representative, the joint-lock situation extensively exists in types of joints (e.g., rotational joints and translational joints). For example, the joints may be locked with a greater probability when the robot works with the sludge. Thus, the proposed different-level simultaneous minimization scheme (Equations 6–12) is generally applicable, and the feasibility and efficacy of such a proposed scheme are not limited by the specific robot structure and failure mode.

### 4. CONCLUSIONS

In this paper, by incorporating the physical constraints in joint space, a different-level simultaneous minimization scheme, which takes both the robot kinematics and robot dynamics

into account, has been presented and investigated for faulttolerant motion planning of redundant manipulator in this paper. Then, the scheme has been reformulated as a quadratic program (QP) with equality and bound constraints, which has been solved by a discrete-time recurrent neural network. Simulative verifications based on a six-link planar redundant robot manipulator have substantiated the efficacy and accuracy of the presented acceleration fault-tolerant scheme, the resultant QP and the corresponding discrete-time recurrent neural network.

### AUTHOR CONTRIBUTIONS

LJ and BL presented the scheme. BL and ML designed experiments. LX and DG carried out experiments. XY analyzed experimental results. LJ and BL wrote the manuscript.

### REFERENCES


### FUNDING

This work is supported by the Hunan Natural Science Foundation of China (No. 2017JJ3257, and No. 2017JJ3258), by the Fundamental Research Funds for the Central Universities (No. lzujbky-2017-37), by the National Natural Science Foundation of China (No. 61563017, No. 61703189, No. 61503152, and 61662025), by the Research Foundation of Education Bureau of Hunan Province, China (No. 17B215, No. 17C1299, and No. 15B192), and by the Scientific Research Foundation of Jishou University (No. jsdxxcfxbskyxm201508).

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fnbot. 2017.00050/full#supplementary-material

Cheng, F. T., Sheu, R. J., and Chen, T. H. (1995). The improved compact QP method for resolving manipulator redundancy. IEEE Trans. Syst. Man Cybern. 25, 1521–1530. doi: 10.1109/21.467718


systems. IEEE Trans. Cybern. 45, 1977–1987. doi: 10.1109/TCYB.2014. 2363073


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer YZ and handling Editor declared their shared affiliation.

Copyright © 2017 Jin, Liao, Liu, Xiao, Guo and Yan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# **A Velocity-Level Bi-Criteria Optimization Scheme for Coordinated Path Tracking of Dual Robot Manipulators Using Recurrent Neural Network**

*Lin Xiao<sup>1</sup> \*, Yongsheng Zhang<sup>1</sup> , Bolin Liao<sup>1</sup> , Zhijun Zhang<sup>2</sup> , Lei Ding<sup>1</sup> and Long Jin<sup>3</sup>*

*<sup>1</sup>College of Information Science and Engineering, Jishou University, Jishou, China, <sup>2</sup>School of Automation Science and Engineering, South China University of Technology, Guangzhou, China, <sup>3</sup>School of Information Science and Engineering, Lanzhou University, Lanzhou, China*

#### *Edited by:*

*Shuai Li, Hong Kong Polytechnic University, Hong Kong*

#### *Reviewed by:*

*Dechao Chen, Sun Yat-sen University, China Xiaotian Yu, The Chinese University of Hong Kong, Hong Kong Yonghua Yin, Imperial College, Pakistan Xiao Zhengli, Changchun University of Science and Technology, China*

> *\*Correspondence: Lin Xiao xiaolin860728@163.com*

*Received: 24 May 2017 Accepted: 17 August 2017 Published: 04 September 2017*

#### *Citation:*

*Xiao L, Zhang Y, Liao B, Zhang Z, Ding L and Jin L (2017) A Velocity-Level Bi-Criteria Optimization Scheme for Coordinated Path Tracking of Dual Robot Manipulators Using Recurrent Neural Network. Front. Neurorobot. 11:47. doi: 10.3389/fnbot.2017.00047* A dual-robot system is a robotic device composed of two robot arms. To eliminate the joint-angle drift and prevent the occurrence of high joint velocity, a velocity-level bi-criteria optimization scheme, which includes two criteria (i.e., the minimum velocity norm and the repetitive motion), is proposed and investigated for coordinated path tracking of dual robot manipulators. Specifically, to realize the coordinated path tracking of dual robot manipulators, two subschemes are first presented for the left and right robot manipulators. After that, such two subschemes are reformulated as two general quadratic programs (QPs), which can be formulated as one unified QP. A recurrent neural network (RNN) is thus presented to solve effectively the unified QP problem. At last, computer simulation results based on a dual three-link planar manipulator further validate the feasibility and the efficacy of the velocity-level optimization scheme for coordinated path tracking using the recurrent neural network.

**Keywords: dual robot manipulators, bi-criteria optimization scheme, recurrent neural network, quadratic program, repetitive motion**

## **1. INTRODUCTION**

Robot manipulators were widely investigated and applied to many fields (Jin et al., 2017; Zhang and Zhang, 2012; Xiao and Zhang, 2013, 2014a; Jin and Zhang, 2015; Zhang et al., 2015; Yamada et al., 2016), such as human–robot interaction, path tracking, industrial manufacturing, military, repetitive motion, and so on. Many researches have been focused on this topic, and various kinds of robot manipulators have been developed and investigated (Li et al., 2012, 2014, 2017; Xiao and Zhang, 2013; Jin and Zhang, 2015; Zhang et al., 2015). As far as we know, there are some manipulation tasks (including large, heavy, awkwardly sized payloads) that cannot be fulfilled by only a single robot manipulator. In contrast, dual robot manipulators can not only complete some common tasks but also can finish some complex and dangerous things that the single robot manipulator is usually hard to finish (Zhang and Li, 2017; Li et al., 2012, 2014; Jin and Zhang, 2015). In addition, dual robot manipulators have been successfully applied to various applications (Jin and Li, 2016; Zhang et al., 2013, 2015; Xiao and Zhang, 2014b; Jin and Zhang, 2015; Jin et al., 2016a), e.g., load transport, cooperative assembly, dextrous grasping, coordinate welding. Therefore, using dual robot manipulators to collectively conduct complicated tasks is becoming increasingly popular.

It is well known that inverse kinematics of robot manipulators (including dual manipulators) is a much more difficult problem than forward kinematics, but it is a fundamental issue in the field of robotics (also including dual robot manipulators). Generally speaking, there are two types of good methods for addressing the inverse kinematic problem. One is based on the pseudoinverse method that includes a homogeneous solution and a specific minimum-norm solution (Klein and Kee, 1989;Klein and Ahmed, 1995). However, the traditional pseudoinverse method needs to compute the inverse/pseudoinverse of matrices, which usually costs a lot of time. In addition, this method would lead to the joint angle drift when the end-effector completes a repetitive motion (Klein and Ahmed, 1995). The second method is based on optimization techniques, which treat performance criteria as objective functions (Jin and Li, 2016; Zhang et al., 2004; Guo and Zhang, 2012; Xiao and Zhang, 2013, 2014a, 2016). Among the existing schemes, single performance criterion is widely used to control the motion of manipulators at different joint levels, such as repetitive motion (Xiao and Zhang, 2013, 2014a), manipulability (Jin and Li, 2016), obstacle avoidance (Xiao and Zhang, 2016), minimum velocity norm (Guo and Zhang, 2012), and minimum torque norm (Zhang et al., 2004).

It is worth pointing out that single criterion optimization schemes cannot satisfy multiple requirements in practical applications, so dual-criteria optimization schemes are needed (Hou et al., 2010). Besides, considering the importance of the repetitive motion control for dual robot manipulators, it also requires an effective criterion for solving the joint-angle drift problem of dual robot manipulators in practical applications (Xiao and Zhang, 2013, 2014a; Zhang et al., 2013). To satisfy the above requirements, in this article, a novel bi-criteria optimization scheme is presented and investigated for coordinated path tracking of dual robot manipulators at the joint velocity level, of which the bi-criteria consist of the minimum velocity motion (MVN) and the repetitive motion (RM). Note that the proposed optimization scheme consists of two subschemes (corresponding to the left and right manipulators). Besides, such two subschemes can be rewritten as two general quadratic programs (QPs), which is further integrated into one QP formulation.

There are a lot of methods to solve the above QP problems, such as numerical algorithms, recurrent neural networks (RNN), and so on. Although the numerical algorithms can iterate good solutions, they are not suitable for real-time implementations due to their series characteristic and computational complexity. As an efficient computation tool, the neural network approach has several potential advantages in real-time applications (Li et al., 2013a,b; Li and Li, 2014; Xiao and Zhang, 2014c; Xiao, 2015, 2016a,b; Xiao and Lu, 2015; Jin et al., 2016b, 2017; Xiao and Liao, 2016), such as parallel processing, hardware implementation ability, and distributed storage. For example, a gradient-based neural network (GNN) has been widely used to solve various challenging mathematical problems (Zhang et al., 2009; Xiao and Zhang, 2011; Yi et al., 2011; Li et al., 2013c; Xiao, 2016c). Considering the advantages of this method, GNN is developed and applied for solving the proposed bi-criteria optimization scheme and the unified QP problem. Finally, on the basis of a dual three-link planar manipulator, we conduct circular path tracking simulations using such a GNN model and the proposed bi-criteria optimization scheme. The computer simulation results further verify the feasibility and effectiveness of the proposed scheme for coordinated path tracking of dual robot manipulators using the recurrent neural network.

### **2. PRELIMINARIES**

The forward kinematic equations of the robot manipulators at the position level and the velocity level can be expressed, respectively, as follows (Jin et al., 2017; Zhang and Zhang, 2012; Xiao and Zhang, 2013, 2014a; Jin and Zhang, 2015; Zhang et al., 2015):

$$r(\mathbf{t}) = f(\theta(\mathbf{t})),\tag{1}$$

$$
\dot{r}(t) = J(\theta)\dot{\theta}(t),\tag{2}
$$

where *θ*(*t*)*∈R n* and ˙*θ*(*t*) *∈ R n* denote the joint position vector and the joint velocity vector, respectively; *r*(*t*)*∈R m* and *r*˙(*t*) *∈ R m* denote the end-effector position vector and the end-effector velocity vector, respectively; Jacobian matrix *J*(*θ*) = *∂f*(*θ*(*t*))*/∂θ ∈ R m×n* ; and *f*(*·*) denotes a smooth non-linear function.

For example, for a three-link planar robot manipulator, we can readily get the forward-kinematic equation (the independent variable *t* is omitted for presentation convenience):

$$r = \begin{bmatrix} r\_\text{\chi} \\ r\_\text{\chi} \end{bmatrix} = \begin{bmatrix} l\_1c\_1 + l\_2c\_2 + l\_3c\_3 \\ l\_1s\_1 + l\_2s\_2 + l\_3s\_3 \end{bmatrix} = f(\theta),$$

where *θ* = [*θ*1, *θ*2, *θ*3] <sup>T</sup> *∈R* 3 , *r ∈R* 2 , *l*<sup>1</sup> denotes the length of the first link, *l*<sup>2</sup> denotes the length of the second link, and *l*<sup>3</sup> denotes the length of the third link. In addition, the variables depicted in the above are defined as

$$\begin{aligned} \mathcal{c}\_1 &= \cos(\theta\_1), \; s\_1 = \sin(\theta\_1), \\ \mathcal{c}\_2 &= \cos(\theta\_1 + \theta\_2), \; s\_2 = \sin(\theta\_1 + \theta\_2), \\ \mathcal{c}\_3 &= \cos(\theta\_1 + \theta\_2 + \theta\_3), \; s\_3 = \sin(\theta\_1 + \theta\_2 + \theta\_3). \end{aligned}$$

The Jacobian matrix of *f*(*·*) can be solved in this situation by differentiating (1):

$$J = \begin{bmatrix} -l\_1s\_1 - l\_2s\_2 - l\_3s\_3 & -l\_2s\_2 - l\_3s\_3 & -l\_3s\_3 \\ l\_1c\_1 + l\_2c\_2 + l\_3c\_3 & l\_2c\_2 + l\_3c\_3 & l\_3c\_3 \end{bmatrix} . \tag{3}$$

Note that, in this article, we are concerned with the dual robot arms. Without loss of generality, one is called the left manipulator and the other is called the right manipulator for convenience. Therefore, the variables of the left and right robot manipulators of dual arms are correspondingly marked by subscripts <sup>l</sup> and <sup>r</sup>. For example, variables *θ*<sup>l</sup> and *θ*<sup>r</sup> denote the joint position vectors of the left and right robot manipulators of dual arms, respectively. In Section 5, we set *l*<sup>1</sup> = *l*<sup>2</sup> = *l*<sup>3</sup> = 1 m.

### **3. SCHEME FORMULATION**

For simplicity, the bi-criteria scheme of one robot manipulator is firstly proposed. To integrate the optimization criteria of the minimum velocity norm (MVN) and the repetitive motion (RM), a bi-criteria optimization objective at the velocity level is designed as

$$\text{minimize} \qquad \|\dot{\theta}\_{\text{l/r}}\|\_{2}^{2}/2 + \|\dot{\theta}\_{\text{l/r}} + q\_{\text{l/r}}\|\_{2}^{2}/2,\tag{4}$$

where *q*l*/*<sup>r</sup> = *ϵ*(*θ*l*/*<sup>r</sup> *− θ*l*/*r(0)) with *ϵ >* 0. Besides, performance index *∥* ˙*θ*l*/*r*∥* 2 <sup>2</sup> can achieve the minimum velocity motion of robot manipulators, and performance index *∥* ˙*θ*l*/*<sup>r</sup> + *q*l*/*r*∥* 2 <sup>2</sup>*/*2 can complete the repetitive motion task at the joint velocity level.

For the left robot manipulator, considering the forward kinematics equation and the above bi-criteria optimization objective, the bi-criteria optimization scheme can be formulated as below:

$$\text{minimize} \qquad \|\dot{\theta}\_{\text{l}}\|\_{2}^{2}/2 + \|\dot{\theta}\_{\text{l}} + q\_{\text{l}}\|\_{2}^{2}/2,\tag{5}$$

$$\text{subject to} \quad J\_{\text{l}}(\theta)\dot{\theta}\_{\text{l}} = \dot{r}\_{\text{l}},\tag{6}$$

where ˙*θ*<sup>l</sup> , *q*<sup>l</sup> , *J*l(*θ*), and *r*˙<sup>l</sup> are defined the same as before, but belong to the variables of the left robot manipulator. Equation (5) uses the bi-criteria optimization objective (equation (4)); and equation (6) is the forward kinematics equation (2) of the left robot manipulator of dual arms.

For the right robot manipulator, the bi-criteria optimization scheme can be formulated as below in the same way:

$$\text{minimize} \qquad \|\dot{\theta}\_{\mathbf{r}}\|\_{2}^{2}/2 + \|\dot{\theta}\_{\mathbf{r}} + q\_{\mathbf{r}}\|\_{2}^{2}/2,\tag{7}$$

$$\text{subject to} \quad J\_{\mathbf{r}}(\theta)\dot{\theta}\_{\mathbf{r}} = \dot{r}\_{\mathbf{r}},\tag{8}$$

where ˙*θ*r, *q*r, *J*r(*θ*), and *r*˙<sup>r</sup> are defined the same as before, but belong to the variables of the right robot manipulator.

### **4. QP REFORMULATION AND UNIFICATION**

In this section, to obtain two standard QP formulations, the proposed subschemes are rewritten as two QPs, which can be unified into one QP problem.

(1) Conversion of MVN criterion: according to definition of two norms, minimizing *∥* ˙*θ*l*∥* 2 <sup>2</sup>*/*2 in the first term of equation (5) for the left robot manipulator is equivalent to

$$\text{minimize} \quad \frac{\dot{\theta}\_1^\mathrm{T} I \dot{\theta}\_1}{2},\tag{9}$$

where *I ∈R n×n* denotes an identity matrix.

Similarly, MVN criterion *∥* ˙*θ*r*∥* 2 <sup>2</sup>*/*2 in the first term of equation (7) for the right robot manipulator is equivalent to

$$\text{minimize} \quad \frac{\dot{\theta}\_{\text{r}}^{\text{T}} I \dot{\theta}\_{\text{r}}}{2}. \tag{10}$$

(2) Conversion of RM criterion: the RM criterion *∥* ˙*θ*<sup>l</sup> + *q*l*∥* 2 <sup>2</sup>*/*2 in the second term of equation (5) for the left robot manipulator is rewritten equivalently as

$$\text{minimize} \quad \frac{(\dot{\theta}\_{\text{l}} + q\_{\text{l}})^{\text{T}}(\dot{\theta}\_{\text{l}} + q\_{\text{l}})}{2},\tag{11}$$

which is further equivalent to the following form:

$$\text{minimize} \quad \frac{\dot{\theta}\_1^\mathrm{T} I \dot{\theta}\_1 + 2q\_1^\mathrm{T} \dot{\theta}\_1 + q\_1^\mathrm{T} q\_1}{2},\tag{12}$$

where *q* T <sup>l</sup> *q*<sup>l</sup> can be deemed as a constant with respect to optimization variable ˙*θ* and can be ignored during minimization. Thus, the RM criterion *∥* ˙*θ*<sup>l</sup> + *q*l*∥* 2 <sup>2</sup>*/*2 of the left robot manipulator is finally equivalent to the following form:

$$\text{minimize} \quad \frac{\dot{\theta}\_{\text{l}}^{\text{T}} I \dot{\theta}\_{\text{l}} + 2q\_{\text{l}}^{\text{T}} \dot{\theta}\_{\text{l}}}{2}. \tag{13}$$

Similarly, the RM criterion *∥* ˙*θ*<sup>r</sup> + *q*r*∥* 2 <sup>2</sup>*/*2 of the right robot manipulator can be equivalent to the following form:

$$\text{minimize} \quad \frac{\dot{\theta}\_{\text{r}}^{\text{T}} I \dot{\theta}\_{\text{r}} + 2q\_{\text{r}}^{\text{T}} \dot{\theta}\_{\text{r}}}{2}. \tag{14}$$

Thus, through the above conversion, the bi-criteria optimization subscheme for the left robot manipulator can be formulated as the following standard QP:

$$\mathbf{x} \text{minimize } \quad \mathbf{x}\_{\mathbf{l}}^{\mathrm{T}} \mathbf{Q}\_{\mathbf{l}} \mathbf{x}\_{\mathbf{l}} / 2 + q\_{\mathbf{l}}^{\mathrm{T}} \mathbf{x}\_{\mathbf{l}}, \tag{15}$$

$$\text{subject to} \quad A\_{\mathbb{I}} \mathfrak{x}\_{\mathbb{I}} = b\_{\mathbb{I}}, \tag{16}$$

where *x*<sup>l</sup> = ˙*θ*<sup>l</sup> *∈ R n* , *Q*<sup>1</sup> = 2*I ∈R n×n* , *q*<sup>l</sup> = *ϵ*(*θ*<sup>l</sup> *− θ*l(0)) *∈ R n* , *A*<sup>l</sup> = *J*l(*θ*)*∈R m×n* , and *b*<sup>l</sup> = *r*˙<sup>l</sup> .

Similarly, the bi-criteria optimization subscheme of the right robot manipulator is presented as

$$\text{minimize} \quad \mathbf{x}\_{\mathbf{r}}^{\mathrm{T}} \mathbf{Q}\_{\mathbf{r}} \mathbf{x}\_{\mathbf{r}}/2 + q\_{\mathbf{r}}^{\mathrm{T}} \mathbf{x}\_{\mathbf{r}}, \tag{17}$$

$$\text{subject to} \quad A\_{\mathbf{r}} x\_{\mathbf{r}} = b\_{\mathbf{r}}, \tag{18}$$

where *x*<sup>r</sup> = ˙*θ*<sup>r</sup> *∈ R n* , *Q*<sup>r</sup> = 2*I ∈R n×n* , *q*<sup>r</sup> = *ϵ*(*θ*<sup>r</sup> *− θ*r(0)) *∈ R n* , *A*<sup>r</sup> = *J*r(*θ*)*∈R m×n* , and *b*<sup>r</sup> = *r*˙r.

Finally, the presented two QPs for the left and right robot manipulators of two arms are unified into a new QP formulation, i.e.,

$$\underset{\dots}{\text{minimize}} \quad z^{\text{T}} \,\,\,\text{W}z/2 + \omega^{\text{T}}z,\tag{19}$$

$$\text{subject to} \quad \mathcal{C}z = d,\tag{20}$$

where coefficient matrices (or vectors) are defined as below:

$$\begin{aligned} z &= \begin{bmatrix} \chi\_{\mathbf{l}} \\ \chi\_{\mathbf{r}} \end{bmatrix} \in \mathbb{R}^{2^n}, \ W = \begin{bmatrix} \mathcal{Q}\_{\mathbf{l}} & \mathbf{0} \\ \mathbf{0} & \mathcal{Q}\_{\mathbf{r}} \end{bmatrix} \in \mathbb{R}^{2^{n \times 2n}}, \\\ \omega &= \begin{bmatrix} q\_{\mathbf{l}} \\ q\_{\mathbf{r}} \end{bmatrix} \in \mathbb{R}^{2^n}, \ C = \begin{bmatrix} J\_{\mathbf{l}}(\theta) & \mathbf{0} \\ \mathbf{0} & J\_{\mathbf{r}}(\theta) \end{bmatrix} \in \mathbb{R}^{2^{m \times 2n}}, \\\ d &= \begin{bmatrix} b\_{\mathbf{l}} \\ b\_{\mathbf{r}} \end{bmatrix} = \begin{bmatrix} \dot{r}\_{\mathbf{l}} \\ \dot{r}\_{\mathbf{r}} \end{bmatrix} \in \mathbb{R}^{2^{m}}. \end{aligned}$$

### **5. RECURRENT NEURAL NETWORK SOLVER**

Note that there are many methods to solve such a standard QP problem. The most common approach is to use a Lagrange multiplier and to minimize a cost function (Li et al., 2013c; Xiao,

manipulator, and **(D)** joint velocity *θ*˙

<sup>r</sup> profile of right manipulator.

2016a). Thus, for dynamic quadratic optimization (equations (19) and (20)), its related Lagrangian is presented as follows:

$$H(z, \lambda) = z^{\mathrm{T}} \mathcal{W}z / 2 + \omega^{\mathrm{T}}z + \lambda^{\mathrm{T}}(\mathrm{C}z - d),$$

where *λ∈R* 2*m* denotes the multiplier variable.

It is well known that solving the quadratic optimization (equations (19) and (20)) could be achieved by zeroing the following equations:

$$\begin{cases} \frac{\partial H(z,\lambda)}{\partial x} = Wz + \omega + \mathcal{C}^{\mathrm{T}}\lambda = \mathbf{0},\\ \frac{\partial H(z,\lambda)}{\partial \lambda(t)} = Cz - d = \mathbf{0}. \end{cases}$$

Let

$$\begin{aligned} G &= \begin{bmatrix} W & \mathbb{C}^{\mathrm{T}} \\ \mathbb{C} & \mathbf{0} \end{bmatrix} \in \mathbb{R}^{(2n+2m)\times(2n+2m)}, \boldsymbol{\upchi} = \begin{bmatrix} \boldsymbol{z} \\ \boldsymbol{\uplambda} \end{bmatrix} \in \mathbb{R}^{2n+2m}, \\ \boldsymbol{\upmu} &= \begin{bmatrix} -\boldsymbol{\omega} \\ \boldsymbol{d} \end{bmatrix} \in \mathbb{R}^{2n+2m}. \end{aligned}$$

The above linear equations can be further equivalent to the following:

$$G\mathcal{y} = \mathfrak{u}.\tag{21}$$

Note that there were a lot of methods to solve the above linear equation system (equation (21)). In this part, a gradient-based neural network (GNN) is presented and investigated for solving the proposed bi-criteria optimization scheme and the finally equivalent equation (21). By following the literature (Zhang et al., 2009; Xiao and Zhang, 2011; Yi et al., 2011; Li et al., 2013c; Xiao, 2016c), the design procedure of GNN is listed as below.

First, an non-negative scalar-based energy function Ω is defined as follows:

$$\Omega = \|G\mathfrak{y} - \mathfrak{u}\|\_2^2 / 2. \tag{22}$$

Second, the negative gradient of Ω can be solved as *−∂*Ω*/∂y* = *G* T (*Gy − u*).

Frontiers in Neurorobotics | www.frontiersin.org

left manipulator, and **(D)** velocity error *ε*˙<sup>r</sup> profile of right manipulator.

Finally, according to gradient neural network design formula *y*˙ = *−γ∂*Ω*/∂y*, the GNN model for dynamic inverse kinematics problem can be described as follows:

$$\dot{\mathbf{y}} = -\gamma \mathbf{G}^{\mathrm{T}} \left( \mathbf{G} \mathbf{y} - \boldsymbol{\mu} \right), \tag{23}$$

where *y ∈R* 2*n*+2*m* denotes the neural state of GNN model (equation (23)).

### **6. SIMULATIVE VERIFICATIONS**

In this part, the unified bi-criteria optimization scheme (equations (19) and (20)) is applied to a dual three-link planar manipulator and solved by the presented GNN model (equation (23)). In computer simulations, the end-effectors of the dual manipulators are expected to simultaneously track a circle. Without loss of generality, design parameters *ϵ* = 10 and *γ* = 10<sup>7</sup> ; the task execution time is 8 s, and the radius of the desired circle is 0.25 m. Besides, the joints of the left and right manipulators are expected to begin with the initial states *θ*l(0) = [3*π*/4, *−*2*π*/5, *−π*/4]<sup>T</sup> rad and *θ*r(0) = [*π*/3, 2*π*/5, *π*/4]<sup>T</sup> rad, respectively. The computer simulations are illustrated in **Figures 1**–**3**, which is solved by the proposed bi-criteria optimization scheme and the presented recurrent neural network.

Specifically, **Figure 1** shows the whole motion trajectories of the dual three-link planar manipulators when the end-effectors track the given circular path. As seen from **Figure 1A**, the circular path-tracking task is performed successfully by the dual three-link planar manipulators. In addition, from **Figure 1B**, we can see that the final state and the initial state of the dual three-link planar manipulators coincide with each other.

**Figure 2** shows the joint-variable (including joint angle and joint velocity) profiles during the task execution of the dual three-link planar manipulators. From this figure, we can conclude that the proposed bi-criteria optimization scheme [synthesized by GNN model (equation (23))] can not only solve the jointangle drift problem but also prevent the occurrence of high joint velocity in this path-tracking task. Specifically, after the endeffectors completing the circular-path tracking task, the final joint states of the left and right manipulators return to their initial states, which can be seen in **Figures 2A,B**. In addition, from **Figures 2C,D**, we can observe that the situation of the high joint velocity does not happen, and the final velocity of each joint for the dual three-link manipulators is equal to zero. It is worth pointing out that, if the final joint velocities is not equal to zero, the manipulator' joints will not stop immediately at the end of the task duration; and thus, the non-repetitive problem would happen. These results demonstrate and verify the effectiveness of such a bi-criteria optimization scheme synthesized by GNN model (equation (23)).

For further verifying the accuracy of the proposed bi-criteria optimization scheme and GNN model (equation (23)), **Figure 3** shows the corresponding position error *ε*(*t*): = *r*(*t*) *− f*(*θ*(*t*)) and the velocity error *ε*˙(*t*) of the left robot manipulator and the right robot manipulator, where *ε<sup>X</sup>* and *ε<sup>Y</sup>* denote, respectively, the X-axis and Y-axis components of *ε*(*t*). As observed from **Figures 3A,B**, the corresponding X-axis and Y-axis components of position errors for the left robot manipulator and the right robot manipulator are less than 2 *×* 10*<sup>−</sup>*<sup>5</sup> m. Besides, from **Figures 3C,D**, we can obtain that the X-axis and Y-axis components of velocity errors for the left robot manipulator and the right robot manipulator are less than 6 *×* 10*<sup>−</sup>*<sup>6</sup> m. These demonstrate that the given circular path tracking task is fulfilled well *via* the proposed velocity-level bi-criteria optimization scheme.

In summary, the end-effector tasks are performed very well by synthesizing the proposed velocity-level bi-criteria optimization scheme. The detailed results verifies the effectiveness and applicability of the proposed bi-criteria optimization scheme for coordinated path tracking of dual redundant robot manipulators using the recurrent neural network.

### **7. CONCLUSION**

In this article, a novel velocity-level bi-criteria optimization scheme (i.e., integrating minimum velocity norm and repetitive motion) has been proposed and investigated for complex motion planning of dual robot manipulators. Such a bi-criteria optimization scheme can not only prevent the occurrence of high joint-velocity but also remedy the joint angle drifts of dual redundant robot manipulators well. In addition, the proposed scheme guarantees the joint velocity equals zero at the end of path tracking motion. To do so, two subschemes have been presented for the left and right robot manipulators, which are reformulated as two general quadratic programs (QPs). Then, such two general QP problems have been further unified into one standard QP formulation. Simulative results based on the dual three-link robot manipulators have substantiated the efficacy and applicability of the proposed velocity-level bi-criteria optimization scheme. The future work may lie in the applications of the bi-criteria optimization scheme to real robot manipulators.

### **AUTHOR CONTRIBUTIONS**

LX: experiment preparation, data acquisition and processing, and publication writing; YZ: experiment preparation, data processing, and publication drafting; BL: experiment technology support and publication review; ZZ and LD: experiment preparation and publication review; LJ: experiment preparation, data acquisition, and publication review.

### **FUNDING**

This work was supported by the National Natural Science Foundation of China under grant 61503152, the Natural Science Foundation of Hunan Province, China under grants 2016JJ2101 and 2017JJ3258, the National Natural Science Foundation of China under grants 61563017, 61363073, 61662025, and 61561022, and the Research Foundation of Jishou University, China under grants 2017JSUJD031, 2015SYJG034, JGY201643, and JG201615.

### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2017 Xiao, Zhang, Liao, Zhang, Ding and Jin. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# **Robust Adaptive Synchronization of Ring Configured Uncertain Chaotic FitzHugh–Nagumo Neurons under Direction-Dependent Coupling**

*Muhammad Iqbal <sup>1</sup> , Muhammad Rehan<sup>2</sup> \* and Keum-Shik Hong<sup>3</sup> \**

*<sup>1</sup>Department of Computer and Information Sciences, Pakistan Institute of Engineering and Applied Sciences (PIEAS), Islamabad, Pakistan, <sup>2</sup>Department of Electrical Engineering, Pakistan Institute of Engineering and Applied Sciences (PIEAS), Islamabad, Pakistan, <sup>3</sup>Department of Cogno-Mechatronics Engineering, School of Mechanical Engineering, Pusan National University, Busan, South Korea*

This paper exploits the dynamical modeling, behavior analysis, and synchronization of a network of four different FitzHugh–Nagumo (FHN) neurons with unknown parameters linked in a ring configuration under direction-dependent coupling. The main purpose is to investigate a robust adaptive control law for the synchronization of uncertain and perturbed neurons, communicating in a medium of bidirectional coupling. The neurons are assumed to be different and interconnected in a ring structure. The strength of the gap junctions is taken to be different for each link in the network, owing to the interneuronal coupling medium properties. Robust adaptive control mechanism based on Lyapunov stability analysis is employed and theoretical criteria are derived to realize the synchronization of the network of four FHN neurons in a ring form with unknown parameters under direction-dependent coupling and disturbances. The proposed scheme for synchronization of dissimilar neurons, under external electrical stimuli, coupled in a ring communication topology, having all parameters unknown, and subject to directional coupling medium and perturbations, is addressed for the first time as per our knowledge. To demonstrate the efficacy of the proposed strategy, simulation results are provided.

#### *Edited by:*

*Muhammad Umer Khan, Air University, Pakistan*

#### *Reviewed by:*

*Huaguang Gu, Tongji University, China Rongyu Tang, Beijing Institute of Technology, China*

### *\*Correspondence:*

*Muhammad Rehan rehanqau@gmail.com; Keum-Shik Hong kshong@pusan.ac.kr*

*Received: 01 August 2017 Accepted: 02 February 2018 Published: 26 February 2018*

#### *Citation:*

*Iqbal M, Rehan M and Hong K-S (2018) Robust Adaptive Synchronization of Ring Configured Uncertain Chaotic FitzHugh–Nagumo Neurons under Direction-Dependent Coupling. Front. Neurorobot. 12:6. doi: 10.3389/fnbot.2018.00006* **Keywords: FitzHugh–Nagumo neuron, neuronal networks, ring configuration, coupling strengths, robust adaptive synchronization control**

## **INTRODUCTION**

The spurred efforts to get an insight of the complex and opaque interactions among the levels of various neuronal networks is a major aspiration in neuroscience, because it would be an incredible abet to explore the foundation of normal and pathological brain functioning (Buzsaki, 2006; Alvarellos-Gonzalez et al., 2012; Aqil et al., 2012b). For example, one would be able to unveil how a steering signal is generated for muscles from the brain or how neurons diminish during brain disorders like Parkinson's, Huntington's, and epilepsy (Deak et al., 2007; Di Garbo et al., 2007; Mejias and Torres, 2007; Limousin and Martinez-Torres, 2008; Jobst, 2010; and Ostrem and Starr, 2008). The brain's mechanisms of operations have their own realism in interconnection and signal transmission, which has enthused many researchers to investigate brain activity at multiple levels (Naseer and Hong, 2013; Hong and Nguyen, 2014; Santosa et al., 2014; Hong and Naseer, 2016; Hong and Santosa, 2016; Nguyen and Hong, 2016; Zafar and Hong, 2017), ranging from a single neuron to a network of neurons. Brain has a number of complex functions and activities in relation to cognitive purposes (Santosa et al., 2013; Hong et al., 2015, 2017; Naseer et al., 2016; Nguyen et al., 2016). These brain activities can be somehow measured using various modalities and sensors in order to identify the intension of a subject (Turnip et al., 2011; Khan et al., 2014; Hong and Khan, 2017). Therefore, in-depth research has been done on modeling, analysis, instrumentation, and control of external devices in the area of brain-computer interfaces (Khan and Hong, 2015, 2017; Kocaturk et al., 2015; Naseer and Hong, 2015; Ghafoor et al., 2017; Liu and Hong, 2017).

Neuronal networks have been a thought-provoking and imperative subject owing to the various potential real-world processes, estimation, control and robotic applications [see Ellacott et al. (1997) and references therein]. In a neuronal network, a large number of neurons are inter-connected in various fashions under multifarious coupling phenomena. Recently, the studies on the dynamical behavior of a single neuron, a collective behavior of coupled neurons, and synchronization among the neurons have been extensively investigated (Thompson et al., 1999; Hua and Smith, 2004; Zhang et al., 2006;Wu and Chen, 2008;Yu et al., 2013; Wang et al., 2015). Synchronization of neurons plays a key role in the transmission process of neuronal signals, and enables effective communications in the brain or to the muscles (Knoblauch and Palm, 2005; Wang et al., 2008a,b; Nguyen and Hong, 2011, 2013). The FitzHugh–Nagumo (FHN) system, a simplified model of the coupling effect of neurons, has been considered largely owing to the fact that it mimics the dynamical behavior of neurons and intricates neuronal networks under external electrical stimulation (Thompson et al., 1999).

Neuroscience enriched by numerous reports in the context of coupled FHN neurons has opened a new avenue of research during the past few years. The simplest model to mimic the dynamical properties of neuronal interactions (such as synchronization) consists of two coupled neurons (Wang et al., 2009; Zhen and Xu, 2010; Aqil et al., 2012a; Iqbal et al., 2015, 2017). A control and synchronization methodology was designed to investigate the coupled reaction–diffusion FHN systems in Ambrosio and Aziz-Alaoui (2012). Synchronization of two coupled neurons was carried out by employing an adaptive backstepping sliding mode control in Yu et al. (2012). A theoretical criterion was presented for the synchronization of uncertain chaotic coupled systems for a neural network *via* the sliding mode technique by Chen et al. (2009). Synchronization of two identical coupled FHN systems with known or unknown parameters has been studied *via* a nonlinear adaptive control based on the fuzzy logic scheme, neural networks, the uncertainty estimator, and the feedback linearization control (Wang et al., 2007, 2008a,b; Zhang et al., 2007; Che et al., 2009), respectively. Later, a robust adaptive control for synchronization of two coupled FHN neurons of unknown parameters has been developed. Moreover, some important results for the synchronization of three-coupled FHN neurons having slightly different unknown parameters and disturbances with respect to multiple communication pathways have been explored (Rehan and Hong, 2011; Rehan et al., 2011). For more related investigations, synchronization of two coupled FHN neurons with unknown and different parameters under direction-dependent coupling has been discussed in Iqbal et al. (2014).

To a certain extent, efforts have been dedicated to the study of the dynamics of the neuronal networks coupled in a ring fashion, specifically by exploiting the impact of time delays (Campbell et al., 2005; Xu, 2008; Song and Xu, 2012; Zhang, 2014; Wang et al., 2015; Mao and Wang, 2016; Yuan et al., 2016; Mao, 2017). A recent work by Zhou et al. (2009) extended the synchronization problem to a network of coupled FHN neurons and explored the impact of the gap junctions on the network. It was investigated that the influence of the gap junctions on the dynamical behavior of neurobiological networks is stronger than the coupled systems. In addition, interestingly, a network of the FHN neurons exhibits a more fascinating dynamically complicated behavior than two or three coupled FHN neurons.

Some interesting works on synchronization of neurons have been accomplished in the recent years by employing various complexities. For instance, the work of Lai et al. (2008) employed an adaptive control approach, which provided synchronization of FHN neurons under a sinusoidal electrical field. The approach, however, may not ensure asymptotic convergence of the synchronization error and additional parameters are required for achieving the adaptation. To attain the robust synchronization of FHN neurons, Wei et al. (2009) introduced an internal model control strategy for output synchronization between the neurons using a semi-global Lyapunov approach. For dealing with perturbations, sliding surface-based control schemes were developed by Che et al. (2011) and Yu et al. (2012) in the presence of resistive coupling between the neurons. A step further, model complexity along with the behavioral analyses and control approach for phase synchronization between neurons were studied in the recent study by Ma et al. (2017). Despite of these studies, several open problems and challenges include synchronization in multiple coupled neurons and coupling model complexities.

In the earlier works, the research was limited to the simple scenarios of two or three coupled FHN neuronal models, since such simple scenarios were easily addressable. But, the operational mechanisms in the brain cannot be described with simple systems owing to the complex interactions (coupling) existing among the large number of neurons. Consequently, in order to explore the dynamical behavior of real complex systems, it is indispensable and challenging to work on larger coupled networks instead of a simple model of coupled systems (or reduced networks). In addition, the coupling models between the neurons should also be addressed as much as possibly closer to the actual complex medium strengths. Moreover, controlling of behaviors of neurons can be accomplished *via* adaptive control approaches in order to develop intelligent methods of adaptation according to the dynamical circumstances (Oyama et al., 2016; Stewart et al., 2016; Aoi et al., 2017). In conclusion, considering a neuronal network with unknown parameters in which a large number of neurons are communicating under complex couplings, namely, directiondependent coupling, can lead to enhance the theoretical and numerical analysis of neuronal systems' complexity, which is a pretty challenging research task.

Motivated by the aforementioned rationale, the aim of this paper is to investigate the dynamical behavior and synchronization of a network of different FHN neurons with unknown parameters, linked in a ring configuration, under direction-dependent coupling mediums. The direction-dependent coupling has been employed due to direction-dependent behavior of the gap junctions. The gap junctions between neurons can either allow current in one or in both (but with different strengths) directions, giving rise to the so-called direction-dependent coupling between neurons, see Iqbal et al. (2014). A model of four different FHN neurons, coupled in a ring topology, under external disturbances is presented. The different strength of the gap junctions for each link in the network owing to the inter-neuronal coupling medium properties is considered. A robust adaptive control is designed to address the intricate problem of the synchronization in a network of neurons. Based on Lyapunov stability theory, conditions are derived analytically for the synchronization in a network of four different FHN neurons with unknown parameters in a ring configuration under direction-dependent coupling and disturbances. The developed robust adaptive control algorithm encounters the problem of dealing with different recovery variables. Unlike the synchronization approach, partial synchronization of neurons by Iqbal et al. (2014), the proposed scheme ensures the complete synchronization of neurons. To the best of our knowledge, the robust adaptive control mechanism for synchronization of different neurons with unknown parameters in the ring configuration under direction-dependent coupling and disturbances is addressed for the first time. Essentially, the outcome of this study can edify new ideas for understanding of the neuronal networks in context of multifaceted coupling phenomena. Compared with the existing works on synchronization of two or three neurons, our study considers a complex scenario for synchronizing four neurons in a ring configuration under direction-dependent coupling, parametric uncertainties, and perturbations. This study shows the possibility of a robust and adaptive control strategy for attaining the coherent behavior among neurons forming a complicated network under an external electrical stimulation. To end with, extensive numerical simulation results are drawn to elucidate the efficacy of the proposed method.

There are several differences in this study compared to the existing works. For instance, this study considers a ring configuration of multiple neurons rather than an interconnection of two neurons as in Wang et al. (2007), Zhang et al. (2007), Wang et al. (2008a,b), Che et al. (2009), Rehan and Hong (2011), Lai et al. (2008), Wei et al. (2009), Che et al. (2011), Yu et al. (2012), and Ma et al. (2017). In addition, the current flow between two neurons is considered as direction-dependent, compared to these models, for regarding bidirectional coupling formed by the gap junctions. Moreover, the models of neurons in our study have different parameters to examine synchronization of dissimilar neurons. Compared to synchronization study in Rehan et al. (2011) for three FHN neurons, we develop a control approach for robust adaptive synchronization and all the parameters are considered to be unknown and different. Moreover, we employ a more complex scenario of four neurons, bidirectional coupling, and ring configuration. In comparison to the recent neuronal synchronization study of Iqbal et al. (2015), there are three contributions in this work. First, we consider multiple neurons for developing a synchronization control approach owing to the presence of multiple coupled neuronal interactions in the brain; second, synchronization of both activation potentials and recovery variables has been achieved in the proposed study; third, the idea of bidirectional coupling between two neurons has been extended to a ring configuration of neurons.

The rest of the manuscript are organized as follows: Section "Results and Discussion" discusses the main results, containing the modeling of a network of different FHN neurons with unknown parameters linked in a ring configuration under direction-dependent coupling, the design of a robust adaptive control mechanism, synchronization in the network without disturbance, synchronization in the network with disturbance, and numerical simulation results. Section "Methods" includes the employed methods, namely, FHN model, Lyapunov stability analysis, and proof of the main results without and with disturbances. Section "Conclusion", finally, includes the study conclusions.

### **RESULTS AND DISCUSSION**

### **Ring Configured FHN Neurons under Direction-Dependent Coupling**

The ring configuration of four neurons coupled in a bidirectional medium is shown in **Figure 1**. Let N1 be the master neuron, and N2, N3, and N4 be the slave neurons. We employ control signals for the synchronization of the slave neurons with the master neuron. The purpose of this study is to model the neuronal behavior and to provide a synchronization control remedy for attaining the coherent behavior of the neurons. The proposed network model of ring configured four FHN neurons under direction-dependent coupling [by accounting the results of Iqbal et al. (2014) and Yuan et al. (2016)] is given by

$$\begin{split} \dot{\mathbf{x}}\_{1} &= \mathbf{x}\_{1}(\mathbf{x}\_{1} - 1)(1 - r\_{1}\mathbf{x}\_{1}) - \mathbf{y}\_{1} - \mathbf{g}\_{1} \left[ (\mathbf{x}\_{1} - \mathbf{x}\_{2}) + (\mathbf{x}\_{1} - \mathbf{x}\_{4}) \right] \\ &+ I\_{\text{ext},1} + d\_{\text{ext},1}, \end{split} \tag{1}$$

$$\begin{aligned} \dot{y}\_1 &= b\_1 \mathbf{x}\_1, \\ \dot{\mathbf{x}}\_2 &= \mathbf{x}\_2 (\mathbf{x}\_2 - 1)(1 - r\_2 \mathbf{x}\_2) - y\_2 - \mathbf{g}\_2 \left[ (\mathbf{x}\_2 - \mathbf{x}\_1) + (\mathbf{x}\_2 - \mathbf{x}\_3) \right] \\ &+ I\_{\text{ext},2} + d\_{\text{ext},2}, \\ \mathbf{x} &:= \mathbf{x} \end{aligned} \tag{2}$$

$$\begin{aligned} \dot{y}\_2 &= b\_2 \mathbf{x}\_2, \\ \dot{\mathbf{x}}\_3 &= \mathbf{x}\_3 (\mathbf{x}\_3 - 1)(1 - r\_3 \mathbf{x}\_3) - y\_3 - \mathbf{g}\_1 \left[ (\mathbf{x}\_3 - \mathbf{x}\_2) + (\mathbf{x}\_3 - \mathbf{x}\_4) \right] \\ &+ I\_{\text{ext}, 3} + d\_{\text{ext}, 3}, \end{aligned} \tag{3}$$

$$\begin{aligned} \dot{p}\_3 &= b\_3 \mathbf{x}\_3, \\ \dot{\mathbf{x}}\_4 &= \mathbf{x}\_4 (\mathbf{x}\_4 - 1)(1 - r\_4 \mathbf{x}\_4) - y\_4 - g\_4 \left[ (\mathbf{x}\_4 - \mathbf{x}\_3) + (\mathbf{x}\_4 - \mathbf{x}\_1) \right] \\ &+ I\_{\text{ext}, 4} + d\_{\text{ext}, 4}, \\ \dot{p}\_4 &= b\_4 \mathbf{x}\_4, \end{aligned} \tag{4}$$

where *x*<sup>1</sup> and *y*<sup>1</sup> are the model states of the master FHN neuron, namely, the activation potential and the recovery variable, respectively. The *x*<sup>2</sup> and *y*<sup>2</sup> represent the states of the first slave neuron, *x*<sup>3</sup> and *y*<sup>3</sup> correspond to the second slave neuron states, and *x*<sup>4</sup> and *y*<sup>4</sup> are the states for the fourth neuron. The parameters (*r*1, *r*2, *r*3, *r*4) and (*b*1, *b*2, *b*3, *b*4) are related with the neurons'

nonlinear parts and recovery variable dynamics, respectively. The terms*Iext,*1, *Iext,*2, *Iext,*3, and *Iext,*<sup>4</sup> represent the external stimulation currents, where *Iext,<sup>i</sup>* = (*A/*ω) cos(ω*t*)for*i* = 1*,* 2*,* 3*,* 4, ω = 2π*f*. Here, *f* denotes the frequency and *A* denotes the amplitude of stimulation current. The gap junctions' strengths for communication between neurons are represented by *g*1, *g*2, *g*3, and *g*4. Disturbances at neurons are denoted by *dext,*1, *dext,*2, *dext,*3, and *dext,*4.

Various models of coupled neurons were considered in the studies (Wang et al., 2007, 2008a,b; Zhang et al., 2007; Che et al., 2009; Chen et al., 2009; Rehan and Hong, 2011; Rehan et al., 2011; Ambrosio and Aziz-Alaoui, 2012; Aqil et al., 2012a; Yu et al., 2012). However, these studies considered simple neuronal models with direction-independent coupling. The work of Iqbal et al. (2014) introduced the direction-dependent coupling. However, the ring configuration of neurons and coupling between several neurons were lacking. It should be noted that the model parameters associated with the proposed network of FHN neurons in Eqs (1)–(4) are totally uncertain and different. In addition, the proposed systematic approach considering direction-dependent coupling, different parameters, disturbances to the network model, and ring topology, in contrast to the simple models offered in Wang et al. (2007, 2008a,b), Zhang et al. (2007), Che et al. (2009), Chen et al. (2009), Rehan and Hong (2011), Rehan et al. (2011), Ambrosio and Aziz-Alaoui (2012), Aqil et al. (2012a), Yu et al. (2012), and Iqbal et al. (2014), which empowers a more realistic and generalized model.

In order to explore the complex behavior of the network model of the ring configured with different four FHN neurons under direction-dependent coupling, we first set the model parameters as *r*<sup>1</sup> = 10, *r*<sup>2</sup> = 10.2, *r*<sup>3</sup> = 10.4, *r*<sup>4</sup> = 10.6, *b*<sup>1</sup> = 1, *b*<sup>2</sup> = 1.01, *b*<sup>3</sup> = 1.02, *b*<sup>4</sup> = 1.03, *g*<sup>1</sup> = 0.001, *g*<sup>2</sup> = 0.002, *g*<sup>3</sup> = 0.003, *g*<sup>4</sup> = 0.004, and *f* = 0.127. The disturbances are accounted as *dext,*<sup>1</sup> = 0*.*1 sin 12*t*, *dext,*<sup>2</sup> = 0*.*1 sin 20*t*, *dext,*<sup>3</sup> = 0*.*1 sin 25*t*, and *dext,*<sup>4</sup> = 0*.*1 sin 23*t*. The stimulation amplitude is selected as *A* = 0.01. **Figure 2** depicts the results for the network of different FHN neurons under direction-dependent coupling. The phase portraits of four FHN chaotic neurons are shown in **Figures 2A–D**. These phase portraits show that the neurons have oscillatory behaviors. **Figures 3** and **4** exhibit the nonsynchronous behavior of the network of four FHN neurons for activation potentials and recovery variables (to be explained later). The phase portrait in **Figure 2A** displays the chaotic behavior of first neuron. The second neuron's chaotic behavior can be observed in **Figure 2B**. The chaotic behaviors for third and fourth neurons can be deduced from **Figures 2C,D**, respectively. The Lyapunov exponent has been computed for all the four neurons in **Figures 2A–D** using the approach provided in Iqbal et al. (2014), which come out to be 0.120, 0.058, 0.371, and 0.097. In conclusion, **Figures 2**–**4** along with positive values of the Lyapunov exponent show that all of neurons in the network possess the chaotic behavior, as provided in **Figures 2A–D**, and are not synchronous, as indicated in **Figures 3** and **4**.

## **Adaptive Control Mechanism and Error Dynamics**

This section provides compact equations for the error dynamics, controller, and adaptation laws. This work offers an adaptive

control mechanism for the synchronization of ring configured four FHN neurons under direction-dependent coupling. Thus, model in Eqs (1)–(4) becomes

$$\begin{split} \dot{\mathbf{x}}\_{1} &= \mathbf{x}\_{1}(\mathbf{x}\_{1} - 1)(1 - r\_{1}\mathbf{x}\_{1}) - \mathbf{y}\_{1} - \mathbf{g}\_{1} \left[ (\mathbf{x}\_{1} - \mathbf{x}\_{2}) + (\mathbf{x}\_{1} - \mathbf{x}\_{4}) \right] \\ &+ I\_{\text{ext},1} + d\_{\text{ext},1}, \end{split} \tag{5}$$

$$\begin{aligned} \dot{\mathbf{y}}\_1 &= b\_1 \mathbf{x}\_1, \\ \dot{\mathbf{x}}\_2 &= \mathbf{x}\_2 (\mathbf{x}\_2 - 1)(1 - r\_2 \mathbf{x}\_2) - \mathbf{y}\_2 - \mathbf{g}\_2 \left[ (\mathbf{x}\_2 - \mathbf{x}\_1) + (\mathbf{x}\_2 - \mathbf{x}\_3) \right] \\ &+ I\_{\text{ext},2} + d\_{\text{ext},2} + \mu\_{\text{xl}}, \end{aligned} \tag{6}$$

$$\begin{aligned} \dot{y}\_2 &= b\_2 \mathbf{x}\_2 + u\_{y1}, \\ \dot{\mathbf{x}}\_3 &= \mathbf{x}\_3 (\mathbf{x}\_3 - 1)(1 - r\_3 \mathbf{x}\_3) - y\_3 - \mathbf{g}\_1 \left[ (\mathbf{x}\_3 - \mathbf{x}\_2) + (\mathbf{x}\_3 - \mathbf{x}\_4) \right] \\ &+ I\_{\text{ext},3} + d\_{\text{ext},3} + u\_{\text{x2}}, \end{aligned} \tag{7}$$

$$\begin{aligned} \dot{y}\_3 &= b\_3 \mathbf{x}\_3 + u\_{y2}, \\ \dot{\mathbf{x}}\_4 &= \mathbf{x}\_4 (\mathbf{x}\_4 - 1)(1 - r\_4 \mathbf{x}\_4) - y\_4 - g\_4 \left[ (\mathbf{x}\_4 - \mathbf{x}\_3) + (\mathbf{x}\_4 - \mathbf{x}\_1) \right] \\ &+ I\_{\text{ext},4} + d\_{\text{ext},4} + u\_{\text{x3}}, \\ \dot{y}\_4 &= b\_4 \mathbf{x}\_4 + u\_{y3}, \end{aligned} \tag{8}$$

where *ux*1, *ux*2, and *ux*<sup>3</sup> and *uy*1, *uy*2, and *uy*<sup>3</sup> are the control inputs. We address a complete synchronization problem for the network model of ring configured FHN neurons in the context of their activation potentials and recovery variables, in contrast to the study of Iqbal et al. (2014), which has demonstrated the synchronization of two FHN neurons for their activation potentials only. To derive the control laws, the synchronization errors can be written as

$$e\_{\mathbf{x}1} = \mathbf{x}\_1 - \mathbf{x}\_2,\\ e\_{\mathbf{x}2} = \mathbf{x}\_1 - \mathbf{x}\_3,\\ e\_{\mathbf{x}3} = \mathbf{x}\_1 - \mathbf{x}\_4,\tag{9}$$

$$e\_{\rm y1} = y\_1 - y\_2, e\_{\rm y2} = y\_1 - y\_3, e\_{\rm y3} = y\_1 - y\_4. \tag{10}$$

It is worth mentioning that all six synchronization errors in Eqs (9) and (10) are introduced for attaining the complete synchronization, compared to the existing method of Iqbal et al. (2014). **Figure 3A** demonstrates the nonsynchronous behavior of neurons in terms of activation potentials. The spikes in the activation potential errors for the neurons can be observed in the plots of **Figure 3B**. On the same basis, demonstration of nonidentical responses of the FHN neurons in the recovery variable states is provided in **Figure 4A**. The spikes in individual behaviors of synchronization errors in the recovery variables are provided in **Figure 4B**. These spikes in synchronization errors of activation potentials and recovery variables depict that the firing in neurons are not coherent at all. By employing Eqs (5)–(10), the synchronization error dynamics after lengthy algebra take the form

$$\begin{aligned} \dot{\mathbf{e}}\_{\mathbf{x}1} &= \Phi\_{\mathbf{l}}^{\ \ \ \ \ \mathbf{T}\_{\mathbf{l}}(\mathbf{x}\_{\mathbf{l}}, \mathbf{x}\_{\mathbf{2}}) + F\_{\mathbf{l}}(\mathbf{x}\_{\mathbf{l}}, \mathbf{x}\_{\mathbf{2}}) - \mathbf{e}\_{\mathbf{x}1} + d\_{\mathbf{x}1} - \mu\_{\mathbf{x}1}, \\ \dot{\mathbf{e}}\_{\mathbf{y}1} &= \Psi\_{\mathbf{l}}^{\ \ \ \ \mathbf{Y}\_{\mathbf{1}}} \mathbf{Y}\_{\mathbf{1}}(\mathbf{x}\_{\mathbf{l}}, \mathbf{x}\_{\mathbf{2}}) - \mu\_{\mathbf{y}1}, \end{aligned} \tag{11}$$

$$\begin{aligned} \dot{e}\_{\mathbf{x}2} &= \Phi\_2 \, ^T \Gamma\_2(\mathbf{x}\_{\mathbf{l}}, \mathbf{x}\_{\mathbf{3}}) + F\_2(\mathbf{x}\_{\mathbf{l}}, \mathbf{x}\_{\mathbf{3}}) - e\_{\mathbf{x}2} + d\_{\mathbf{x}2} - u\_{\mathbf{x}2}, \\ \dot{e}\_{\mathbf{y}2} &= \Psi\_2 \, ^T \mathbf{Y}\_2(\mathbf{x}\_{\mathbf{l}}, \mathbf{x}\_{\mathbf{3}}) - u\_{\mathbf{y}2}, \end{aligned} \tag{12}$$

$$\begin{aligned} \dot{e}\_{\mathbf{x}3} &= \Phi\_3 \, ^T \Gamma\_3(\mathbf{x}\_{\mathbf{l}}, \mathbf{x}\_4) + F\_3(\mathbf{x}\_{\mathbf{l}}, \mathbf{x}\_4) - e\_{\mathbf{x}3} + d\_{\mathbf{x}3} - u\_{\mathbf{x}3}, \\ \dot{e}\_{\mathbf{y}3} &= \Psi\_3 \, ^T \mathbf{Y}\_3(\mathbf{x}\_{\mathbf{l}}, \mathbf{x}\_4) - u\_{\mathbf{y}3}. \end{aligned} \tag{13}$$

The whole derivation of the error dynamics and the relevant matrices can be seen in the Section "Methods". The proposed controllers for the ring configured FHN neurons are selected as

$$\begin{aligned} \mu\_{\mathbf{x}1} &= \hat{\boldsymbol{\Phi}}\_{1}^{T} \boldsymbol{\Gamma}\_{1}(\boldsymbol{x}\_{\mathrm{l}}, \boldsymbol{x}\_{2}) + F\_{1}(\boldsymbol{x}\_{\mathrm{l}}, \boldsymbol{x}\_{2}) + K\_{1} \boldsymbol{e}\_{\mathrm{x}1}, \\ \mu\_{\mathbf{y}1} &= \hat{\boldsymbol{\Psi}}\_{1}^{T} \boldsymbol{Y}\_{1}(\boldsymbol{x}\_{\mathrm{l}}, \boldsymbol{x}\_{2}), \end{aligned} \tag{14}$$

$$\begin{aligned} \mu\_{\mathbf{x}2} &= \hat{\Phi}\_2^T \Gamma\_2(\mathbf{x}\_{\mathbf{l}}, \mathbf{x}\_3) + F\_2(\mathbf{x}\_{\mathbf{l}}, \mathbf{x}\_3) + K\_2 e\_{\mathbf{x}2}, \\ \mu\_{\mathbf{y}2} &= \hat{\Psi}\_2^T \Gamma\_2(\mathbf{x}\_{\mathbf{l}}, \mathbf{x}\_3), \end{aligned} \tag{15}$$

$$\begin{aligned} \mu\_{\mathbf{x}3} &= \hat{\Phi}\_3^T \Gamma\_3(\mathbf{x}\_{\mathbf{l}}, \mathbf{x}\_4) + F\_3(\mathbf{x}\_{\mathbf{l}}, \mathbf{x}\_4) + K\_3 \mathbf{e}\_{\mathbf{x}3}, \\ \mu\_{\mathbf{y}3} &= \hat{\Psi}\_3^T \mathbf{Y}\_3(\mathbf{x}\_{\mathbf{l}}, \mathbf{x}\_4). \end{aligned} \tag{16}$$

The selected adaptation laws are

$$\begin{aligned} \dot{\hat{\Phi}}\_1 &= p\_1 e\_{\ge 1} \Gamma\_1(\mathbb{x}\_{\mathbb{L}}, \mathbb{x}\_2) \, 1/q\_1, \\ \dot{\hat{\Psi}}\_1 &= l\_1 e\_{\mathbb{M}} \, \mathcal{Y}\_1(\mathbb{x}\_{\mathbb{L}}, \mathbb{x}\_2) \, 1/m\_1, \end{aligned} \tag{17}$$

$$
\dot{\hat{\Phi}}\_2 = p\_2 e\_{\ge 2} \Gamma\_2(\infty, \infty) 1/q\_2,\tag{18}
$$

$$
\dot{\hat{\Psi}}\_2 = l\_2 e\_{\flat^2} \mathbf{Y}\_2(\mathfrak{x}\_{\flat}, \mathfrak{x}\_{\flat}) 1/m\_2,
$$

$$\begin{aligned} \dot{\hat{\Phi}}\_3 &= \rho\_3 e\_{\ge 3} \Gamma\_3(\mathbb{x}\_1, \mathbb{x}\_4) 1/q\_3, \\ \dot{\hat{\Psi}}\_3 &= l\_3 e\_{\ge 3} \, \mathcal{Y}\_3(\mathbb{x}\_1, \mathbb{x}\_4) 1/m\_3, \end{aligned} \tag{19}$$

where the scalars sets (*p*1, *p*2, *p*3), (*q*1, *q*2, *q*3), (*l*1, *l*2, *l*3), and (*m*1, *m*2, *m*3) enclose positive scalars. In the next subsection, adaptive and robust adaptive synchronization control conditions are provided in the network of ring configured neurons.

### **Adaptive Synchronization**

Now, a theoretical condition is developed for the synchronization of ring configured neurons under direction-dependent coupling Eqs (5)–(8) by application of adaptive control mechanism in Eqs (14)–(16) with adaptation law in Eqs (17)–(19). The following assumption is taken to obtain the main results.

**Assumption 1**. The parameters in the network of four FHN neurons in Eqs (5)–(8) and couplings, given by (*r*1*,r*2*,r*3*,r*4*, b*1*, b*2*, b*3*, b*4*, g*1*, g*2*, g*3*, g*4), are unknown constants.

**Theorem 1**. *Consider a network model of ring configured four FHN neurons in Eqs (5)–(8) having synchronization error dynamics Eqs (11)–(13) satisfying Assumption 1 with zero disturbances. Adaptive control mechanism Eqs (14)–(16) and the adaptation law given by Eqs (17)–(19) selected through p*(*K*1+1) *>* 0*, p*(*K*2+1) *>* 0, and *p*(*K*<sup>3</sup> + 1) *>* 0 *will ensure synchronization of the network model of ring configured neurons in terms of activation potentials by guaranteeing the convergence of synchronization errors to zero. In addition, if the steady-state is attained in a finite amount of time, the convergence of* Φˆ *<sup>i</sup> to* Φˆ *∗ <sup>i</sup> and* Ψˆ *<sup>i</sup> to* Ψˆ *∗ i for all i* = *1, 2, 3,* *are ensured for constant steady-state vector values* Φˆ *∗ <sup>i</sup> and* Ψˆ *∗ i , validating* (Φ*<sup>i</sup> −* Φˆ *∗ i* ) *T* Γ*<sup>i</sup>* = 0 *and* (Ψ*<sup>i</sup> −* Ψˆ *∗ i* ) *T* Υ*<sup>i</sup>* = 0.

The proof of the main result of Theorem 1 can be viewed in the next section. The result is important from the synchronization of a network of neurons point of view. In contrast to Iqbal et al. (2014), the proposed strategy in Theorem 1 can be used for complete synchronization of a network of different FHN neurons with unknown parameters. In addition, we considered multiple neurons linked in a ring configuration under direction-dependent coupling. In contrast to the conventional results like Wang et al. (2007, 2008a,b), Zhang et al. (2007), Che et al. (2009), Chen et al. (2009), Rehan and Hong (2011), Rehan et al. (2011), Ambrosio and Aziz-Alaoui (2012), Aqil et al. (2012a), and Yu et al. (2012), several aspects like uncertainties, ring configuration, different neurons, several number of neurons, and direction-dependent coupling are incorporated to design a matter-of-fact control approach of Theorem 1. Adaptations are employed for the synchronization of four neurons for dealing with a large number of unknown parameters. Additionally, a realistic approach has been followed for the adaptive control by considering all four neurons of different dynamics. The conventional studies assume that the FHN neurons have the same dynamical aspects.

In comparison to the works in Wang et al. (2007), Zhang et al. (2007), Wang et al. (2008a,b), Che et al. (2009), Rehan and Hong (2011), Lai et al. (2008), Wei et al. (2009), Che et al. (2011), Yu et al. (2012), and Ma et al. (2017), the proposed synchronization approach in Theorem 1 considers multiple neurons, directional coupling, and ring configuration to develop an adaptive mechanism for synchronization. The work of Rehan et al. (2011) considered synchronization in three neurons with known parameters. Here in this study, we consider adaptation of the parameters, and adaptation laws are introduced to achieve coherent behaviors in neurons with unknown and dissimilar parameters of neurons. In addition, a different configuration and direction-dependent couplings are employed in the proposed method of Theorem 1. The approach of Iqbal et al. (2015) developed a strategy to achieve synchronization in activation potentials and proposed a method to deal with two neurons only. In this case, we also provide a mechanism for synchronization recovery variables as well and provide an extension to a ring of four neurons.

### **Robust Adaptive Synchronization with Disturbance**

In this subsection, a methodology for the synchronization in a network of different FHN neurons with unknown parameters linked in a ring configuration under direction-dependent coupling and disturbances is presented. In addition to Assumption 1, we take the following supposition.

**Assumption 2**. Assume that the inequalities, given by *∥dx*1*∥ ≤ dm*1, *∥dx*2*∥ ≤ dm*2, *∥dx*3*∥ ≤ dm*3, and *∥*Φ*i∥ ≤* Φ*mi, ∀i* = 1*,* 2*,* 3, hold.

**Theorem 2**. *Consider a network model of ring configured four FHN neurons in Eqs (5)–(8), having synchronization error dynamics in Eqs (11)–(13) satisfying Assumptions 1–2. Suppose* *the proposed adaptive control mechanism in Eqs (14)–(16) and the modified adaptation laws given by*

$$\begin{aligned} \dot{\hat{\Phi}}\_1 &= \left( p \varrho\_{\ge 1} \Gamma\_1 - k\_\varepsilon ||\varrho\_{\ge 1}|| \, \hat{\Phi}\_1 \right) / q \,, \\ \dot{\hat{\Psi}}\_1 &= l \varrho\_{\ge 1} \mathbf{Y}\_1 / m \,, \end{aligned} \tag{20}$$

$$
\dot{\hat{\Phi}}\_2 = \left( p e\_{\mathbf{x}2} \Gamma\_2 - k\_\varepsilon ||e\_{\mathbf{x}2}|| \, \hat{\Phi}\_2 \right) / q \,, \tag{21}
$$

$$\begin{aligned} \dot{\hat{\Psi}}\_2 &= \mathrm{le}\_{\mathcal{V}^2} \mathbf{Y}\_2 / m, \\ \dot{\hat{\Phi}}\_3 &= \left( \mathrm{pe}\_{\mathcal{X}^3} \Gamma\_3 - k\_\epsilon ||\mathbf{e}\_{\mathcal{X}^3}|| \, \hat{\Phi}\_3 \right) / q \,, \\ \dot{\hat{\Psi}}\_3 &= \mathrm{le}\_{\mathcal{Y}^3} \mathbf{Y}\_3 / m, \end{aligned} \tag{22}$$

*where k<sup>c</sup> is a scalar constant. If we take p*(*K*1+1) *>* 0*, p*(*K*<sup>2</sup> + 1) *>* 0, and *p*(*K*<sup>3</sup> + 1) *>* 0*, it ensures synchronization of the network model of the ring configured FHN neurons by guaranteeing the convergence of errors to the compact sets. The proposed robust adaptive control scheme will ensure uniformly ultimately bounded errors and parameter estimation errors* Φ*<sup>i</sup> −* Φˆ *<sup>i</sup>*.

A brief proof of the statement in Theorem 2 is presented in Section "Methods". It is notable that the result of Theorem 2 refines the strategy developed in Theorem 1 by considering the disturbances to modify the design approach and adaptation laws. In contrast to the method demonstrated in Iqbal et al. (2014), the approach adopted in Theorem 2 provides a complete synchronization in a network of different FHN neurons with disturbance under unknown parameters linked in a ring configuration under direction-dependent coupling. There are various differences in this work with Iqbal et al. (2014). For instance, the four main differences are as follows: (a) we investigate a ring configuration of neurons, (b) this study is based on a more complex scenario of four neurons than the simple case of two neurons, (c) the coupling is also complex in this work, and (d) the achievement of complete synchronization rather than partial one is emphasized. It should also be noted that the work on synchronization of neurons under direction-dependent coupling is lacking in the literature. It is worth mentioning that such robust adaptive synchronization of the perturbed ring configured neurons with different parameters and direction-dependent coupling is lacking in the existing literature, like Wang et al. (2007, 2008a,b), Zhang et al. (2007), Che et al. (2009), Chen et al. (2009), Rehan and Hong (2011), Rehan et al. (2011), Ambrosio and Aziz-Alaoui (2012), Aqil et al. (2012a), and Yu et al. (2012). The presented approach considered a large number of parameters unknown in the four neurons. In addition, a perturbation in each neuron has been incorporated to provide an advanced synchronization solution. To deal with these perturbations and uncertainties, both adaptation and robustness of control signals for the slow and fast variations, respectively, are addressed in addition to the directiondependent strength of the signals for any connection between neurons.

### **Simulation Results**

To validate the efficacy of the proposed adaptive control mechanism for synchronization in the network model of the ring configured different four FHN neurons under directiondependent coupling, we first select the model parameters as *r*<sup>1</sup> = 10,*r*<sup>2</sup> = 10.2,*r*<sup>3</sup> = 10.4,*r*<sup>4</sup> = 10.6, *b*<sup>1</sup> = 1, *b*<sup>2</sup> = 1.01, *b*<sup>3</sup> = 1.02, *b*<sup>4</sup> = 1.03, *g*<sup>1</sup> = 0.001, *g*<sup>2</sup> = 0.002, *g*<sup>3</sup> = 0.003, *g*<sup>4</sup> = 0.004, and *f* = 0.127. The disturbances are taken as *dext,*<sup>1</sup> = 0*.*1 sin 12*t*, *dext,*<sup>2</sup> = 0*.*1 sin 20*t*, *dext,*<sup>3</sup> = 0*.*1 sin 25*t*, and *dext,*<sup>4</sup> = 0*.*1 sin 23*t*. The stimulation amplitude is chosen as *A* = 0.01.

By application of Theorem 2, the parameters of controller and the adaptation law are obtained as *p* = *q* = *l* = *m* = 1. The control parameters are taken to be *k*<sup>c</sup> = 5, *K*<sup>1</sup> = 20, *K*<sup>1</sup> = 20.001, and *K*<sup>3</sup> = 20.002. It has been observed in **Figures 2**–**4** that the behaviors of the original FHN neurons without any control signal are not coherent. As discussed earlier, the activation potential errors and recovery variable errors in **Figures 3** and **4** do not have converging attributes. Rather, spikes are observed in the synchronization errors, leading to non-synchronous firings of the neurons.

Now we simulate the behavior of same neurons without and with the proposed robust adaptive control scheme of Theorem 2. The proposed control signal is applied at *t* = 400. Before this time, the behaviors of the neurons are not coherent and the synchronization errors have oscillatory responses. By means of the proposed robust adaptive control scheme, it is observed that the FHN neurons are synchronized under unknown parameters and external perturbations. **Figures 5** and **6** depict the synchronization errors for the different FHN neurons under direction-dependent coupling by using the proposed methodology. Before *t* = 400, the behaviors of the activation potential errors in **Figure 5** have spikes, showing non-synchronous firing in neurons. The same trend is also observed in the recovery variable synchronization errors in **Figure 6**. We activated the proposed robust adaptive controller of Theorem 2 at *t* = 400. By application of the controller, the synchronization errors for activation potentials and recovery variables converge to a region near zero, as shown in **Figures 5** and **6**. The convergence of synchronization errors is fast, showing the effectiveness of the proposed robust adaptive control scheme. Due to convergence of the synchronization errors in **Figures 5** and **6**, the spikes due to firing of the four neurons under bidirectional coupling become identical, validating the synchronization in both activation potentials and recovery variables. It is concluded that the results in **Figure 5** authenticate the efficacy of the proposed robust adaptive control mechanism in the context of synchronization of activation potentials. Moreover, **Figure 6** validates the effectiveness of the proposed mechanism for synchronization of recovery variables. As the synchronization errors converge in the neighborhood of zero, it is evident that synchronization of activation and recovery potentials is achieved *via* the proposed robust adaptive control scheme.

The adopted modeling and control methodologies are generalized in certain extent and simulation results presented herein represent a broader scenario of a network of FHN neurons. The methods presented in Theorems 1–2 are valid to a general form of FHN neurons. In addition, robustness against bounded disturbances has been guaranteed through Theorem 2. The results of Theorems 1 and 2 may not be limited to FHN systems of only four neurons. All in all, the proposed modeling and control methodology can be used for a more general form, synchronization in a network of

different FHN neurons of unknown parameters, coupled in ring configuration, and subject to direction-dependent coupling and disturbances.

### **MATERIALS AND METHODS**

### **FHN Model**

Neuron is the chief functional element in the brain. Its dynamical examination is important for the treatment of brain diseases. There are many neuronal models, such as Hindmarsh-Rose, Hodgkin and Huxley, and FitzHugh–Nagumo, etc. These models offer investigation of the dynamical behavior of a neuron and even synchronization in a network of neurons. FHN model is a famous one in terms of representing various neuronal behaviors, owing to its simple representation. Consider the FHN model for representing dynamical aspects of a neuron subjected to external electrical stimulation as in Thompson et al. (1999), given by

$$\begin{aligned} \frac{d\mathbf{x}}{dt} &= \mathbf{x}(\mathbf{x} - 1)(1 - r\mathbf{x}) - \mathbf{y} + I, \\ \frac{d\mathbf{y}}{dt} &= b\mathbf{x} + \nu \mathbf{y}, \end{aligned} \tag{23}$$

where *x* and *y* represent the activation potential and the recovery variable, respectively, *r* is a nonlinearity parameter in the model, parameters *b* and *v* are related to the recovery variable, and *I* = (*a/*ω) cos ω*t* shows the stimulation current. We employ this important neuronal model to study the synchronization in a network of different FHN neurons of unknown parameters coupled in ring configuration subject to direction-dependent coupling and disturbances. In this paper, coupled FHN models were simulated using the *S*-function in Matlab for nonlinear differential equations.

### **Lyapunov Stability Analysis**

The Lyapunov stability criterion is widely utilized to understand the stability and control of dynamical systems. In order to elaborate the Lyapunov stability method, consider a dynamical system, for example, *x*˙ = *f*(*t, x*), where *x ∈ R n* denotes the state vector for the dynamical system. Suppose there exists a positive definite Lyapunov function *V*(*x*) for all the values of vector *x ∈ R n* . If the derivative of the energy function *V*(*x*) along the dynamics of the system *x* = *f*(*t, x*) is negative definite, the state *x* will approach to zero, conferring to the Lyapunov stability theory (seeKhalil (1996) and references therein). *V*˙(*x*) *<* 0 means that the factitious energy

*V*(*x*) of the dynamical system is decreasing, leading to stability of the system.

### **Derivation of Error Dynamics**

By using Eqs (5)–(10), we obtain the error dynamics as follows:

$$\begin{aligned} \dot{e}\_{\mathbf{x}1} &= f\_1(\mathbf{x}\_1) - f\_2(\mathbf{x}\_2) - y\_1 + y\_2 - g\_1 \left[ (\mathbf{x}\_1 - \mathbf{x}\_2) + (\mathbf{x}\_1 - \mathbf{x}\_4) \right] \\ &+ g\_2 \left[ (\mathbf{x}\_2 - \mathbf{x}\_1) + (\mathbf{x}\_2 - \mathbf{x}\_3) \right] + d\_{\mathbf{x}1} - \mu\_{\mathbf{x}1}, \\ \dot{e}\_{\mathbf{y}1} &= b\_1 \mathbf{x}\_1 - b\_2 \mathbf{x}\_2 - \mu\_{\mathbf{y}1}, \end{aligned} \tag{24}$$

$$\begin{aligned} \dot{e}\_{\mathbf{x}2} &= f\_1(\mathbf{x}\_1) - f\_3(\mathbf{x}\_3) - \mathbf{y}\_1 + \mathbf{y}\_3 - \mathbf{g}\_1 \left[ (\mathbf{x}\_1 - \mathbf{x}\_2) + (\mathbf{x}\_1 - \mathbf{x}\_4) \right] \\ &+ \mathbf{g}\_3 \left[ (\mathbf{x}\_3 - \mathbf{x}\_2) + (\mathbf{x}\_3 - \mathbf{x}\_4) \right] + d\_{\mathbf{x}2} - \mu\_{\mathbf{x}2}, \\ \dot{e}\_{\mathbf{y}2} &= b\_1 \mathbf{x}\_1 - b\_3 \mathbf{x}\_3 - \mu\_{\mathbf{y}2}, \end{aligned} \tag{25}$$

$$\begin{aligned} \dot{e}\_{\mathbf{x}3} &= f\_1(\mathbf{x}\_1) - f\_4(\mathbf{x}\_4) - \chi\_1 + \chi\_4 - g\_1 \left[ (\mathbf{x}\_1 - \mathbf{x}\_2) + (\mathbf{x}\_1 - \mathbf{x}\_4) \right] \\ &+ g\_4 \left[ (\mathbf{x}\_4 - \mathbf{x}\_3) + (\mathbf{x}\_4 - \mathbf{x}\_1) \right] + d\_{\mathbf{x}3} - \mu\_{\mathbf{x}3}, \\ \dot{e}\_{\mathbf{y}3} &= b\_1 \mathbf{x}\_1 - b\_4 \mathbf{x}\_4 - \mu\_{\mathbf{y}3}. \end{aligned} \tag{26}$$

Note that*Iext*,1,*Iext*,2,*Iext*,3, and *Iext*,4 are the same in the present scenario, therefore, their effect is canceled out in the error dynamics. Let us define the functions and signals

$$\begin{aligned} f\_1(\mathbf{x}\_1) &= -r\_1 \mathbf{x}\_1^3 + r\_1 \mathbf{x}\_1^2 + \mathbf{x}\_1^2 - \mathbf{x}\_1, \\ f\_2(\mathbf{x}\_2) &= -r\_2 \mathbf{x}\_2^3 + r\_2 \mathbf{x}\_2^2 + \mathbf{x}\_2^2 - \mathbf{x}\_2, \\ f\_3(\mathbf{x}\_3) &= -r\_3 \mathbf{x}\_3^3 + r\_3 \mathbf{x}\_3^2 + \mathbf{x}\_3^2 - \mathbf{x}\_3, \\ f\_4(\mathbf{x}\_4) &= -r\_4 \mathbf{x}\_4^3 + r\_4 \mathbf{x}\_4^2 + \mathbf{x}\_4^2 - \mathbf{x}\_4, \\ d\_{\mathbf{x}1} &= d\_{\mathbf{ext},1} - d\_{\mathbf{ext},2}, \\ d\_{\mathbf{x}2} &= d\_{\mathbf{ext},1} - d\_{\mathbf{ext},3}, \\ d\_{\mathbf{x}3} &= d\_{\mathbf{ext},1} - d\_{\mathbf{ext},4}. \end{aligned} \tag{27}$$

As the recovery variable dynamics are dependent on the activation potential, the relations become

$$\begin{aligned} y\_1 &= b\_1 \int\_0^t \chi\_1 d\alpha + \wp\_1(0), \\ y\_2 &= b\_2 \int\_0^t \chi\_2 d\alpha + \wp\_2(0), \\ y\_3 &= b\_3 \int\_0^t \chi\_3 d\alpha + \wp\_3(0), \\ y\_4 &= b\_4 \int\_0^t \chi\_4 d\alpha + \wp\_4(0). \end{aligned} \tag{28}$$

Frontiers in Neurorobotics | www.frontiersin.org

Here *y*1(0), *y*2(0), *y*3(0), and *y*4(0) denote the unknown initial conditions for the recovery variable of four neurons. The relevant quantities in the error dynamics formulation are defined by

$$\begin{aligned} \Phi\_1^{\,T} &= \begin{bmatrix} r\_1 & r\_2 & b\_1 & b\_2 & y\_1(0) & y\_2(0) & y\_1 & y\_2 \end{bmatrix} \\ \Gamma\_1(\mathbf{x}\_1, \mathbf{x}\_2) &= \begin{bmatrix} -\mathbf{x}\_1^3 + \mathbf{x}\_1^2 & \mathbf{x}\_2^3 - \mathbf{x}\_2^2 & -\int\_0^t \mathbf{x}\_1 d\mathbf{x} & \int\_0^t \mathbf{x}\_2 d\mathbf{x} & -1 & 1 \end{bmatrix} \\ &- \begin{bmatrix} (\mathbf{x}\_1 - \mathbf{x}\_2) + (\mathbf{x}\_1 - \mathbf{x}\_4) \end{bmatrix} \begin{bmatrix} (\mathbf{x}\_2 - \mathbf{x}\_1) + (\mathbf{x}\_2 - \mathbf{x}\_3) \end{bmatrix}^T, \end{aligned} \tag{29}$$

$$\begin{aligned} \boldsymbol{\Psi}\_{1}^{\top} &= \begin{bmatrix} b\_{1} & b\_{2} \end{bmatrix}, [0, 1], \\ \boldsymbol{\Upsilon}\_{1}(\boldsymbol{\varkappa}\_{1}, \boldsymbol{\varkappa}\_{2}) &= \begin{bmatrix} \boldsymbol{\varkappa}\_{1} & -\boldsymbol{\varkappa}\_{2} \end{bmatrix}^{T}, \end{aligned} \tag{30}$$

$$\begin{aligned} \Phi\_2^{\,T} &= \begin{bmatrix} r\_1 & r\_3 & b\_1 & b\_3 & y\_1(0) & y\_3(0) & y\_1 & y\_3 \end{bmatrix}, \\ \Gamma\_2(\mathbf{x}\_1, \mathbf{x}\_3) &= \begin{bmatrix} -\mathbf{x}\_1^3 + \mathbf{x}\_1^2 & \mathbf{x}\_3^3 - \mathbf{x}\_3^2 & -\int\_0^t \mathbf{x}\_1 d\mathbf{x} & \int\_0^t \mathbf{x}\_3 d\mathbf{x} & -1 & 1, \end{bmatrix} \\ &- \begin{bmatrix} (\mathbf{x}\_1 - \mathbf{x}\_2) + (\mathbf{x}\_1 - \mathbf{x}\_4) \end{bmatrix} \begin{bmatrix} (\mathbf{x}\_3 - \mathbf{x}\_2) + (\mathbf{x}\_3 - \mathbf{x}\_4) \end{bmatrix}^T, \end{aligned} \tag{31}$$

$$\begin{aligned} \boldsymbol{\Psi}\_2^T &= \begin{bmatrix} b\_1 & b\_3 \end{bmatrix}, \\ \mathbf{Y}\_2(\boldsymbol{\varkappa}\_1, \boldsymbol{\varkappa}\_3) &= \begin{bmatrix} \boldsymbol{\varkappa}\_1 & -\boldsymbol{\varkappa}\_3 \end{bmatrix}^T, \end{aligned} \tag{32}$$

$$\begin{aligned} \Phi\_3^T &= \begin{bmatrix} r\_1 & r\_4 & b\_1 & b\_4 & \wp\_1(\mathbf{0}) & \wp\_4(\mathbf{0}) & \mathbf{g\_1} & \mathbf{g\_4} \end{bmatrix}, \\ \Gamma\_3(\mathbf{x\_1}, \mathbf{x\_4}) &= \begin{bmatrix} -\mathbf{x\_1^3} + \mathbf{x\_1^2} & \mathbf{x\_4^3} - \mathbf{x\_4^2} & -\int\_0^t \mathbf{x\_1} d\mathbf{x} & \int\_0^t \mathbf{x\_4} d\mathbf{x} & -1 & 1, \end{bmatrix} \end{aligned} \tag{31}$$

$$ -\begin{bmatrix} (\mathbf{x\_1} - \mathbf{x\_2}) + (\mathbf{x\_1} - \mathbf{x\_4}) \end{bmatrix} \begin{bmatrix} (\mathbf{x\_4} - \mathbf{x\_3}) + (\mathbf{x\_4} - \mathbf{x\_1}) \end{bmatrix}^T, \tag{32}$$

$$\begin{aligned} \boldsymbol{\Psi}\_{3}^{\top} &= \begin{bmatrix} b\_{1} & b\_{4} \end{bmatrix}, \\ \mathbf{Y}\_{3}(\boldsymbol{\varkappa}\_{1}, \boldsymbol{\varkappa}\_{4}) &= \begin{bmatrix} \boldsymbol{\varkappa}\_{1} & -\boldsymbol{\varkappa}\_{4} \end{bmatrix}^{T}, \end{aligned} \tag{34}$$

and

$$\begin{aligned} F\_1(\mathbf{x}\_1, \mathbf{x}\_2) &= \mathbf{x}\_1^2 - \mathbf{x}\_2^2, \\ F\_2(\mathbf{x}\_1, \mathbf{x}\_3) &= \mathbf{x}\_1^2 - \mathbf{x}\_3^2, \\ F\_3(\mathbf{x}\_1, \mathbf{x}\_4) &= \mathbf{x}\_1^2 - \mathbf{x}\_4^2. \end{aligned} \tag{35}$$

Employing Eqs (27)–(35) into Eqs (24)–(26), the error dynamics equations given by Eqs (11)–(13) are obtained in the Section "Results and Discussion".

### **Proof of Theorem 1**

The proof of Theorem 1 is provided using the same steps as in Iqbal et al. (2014). However, our scenario is more complex due to the ring configuration and multiple neurons. Incorporating Eqs (14)–(16) into Eqs (11)–(13), for *i* = 1, 2, 3 leads to the results

$$\begin{aligned} \dot{\mathbf{e}}\_{\text{xi}} &= \left(\Phi\_i - \hat{\Phi}\_i\right)^T \Gamma\_i - (\mathbf{K}\_i + 1)\mathbf{e}\_{\text{xi}} + d\_{\text{xi}}, \\ \dot{\mathbf{e}}\_{\text{yi}} &= \left(\Psi\_i - \hat{\Psi}\_i\right)^T \mathbf{Y}\_i. \end{aligned} \tag{36}$$

The considered Lyapunov function candidate is given by

$$\begin{split} &\mathcal{V}(\boldsymbol{e}\_{xi},\boldsymbol{e}\_{yi},(\Phi\_{i}-\hat{\Phi}\_{i}),(\boldsymbol{\Psi}\_{i}-\hat{\Psi}\_{i})) \\ &= (1/2) \sum\_{i=1}^{3} \left( p\boldsymbol{e}\_{xi}^{2} + q(\Phi\_{i}-\hat{\Phi}\_{i})^{T}(\Phi\_{i}-\hat{\Phi}\_{i}) \right) \\ &+ (1/2) \sum\_{i=1}^{3} \left( \boldsymbol{l}\boldsymbol{e}\_{yi}^{2} + m(\boldsymbol{\Psi}\_{i}-\hat{\Psi}\_{i})^{T}(\boldsymbol{\Psi}\_{i}-\hat{\Psi}\_{i}) \right), \end{split} \tag{37}$$

with *p >* 0, *q >* 0, *l >* 0, *m >* 0. On taking the time-derivative of Eq (37), using (Φ*<sup>i</sup> −* Φˆ *<sup>i</sup>*) *<sup>T</sup>* ˙Φ<sup>ˆ</sup> *<sup>i</sup>* <sup>=</sup> ˙Φˆ *i T* (Φ*<sup>i</sup> −* Φˆ *<sup>i</sup>*) and (Ψ*<sup>i</sup> −* Ψˆ *<sup>i</sup>*) *<sup>T</sup>* ˙Ψ<sup>ˆ</sup> *<sup>i</sup>* <sup>=</sup> ˙Ψ<sup>ˆ</sup> *<sup>i</sup>*(Ψ*<sup>i</sup> <sup>−</sup>* <sup>Ψ</sup><sup>ˆ</sup> *<sup>i</sup>*) *T* and, further, incorporating the error systems of Eq (36), we obtain

$$\begin{split} \dot{V}(e\_{\rm xi}, e\_{\rm pi}, (\Phi\_i - \hat{\Phi}\_i), (\Psi\_i - \hat{\Psi}\_i)) \\ = \sum\_{i=1}^3 \left( p e\_{\rm xi} (\Phi\_i - \hat{\Phi}\_i)^T \Gamma\_i - p (K\_i + 1) e\_{\rm xi} \right. \\ \left. - q (\Phi\_i - \hat{\Phi}\_i)^T \dot{\Phi}\_i + p e\_{\rm xi} d\_{\rm xi} + l e\_{\rm pi} (\Psi\_i - \hat{\Psi}\_i)^T \mathbf{Y}\_i \right. \\ \left. - m (\Psi\_i - \hat{\Psi}\_i)^T \dot{\Psi}\_i \right). \end{split} \tag{38}$$

Using the adaptation laws in Eqs (17)–(19) under zero disturbances, it yields

$$\dot{V}(\mathcal{e}\_{xi}, \mathcal{e}\_{yi}, (\Phi\_i - \hat{\Phi}\_i), (\Psi\_i - \hat{\Psi}\_i)) = -p \sum\_{i=1}^3 (K\_i + 1) \mathcal{e}\_{xi}^{\;2}. \tag{39}$$

As*V*˙(*exi, eyi,*(Φ*i−*Φˆ *<sup>i</sup>*)*,*(Ψ*i−*Ψˆ *<sup>i</sup>*)) *<* 0, we need*−p*(*Ki*+1)less than zero for *i* = 1*,* 2*,* 3. In the steady-state, the synchronization errors and their derivatives are zero. In addition, the behaviors of all four neurons will be the same. Therefore, we have ˙Φ<sup>ˆ</sup> *<sup>i</sup>* <sup>=</sup> 0 and ˙Ψ<sup>ˆ</sup> *<sup>i</sup>* <sup>=</sup> 0, which implies that <sup>Φ</sup><sup>ˆ</sup> *<sup>i</sup>* <sup>=</sup> <sup>Φ</sup><sup>ˆ</sup> *∗ <sup>i</sup>* and Ψˆ *<sup>i</sup>* = Ψˆ *∗ <sup>i</sup>* are satisfied in the steady-state, where Φˆ *∗ <sup>i</sup>* and Ψˆ *∗ <sup>i</sup>* are constants. As observed in Rehan and Hong (2011), Rehan et al. (2011), and Iqbal et al. (2014), we have (Φ*<sup>i</sup> −* Φˆ *∗ i* ) *T* Γ*<sup>i</sup>* = 0 and (Ψ*<sup>i</sup> −* Ψˆ *∗ i* ) *T* Υ*<sup>i</sup>* = 0.

### **Proof of Theorem 2**

The proof of Theorem 2 employs similar methods as in the results (Rehan and Hong, 2011; Rehan et al., 2011; Iqbal et al., 2014) for the proposed complex scenario. Using Eq (38) and the proposed adaptation law in Theorem 2, we have

$$\begin{split} \dot{V}(e\_{xi}, e\_{yi}, (\Phi\_{i} - \hat{\Phi}\_{i}), (\Psi\_{i} - \hat{\Psi}\_{i})) \\ = \sum\_{i=1}^{3} \left( -\rho(K\_{i} + 1)e\_{xi} \right. \\ \left. \left. - \left( \left. \Phi\_{i} - \Phi\_{i} \right| \right. \right. \right) \left. \left. \left. \Phi\_{i}k\_{\epsilon} \left|| e\_{xi}|| \right. \right| + \rho e\_{xi}d\_{xi} \right) . \right. \end{split} \tag{40}$$

It can be confirmed with *∥*Φ*i∥ ≤* Φ*mi* that Φˆ *<sup>i</sup> −* Φ*<sup>i</sup>* 2 *−* Φˆ *<sup>i</sup> −* Φ*<sup>i</sup>* Φ*mi ≤* (Φˆ *<sup>i</sup> −* Φ*i*) *T* Φˆ *<sup>i</sup>* from (Iqbal et al., 2014). It along with Assumption 2 implies

$$\begin{split} \dot{V}(e\_{xi}, e\_{yi}, (\Phi\_{i} - \hat{\Phi}\_{i}), (\Psi\_{i} - \hat{\Psi}\_{i})) \\ \leq & \sum\_{i=1}^{3} \left( -||e\_{xi}|| \left( p(K\_{i} + 1) \left||e\_{xi}|| + k\_{\epsilon} \left( \left||\hat{\Phi}\_{i} - \Phi\_{i} \right|| - \Phi\_{\text{mi}}/2 \right)^{2} \right)^{-1} \right) \\ -k\_{\epsilon} \Phi\_{\text{mi}}^{2} / 4 \right. \tag{41} \end{split} \tag{42}$$

Given that *p*(*Ki*+1)*>*0, Eq (41) implies that *V*˙(*exi, eyi,*(Φ*i−*Φˆ *<sup>i</sup>*), (Ψ*<sup>i</sup> −* Ψˆ *<sup>i</sup>*)) *<* 0 if the conditions in Eq (42) hold.

$$||\mathfrak{e}\_{\rm xi}|| > \frac{k\_{\rm c} \Phi\_{\rm mi}^2 / 4}{p(K\_i + 1)}, \left||\hat{\Phi}\_i - \Phi\_i\right|| > \frac{\Phi\_{\rm mi}}{2} + \sqrt{\frac{\Phi\_{\rm mi}^2}{4} + \frac{p d\_{\rm mi}}{k\_{\rm c}}},\tag{42}$$

for *i* = 1, 2, 3. Thus, the synchronization errors and estimation errors are uniformly ultimately bounded as seen in Zhang et al. (2007), Rehan and Hong (2011), Rehan et al. (2011), and Iqbal et al. (2014) and references therein. The guidelines provided in Zhang et al. (2007), Rehan and Hong (2011), Rehan et al. (2011), and Iqbal et al. (2014) and references therein for the selections of robust adaptive control parameters can be followed.

This study provides a step to increase complexity by increasing the number of neurons and considering their complex interactions, and it provides an approach to consider a generalized model for synchronization aspects. Prohibition of synchronization is also another research topic. Further works on blockage of the synchronization using control strategies can also be investigated.

### **CONCLUSIONS**

This paper addressed the controlled synchronization in a network of ring configured four different FHN neurons with unknown parameters under direction-dependent coupling and disturbances. The neurons and their interactions (i.e., coupling) in a ring topology network are considered to be different owing to the inter-neuronal coupling medium properties. Based on the Lyapunov stability criteria, adaptive control strategies were developed to deal with the complex problem of synchronization in a network

### **REFERENCES**


of four different FHN neurons. In addition, a robust adaptive control was also developed to ensure robustness against the external disturbances to attain the uniformly ultimately bounded synchronization errors. In contrast to various existing works, dissimilar neurons, unknown parameters, multiple neurons, ring topology of neurons, bidirectional communication in neurons and coherence in activation potentials, and recovery variables are incorporated in this study. The numerical simulation results verified the efficacy of the proposed control approaches.

### **AUTHOR CONTRIBUTIONS**

MI wrote the first draft of the manuscript. MR has initiated the idea and revised the manuscript. K-SH has corrected the manuscript and finalized the work. All the authors have approved the final manuscript.

### **FUNDING**

This work was supported by the National Research Foundation (NRF) of Korea under the Ministry of Science and ICT, Korea (grant no. NRF-2017R1A2A1A17069430 and NRF2017R1A4A1015627).


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2018 Iqbal, Rehan and Hong. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# An Extended Passive Motion Paradigm for Human-Like Posture and Movement Planning in Redundant Manipulators

#### Paolo Tommasino<sup>1</sup> \* and Domenico Campolo<sup>2</sup> \*

<sup>1</sup> Laboratory of Neuromotor Physiology, Fondazione Santa Lucia, Rome, Italy, <sup>2</sup> Synergy Lab, Robotics Research Centre, School of Mechanical and Aerospace Engineering, Nanyang Technological University, Singapore, Singapore

A major challenge in robotics and computational neuroscience is relative to the posture/movement problem in presence of kinematic redundancy. We recently addressed this issue using a principled approach which, in conjunction with nonlinear inverse optimization, allowed capturing postural strategies such as Donders' law. In this work, after presenting this general model specifying it as an extension of the Passive Motion Paradigm, we show how, once fitted to capture experimental postural strategies, the model is actually able to also predict movements. More specifically, the passive motion paradigm embeds two main intrinsic components: joint damping and joint stiffness. In previous work we showed that joint stiffness is responsible for static postures and, in this sense, its parameters are regressed to fit to experimental postural strategies. Here, we show how joint damping, in particular its anisotropy, directly affects task-space movements. Rather than using damping parameters to fit a posteriori task-space motions, we make the a priori hypothesis that damping is proportional to stiffness. This remarkably allows a postural-fitted model to also capture dynamic performance such as curvature and hysteresis of task-space trajectories during wrist pointing tasks, confirming and extending previous findings in literature.

#### Edited by:

Yangming Li, University of Washington, United States

#### Reviewed by:

Vittorio B. Lippi, Universitätsklinikum Freiburg, Germany Danying Hu, University of Washington, United States

#### \*Correspondence:

Paolo Tommasino p.tommasino@hsantalucia.it Domenico Campolo d.campolo@ntu.edu.sg

Received: 30 June 2017 Accepted: 17 November 2017 Published: 30 November 2017

#### Citation:

Tommasino P and Campolo D (2017) An Extended Passive Motion Paradigm for Human-Like Posture and Movement Planning in Redundant Manipulators. Front. Neurorobot. 11:65. doi: 10.3389/fnbot.2017.00065 Keywords: kinematic redundancy, postural synergies, Donders' law, posture, movement, pointing

## 1. INTRODUCTION

Recent trends in both industry and healthcare clearly show the need for robots to be able to cooperate and assist humans in specific tasks. In order to do so, not only our robots will need to be safe-by-design, incorporating for example compliant mechanisms (Haddadin et al., 2010; Vanderborght et al., 2013) and force/impedance control architectures (Ficuciello et al., 2015) (as opposed to the current rigid and position-controlled deployed in industry) but will also need to behave naturally. In other words, while working with a robot, human operators not only need to be safe at all times, but shall also feel comfortable. As an example, imagine a robotic assistant designed to hand-over tools to a human operator. It is quite important for the robot to assume natural postures, which carry non-verbal semantics very valuable to human operators (the same object can be passed in different ways, for different purposes). For this and other reasons, in the last decades, roboticists have started looking into human motor strategies as a source of inspiration for the formulation of bio-inspired postural/motion controllers (Khatib et al., 2004; Schaal and Schweighofer, 2005; Kim et al., 2011, 2013; Zanchettin et al., 2013).

Another characteristic of modern robots is that they are general-purpose (unlike, for example, a CNC machine) and often bear human-like functionalities, if not resemblance. In this paper we shall be mainly interested in robotic manipulators. Currently, many commercial robotic arms are made available (in single or bimanual configuration) with kinematic similarities to human arms (Smith and Rooks, 2006; Albu-Schaffer et al., 2008). One of the specific similarities lies in the degrees-of-freedom (DOF), typically 6–7 in current robotic manipulators, and the kinematic redundancy which comes with it when dealing with most of tasks. The coordination of redundant degrees-of-freedom is a central topic in both robotics and neuroscience and we are interested in two specific aspects: the redundancy problem (Bernstein, 1967) and the posture/movement problem (Ostry and Feldman, 2003). This issue was first addressed by the authors in a recent work (Tommasino and Campolo, 2017) where a principled approach was proposed to tackle these very issues with focus on capturing human-like postural strategies: static (or equilibrium) postures satisfying a given (static) task constraint. In this work, we shall specialize the computational model and extend our previous results to the problem of movement generation: given a desired task constraint, find human-like motions (and postures), both in task and joint space, that brings the current robot posture to the desired task-space target.

### The Robotics Approach to Kinematic Redundancy

Motions for robotic manipulators are typically planned in taskspace as it is much easier and intuitive to define a trajectory for a robotic end-effector than for its multiple (and often) redundant joints. For example, if we want a robotic manipulator to reach for a given object we can easily program the robotic gripper to follow a desired task-space path **x**<sup>d</sup> with a given task-space velocity **x**˙<sup>d</sup> rather than programming the trajectory of each individual joint. However, due to kinematic redundancy, mapping the desired task trajectory in joint space is challenging as infinite combinations of joints trajectories are possible for the same taskspace trajectory. This issue has a very long history in robotics and it has been tackled by roboticists, either at the kinematic or at the force level, with a local optimization approach and the use of weighted pseudo-inverses of the Jacobian matrix of the robot manipulator (Klein and Huang, 1983; Nenchev, 1989; English and Maciejewski, 2000). For instance, a simple way to map a desired task trajectory **x**˙<sup>d</sup> in joint space is: **q**˙<sup>d</sup> = J # <sup>W</sup>(**q**)**x**˙d, where J # <sup>W</sup> is any W-weighted generalized pseudo-inverse of the Jacobian matrix J. However, it was soon realized that such solution, although simple, very often results in non-holonomic, or non-repeatable joint trajectories, i.e., the robot equilibrium posture satisfying a given task-constraint is not unique but depends on the path that robot followed before reaching the desired task constraint (Klein and Huang, 1983; Mussa-Ivaldi and Hogan, 1991). This type of solution is problematic especially for cyclic task-space movements as non-repeatable joint motions can result in instability and/or violations of joint constraints. At the kinematic level, the problem of repeatability can be tackled by planning an additional joint trajectory (or null-space motion) **q**˙ 0 that does not interfere with the planned task-space motions **x**˙: **q**˙ = J # <sup>W</sup>(**q**)**x**˙ + NW(**q**)**q**˙ <sup>0</sup> where NW(**q**) is the null-space projector operator associated to the weighting matrix W (Klein and Huang, 1983; Nenchev, 1989; English and Maciejewski, 2000) 1 . While kinematic motion planning requires an execution level to track the desired trajectory in joint space (such as computed torque control or PD control (Murray et al., 1994), weighted pseudoinverses and null-space projectors can also be used to solve kinematic redundancy at the force/torque level: τ = J T (**q**)**F** + N T <sup>W</sup>∇**q**h(**q**), where <sup>τ</sup> is the commanded joint torque, **<sup>F</sup>** is a taskspace force fields (Mistry and Schaal, 2015) 2 that drives the robotic end-effector along desired task constraints and ∇**q**h(**q**) is the gradient of a real or virtual potential fields that is mapped in the null-space of the Jacobian transpose matrix to achieve repeatable joint motions.

In the last decade, task-space control has been extensively used in robotics to generate human-like and/or adaptive robot behavior (Schaal and Schweighofer, 2005) either at the tasksspace level, in terms of adaptive trajectories (Peters and Schaal, 2007; Degallier and Ijspeert, 2010) and Cartesian impedance (Calinon et al., 2013), then at the joint space level in terms of null-space control and weighting matrix W (Khatib et al., 2004; Nakanishi et al., 2008; Dietrich et al., 2015).

### Kinematic Constraints and Computational Approaches to Human Motor Control

In neuroscience is still debated whether the human brain adopts a hierarchical approach to plan and control movements and whether the brain plans and control task-space and null-space motions independently (Jordan and Wolpert, 1999; Mussa-Ivaldi et al., 2011; Mistry and Schaal, 2015). The experimental evidence that unconstrained planar reaching movements features straightline paths and bell-shaped velocity profiles led to the hypothesis that the human brain plans hand movement in task-space, by shifting the equilibrium position of the hand according to a minimum-jerk trajectory. This trajectory would then be tracked in joint space (hence at a lower level) by an impedance controller that exploits muscle visco-elasticity [see the Equilibrium-Point Hypothesis (EPH); Flash, 1987 for more details]. Later studies however, showed that in other experimental conditions hand movements were curved and models such as the minimumtorque change (Uno et al., 1989) and the minimum-variance (Harris and Wolpert, 1998) were able to capture these human movement features by solving an optimal control problem directly in joint space.

### Postural Synergies: Donders' law, Uncontrolled Manifold and the Leading Joint Hypothesis

Postures are somewhat static, possibly accounted for as equilibria of some potential field (Campolo et al., 2011), while movement

<sup>1</sup>Note that any positive-definite matrix can be used as weight and that in general W is configuration-dependent.

<sup>2</sup>The task-space force field is usually generated by combining a desired task-space trajectory with a task-space impedance.

is in apparent contrast with the very concept of equilibrium. The Posture/Movement problem stems out from the possible interference of postural control mechanisms with general motor strategies (Ostry and Feldman, 2003). In the last few decades, various approaches have been proposed in computational neuroscience as an attempt to reconcile posture and movement.

An extensive number of behavioral studies have shown that, at the joint-space level, during kinematically redundant tasks, humans adopt a stereotypical strategy that associates a unique and path-independent posture to a given task (Hepp, 1990; Haslwanter, 1995). This kinematic strategy is usually called Donders' law, since the Dutch ophthalmologist Donders showed (1847) that for any steady gazing direction (task), the human eye assumes a unique combination of elevation, azimuth, and torsion angles (posture). Donders-like strategies have also been found for pointing tasks involving the head (Ceylan et al., 2000; Crawford et al., 2003), the wrist (Campolo et al., 2009), the shoulder (Hore et al., 1992) and for pointing/reaching tasks involving the upper arm (Liebermann et al., 2006; Ewart et al., 2016).

It has been suggested that the brain implements Donders' Law as a flexible family of holonomic constraints (Medendorp et al., 2000; Crawford et al., 2003) to solve redundancy as well as to fulfill some optimality criteria that might vary in different experimental scenarios and physiological conditions (Ceylan et al., 2000; Medendorp et al., 2000; Wong, 2004).

From a computational perspective, Donders-like postural strategies, can be captured by solving a constrained optimization problem which returns the unique optimal posture that minimizes a given (posture-dependent) objective function while fulfilling a desired task-constraint (Cruse et al., 1990; De Sapio et al., 2006; Campolo et al., 2011). Because this type of postural models only computes static/equilibrium-configurations they are usually not suitable for planning movements. Transport models (Vetter et al., 2002) such as minimum-torque-change (Uno et al., 1989), minimum-work (Soechting et al., 1995), minimumvariance (Vetter et al., 2002), do provide a solution to the Posture/Movement problem but, in their original formulation are incompatible with Donders' law as they predict path-dependent equilibrium postures (Admiraal et al., 2004).

Kinematic constraints such as Donders' law, suggest that the brain may plan and control equilibrium postures directly in joint-space, by constraining redundant postures to a submanifold (Donders' surface) of the joint-space. Experimental studies involving redundant DOFs however, have also shown that motor variability is always higher along task-irrelevant directions (also known in human motor control as uncontrolled manifold) of the joint-space rather than along task-relevant directions (Latash et al., 2007). These results led to the Uncontrolled Manifold hypothesis (Scholz and Schöner, 1999) according to which the brain does not freeze redundant DOFs into a holonomic constraint (such as Donders' law) but instead uses redundant DOFs to push "bad motor variability" (i.e., directly affecting the task) along task-irrelevant directions of the joint space. In other words, according to the UCM hypothesis the brain would only stabilize elemental variables (such as joint rotations) that directly affect task performance while leaving task-irrelevant directions uncontrolled.

An alternative theory on how the brain may simplify the control of redundant DOFs is the Leading Joint Hypothesis (LJH) (Dounskaia, 2005). Central to the theory is the fact that link segments are coupled to each other by non-linear interaction torques so that motion in one joint unavoidably introduce motions to nearby joints, especially for fast speed movements. According to the LJH, the brain, depending on the specific task, organizes joints hierarchically: the "leading" joint, typically a proximal joint of the chain, is accelerated/decelerated as in a single joint movement, hence neglecting interaction torques and motions at the other joints. Subordinate joints instead, "monitor the interaction torque effect and create net torque that results in limb motion characteristics required by the task, including movement direction, accuracy, and so on." Although in line with intuition, the LJH, to the best of authors knowledge, does not really propose a computational framework.

### Optimal Feedback Control and Passive Motion Paradigm

The UCM and the LJH do provide theories of human motor control and mathematical frameworks to analyse human movements in terms of joint variability and leading/subordinate joints respectively. However, very little is known on how the brain may actually implement such motor strategies. Optimal feedback control (OFC) is probably one of the most accredited computational model of human motor control that can reproduce both average trajectories of human reaching movements and, to some extend, can predict the patterns of motor variability typical of the UCM hypothesis (Todorov and Jordan, 2002). Contrary to the robotics task-space control, in the OFC framework there is no distinction between planning and execution and task and joint space trajectories simply unfolds as the optimal controller adjusts feedback gains to suit the overall goals of the system. The OFC also predicts movement variability in line with the UCM hypothesis as deviations from the average trajectory are not correct by the controller if they do not affect task performance (minimum intervention principle).

Although the OFC framework has been very successful at modeling movement strategies typical of planar nonredundant point-to-point reaching movements (Scott, 2004), some computational studies have reported difficulties in solving optimal control problem in the presence of both kinematic redundancy and static forces (gravitational and/or elastic). This is because, to hold the body still at equilibrium (i.e., at the end of a movement), suitable boundary conditions must be specified so that the optimal muscle forces can counterbalance the static forces acting on the body (see Guigon et al., 2007 and references therein). Recently, this Posture/Movement problem has been tackled with the Separation Principle according to which the brain processes static (i.e., configuration-dependent) and dynamic (i.e., velocity-dependent) joint torques separately (Hollerbach and Flash, 1982; Atkeson and Hollerbach, 1985; Guigon et al., 2007). By combining the optimal control framework with the Separation Principle, Guigon and colleagues were able to implement human-like motor strategies in redundant manipulators (Guigon et al., 2007; Taïx et al., 2013). However, their approach, formulated at the joint-space level, results into path-dependent (i.e., dependent on the movement history and initial body configuration) terminal postures and therefore such an approach cannot predict kinematic synergies such as Donders' law.

An alternative theory to the OFC framework is the so-called Passive Motion Paradigm (PMP) (Mussa-Ivaldi et al., 1988; Mohan and Morasso, 2011; Morasso et al., 2015) PMP can be considered a computational generalization of the EPH, in that goals and kinematic constraints can be superimposed when viewed as force-fields. First proposed in the 80 s (Mussa-Ivaldi et al., 1988), the PMP has evolved over the years and has been proposed as a theory of human trajectory formation and as bio-inspired trajectory planner for redundant robots (Mohan and Morasso, 2007; Morasso et al., 2010; Mohan et al., 2011).

One of the major strengths of the PMP lies in its computational simplicity. While full details of the PMP are found in Mohan et al. (2011) and references therein, its basic features are illustrated in **Figure 1**. In the standard PMP model, a robotic manipulator is seen as a rigid structure (i.e., an arm-like kinematic chain) with "intrinsic" properties defined at the level of joint-space (e.g., joint angles q1, q2, q3) and "extrinsic" properties defined at the level of task-space (e.g., actual and desired endpoint postures **x** and **x**d, respectively). The redundancy problem is solved by postural mechanisms implemented via the action of an intrinsic impedance, for example in the form of purely viscous (mechanical dampers) or viscoelastic (dampers and springs) elements interconnected at joint level. On the other hand, movement is planned at task-level and implemented by an extrinsic impedance, **F** = K(**x**<sup>d</sup> − **x**) in **Figure 1**, acting as a generalized spring which continuously drives the end-effector (at some position **x**) toward the goal **x**<sup>d</sup> while the intrinsic impedance takes care of postures.

The standard PMP comes in two forms, with the only difference in terms of intrinsic impedance: one being purely viscous and the other being viscoelastic. In the first case, it can be easily shown (Tommasino and Campolo, 2017) that a purely viscous intrinsic impedance solves the posture/movement problem but is incompatible with Donders' law (as it does not yield repeatable postures). In the second case, the intrinsic viscoelastic impedance ensures unique postures (due to an elastic potential in joint-space) but does not solve the posture/movement problem. A simple way to see this is that the intrinsic springs "pull" the end-effector back to a rest position (**q** ∗ ), in contrast with the extrinsic spring which pulls the endeffector toward a target **x**d. To ensure task completion (i.e., **x** = **x**d) one should set the target at a different location, say **x**d′, so that the end-effector ends up being in equilibrium at the planned target **x**d. Computing **x**d′ is not trivial and somewhat blurs the separation between task and posture as **x**d′ depends on both the intrinsic and the extrinsic elastic potentials.

### An Extended Passive Motion Paradigm (λ0-PMP)

One way to prevent the interference between intrinsic and extrinsic elastic potentials is to block any effect of the intrinsic potential onto the task. Inspired by the Separation Principle (of static and dynamic torques) (Guigon et al., 2007), we recently proposed the λ0-PMP model (Tommasino and Campolo, 2017), an extension of the standard PMP. Experimental evidence shows that the human brain processes static (or configurationdependent) and dynamic (or velocity-dependent) force fields separately (Hollerbach and Flash, 1982; Atkeson and Hollerbach, 1985; Nishikawa et al., 1999; Kurtzer et al., 2005b). Because static forces such as gravitational or elastic fields are predominant during slow movements and are not affected by movement speed, the Separation Principle (Guigon et al., 2007) has been proposed as a simplifying control strategy for the brain to learn new movements (Nishikawa et al., 1999), to efficiently timescale arm trajectories (Hollerbach and Flash, 1982; Atkeson and Hollerbach, 1985) and to robustly cope with the effect of gravity in different environments (Kurtzer et al., 2005a).

In literature, the Separation Principle is typically applied at joint-space level (larger than task-space, dimension-wise, when dealing with redundant manipulators), assuming that static contributions (either due to gravity or to elastic fields) are perfectly compensated for by the brain (or by the robot controller) so that they can be removed from the dynamic equations of the limb under control (Guigon et al., 2007; Taïx et al., 2013). In our recent work (Tommasino and Campolo, 2017), we derived the λ0-PMP model by (i) re-framing the problem within the constrained minimization framework and

applied the Lagrange Multipliers method; (ii) noting that the Lagrange Multipliers λ define a task-space force field; (iii) applying the Separation Principle to λ and defining a static taskspace force field λ<sup>0</sup> (from which the name of the method). This task-space force λ0, also highlighted in **Figure 1** as an addition to the standard PMP, produces partial compensation of joint torques, blocking their effect only on the tasks space, leaving joint torques free to act in the null-space, and driving the posture toward minima of the potential without interfering with taskspace objectives. This captures the essence of postural synergies such as Donders' law which can now be seen as generated from a joint-space potential combined with a task-space force field.

### Scope of This Work and Contribution

With reference to **Figure 1B**, for the λ0-PMP, once the Jacobian matrix J is defined (given the geometry of the manipulator and of the task), the parameters which need to be determined are the intrinsic stiffness matrix K<sup>J</sup> ; the intrinsic damping matrix W (or, equivalently, the admittance W−<sup>1</sup> ); and the extrinsic stiffness matrix K. Although all these intrinsic and extrinsic parameters are required to plan motion, this work will focus on the role of intrinsic properties and in particular the intrinsic damping matrix W.

In previous works, we focused on postural strategies and showed how Donders' Law can be captured via an intrinsic elastic potential (Campolo et al., 2011; Tommasino and Campolo, 2016) and how nonlinear inverse optimization can be used to determine the coefficients of the intrinsic stiffness K J to fit experimental data (Tommasino and Campolo, 2017). In this work, we shift our focus on movement dynamics, which are primarily shaped by the damping matrix W. Rather than trying to use the coefficients of the matrix W as "extra degrees of freedom" to better fit experimental data a posteriori, we assume a priori that damping is proportional to stiffness, in line with experimental evidence (Tsuji et al., 1995; Perreault et al., 2004; Tee et al., 2004; Peaden and Charles, 2014). In other words, we hypothesize that the same biomechanical factors which determine the "shape" of K J (i.e., its eigenvectors and eigenvalues) also determine the "shape" of W.

With this hypothesis in place, the intrinsic stiffness still has an "indirect effect" as it shapes the intrinsic damping matrix W, whose dynamic effects are not blocked by λ0. We shall specifically show how this mechanism determines curvature of task-space trajectories during pointing tasks performed with the wrist, in line with the experimental evidence also reported in literature by Charles and Hogan (2010). Lastly, it should be noted that any stable task-space force field can be used as a movement planner and some possible choices have already been reported in Tommasino and Campolo (2017). In line with the PMP, in this work we assume that the task planner is a virtual elastic field driving the end-effector toward the desired target. A detailed comparison of different task-space planners will be reported in a separate work.

### 2. MATERIALS AND METHODS

This section presents the λ0-PMP, a novel extension of the Passive Motion Paradigm, and its specialization to wrist pointing tasks which will be later used in a comparative analysis with experimental data.

Although bearing remarkable similarities with the Passive Motion Paradigm, the theoretical derivation of the λ0-PMP follows a principled approach, described in detail in Tommasino and Campolo (2017). However, this similarity allows presenting our model as an extension of the PMP, facilitating readers already familiar with the Passive Motion Paradigm itself. To this end, **Figure 1B** highlights the differences between the two standard PMP models (here denoted as PMP1 and PMP2) and ours.

## λ0-PMP

One of the computational advantages of the PMP is its ability to solve the redundancy problem without explicit kinematic inversion and cost function computation (Mohan and Morasso, 2011). To see how this is accomplished, consider a redundant manipulator with forward kinematics

$$\mathfrak{x} = F\mathcal{K}(\mathfrak{q}) \tag{1}$$

where **x** ∈ R <sup>m</sup> is a given end-effector pose, **q** ∈ R n is a given manipulator configuration and the inequality m < n denotes kinematic redundancy. As an example, **Figure 1A** shows a planar human-like arm with a three-dimensional joint space (consisting of three rotational joints **q** = [q<sup>1</sup> q<sup>2</sup> q3] T ) and with a twodimensional task-space encoding of the actual hand position **x** and the desired hand position **x**d. Once the Forward Kinematics (FK) of the manipulator is defined in relation to a specific task, one can compute the task Jacobian, an n × m matrix which maps joint space velocities **q**˙ into task-space velocities **x**˙:

$$
\dot{\mathbf{x}} = \frac{\partial F\mathbf{K}}{\partial q}\dot{\mathbf{q}} := J(q)\dot{\mathbf{q}}\tag{2}
$$

The redundancy problem lies in the fact that, even with a fullranked Jacobian matrix, there might exist many (infinite) joint velocities which result in the same velocity at the end-effector **x**˙. This problem can be solved with a mechanical analogy, imagining that a mechanical manipulator, with negligible inertia and purely viscous (symmetric and positive-definite) joint impedance W producing viscous joint torques W**q**˙, is passively moved at the end-effector with an imposed velocity **x**˙. This action will produce a unique joint velocity

$$\dot{\boldsymbol{q}} = \boldsymbol{W}^{-1} \boldsymbol{J}^{T}(\boldsymbol{q}) \underbrace{\left(\boldsymbol{J}(\boldsymbol{q}) \boldsymbol{W}^{-1} \boldsymbol{J}^{T}(\boldsymbol{q})\right)^{-1}}\_{\boldsymbol{B}(\boldsymbol{q})} \dot{\boldsymbol{x}} \tag{3}$$

This type of redundancy solution is also known as a Wweighted generalized pseudo-inverse (Klein and Huang, 1983; Doty et al., 1993) but, rather than its derivation, here we want to emphasize its physical interpretation. The highlighted term B(**q**):= J(**q**)W−<sup>1</sup> J T (**q**) −1 represents the task-space damping, i.e., the damping force perceived at the end-effector (task-space) while imposing a task-space velocity **x**˙ and solely due to the joint-space damping W and its mapping via the Jacobian J(**q**).

To accomplish a task such as reaching for a target **x**d, a simple way is to "pull" the end-effector toward the target with

the action of an extrinsic spring K, producing a task-space force **F** = K(**x**d−**x**) on the hand always directed toward the target. The first and most basic form of PMP (Mohan and Morasso, 2011) corresponds to the thin-line loop [denoted PMP1 and including K, W−<sup>1</sup> , J(**q**) and J T (**q**)] in **Figure 1B**. This form of PMP can solve redundancy but is incompatible with Donders' law. More details can be found in Tommasino and Campolo (2017) but it is straightforward to see that, once on the target, i.e., **x** = **x**d, no force is produced by the extrinsic spring (**x** − **x**<sup>d</sup> = 0) and any posture **q** such that **x**<sup>d</sup> = FK(**q**) will therefore be maintained indefinitely.

To guarantee unique postures, an elastic scalar potential can be introduced in the joint-space, for example, via an intrinsic stiffness matrix K J at the joint-level. This will subject the manipulator to joint torques τ el : = K J (**q** − **q** ∗ ) which continuously drive the manipulator toward a given "rest" posture **q** ∗ (an equilibrium for the intrinsic elastic potential). The addition of elastic joint torques τ el leads to the second form of PMP, denoted as PMP2 and highlighted in **Figure 1B**. As shown in Tommasino and Campolo (2017), this solves redundancy and accounts for Donders's law but does not solve the posture/movement problem due to the contrasting effect of the extrinsic spring K, which pulls the end-effector toward the target **x**d, and intrinsic stiffness K <sup>J</sup> which pulls the whole manipulator toward the "rest" posture **q** ∗ . It is easy to show that, in general, **x**<sup>d</sup> is not an equilibrium for the system. If **x** = **x**d, then the extrinsic spring K will produce no force (**x**<sup>d</sup> − **x** = 0) and the effect of the intrinsic stiffness K <sup>J</sup> on the task will not be contrasted, moving the end-effector away from **x**d.

The unwanted interference due to intrinsic elastic torques τ el can be removed by adding the task-space force:

$$
\lambda\_0 := \left. B(q)J(q)\right|W^{-1}\mathfrak{r}\_{el} \tag{4}
$$

The task-force λ<sup>0</sup> is the last piece of the puzzle needed to complete the description of the λ0-PMP model shown in **Figure 1B**. For a complete theoretical derivation, the reader should refer to Tommasino and Campolo (2017), here we only wish to provide its physical intuition. As mentioned above, τ el is an elastic torque field responsible for Donders' law, in the sense that it constantly drives the manipulator toward "natural" or "comfortable" postures (Campolo et al., 2011). In doing so, however, it also interferes with the task completion (posture/movement problem). In order to block its effect only in the task-space (so, preserving Donders' law in joint-space) one could proceed as follows: the elastic torque τ el, if unblocked, would produce a joint velocity **q**˙ el <sup>=</sup> <sup>W</sup>−<sup>1</sup> τ el with a resulting task velocity **x**˙ el = J(**q**)**q**˙ el. The task-force <sup>λ</sup><sup>0</sup> can be seen as the force needed to contrast the (task-space) damping force B(**q**) **x**˙ el = B(**q**)J(**q**)W−<sup>1</sup> τ el. The novel concept of a task-space force λ<sup>0</sup> is very useful as it provides a force perspective which allows other force-control strategies to be simply superimposed onto our postural mechanisms. The extrinsic spring force **F** = K(**x**<sup>d</sup> − **x**) plays the role of Task Planner. In fact, as shown in Tommasino and Campolo (2017), other type of force-control strategies could be superimposed, such as a visco-elastic taskspace force field or an optimal force field planner minimizing the total task-space force moving the end-effector toward the desired target.

**Remark:** With reference to the diagram in **Figure 1B**, the task-space damping B(**q**) in Equation (3) transforms task-space velocities **x**˙ into task-space forces which balance out the effect of the extrinsic spring K, leading to the following task-space dynamic equation:

$$B(q)\dot{\mathbf{x}} = K(\mathbf{x}\_d - \mathbf{x})\tag{5}$$

Although the task-space damping is posture-dependent and more equations are needed to fully solve the dynamics, some remarkable properties can already be noted: (i) the taskdamping B(**q**) in Equation (3) directly depends on the intrinsic damping W which, therefore, directly affects task-space dynamics (Equation 5); (ii) the intrinsic stiffness K <sup>J</sup> does not appear in Equation (5) and therefore does not directly affect the task dynamics, however, it does it indirectly through postures adjustments in the null-space which affect the Jacobian and therefore B(**q**).

Although in this work we only consider elastic joint-space potentials, other potentials (for example due to gravity) and their gradient can be simply added in parallel to the elastic torque τ el in **Figure 1** and, the λ<sup>0</sup> would only block, in task-space, the effects of these posture-dependent torque fields but not velocitydependent (i.e., dynamic) torques, such as viscous effects due to W. The reader is referred to our previous work (Tommasino and Campolo, 2017) for this and other details. The role of the λ<sup>0</sup> also offers a force field perspective to the UCM and LJH hypothesis. By compensating only the task-space components of intrinsic potential fields, the null-space is left uncontrolled and motions along task-irrelevant directions are due exclusively to the passive dynamics of the limb under control.

### Application to Wrist Pointing Tasks

Building on previous experimental and computational studies (Campolo et al., 2009, 2010, 2011; Charles and Hogan, 2012; Formica et al., 2012; Tommasino and Campolo, 2017), we are specifically interested in capturing human-like motor strategies during pointing tasks performed with the wrist. To implement the model in **Figure 1B**, we shall first determine the Jacobian J from the forward kinematics; the intrinsic stiffness (K J ) matrix and the rest posture **q** ∗ ; the damping (W) matrix as well as the extrinsic stiffness matrix K.

#### Forward Kinematics

With reference to **Figure 2**, we assume that the wrist is used to point a virtual laser beam onto a point on a computer screen, i.e., a two-dimensional task-space of coordinates **x** = [x<sup>1</sup> x2] <sup>T</sup> ∈ R 2 . The human wrist is modeled as an ideal, three-dimensional gimbal comprising the following three orthogonal, rotational axes (from proximal to distal):


FIGURE 2 | Center-out task and kinematic spaces involved in wrist pointing. 3D wrist configurations can be equivalently expressed in terms of 3 × 3 rotation matrices R(**q**), rotation vectors **r** or joint rotations **q**. The forward kinematics **x** = FK(**q**) maps 3D wrist orientations onto the 2D task, expressed in screen coordinates **x**. Adapted from Campolo et al. (2011).

The joint-space is therefore three-dimensional and can be described via a joint vector **q** = [q PS q FE q RUD] <sup>T</sup> ∈ R <sup>3</sup> or, alternatively, via rotation vectors (Campolo et al., 2010, 2011). as shown in **Figure 2**.

The forward kinematics (Equation 1) for a 3DOF wrist at distance d = 1 m and initially pointing in the [1 0 0]<sup>T</sup> direction can be written as

$$
\begin{bmatrix} \chi\_1\\ \chi\_2 \end{bmatrix} = \underbrace{\begin{bmatrix} 0 & -d & 0\\ 0 & 0 & d \end{bmatrix}}\_{\text{secreen projection}} \cdot R(q) \cdot \begin{bmatrix} 1\\ 0\\ 0 \end{bmatrix} \tag{6}
$$

where R(**q**) represents the 3D hand orientation, computed as <sup>R</sup>(**q**) <sup>=</sup> exp(−b**e**x<sup>q</sup> PS) exp(b**e**z<sup>q</sup> FE) exp(b**e**y<sup>q</sup> RUD) where the exponential notation exp(b**<sup>e</sup>** <sup>θ</sup>) represents the rotation about an axis **e** by and angle θ (Murray et al., 1994). Further details are given in Campolo et al. (2011) and references therein. Once the forward kinematics is given, the Jacobian can be analytically computed based on its definition given in Equation (2).

#### Subject-Specific Intrinsic Stiffness K <sup>J</sup> and Rest Posture **q** ∗ from Experimental Data

For the 3DOF wrist, the intrinsic stiffness (as well as the damping) is represented by a 3 × 3 symmetric matrix. The rest posture **q** ∗ represents the posture (three joint angles) of minimum elastic energy. Using nonlinear inverse optimization (NIO) techniques (Tommasino and Campolo, 2016), a subjectspecific matrix K J and rest posture **q** ∗ can be directly derived from experimental data. As detailed in Tommasino and Campolo (2016), experimental data consisting of thousands of data points are fitted to a quadratic surface, typically used in literature to encode Donders' law. This can be seen as an extreme downsampling of experimental data and the resultant quadratic surface can be seen as an average Donders' surface. The reader is referred to Tommasino and Campolo (2016) for the detailed procedure based on nonlinear inverse optimization. One thing to highlight is that it is the relative ratio between eigenvalues of K <sup>J</sup> which determines a specific Donders' law, not the absolute values. For this reason, the trace of the matrix can be set to any arbitrary (positive) number. To be in line with biomechanical (passive) stiffness values found in literature (Peaden and Charles, 2014), we set this value to be trace(K J ) = 4 Nm/rad.

#### Damping W and Intrinsic Time Constant

While the intrinsic stiffness matrix is derived directly from a fitting process of experimental data, for the intrinsic damping matrix W, rather than trying to use the coefficients of the matrix W as "extra degrees of freedom" to better fit experimental data, we assume that damping is proportional to stiffness, in line with experimental evidence (Tsuji et al., 1995; Perreault et al., 2004; Tee et al., 2004; Peaden and Charles, 2014). In other words, we hypothesize that the same biomechanical factors which determine the "shape" (in terms of eigenvectors and eigenvalues) of K <sup>J</sup> will determine a similar "shape" for W. For this reason we set the damping to be proportional to the intrinsic stiffness

$$\boldsymbol{W} = \mathfrak{r}\_0 \boldsymbol{K}^{\boldsymbol{I}} \tag{7}$$

where τ<sup>0</sup> is a scalar (positive) value with the units of time, and can be therefore thought of as an intrinsic time constant. The reason is that, for a simple scalar, linear spring-damper system, the ratio between damping and stiffness determines exactly the time constant of the system.

#### Extrinsic Stiffness K and Task-Space Dynamics

The extrinsic stiffness K is responsible for the task-space dynamics together with the task-space damping B(**q**), as highlighted in Equation (5). However, B(**q**) is determined once J and W are given, as in Equation (3). In this work, we are considering very simple center-out tasks, as described below. As there is no a priory preferential direction in the task-space, we shall consider an isotropic extrinsic stiffness K, although other choices of task planners are possible (Tommasino and Campolo, 2017) and will be the focus of future works. When the extrinsic stiffness is constant, our computational model predicts trajectories that evolve in time according to a first order dynamic typical of a visco-elastic system with constant parameters. In other words, the desired target is reached with an exponential velocity profiles that decays to zero only after an infinite amount of time. This feature is clearly not bio-inspired as biological movements are characterized by bell-shaped velocity profiles. In the PMP, Morasso and colleagues have overcome this issue with the introduction of a time base generator, i.e., a time-dependent gain matrix that rescales end-effector velocities according to a minimum-jerk profiles (Morasso et al., 2010). However, in this work we pursue the approach proposed by Arimoto et al. (2005), as it could reproduce velocity profiles more similar to our experiments. More specifically, a time-varying extrinsic stiffness matrix:

$$K(t) = k \cdot \left(1 - e^{-\frac{t}{\overline{\tau}}} - \frac{t}{\overline{\tau}} e^{-\frac{t}{\overline{\tau}}}\right) \cdot \begin{bmatrix} 1 & 0\\ 0 & 1 \end{bmatrix} \tag{8}$$

with the property of increasing its stiffness value from zero to k with an extrinsic time constant τ (notice that when pointing to a new target the time t is reset to zero), is used to avoid first order dynamics typical of visco-elastic systems with constant stiffness and damping parameters.

### Example: Anisotropic Damping and Curved Task-Space Trajectories

As an example, **Figure 3** shows the trajectories predicted via the λ0-PMP with an anisotropic intrinsic damping W. More specifically, every outbound and inbound movement was simulated for a duration T = 0.4 [s] , and with time-constants τ = τ<sup>0</sup> = 0.08 [s], i.e., one-fifth of T. The anisotropic damping W was set according to Equation (7) as:

$$W = \text{tr}\_0 K^I = 0.08 \begin{bmatrix} 0.5 & 0 & 0 \\ 0 & 1.5 & 0 \\ 0 & 0 & 2 \end{bmatrix} \frac{Nms}{rad} \tag{9}$$

hence proportional to an anisotropic intrinsic stiffness K J . The rest posture was set as **q** <sup>∗</sup> = [5 0 0]<sup>T</sup> [deg] and the scalar stiffness k = 22.5[ Nm rad ].

As shown in **Figure 3A** the anisotropic damping results into paths of different degree of curvature depending on the specific movement direction. Moreover, outbound and inbound movements follow different paths, especially along the (NW-SE) and (SW-NE) directions. **Figure 3B** shows the joint space trajectories predicted by the model and **Figure 3C** shows that task-space tangential velocity profiles are bell-shaped thanks to the time-varying stiffness K(t) (Arimoto et al., 2005).

### 3. COMPARATIVE ANALYSIS OF TASK-DYNAMICS: EXPERIMENTAL POINTING TASKS VS. DONDERS-FITTED λ0-PMP MODEL PREDICTIONS

In this section, we will compare the average experimental taskspace trajectories, as previously measured in Campolo et al. (2011) from human subjects during wrist pointing tasks, with those predicted via a Donders-fitted λ0-PMP model, i.e., a λ0- PMP model for which the only postural parameters are fitted to capture the Donders' law for a specific subject. The major limitation of our previous model (Campolo et al., 2011) is that it was limited to static postures, while our current λ0-PMP model can also generate movements.

The main hypothesis is that a Donders-fitted model λ0-PMP, i.e., fitted to only capture postural strategies, is also able to display path dynamics such as curved task-space trajectories as experimentally found by Charles and Hogan (2010). A major difference with their experimental paradigm is that their subjects only used FE and RUD movements, as PS movements were restrained, so it was not a redundant task. In our case, subjects are free to rotate the forearm about the PS axis, adding a degree of redundancy.

### Donders-Fitted λ0-PMP Model for Wrist-Pointing Tasks

Our λ0-PMP model in **Figure 1B** requires specification of a Jacobian [J(**q**)], a task-planner (K) as well as intrinsic damping (W) and intrinsic postural parameters (K J and **q** ∗ ). Once specialized to wrist-pointing tasks and ideally assuming a similar wrist structure for all subjects, the forward kinematics (Equation 6), and therefore the Jacobian J(**q**), will be the same for all subjects. On the other hand, we shall fit subject-specific postural parameters K J and **q** ∗ . Note that these parameters alone only capture Donders' law, i.e., they can identify optimal postures for giving pointing directions (Campolo et al., 2011; Tommasino and Campolo, 2016) but cannot tell where to point. The actual motion, in particular the geometry of task-space trajectories, will be shaped by the task dynamics (Equation 5). Here, rather than fitting every single movement a posteriori with a specific damping W, we make the a priori hypothesis that intrinsic damping is proportional to intrinsic stiffness, via an intrinsic time constant as in Equation (7).

As shown in Equation (5), task-space dynamics depend on K and on B(**q**) which, in turn, depends on the intrinsic damping W via Equation (3). For the task-planner, we shall assume an extrinsic K(t) as in Equation (8) hence isotropic and therefore not directly responsible for path curvatures. Both the intrinsic and extrinsic time constant, in Equations (7), (8) respectively, affect the speed of the simulated trajectory, in particular, the time required for the simulated wrist, to reach the target in "steadystate" (i.e., an equilibrium posture compatible with Donders' law) after the beginning of the movement. In general, movement speed can be target and subject specific, hence we set both time constants to be proportional to the average movement duration Tsj that subject s needs to point to the target j. More specifically,

we heuristically found that, by setting τ = τ<sup>0</sup> = Tsj/5 the model predicts both task-space and joint space dynamics that are compatible with the experimental ones (see results below). Similar to Equation (7), we used the time-constant τ<sup>0</sup> to tune the scalar stiffness k in Equation (8) as:

$$k = b\_{\max} / \mathfrak{r}\_0 \tag{10}$$

where bmax is the maximum eigenvalue of the matrix B(**q**<sup>0</sup> ) and **q**0 is the initial wrist configuration prior to the starting of the movement. Therefore, k was set on a subject-specific (because of B(**q**)) and movement-specific basis (because τ depends on the average time T that the subject needs to perform the movement).

We know that our model predicts curvatures in task-space, as shown in **Figure 3**. We shall now compare the average curvature displayed by a specific human subject with the curvature predicted by our λ0-PMP model, once Donders-fitted to a specific subject.

### Experimental Protocol

We asked six subjects to perform center-out pointing tasks toward nine targets on a computer screen. As shown in **Figure 2**, the nine targets consist of a central target and eight peripheral ones arranged over a circle (with a radius of 15◦ ) and oriented along the eight cardinal directions, i.e., North (N), NorthEast (NE), etc., which also define the naming convention).

Each subject performs 10 trials at self-paced speed. Each trial consists of eight outbound movements (from the center to a peripheral target) and 8 inbound movements (from a peripheral target to the center). In a trial, each peripheral target is visited only once and an outbound movement is always followed by an inbound movement. The order in which targets are visited is computed prior to the start of the trial as a pseudo-random permutation.

Throughout the experiment, subjects grasp a light-weight 3D printed handle mounting an inertial measurement unit (IMU) that records hand orientations R(**q**), with respect to the fixed reference frame, at 120 Hz. A computer monitor is used to display the center-out task to the subject. The visual feedback consists of the desired target position (a red circle) and the current location pointed at by the subject (a yellow circle). The current location is displayed at coordinates **x** = [x<sup>1</sup> x2] T computed as in Equation (1), is updated realtime via Equation (6), where R(**q**) is sensed by the IMU.

### Data Analysis

In this work, we are mainly interested in task-space dynamics, in particular the fact that trajectories during pointing tasks with the wrist appear more curved than in similar tasks performed with the arm (Charles and Hogan, 2010).

### Movement Start and End Times

To this end, we follow the same data analysis method proposed in Charles and Hogan (2010). Specifically, the recorded kinematic data is first filtered with smoothing splines (Dohrmann et al., 1988; Charles and Hogan, 2010) to ease numerical differentiation in estimating task-space velocity profiles. The starting and ending times of a movement are identified from the task-space tangential velocity profiles: the start of movement is set to occur at the time of the first data sample before the velocity peak with a value below 20% peak velocity. Similarly, the end of a movement is set to occur at the time of the first data sample after the velocity peak with a value below 20% the peak velocity. Movements featuring a path length and/or a duration beyond two interquartile from the (subject-specific) median were excluded from the analysis. To compute the average trajectory of a movement, the data was normalized with respect to the movement time and then linearly interpolated at 20 equally temporally spaced samples.

#### Path Curvature and Hysteresis

To assess path curvature in task-space, we follow the method proposed in Charles and Hogan (2010) to test whether task-space paths are on average curved and if so, whether outbound and inbound movements had different direction of curvatures. More specifically, since each movement occurs between two targets, we consider the straight-line connecting the two targets and directed from the initial target to the final one. This direction is used to determine whether the actual movement is on the left or on the right of the straight line, as depicted in **Figure 4**. In particular, we shall consider the whole area enclosed between the actual movement and the straight line and split this area into "right" area (AR), i.e., the enclosed area to the right of the straight line and "left" area (AL) as the enclosed area to the left of the straight line. Both left and right areas are defined non-negative and are normalized with respect to the square of the nominal target-totarget distance (( <sup>π</sup> 12 ) 2 [m<sup>2</sup> ]). For each movement, either inbound or outbound, we compute the following measures:


Finally, since an outbound movement is always followed by an inbound movement, we also consider the path hysteresis defined as Ahyst : = A OUT net + A IN net, i.e., the area enclosed in between outbound and inbound paths.

To assess the statistical significance of each measure, we use a t-test (with α = 0.05) to test the following hypotheses: (1) paths are curved (Asum 6= 0); (2) outbound paths have a preferred curvature direction (A OUT net 6= 0); (3) inbound paths have a preferred curvature direction (A IN net 6= 0); and (4) an outboundinbound sequence presents hysteresis (A OUT net 6= −A IN net).

### Results

For all subjects and for all movements we found Asum to be statistically different from zero suggesting that task-space paths executed with the wrist are not straight also in presence of redundancy (this was not the case in Charles and Hogan, 2010, where PS was locked) .

**Figure 5** shows the average outbound and inbound paths of the six subjects together with their standard deviations (shaded areas). Thick lines mark movements for which Anet was statistically different from zero (i.e., a preferred curvature direction) while stars mark segments with statistically significant hysteresis (i.e., outbound and inbound follow different paths). Superimposed with experimental trajectories, **Figure 5** also shows the task-space trajectories predicted via a Donders-fitted λ0-PMP model (dashed lines). The subject-specific postural parameters K J and **q** ∗ (estimated with method proposed in

Tommasino and Campolo (2016)), used to simulate the model for each subject, are shown in **Tables 1**–**2**.

**Figure 6** compares the experimental Anet (average and standard deviation) with the model predicted Anet. This comparison indicates whether the simulated paths have the same curvature direction and magnitude as the experimental ones. A t-test (p < 0.05) was used for each movement to assess if the average Anet was statistically different from the model predicted Anet (stars).

With reference to **Figure 5A**, subject 1 shows path hysteresis only for the (SW) target, while, the only statistically different Anet where found for inbound movements from the (W) and the (SW) target (red thick lines) that both veer to the right. With reference to **Figure 6A**, there is no statistical difference between the model and the experimental curvatures when pointing to and from the (N) and the (W), from (NW) and to (S) targets. The model is particularly accurate in capturing the average curvature of the (N) inbound, the (S) inbound and the (W) outbound and inbound. Overall, for this subject the model can only capture the curvature of 6 out of 16 movement direction (37%).

With reference to **Figure 5B**, subject 2 presents hysteresis for most of the targets, except for the (E), (SW) and (SE). This subject presents preferred curvature direction when pointing to and from the (W) target, with outbound and inbound both veering to the right. There is also a preference to veer to the right and to the left when performing outbound movements toward the (SW) and (S) targets, respectively. **Figure 6B** shows that, for all movements, the model predicts curvatures that are not statistically different from the experimental ones.

Similar analysis can be conducted for the remaining subjects. Here we limit ourselves to observe that for subject 3 there were no differences in terms of curvatures in 11 out of 16 movements (about 70% of movements). For subject 4

FIGURE 5 | Average task-space trajectories. Shaded areas show the standard deviations for both outbound and inbound movements. Thick lines are relative to paths with Anet statistically different from zero (i.e., a preferred curvature direction). The stars mark movement for which A IN net was statistically different from A OUT net , i.e., those movement that present hysteresis.

there were no differences between model and experimental curvatures in 10 out of 16 movements (62% of movements). For subject 5 there were no differences between model and experimental curvatures in 8 out of 16 movements (50% of movements) and for subject 6 in 9 out of 16 movements (56%).

TABLE 1 | Subject-specific intrinsic stiffness (K J ) parameters estimated via NIO from the average postural strategy.


Stiffness subscript correspond to: 1 = PS, 2 = FE, 3 = RUD, so that the stiffness K<sup>12</sup> corresponds to the stiffness along the PS-FE direction.

TABLE 2 | Subject-specific equilibrium postures (**q** ∗ ) estimated via NIO from the average postural strategy.


In summary, task-space curvature and hysteresis appear to be subject- and movement-specific and the model can capture most of this features for the majority of movements and subjects.

The experimental and simulated joint space trajectories are shown in **Figure 7** (only outbound and inbound movements from the (E) the (W) target are shown). All subjects show high variability when coordinating the PS rotation (red area), most likely because this is the joint that adds redundancy to the pointing task. The model can accurately reproduce the average FE (magenta color) and RUD (green) trajectories for most of the subjects and movements, while for PS rotations, there are larger errors between the average experimental trajectory and the model.

At the starting and ending times of each movement, i.e., when the wrist is stationary, the model predicted postures only depends on the estimated parameters K J and **q** ∗ . So, the larger the error between the model and the experimental posture, the less accurate is the estimate of K J and **q** ∗ . Because we are setting W proportional to K J , part of the errors between the model and the experimental trajectories may be due to the error between the real intrinsic subject stiffness and the one estimated from the data. In addition, the model does not take into account inertial and gravitational contributions. While the former has very little effect on wrist and forearm rotations (Peaden and Charles, 2014), gravity torques have been found to be non-negligible (Peaden and Charles, 2014).

**Figure 8** compares the experimental task-space tangential velocity profile and those predicted by the model for a representative subject. The time-varying extrinsic spring (Equation 8) reproduces bell-shaped velocity profiles similar to the experimental ones, although, task-space velocities predicted by the model tend to be larger than the experimental ones.

### 4. CONCLUSION

Motion planning and postural control in the presence of kinematic redundancy continue to be central topics in both neuroscience and robotics. For example, it is still debated why hand movements follows roughly straight-line paths in some experimental conditions while they are curved in others experimental settings. For decades, minimum principles (such as minimum-jerk, minimum variance, and so forth) and optimal control have been used as a tool to model and capture humanlike trajectories. Although successful in capturing some features of human movements, when formulated in joint space, these approaches are not only computational demanding but also fail to capture postural control mechanisms such as Donders' law. While it is still unclear how the brain solves redundancy (Mussa-Ivaldi et al., 2011), in robotics kinematic redundancy has been tackled with the task-space control framework that combines local optimization and W-weighted generalized pseudo-inverses. However, as robots start to look more anthropomorphic and to interact with humans, they also need to display natural and intuitive movements and posture. Hence, roboticists are looking at bio-inspired approaches to plan and control taskspace trajectories, null-space movements and equilibrium robot postures. We recently addressed the problem of postural control and trajectory planning by combining classical robotic motion planning (velocity resolution control) with neuroscientific evidence and theories of human motor control. We proposed a general and unifying force-field based posture and movement planner that was primarily tested in terms of human-like postural control (equilibrium postural strategies) (Tommasino and Campolo, 2017). In this work we extend our previous results by investigating the trajectory (both in task and joint space) predicted by a specific instance of our general computational framework: the λ0-PMP. More specifically, we focused on human motor strategies during redundant pointing tasks performed with wrist (and forearm) rotations. In a previous work, Charles and Hogan (2010) showed that when pointing with the wrist, taskspace paths are curved and in general, inbound and outbound movements follow different paths. In a successive work, they posited that such features of wrist rotations are due to an anisotropic joint stiffness matrix.

Here, we put forward the hypothesis that anisotropic intrinsic damping, rather than stiffness, is primarily responsible for curved task-space paths. The novel aspect of our approach is that our model was fitted to capture postural strategies and, with the sole hypothesis that intrinsic damping is proportional to stiffness (Equation 7), the model also exhibited curvatures and hysteresis in task-space performance remarkably similar to subject-specific average motions. More specifically, we found that (i) task-space paths are curved also in presence of kinematic redundancy, extending thus the work of Charles and Hogan (2010) where the PS axis was locked; (ii) curvature and hysteresis found in experimental trajectories, on a subject-specific and target-specific basis, are a possible consequence of postural constraints.

It should be noted that our computational framework is capable of generating human-like task-space trajectories from the only knowledge of the terminal target position. Hence, for

Frontiers in Neurorobotics | www.frontiersin.org

statistically different (p < 0.05) from the average experimental Anet.

FIGURE 7 | Joint space trajectories (PS in red, FE in magenta and RUD in green), predicted by the model (dashed lines) and measured experimentally: mean (continuous line) and standard deviations (color areas). The letters indicate the target sequence. For instance, the first movement is an outbound movement toward the target (E), the second movement is an inbound movement from target (E) to target (C) and so forth.

robotic applications, task-space trajectories must not be preprogrammed as in the classical task-space control approach, but are a direct consequence of the intrinsic and extrinsic impedance parameters (damping and stiffness) used in the model. For the pointing task implemented in this work there is no preferred task-space direction as subjects only receive the final target position as desired target. Therefore, in the model, for the taskspace planner we used an isotropic elastic attractor to push the simulated cursor on the desired target. This solution is not only simple but, when combined with suitable intrinsic impedance parameters also results in human-like wrist trajectories. However, as also discussed in Tommasino and Campolo (2017) any taskspace force field can in principle be used as task-planner, and therefore, for more complex robotic applications future works will explore the possibility of integrating dynamic movement primitives in our framework for the generation of adaptive and compliant skills (Calinon et al., 2013). In summary, in addition to the desired target location, our extended passive motion paradigm requires only the knowledge of: (i) an intrinsic stiffness matrix K J and an equilibrium posture **q**<sup>0</sup> that, combined with the λ<sup>0</sup> force field, allow the prediction of equilibrium (steady-state) wrist postures compatible with experimental (subject-specific) Donders' laws: (ii) the movement duration T, from which both the intrinsic and extrinsic time constants, of the joint-space damping and task-space stiffness respectively, can be set to reach the desired target in T seconds and with an equilibrium posture compatible with Donders' law.

There are of course many approximations and assumptions in our model which, as mentioned, is not meant to predict exact trajectories but rather capturing some basic features of humanlike motion. A major limitation is that the intrinsic stiffness K J is only a very simplified attempt to approximate the real, nonlinear, time-variant mechanical stiffness typically of human arm. This in turn affects not only the predicted postural strategies (i.e., wrist configuration at the beginning and ending of a movement) but also the predicted trajectories as the relationship between damping and stiffness is certainly more complex than the simple proportionality assumed in Equation (7). Furthermore, we only considered an isotropic task planner to investigate the effect of joint damping on path curvatures. However, future works need to compare how different and possibly anisotropic task planners (Tommasino and Campolo, 2017), when combined with anisotropic joint damping, predict subject-specific path's curvature.

A second limitation is that the λ0-PMP totally neglects feedback, as it is meant to address motion planning rather than execution. Our model is however useful at a planning stage, while feedback should be incorporated for movement execution.

As a third limitation, our model is to be considered as a first order postural and motor planner, in the sense that it does not take into account the inertial properties of human or robotic arms. This is a specific choice (in some cases an inertia might not even be available, e.g., in motor imagery scenarios) and the model could be extended to include inertial properties. In fact, the role that the manipulator intrinsic inertia would have is the same that the intrinsic damping has in our model. Such an approach would lead to models along the lines proposed by Khatib et al. (2009). Our approach however, is similar to Dietrich et al. (2015) where the manipulator joint stiffness (see Equation 7), compared to manipulator inertia, has been shown to be a more reliable weighting matrix for the calculation of W-weighted pseudoinverse and null-space projector operator.

In conclusion this work presents an extended version of the PMP that can deal with kinematic redundancy in compliance with Donders' law and solve the posture/movement problem. Just like the PMP, our model can find extensive use in planning human-like motions for humanoid robots and, at the same time, be able to capture natural postures in compliance with Donders' Law.

### AUTHOR CONTRIBUTIONS

PT and DC wrote and revised the manuscript, and conceived the work; PT performed simulations and data analysis.

### FUNDING

This work was partly supported by the A\*STAR Industrial Robotics Program, 2013–2016, and by the NMRC project BnB/0006b/2013, Ministry of Health, Singapore.

### REFERENCES


interaction. IEEE Trans. Robot. 31, 850–863. doi: 10.1109/TRO.2015.24 30053


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer DH and handling Editor declared their shared affiliation.

Copyright © 2017 Tommasino and Campolo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.