An Improved Fuzzy Brain Emotional Learning Model Network Controller for Humanoid Robots

The brain emotional learning (BEL) system was inspired by the biological amygdala-orbitofrontal model to mimic the high speed of the emotional learning mechanism in the mammalian brain, which has been successfully applied in many real-world applications. Despite of its success, such system often suffers from slow convergence for online humanoid robotic control. This paper presents an improved fuzzy BEL model (iFBEL) neural network by integrating a fuzzy neural network (FNN) to a conventional BEL, in an effort to better support humanoid robots. In particular, the system inputs are passed into a sensory and emotional channels that jointly produce the final outputs of the network. The non-linear approximation ability of the iFBEL is achieved by taking the BEL network as the emotional channel. The proposed iFBEL works with a robust controller in generating the hand and gait motion of a humanoid robot. The updating rules of the iFBEL-based controller are composed of two parts, including a sensory channel followed by the updating rules of the conventional BEL model, and the updating rules of the FNN and the robust controller which are derived from the “Lyapunov” function. The experiments on a three-joint robot manipulator and a six-joint biped robot demonstrated the superiority of the proposed system in reference to a conventional proportional-integral-derivative controller and a fuzzy cerebellar model articulation controller, based on the more accurate and faster control performance of the proposed iFBEL.


INTRODUCTION
The control of uncertain nonlinear systems with multiple inputs and outputs often presents a great challenge, and the robotic motion control is such a typical case. Robots, especially humanoid robots, are widely used in domestic, medical and other industrial areas (Liu et al., 2015;Li et al., 2017;Wu et al., 2018;Zhou et al., 2018). A humanoid robot must accurately control its two manipulators and two legs, in order to generate hand reaching/grasping motions and biped-leg walking gaits. Such crucial motion abilities allow humanoid robots to work in complicated, dangerous, and even poisonous environments with reduced labor costs, health implication, and other associated complications.
The Sliding Mode Control (SMC) proves to be an effective control method for uncertain nonlinear systems, especially for humanoid motion control. Once the state of the system reaches a sliding surface, the state will remain on that surface regardless of system uncertainties and external disturbances (Lin and Hsu, 2015). Yet, control input chattering, usually led by a combination of uncertainties from multiple pathways, is often not expected in humanoid robot systems when SMC is applied. It has been found in several studies that the collaboration of an artificial neural network with a SMC controller can enhance non-linear approximation ability in reducing the chattering effect (Boldbaatar and Lin, 2015).
A neural network with good non-linear learning abilities is therefore of great appeal to the SMC model. Note that an association between a stimulus and its emotional consequence in the amygdala of the mammalian brain was discovered by LeDoux (1992). The inspiration from the emotional consequence then led to the development of the brain emotional learning network (BEL) controller, which has a good nonlinear approximation capability. Such a neural network is comprised of a sensory neural network in simulating the orbitofrontal cortex of the brain, and an emotional neural network representing the amygdala cortex (LeDoux, 1992;Lotfi and Akbarzadeht, 2013). The sensory neural network is responsible for the major output of the controller, while the emotional neural network has an indirect impact on the sensory neural network. Despite of the effectiveness in uncertain non-linear control, most BEL networks face the dilemma of slow learning convergence leading to difficulty in on-line control of the multiple joints of a humanoid robot.
Fuzzy neural networks (FNN) are another popular choice for uncertain nonlinear control systems with reasonable non-linear approximation ability, due to their rapid learning convergence and simple structure which is particularly favorable for on-line humanoid robotic control (Rubio, 2012(Rubio, , 2018Aguilar-Iban et al., 2018;Rubio et al., 2018). A typical FNN integrates a fuzzy inference system and a neural network (Pan et al., 2016;Meda-Campana et al., 2018). The weights of the network are usually updated by taking only the output errors of the FNN as the learning assessment. To achieve better performance for uncertain nonlinear systems, the FNN must also consider the overall performance of uncertain nonlinear systems when adjusting the control parameters, as reported in Zhao and Lin (2017). Therefore, the combination of the rapid convergence of FNN and the nonlinear mapping capability of BEL seems to be a good idea for controlling humanoid robots.
We believe that the chattering effect of the SMC model is a very challenging issue. Although, many existing algorithms had been developed to deal with the chatting; the artificial neural network still plays an important role in the control of uncertain nonlinear system with multiple inputs and outputs. In addition, FNN is good at rapid convergence and BEL can ideally increase the network's nonlinear mapping capability. Therefore, we focused on a combined neural network to deal with the chattering problem. Based on these considerations, this paper proposes an improved brain emotional learning model network (iFBEL) for a humanoid robot controller, in an effort to achieve better human-like control performance with the support of more nonlinear approximation capabilities. The proposed iFBEL is comprised of two components, with one built from a conventional BEL and the other created by an FNN; and the resulted iFBEL thus enjoys the advantages of both subsystems. The iFBEL works with a robust controller to replace the ideal sliding mode controller for better system performance. To ensure the convergence and robustness, the adaptive laws of the FNN and the robust controller are derived from the Lyapunov function. The iFBEL was validated and evaluated on a robot with a three-joint manipulator and a biped-leg system, although applications in other control fields can be readily identified. The experimental results demonstrate competitive performance of the proposed systems in dynamic humanoid robotic control.
The reminder of this paper is organized as follows: section 2 introduces a group of uncertain nonlinear systems controlled by a sliding mode controller. Section 3 reports the proposed improved fuzzy brain emotional learning model neural network. Section 4 describes the implementations of the network controller and the updating rules. Section 5 shows the experimental results and compares the performances with the conventional proportionalintegral-derivative (PID) controller and the fuzzy cerebellar model articulation controller (FCMAC). Section 7 concludes the paper and points out future work.

HUMANOID ROBOT CONTROL BY SLIDING MODE CONTROLLER
In order to understand the proposed network-based control system and realize the importance of the proposed neural network, this section introduces a typical uncertain nonlinear system controlled by a sliding mode controller as the work's background.
A humanoid robot needs to control multi-joints. Without loss of generality, consider a class of nth-order uncertain nonlinear systems with mth-order input and output states expressed in the following form: where ∈ ℜ m is an unknown, but bounded, smooth nonlinear function, G(x(t)) ∈ ℜ m×m is an unknown, but bounded, gain matrix, and d(t) = [d 1 (t), d 2 (t), . . . , d m (t)] T ∈ ℜ m is an external bounded disturbance. The nominal model of such a nonlinear system can be defined as where f n (x(t)) is the nominal function of f (x(t)), and G n = diag[g n 1 . . . g n m ] ∈ ℜ m×m is the nominal function of G(x(t)), with g n i being nominal gain constants, for i = 1, 2, . . . , m. Assume that g n i > 0 for the existence of G −1 n , Equation 1 can be represented as: where l(x(t), t) = △f (x(t)) + △G(x(t))u(t) + d(t) is the lumped uncertainties and external disturbances. Let T ∈ ℜ m×n be a desired trajectory in which the state of the system is tracked. The tracking error vector is defined as: An ideal sliding surface can be defined as whereK = [I, K] = I λ 1 I . . . λ n I ∈ ℜ m×(m+1)n . All λ j = [λ 1j . . . λ nj ] T ∈ ℜ n are roots of the equation: q n + λ 1 q n−1 + · · · + λ n−1 q + λ n = 0 in which q is the Laplace operator and is in the open left half-plane. The time derivative of Equation 4 leads to the following: whereė(t) = e (n) (t) e (n−1) (t) . . .ė(t) .
For the existence and reachability of this sliding surface, the control law of system is satisfied by the following inequation: for σ i > 0, i = 1, 2, . . . , m.
Taking 5 into 6, yields If the dynamic and the lumped uncertainty of the system are known exactly, the ideal sliding mode controller is designed as: where sgn is a sign function and G n is a positive define matrix. However, it is difficult to obtain the dynamical functions of most nonlinear systems, and the lumped uncertainty is always unmeasurable. Therefore, the ideal sliding mode controller is unobtainable.

THE PROPOSED IFBEL NETWORK
The configuration of the proposed iFBEL is depicted in Figure 1, consisting of an BEL and the FNN in addition to the input and output spaces. The outputs of this network are u i = b i − g i for i = 1, 2, . . . , m, in which, b i are the outputs of the the BEL and g i are the outputs of the FNN. The BEL network is comprised of the input space I, the association memory space M 1 , the weight memory space V, and the sub-output space B. The FNN shares the same input space with the BEL, and it also includes the association memory space M 2 , the receptive-field space R, the weight memory space W, and the sub-output space G. In particular, the FNN channel of iFBEL also contains a set of fuzzy reference rules (Lee, 1990) as represented as follows: where n f is the number of layers for each m input dimensions with each layer including n k blocks and n l = n f n k referring to the number of fuzzy rules, and φ ijk represents the fuzzy set for ith input, jth layer and kth block; each fuzzy set's member function is implemented by the Gaussian function; ω jk is the output weight in the consequent part; and g jk is the rule's output. Note that: each fuzzy set's member function can be defined as rectangular, triangular or any continuously bounded function e.g., Gaussian or B-spline; in order to easily implement the iFBEL with the better non-linear approximation ability, the Gaussian function is adopted. The aforementioned "spaces" are detailed as follows: 1. Input Space I: p = [p 1 , p 2 , . . . , p m ] T ∈ ℜ m is an input vector which is quantized into discrete regions (elements), where m is the number of input state variables. The number of elements, n e , is termed as a resolution. p is delivered to the BEL and the FNN simultaneously as their inputs. 2. Association Memory Spaces M 1 and M 2 : Several elements are combined as a block; the number of blocks, n b and n f for the BEL and the FNN respectively, must be equal or greater than two. The association memory space of the BEL has n a (= m × n b ) components, while that of the FNN has n c (= m × n f ) components. Every component is represented as a Gaussian basis function; let ϕ denote a component for the BEL and f for the FNN: where i = 1, 2, . . . , m, j = 1, 2, . . . , n b , and y ij and z ij are the means and variances, respectively; and where i = 1, 2, . . . , m, j = 1, 2, . . . , n f , k = 1, 2, . . . , n k , and c ijk and v ijk are the means and variances, respectively.
The block matrix of the BEL is defined as: (12) 3. Receptive-field Space R for FNN: Every cell in this space is the product of the corresponding components of the association memory space M 2 , which is defined as: where j = 1, 2, . . . , n f , and k = 1, 2, . . . , n k . An example of the FNN with two input variables is shown in Figure 2, which has 4 layers (n f = 4) for every input variable and 2 blocks (n k = 2) for each layer. And n l = n f n k is the number of receptive fields, such as Aa, Bb, . . . ; φ jk is associated with the jth layer and the kth block in the fuzzy rule as expressed in Equation 9. The block matrix of the FNN is defined as: Frontiers in Neurorobotics | www.frontiersin.org 4. Weight Memory Spaces V and W : ν ijk is the weight of the ith output, jth input, and kth block of the BEL; and ω ijk is the weight of the ith output, jth layer, kth block of the FNN: . .
FIGURE 3 | Design of control system.

5.
Sub-output Space B and G: The ith output (b i ) and the output vector (b) of the BEL, and the ith output (g i ) and the output vector (g) of the FNN are represented as follows: 6. Output Space U: The output of the proposed iFBEL is the combination of the outputs of the BEL and the FNN, in which the BEL works as a primary controller and the FNN as an emotion controller:

IFBEL-BASED CONTROLLER
The proposed intelligent controller, consisting of a sliding surface, an iFBEL network, and a robust controller, is shown in Figure 3. The iFBEL network and robust controller collaborate to imitate an ideal sliding mode controller. The updating rules of the BEL mechanism of the iFBEL network are followed by the brain emotional learning algorithm Lin and Chung, 2015); and the adaptive laws of the FNN mechanism and robust controller are derived from the Lyapunov function. Besides, to ensure robust tracking performance.
The updating rules are detailed as follows. Subtracting 8 into 5, yields:ṡ Assume that an optimal iFBEL u * BFC exists in the ideal sliding model controller, u ISMC , and that ǫ is a minimum error vector; thus, the weight matrixes of u * BFC are represented as V * and W * for the BEL and the FNN, respectively. Then, the output of the optimal sliding model controller is: where u BEL and u FNN are the outputs of the BEL and the FNN respectively, and * andŴ are the optimal matrix and estimated matrix of and Ŵ respectively. The output of the proposed iFBEL controller is defined by: where u RC is the output of the robust controller, andV,Ŵ,ˆ are the estimated matrices of V * , W * , * respectively. Taking 23 and 24 into 22, the following can be obtained: A partially linear form of the receptive-field basis function vector˜ in the Taylor series is: where c and v are defined by: Rewriting 26 with˜ = * −ˆ , yields: * =ˆ +˜ =ˆ + cc + vṽ + β.
Substituting 27 to 25, yields: where ω = W * T β +W( cc + vṽ ) + ǫ is a combined error of the FNN, andṼ = V * −V = [ν 1 ,ν 2 , . . . ,ν m ] T ∈ ℜ m×mn b is an approximation error weight matrix of the BEL. Consider a H ∞ tracking performance  for the existence of ω andṼ as: where η W , η c , η v are diagonal positive constant learning-rate matrices, and λ i is an attenuation constant. Set the initial conditions of the system as s(0) = 0,W(0) = 0,c(0) = 0,ṽ(0) = 0; then Equation 29 can be re-expressed as: To approximate an ideal sliding mode controller, assume that the approximation error between the proposed iFBEL and an ideal controller are bounded; in other words, ω ∈ L 2 [0, where N 1 and N 2 are two big positive constants. If λ = ∞, the minimum error cannot achieve approximation attenuation. If λ < ∞, the system is stable as shown by: where α is a learning-rate constant, and d consists of the input vector p and the output vector u BFC with the learning constants γ and τ .Ẇ where R = diag λ 1 λ 2 . . . λ m ∈ ℜ m×m is a diagonal matrix of a robust controller to converge the proposed system with the update rulesẆ,ċ andv, and λ i > 0, where i = 1, 2, ..., m; thus, R is a positive definite matrix.

EXPERIMENTATION
To verify the effectiveness and efficacy of the proposed controller with the new iFBEL, it was applied to two typical humanoid robotic systems, including a three-joint robot manipulator and a six-joint biped robot. A comparative study is also included in this section to evaluate the performance of the proposed controller in reference to two important control approaches including a PID controller and an SMC with fuzzy cerebellar model articulation controller network (FCMAC) (Lin et al., 2009). PID control is a classic control method, which is linearly combined by proportional control, integral control and differential control. The FCMAC network has the characteristics of rapid convergence, which enable the work to be suitable for the robotic control. The effectiveness of the FCMAC-based network controller has been demonstrated in many recent studies, such as Lin et al. (2016) and Zhao and Lin (2017). The experiments of both three-joint robot manipulator and six-joint biped robot are simulated in MATLAB R2016a. The configuration of the algorithm computer is set as follows: The CPU and the operating system of the development computer are Intel Core i5-4200U CPU@2.30GHz and Windows 10 professional. The source code of the algorithm can be found in this link 1 .
The parameters for the robust controller and the iFBEL's Gaussian functions and weights are tuned by using Equations from 32 to 37. The learning rate parameters and iFBEL's network structure are set empirically.
The initial means of the Gaussian functions in the Association Memory Spaces were divided equally and set as [−1, 1] for the BEL, and [−2, 2] for the FNN. The initial variances were set as σ ij = 0.1 for the BEL, and σ pq = 0.1 for the FNN, where i = p = 1, 2, 3, and j = q = 1, 2 . . . , 8. The weights of both the BEL and the FNN were initialized as zero and then automatically adjusted during the online training process. In addition, the learning rates were set as follows: η ω = 20, η m = 0.001, η v = 0.001, α = 0.01, b = 0.1, c = 0.1. The parameters of PID controller in the comparison experiments were set as: κ P = 15, κ I = 0.2, κ D = 0.5, where κ P , κ I and κ D are the coefficients of the proportional controller, integral controller and differential controller. FCMAC controller in the comparison experiments has the same parameters as FNN does.
The simulated position responses and the tracking errors at ρ 1 = 1 are shown in Figure 5. To better distinguish these values for the three controllers, Figures 6, 7 show the amplified trajectory responses and the tracking errors at t = 0 and t = 15. In Figure 6, the PID controller required 1.4s, 1.3s, and 0.05s for Joints 1, 2, and 3 to converge, respectively, while the FCMAC required 1.2s, 1.3s, and 0.05s for these joints respectively; however, the proposed iFBEL controller just needed 1.1s, 1.3s, 0.03s for these joints, respectively. In addition, the iFBEL performed the best when t = 15s.
The accumulated RMSE values at ρ 1 = 1 during the entire experiment are listed in Table 1, which also proved that the proposed iFBEL controller outperformed others. However, the difference among the three controllers is insignificant. The FCMAC and the PID controllers also generated good control performances in this experiment, because the three-joint manipulator system is not very complicated. The accumulated RMSE values under ρ 1 = 1.5 and ρ 1 = 2 are listed in Tables 2, 3, respectively. With the increase of disturbance, the errors of the three controllers also increased. However, the iFBEL also achieved the best performance under the two disturbance situations. This proves that the proposed iFBEL can well handle larger disturbances.

The Biped Robot
The configuration of the six-link biped robot used in this second experiment is illustrated in Figure 8. The experiment reported in the last sub-section was mainly used to validate the proposed system, but the experiment reported in this sub-section was primarily used to evaluate the efficiency and efficacy of the proposed control system. The dynamic equation of the robot is given as follow: where q ∈ ℜ 6 ,q ∈ ℜ 6 ,q ∈ ℜ 6 are the joint angle state vector, velocity vector and acceleration vector respectively, and M(q) ∈ ℜ 6×6 , C(q,q) ∈ ℜ 6×6 , g(q) ∈ ℜ 6 are the inertia matrix, the Coriolis/Centripetal matrix and the gravity vector respectively, u ∈ ℜ 6 is the output torque. More details for M(q), C(q,q), g(q) and the nominal parameters of the biped robot can be found in Appendix 1.2. This experiment also considered the phases of signal support of a gait cycle. The analysis planning and walking pattern generation are detailed in Appendix 1.3. The generated gait trajectory q d = [θ 1 , θ 2 , . . . , θ 6 ] T ,q d = 0,q d = 0 were set as the reference trajectories of the biped robot. The initial angles of each joint were given as q = [0.37, 0.5, 0.75, −0.15, −0.56, 0.85] t ,q = 0,q = 0. τ d = ρ 2 × exp(−0.1t) 6×1 was used in this experiment as the external disturbance, where ρ 2 = 1 is the amplification coefficient.
The BEL and the FNN are characterized as the same with that used in the first experiment as reported in section 5.1, but with different initializations. In particular, the initial means of the Gaussian functions in the Association Memory Spaces in this experiment were divided equally and set as [−1.4, 1.4] for the BEL, and [−1.6, 1.6] for the FNN. The initial variances were set as σ ij = 0.01 for the BEL and σ pq = 0.5 for the FNN, where i = p = 1, 2 . . . , 6, and j = q = 1, 2 . . . , 8. The weights of both sub-systems were initialized from zero and then automatically adjusted during the online training stage. In this experiment, the learning rates were chosen as η ω = 0.01, η m = 0.001, η v = 0.001, α = 0.01, b = 0.05, and c = 0.01.
The parameters of PID controller in the second experiment were set as: κ P = 8, κ I = 0.5, κ D = 1.3. FCMAC controller in the second experiment also has the same parameters as FNN does.
The simulated position responses and the tracking errors at ρ 2 = 1 led by the three controllers are illustrated in Figures 9, 10; with the performances of Joints 1, 2 and 3 illustrated in Figure 9 and those of Joints 4, 5, and 6 in Figure 10. The PID controller had a significant convergence delay, which therefore represented the worst performance within the three controllers. It is difficult from these figures to distinguish the performances led by the FCMAC and the iFBEL controllers, and thus the trajectories resulted from all the controllers in the range of [−1.4s, 1.4s] are magnified as displayed in Figures 11, 12 for better visualization and thus easier investigation.
From Figures 11, 12, it is clear that the PID controller could not converge rapidly in all the joints of the biped robot. The performances of the FCMAC and the iFBEL regarding all of the joints were very similar; both controllers rapidly converged the tracking errors. The tracking error amplitudes of the FCMAC controller in Joints 1, 2, 3, and 6 were larger than those of the iFBEL controller, which indicates the superiority of the proposed iFBEL controller.
The accumulated RMSE values are listed in Table 4. It is clear from this table that the convergence time of the iFBEL controller was shorter than those of the PID and the FCMAC for each joint. In this case, the RMSE values also proved that the proposed iFBEL controller achieved the best control performance within the three compared controllers used in this comparative study.  The accumulated RMSE values at ρ 2 = 1.5 and ρ 2 = 2 are also given in Tables 5, 6, respectively. The iFBEL also achieved the best performance under the two disturbance situations.

DISCUSSION
A humanoid robot usually consists of multiple joints and suffers many unexpected disturbances; therefore, the controller of humanoid robot must own the powerful non-linear approximation ability to handle these complex situations.
Based on the results of the two simulations, the proposed iFBEL network successfully demonstrated a rapid convergence ability and a nonlinear mapping capability. In the two simulations, the iFBEL controller can always achieve the fastest reaction speed to reduce errors; in addition, the iFBEL controller still achieved the best performance in different disturbance patterns. Therefore, the proposed network is suitable for the control of humanoid robots.
Although the performance of iFBEL-based controller was better than those of the FCMAC and PID controllers, the iFBEL network's structure is more complicated than that of the FCMAC. To address this issue, we believe that a recurrent mechanism usually uses a simple network structure to achieve good dynamic performance. Therefore, in the future work, we will improve our method by embedding a recurrent network inside the iFBELC controller.

CONCLUSION
This paper proposed a novel humanoid robot controller, which integrates some components from a fuzzy neural network and a brain emotional learning model into a sliding mode controller for dynamic non-linear control. It has been theoretically proven that the proposed system is asymptotically stable, thus guaranteeing the convergence. Experimental results and comparative studies further verified this, and demonstrated precise position tracking, more favorable stability, and better performance in reference to the results generated from the recently-developed network controllers of PID and FCMAC.
This research can be further improved in several directions. The current iFBEL network does not include any recurrent mechanism, but such a mechanism can generally improve the dynamic performance of a network. Therefore, a future investigation will focus on the development of the recurrent feature to better support the iFBEL controller. In addition, the undesired chattering situation existing in the sliding surface has not been fully investigated; more efforts will focus on this issue. Furthermore, the proposed approach was only practically applied to the dynamic humanoid robot control in this work. It is worthwhile to apply the approach to a wider range of applications to fully discover its potential.

AUTHOR CONTRIBUTIONS
WF contributed to this work by developing the proposed method and preparing the experiments. FC contributed the implementation of the proposed method and writing the manuscript. C-ML conducted the statistical analysis of the experimental results. LY contributed the planning and analysis of the experiments and writing of the manuscript. CS contributed to the design of the proposed method. CZ contributed to the writing of the manuscript.