Robust Adaptive Recurrent Cerebellar Model Neural Network for Non-linear System Based on GPSO

A robust adaptive recurrent cerebellar model articulation controller (RARC) neural network for non-linear systems using the genetic particle swarm optimization (GPSO) algorithm is presented in this study. The RARC is used as the principal tracking controller and the robust compensation controller is designed to recover the residual of the approximation error. In the RARC neural network, the steepest descent gradient method and the Lyapunov function are used for deriving the adaptive law parameter of the system. Besides, the learning rates play an important role in these adaptive laws and they have a great effect on the functions of control systems. In this paper, the combination of the genetic algorithm with the mutation particle swarm optimization algorithm is applied to seek for the optimal learning rates of the RARC adaptation laws. The numerical simulations about the inverted pendulum system as well as the robot manipulator system are given to confirm the effectiveness and practicability of the GPSO-RARC-based control system. Compared with other control schemes, the proposed control scheme is testified to be reliable and can obtain the optimal parameter about the learning rates and the minimum root mean square error for non-linear systems.


INTRODUCTION
Strictly speaking, almost all practical control systems are non-linear systems and there is a difference between the mathematical model and the practical system. Besides, the structure and parameters of the practical systems are generally unknown or time-varying and the disturbances acting on the system are often random and unmeasurable in many cases. The neural network has the advantages of highly parallel structure, powerful learning ability, continuous non-linear function approximation ability, fault tolerance, etc., which greatly promotes and expands the application of neural network technology in non-linear system identification and control (Hunt et al., 1992). Recently, scholars have proposed generous research papers on neural network control theory and engineering application. For a class of uncertain non-linear systems with strict feedback, an adaptive neural network controller was designed using dynamic surface control technique (Wang and Huang, 2005). For a class of uncertain Multiple-Input Multiple-Output non-linear systems with unknown control coefficient matrix and input non-linearity, a variable structure control method combining an adaptive neural network controller with backtracking and Lyapunov synthesis is proposed (Chen et al., 2010). The paper presented a control scheme for the non-linear systems with input and state delay, which merges a radial basis function neural network, backstepping, and adaptive control (Zhu et al., 2008). As for a class of second-order nonlinear systems, a wavelet adaptive backstepping control system was designed, which consists of a neural backstepping controller and a robust controller (Hsu et al., 2006).
In 1975, Albus proposed the concept of the cerebellar model articulation controller (CMAC) for the first time (Albus, 1975), which was an imitation of the cerebellum learning structure, also one of the local approximations in the neural network system. Cerebellar model network system not only has nonlinear approximation ability, adaptive generalization ability and associative memory ability, but also is a kind of fast convergence neural network, which has been widely used in non-linear realtime control system (Guan et al., 2016). An efficient controller was proposed for the robot manipulators based on the structure and local learning characteristics of CMAC (Commuri et al., 1997). Considering the characteristics of non-linear uncertainty model, the paper presented an adaptive connection controller based on the monitoring system, which was composed of a supervisory controller and adaptive CMAC (Lin and Peng, 2004). Compared with fully connected neural networks, the CMAC neural network (NN) has strong structural advantages and was an effective control method for unknown dynamics non-linear systems (Commuri and Lewis, 1995). For a class of multiinput and multi-output uncertain non-linear systems, a selforganizing CMAC control system was proposed, which combines sliding mode control, compensation control and CMAC (Lin and Chen, 2009). At the same time, a TKS fuzzy CMAC controller integrates the robust compensation and adaptive law is proposed to improve the precision of position control and speed control for robot manipulators (Guan et al., 2018).
As a kind of artificial neural network, recurrent neural network takes sequence data as input, recursion in the direction of sequence evolution and all nodes are linked in a chain to form a closed loop. Therefore, it can show dynamic time series behavior. Unlike feed forward neural networks, RNNs have the characteristics of memory and parameter sharing. This performance also makes it extremely useful for speech recognition, language modeling, machine translation and other fields (Sak et al., 2014). A complex fuzzy neural network system, which can be a modified version of the fuzzy neural networks, was used for identifying and controlling non-linear dynamic systems (Zhong et al., 2017(Zhong et al., , 2018Lam, 2018). The recursive neuron has an internal feedback loop which can capture the dynamic response of the system and further simplify the network model (Lee and Teng, 2000). The paper combines Takagi-Sugeno-Kang fuzzy model with the wavelet neural network and constructs a recurrent wavelet fuzzy neural network to identify and predict the operation of non-linear dynamic systems (Lin and Chin, 2004). For the non-linear uncertain systems, an adaptive recurrent CMAC with sliding mode control was proposed, and the performance of the system was proved on the car-following system and the chaotic system (Lin and Chen, 2006). For the motion control of the linear ultrasonic motor, an adaptive recurrent CMAC based on variable optimal learning rate and dynamic gradient descent method was studied (Peng and Lin, 2007).
Particle swarm optimization (PSO) algorithm is an evolutionary algorithm proposed by Kennedy andEberhart in 1995 (Eberhart andKennedy, 1995). It is derived from the simulation of bird predation and is an evolutionary computation technology based on swarm intelligence. As a new parallel optimization evolutionary algorithm, PSO can deal with a large number of non-linear, non-differentiable, multi-peak as well as non-continuous optimization and multi-peak optimization problems, which was widely used in engineering and science fields (Kennedy, 2011). In the 1970s, professor J. H. Holland first developed the model of Genetic algorithm (GA) (Holland, 1973). It is an effective optimization method with principles about genetics natural selection. GA is also very popular in the fields of optimal scheduling, computer science, combinatorial optimization as well as transportation problem since its simple and universal, strong robustness and parallel processing (Holland, 1992). The improvement of genetic algorithm in the past is generally considered as the problem of premature and convergence. An adaptive genetic algorithm with dynamic fitness function for multi-objective problems in a dynamic environment was proposed to review the performance of the algorithm (Bingul, 2007). A new kind of genetic algorithms combined with the concept of the horizontal set was proposed to control the "precocity" of the genetic algorithm (Qinghua et al., 2006). In order to improve the convergence of the genetic algorithm, an improved crossover operation is proposed, and new population diversity and individual correlation are defined (Cai and Xia, 2006). However, the research on genetic algorithm fails to fully consider the situation of individuals in each generation, which does not match the growth and improvement of individuals in the process of evolution. In the selection and cross steps of the genetic algorithm, individuals directly enter the next generation, while individuals themselves do not get improved. Individuals have to grow and adapt to the environment in order to reproduce in nature. In the PSO algorithm, each particle is related to each other, and the particle can be imitated in the natural world, and the maximum performance can be improved and the particle can be mature (Peng et al., 2008).
There are usually two options to select the appropriate learning rate of the recurrent CMAC (learning rate of the recurrent neuron, weight, the variance and the mean), one of which is to adopt human expert experience. However, the accuracy of the method is not high, and not suitable for complex and uncertain problems. The second scheme is gradient learning (Song et al., 2008;Misra and Saha, 2010). In this paper, a robust adaptive recurrent cerebellar model neural network for nonlinear system based on GPSO algorithm is investigated, in order to avoid trial-and-error and improve the local optimal problems. In this system, the optimal learning rate for controller is calculated by GPSO algorithm, the adaptive recurrent cerebellar model articulation controller is used as the principal tracking controller and the robust compensation controller is designed to recover the residual of the approximation error, and the steepest descent gradient method and the Lyapunov function are used for deriving the online adaptive law parameter, so that the system stability can be guaranteed. Finally, the proposed GPSO-RARCbased control system is applied to the inverted pendulum system and the robot manipulator system to illustrate its effectiveness. Compared with the existing research already reported in the literature, the contribution of this paper has the following three aspects: (1) This paper combines genetic algorithm and mutation particle swarm optimization algorithm to find the optimal learning rate of adaptive law to the robust adaptive recurrent cerebellar model articulation controller and reduce the system training time; (2) The proposed control scheme ensures the stability of the entire system; (3) The compensation control can eliminate the small disturbance, when there are uncertainty, the compensation control deals with the lumped uncertainty. The full text is structured as follows. After a basic introduction, the formulation of the non-linear control system is shown in section Problem Formulation. In section GPSO-RARC, a GPSO-RARC control system is developed. Section Simulation Results provides the simulation results about the manipulator system and the inverted pendulum system. Finally, in Section Conclusion some valuable conclusions are drawn from the results.

PROBLEM FORMULATION
The nth order non-linear system can be denoted as: or, equivalent to formulas in which f (x(t)) ∈ ℜ m and g(x(t)) ∈ ℜ m×m represent smooth non-linear uncertain functions, which are assumed to be bounded, but functions that are assumed to be bounded, and assume g( T ∈ ℜ m are the inputs and outputs of the control, respectively; ∈ ℜ mn is a state vector of the system and is assumed to be measurable, and The purpose of the control system is to design a controller so that the state x(t) can track a given reference value x d (t). The tracking error was denoted as e(t) x d (t) − x(t) ∈ ℜ m , and the tracking error vector of the control system is defined as: If the dynamics and external disturbances of the controlled object are known (i.e., the nominal functions of f (x(t)), g(x(t)) and d(t) are known exactly), the so-called feedback linearization method can be used for the control problem. In this way, an ideal controller can be developed as: The following error dynamics are derived by applying the control law (4) to the system (1). Suppose K is chosen so that all roots of the polynomial h(s) s n +k 1 s n−1 +· · ·+k n are strictly in the open left half of the complex plane. This means that for any starting initial conditions, the trace of the reference trajectory is asymptotically achieved at lim t→∞ |E| = 0.
Substituting (4) into (1) the error dynamic equation is developed as: However, the non-linear functions f (x) and g(x) are usually unknown and external disturbances are unknown and uncertain. In this case, the control law (4) cannot be implemented in the practical applications. In order to make the system output x(t) effectively follow the given reference track x d (t), a GPSO-RARC control system is developed to achieve a better control performance in the following sections.

GPSO-RARC
The structure of GPSO-RARC control system consists of a sliding surface, a robust adaptive recurrent CMAC where its learning rates can be updated using the GPSO algorithm and a robust compensation controller. Figure 1 shows the block diagram of the RARC feedback control system.

Figure 2
Shows an RCMAC model, in which T denotes a delay time. The architecture of the RCMAC includes the inputs space, the association memory space with recurrent weights, the receptive-field space, the weight memory space and the outputs space. The following describes the propagation of signals in each space and the basic functions of each space.
1) Input space C: which can be described as the c = c 1 , · · · , c i , · · · , c n i T ∈ ℜ n i , c i is the ith input in layer 1. Based on the specific control space, all variables of the input state c i can be quantized to discrete regions (namely, an element). 2) Association memory space (Membership function) A: usually several elements are accumulated into one block and the number of blocks n k is usually no less than two. A represents an association memory space with n A (n A = n i × n k ) components. In the space, each block performs the Gaussian function as a receptive-field basis function, and is described as: in which φ ik denotes the kth block of the ith input cr i with the mean m ik and the variance v ik . In general, the input of this block can be described as follows: in which represents the recurrent gain, φ ik (t − T) φ ikT indicates the value of φ ik through time delay T. Obviously, the input of this block includes the memory term φ ikT , which saves the previous information about the network and presents dynamic mapping. This is the obvious difference between RCMAC and traditional CMAC. Where the variable c 1 is separated into blocks A and , while the variable c 2 is separated into blocks a and b. Shifting each variable to an element yields different blocks. For example, in Figure 2B the block C and D for c 1 , while the block c and d for c 2 are obtained by shifting an element. In this space, each block has three adjustable parameters, named m ik , v ik , and r ik .
3) Receptive-field space (Hypercube) : H regions composed of blocks (called Aa and Bb) are called receptive-fields. The kth multidimensional receptive field function is described as follow: for i = 1, 2, ..., n i , and k = 1, 2, ..., n k in which cr = [cr 1 , cr 2 , · · · , cr n i ] T ∈ R n i , m k = [m 1k , m 2k , · · · , m n i k ] T ∈ R n i and v k = [v 1k , v 2k , · · · , v n i k ] T ∈ R n i . Meanwhile, the multidimensional receptive-field functions can also be expressed in a vector form: in which h ik is associated with the ith layer and kth block, the field is activated while the input is in the kth receptive-field. At the same time, one or more of the same weights are activated by nearby inputs, and the corresponding blocks export similar outputs. The correlation is a very useful feature of the RCMAC, which is a local generalization.
4) Weight memory (RCMAC output weight) W: in this space, the parameter w n i n k is the weight which parameterizes the RCMAC mapping(connects to h ik ), which can be represented by the following formula: w = w 11 , . . . , w 1n k , w 21 , . . . , w 2n k , . . . , w n i 1 , . . . , w n i n k T ∈ ℜ n i n k = w 1 , . . . , w l , . . . , w n l T ∈ ℜ n l in which w l is automatically adjusted from the initial value via the online algorithm.

5) Output space Y:
The outputs of RCMAC are the sum of the activated receptive field multiplied by the corresponding weight, expressed as: and the outputs with the RCMAC can be described with the following vector form as: Since the recurrent unit of RCMAC contains the past value of the receptive-field basis function, the results of the control network have the features of dynamic characteristics and simple structure. If the time delay T=0, the system will return to a conventional CMAC mode. Moreover, the RCMAC will be simplified as a recurrent neural network, in case of each block carries only an element, and each input space has only one layer. Therefore, RCMAC is a generalization of recurrent neural networks, but it is more general, faster to learn and more to recall than the latter.

Adaptive Law for RCMAC Control System
The robust adaptive RCMAC control system includes an adaptive recurrent CMAC and a robust controller which is shown in Figure 1, and output of the system as the following: Frontiers in Neuroscience | www.frontiersin.org  in which u ARCMAC is the output of the developed adaptive RCMAC and u R is the output of the robust compensation. u ARCMAC is the main controller of RCMAC, which is used to approximate the ideal controller in formula (4). The parameters of RCMAC are adjusted online by the adaptive laws. u R is the robust controller used to efficiently restrain the influence of residual approximation error between the RCMAC and the ideal controller, and guarantees the L 2 -stability of the control system. A sliding surface s(t) can be defined as follow: s(t) = e (n−1) + K 1 e (n−2) + · · · + K n−1 e + K n t 0 e(τ )dτ (14) in which s(t) = s 1 (t), s 2 (t), · · · s m (t) T ∈ ℜ m , taking the derivative about (14), and substituting with (1) and (13) s(t) = e (n) + K 1 e (n−1) + · · · + K nė = e (n) then define L = 1 2 s 2 (t) as the cost function, and its derivative iṡ L = s(t)ṡ(t) ≤ 0 and substituting (13) According to the steepest gradient descent algorithm, the parameters of RCMAC,ẇ k ,ṁ ik ,v ik andṙ ik can be updated by the tuning laws as below: where learning-rates η w , η m , and η r are positive forẇ k ,ṁ ik ,v ik , andṙ ik , respectively.

Genetic Particle Swarm Optimization (GPSO) Algorithm
Particle swarm optimization algorithm (PSO) is a swarm intelligence algorithm designed by simulating hunting behavior of birds. The PSO moves the individuals in the population to a good region according to the adaptability of the environment. However, instead of using evolutionary operators, each individual fly in the D-dimensional search space at a certain speed and is regarded as a non-volume particle, and dynamically adjusts according to the flight experience of itself and its companions. The ith particle is represented as X i = (x i1 , x i2 , ... x iD ), the best position (with the best adaptive value) it has experienced is P i = (p i1 , p i2 , ... p i D ), also known as p best . The index number of the best position experienced by all particles in the population is denoted by the symbol g, namely P g , also known as g best . The velocity of the particle i is For each generation, its d dimension (1 ≤ d ≤ D) is changed according to the following equation: where c 1 and c 2 are the learning factors, which are also called acceleration constant, ω is the inertia factor, r 1 and r 2 are the uniform random numbers within the range of [0,1]. The right side of the formula (21) consists of three parts. The first part is the inertia or momentum part, which reflects the movement habit of the particle, which means that the particle has a tendency to maintain its previous speed. The second part is the cognition part, which reflects the memory or remembrance of the particle's own historical experience, which represents the tendency of the particle to approach its best position in history. The third part is the social part, which reflects the group history experience of synergy and knowledge sharing between particles. Particle swarm optimization is simple to calculate and converges quickly, but it lacks mutation ability and is easy to diverge. Genetic algorithm has strong global search ability and high efficiency, but it is prone to premature convergence and poor local search ability. Therefore, a GPSO algorithm is proposed in this paper, which integrates the crossover and mutation operations of the genetic algorithm into the optimization iteration process of particle swarm, and adopts adaptive crossover and adaptive mutation to enhance the ability of the population to jump out of the local optimal solution.
Firstly, the main parameters of PSO are improved in the GPSO algorithm. The linear decreasing method is adopted for inertia weight ω , so that the algorithm can have strong global optimization ability in the early stage of search and detailed local search in the late stage of search. The iterative formula is shown in equation (23): where ω start is the weight of initial inertia, ω end is the weight of terminate inertia; k is the current number of iterations; k max is the maximum number of iterations.
In order to make the algorithm have a strong global search ability in the early iteration process, it can converge to the global optimal quickly in the later stage, the value of learning factors in this paper is evaluated by asymmetric linear variation, as shown in equations (24) and (25): where c 1s , c 2s and c 1e , c 2e are the initial and terminate iterative value of learning factors of c 1 , c 2 respectively. Secondly,the crossover operation of GA is applied to PSO in the GPSO algorithm. Particles in the population are selected and randomly paired, and then paired particles are crossed with selected probability p c . For cross particles x i and x j , the calculation process is shown in equations (26) and (27): where α 1 , α 2 are two random numbers within the interval [0, 1], and equations (26) and (27) represent the crossover operation of the position and velocity of the paired particles, respectively. Then, the mutation operation of GA is applied to PSO in GPSO algorithm. The optimal position of each particle varies with the selected probability p m . Assuming that the Ddimensional variable of the individual optimal value p i is p d i , the variation operation of p d i is carried out with the strategy of random perturbation. The variable β applied is subject to the normal distribution with mean value 0 and variance 1, and its mutation formula is shown in (28).
The selection of crossover probability p c and mutation probability p m is one of the important factors affecting the optimization ability of the algorithm. If the p c is too small, the generation speed of new individuals will slow down during the iteration. If the p c is too large, the good individuals that have been generated in the population may be damaged. If p m is too small, then the ability to generate new individuals by mutation operation will be weakened, which is not conducive to maintaining the diversity of the population. If p m is too large, it is similar to the random search algorithm. Therefore, this paper proposes an adaptive crossover and adaptive mutation strategy to make p c and p m automatically adjust according to the evolutionary state of the population. The rate and mutation probability are defined as shown in equations (29) and (30): FIGURE 4 | Single inverted pendulum system.
where p c1 , p c2 , p m1 , and p m2 are constants, f ′ represents the fitness value corresponding to the better individuals compared with the two individuals with crossover operation; f refers to the fitness function value of the mutant operational particle, f avg refers to the average value of the fitness function value of the entire population at present. From the formula, it can be seen that the probability of crossover operation and mutation operation of individuals whose fitness function value is lower than the population average is relatively high, which ensures the population diversity. At the same time, when f max − f avg decreases, the individual in the population tends to converge to the local optimal solution. Meanwhile, the probability of individual crossover and mutation will increase, which enhances  the ability of the population to generate new individuals and urges them to jump out of the local optimal solution.
Finally, the fitness function f itness = m i=1 e i (t) 2 is chosen as a cost function, to evaluate the performance of learning rates in the GPSO algorithm. The flowchart of GPSO is shown in Figure 3.

Robust Compensation Control
There is unavoidably the approximation error between the adaptive recurrent CMAC(ARCMAC) and the ideal controller, an ideal controller can be formulated as the sum of ARCMAC and the approximate error:  Substituting (13) in (1), yield: using the product of (4) and g(x) subtract (32), yield: The robust controller can reduce the influence of the approximation error between the ARCMAC and the ideal controller, thus achieving the tracking performance of L 2 .
Assuming that ε(t) exists and satisfies L 2 bounded, consider the specified L 2 tracking performance (Chen and Lee, 1996): (34) FIGURE 10 | Two-link robot manipulator's architecture. Here r i is a prescribed positive attenuation constant. The following formula describes the design of a robust controller: where R = diag (r 1 , r 2 , · · · , r m ) ∈ ℜ m×m and I is the unity matrix, then further state and prove the following theorem. Theorem I: while the nth-order MIMO non-linear systems described in (1), the RARC control system is designed as in (13), in which u RCMAC is shown as (12) with the online parameter learning algorithms (17)-(20), and (35) describe the design of the robust controller. Then the desired L 2 tracking performance in (34) can be achieved for the specified attenuation levels r i , i = 1,2, · · · ,m.
Proof: The following formula gives the Lyapunov function: Taking the derivative of the Lyapunov function and using (31), (33) and (35), as below: Frontiers in Neuroscience | www.frontiersin.org Assuming ε i (t) ∈ L 2 [0,T], ∀T ∈ [0,∞), taking the integration of the above equation from t = 0 to t= T, yields: Since V(T) ≥ 0, the following inequality can be derived from (38): (39) using (36), the above inequality is equivalent to the following: (40) and the proof is completed. Moreover,in (24), in the case of T 0 ε 2 i (t) dt < ∞ then T 0 s 2 i (t) dt < ∞ for all T, so the L 2 stability of the closed-loop system is guaranteed.

Single Inverted Pendulum System
There is the single inverted pendulum system on the vehicle, as shown in Figure 4, and its dynamic equation is as follows (Mori et al., 1976) where, f (x) = g sin x 1 −mlx 2 2 cos x 1 sin x 1 /(m c +m) l(4/3−m cos 2 x 1 /(m c +m)) , g(x) = cos x 1 /(m c +m) l(4/3−m cos 2 x 1 /(m c +m)) , x 1 and x 2 are the angle and angular velocity of the pendulum, respectively, g = 9.8m/s 2 is the acceleration of gravity, m c = 1kg is the mass of the car, is the mass of the pendulum, l = 0.5m is half the length of the pendulum, u is the control input.
The tracking reference signals are x d (t) = 0.1sin(t), the initial conditions for this system are set as x(0) = π/60,ẋ(0) = 0, the robust compensation R = 0.05, the k 1 = 3, k 2 = 1 then the inputs are s(t) = 3e +ė + t 0 edt andṡ(t) =ë + 3ė + e. Considering the practical application, the off-line training time of the three optimization algorithms is set to 2 s, then the learning rate obtained by off-line training is taken as the initial value of the learning rate parameter of the system controller. For comparison, the original RCMAC control system, the RARC control system based on GA algorithm, the RARC control system based on PSO algorithm, and the RARC control system based on GPSO algorithm are applied to this single inverted pendulum system. Their simulation results are shown in Figures 5-9. The state responses x(t) andẋ(t) of normal RCMAC and RARC based on PSO (RARC_ PSO) are shown in Figures 5, 6, respectively and state responses of RARC based on GA (RARC_GA) and RARC based on GPSO (RARC_GPSO) are plotted in Figures 7, 8, respectively. Moreover, the tracking errors of various algorithms are depicted in Figure 9, and the Root Mean Square Error (RMSE) are presented in Table 1, respectively. Eventually, the simulation results show the proposed RARC_GPSO control system can effectively achieve favorable control for the single inverted pendulum system and can get better tracking performance than others, especially in the tracking of the angular speed.

Two Link Robot Manipulator System
The controlled object is an n-joint robot manipulator, and its non-linear dynamic equation is (Lewis et al., 2003): where M(x) ∈ R n×n indicates inertia matrix which is the symmetrical positive definite, V(x,ẋ) ∈ R n means the term of centrifugal force and Coriolis force,Ṁ(x)−2V(x,ẋ) is the oblique symmetric matrix, G(x) ∈ R n is the gravity, F(ẋ) ∈ R n is the term of friction, τ d (t) ∈ R n is an unknown external disturbance, τ (t) ∈ R n is the joint torque vector applied by the actuator, and x ∈ R n represents the vector of the joint variable.
A two-joint system is presented as follow, as shown in Figure 10, and the similar design process can be extended to any n-joint system. The specific system parameters of the two joint manipulators are described as below: l 2 2 m 2 + l 1 2 (m 1 + m 2 ) + 2l 1 l 2 m 2 cos (x 2 ) l 2 2 m 2 + l 1 l 2 m 2 cos (x 2 ) l 2 2 m 2 + l 1 l 2 m 2 cos(x 2 ) l 2 m 2 (43)  V(x,ẋ) = −l 1 l 2 m 2ẋ2 sin(x 2 ) −l 1 l 2 m 2 (ẋ 1 +ẋ 2 ) sin(x 2 ) m 2 l 1 l 2 sin(x 2 ) 0 (44) G(x) = (m 1 + m 2 )l 1 g cos(x 1 ) + l 2 m 2 cos(x 1 + x 2 ) m 2 l 2 g cos(x 1 + x 2 ) where x 1 , x 2 , m 1 , m 2 and l 1 , l 2 are the angle, mass and length of joint 1 and 2, respectively. g means the gravity acceleration. In addition, the dynamics of the manipulator also includes the non-linear viscous and dynamic friction terms of F(ẋ) and the unknown disturbance τ d , as follow: The initial state of the system is [x 1d ,ẋ 1d , x 2d ,ẋ 2d ] T = [0.09 0 − 0.09 0] T , and the expected trajectory are represented as: x 1d (t) = sin t x 2d (t) = sin t in the robust compensation R = 0.5 * I, k 1 = 3 and k 2 = 1. In order to further prove the superiority and robustness of RARC_GPSO control system, the other three neural network control schemes (normal RCMAC NN, RARC_PSO NN and RARC_GA NN) are adopted to compare the position and velocity tracking of the manipulator's joints, as shown in Figures 11-13. The control performance of the robust adaptive RCMAC control system based on GPSO in the two-joint manipulator is shown in Figure 14. It is obvious that the performance of RARC_GPSO is better than that of the other three control methods in the velocity tracking and position tracking, and it has the best speed of error convergence. Table 2 represents RMSE of the four controllers designed above, which reconfirm that the robust adaptive RARC_GPSO is more excellent than the others in robot manipulator control.

CONCLUSION
A robust adaptive RCMAC control system has been successfully proposed for non-linear MIMO systems in this paper. The main findings of this study are the development of a GPSO-based RCMAC with the adaptive law for updating parameters, and the learning rates can be optimized to best value based on the GPSO algorithm. The control system includes an ARCMAC which is developed to simulate the ideal controller, and a robust controller which is designed to compensate for the difference between ARCMAC and ideal controller. In this design, the optimal learning rate and adaptive learning algorithm of controller parameters are derived, and the L 2 -stability of the system is proved by Lyapunov function. Furthermore, the simulation results also prove the effectiveness of the control system. The GPSO-based RCMAC has dynamic characteristics because it considers the past value of received field basis function in associative memory space, so it has outstanding performance in general motion control and trajectory tracking. If the control scheme is applied to classification, the effect is not very good mainly because each sample in the classification is not necessarily linked. The next research plan will refer to the framework of fuzzy theory in the control of nonlinear systems (Zhong et al., 2019), and constantly improve the algorithm to make it universal.