Collision-Free Compliance Control for Redundant Manipulators: An Optimization Case

Force control of manipulators could enhance compliance and execution capabilities, and has become a key issue in the field of robotic control. However, it is challenging for redundant manipulators, especially when there exist risks of collisions. In this paper, we propose a collision-free compliance control strategy based on recurrent neural networks. Inspired by impedance control, the position-force control task is rebuilt as a reference command of task-space velocities, by combing kinematic properties, the compliance controller is then described as an equality constraint in joint velocity level. As to collision avoidance strategy, both robot and obstacles are approximately described as two sets of key points, and the distances between those points are used to scale the feasible workspace. In order to save unnecessary energy consumption while reducing impact of possible collisions, the secondary task is chosen to minimize joint velocities. Then a RNN with provable convergence is established to solve the constraint-optimization problem in realtime. Numerical results validate the effectiveness of the proposed controller.


INTRODUCTION
Industry 4.0 is becoming a label of modern industry combining traditional manufacturing and increasingly technological world. As an important executor, robot manipulator must be more flexible and intelligent, to satisfy production requirements which is more personalized and customized (Gonzalez et al., 2018). Among various kinds of robot manipulators, redundant manipulators have become an important branch of robotics due to its flexibility (Zhang, 2015). This enables robots to fulfill more complicated tasks and has been a hot topic in the field of robotic control.
With the increasing popularity of robot manipulators, traditional position control based applications (such as welding, painting and so on) can hardly meet complex production tasks (He et al., 2015), for instance, in pure position control based structures, the interaction between robot and workpieces is usually ignored, which could probably lead to high security risk, since excessive system stiffness would lead to the unpredictable responses (Cai and Xiang, 2018). Therefore, aiming at enhancing the execution ability of the system, precise control of contact force is required to ensure compliance to external environment. Accordingly a series of control methods are proposed, depending on different robotic structure and control signals. By imitating flexible joints and muscles of animals, compliance units are introduced into the robots, such as series elastic actuators (SEA), variable stiffness actuators, etc. In Pan et al. (2018b), a compliance controller is designed for SEA based systems, and a modified commandfiltered back-stepping control strategy (CFBC) based on adaptive mechanism is then proposed to overcome the discontinuous friction and complexity problem of traditional back-stepping based methods. By adjusting the compliance of joint angles, precise control of torque output is realized. As to the interaction between the robot and workpieces, Hogan proposes a basic idea of impedance control, in which the robot and environment usually bear as an impedance and admittance, respectively (Hogan, 1985). Generally speaking, the contact force and relative movement of the robot and workpieces can be described as a combination of mass-spring-damper systems. Therefore, the contact force can be controlled by designing motion commands indirectly. Another representative approach is hybrid positionforce control, the controller is usually designed in the torque loop of the joint space, in which both contact forces and movement of the robot are modeled based on dynamic analysis. Then the controller can be described as a combination of control efforts which achieve position and force control, respectively (Raibert and Craig, 1981). Similar research can be found in literature such as (Khatib, 1987;Pan et al., 2018aPan et al., , 2019Zhao et al., 2018a,b).
During the operating process, since the manipulators are usually required to keep in touch with the workpieces, it is possible that the robot would collide with the environment. Besides, the workspace of a robot as also limited (Khatib, 1986). For example, in a production line with multiple manipulators, each robot is located at a fixed position, in order to avoid interference, the robot's workspace is limited by hardware (fences, barriers, etc.) or software constraints(pre-planned space). In situations such as human-machine collaboration, the robot must not collide with human. Therefore, it is crucial to avoid obstacles during the operating process. In present reports, the desired trajectory is generally obtained by off-line programming, which is limited by programming efficiency. To realize obstacle avoidance control in realtime, artificial potential field based methods are widely used. The basic idea of is that the target bears as an attractive pole while the obstacle creates repulsion on the robot, then the robot will be controlled to converge to the target without colliding with obstacles (Wang et al., 2018). In Csiszar et al. (2011), a modified method is proposed, which describes the obstacles by different geometrical forms, both theoretical conduction and experimental tests validate the proposed method. Considering the local minimum problem that may caused by multi-link structures, in Badawy (2016), a two minima is introduced to construct potential field, such that a dual attraction between links enables faster maneuvers comparing with traditional methods. Other improvements to artificial potential field method can be found in Tsai et al. (2001), Tsuji et al. (2002). A series of pseudo-inverse methods are constructed for redundant manipulators in Sciavicco and Siciliano (1988), in which the control efforts consists of a minimum-norm particular solution and homogeneous solutions, and the collision can be avoided by calculating a escape velocity as homogeneous solutions. By understanding the limited workspace, the obstacle avoidance can be described in forms of inequalities, which opens a new way in realtime collision avoidance. In Zhang and Wang (2004), the robot is regarded as the sum of several links, and the distances between the robot and obstacle is obtained by calculating distances between points and links. Then Guo and Zhang (2012) improves the method by modifying obstacle avoidance MVN scheme, and simulation results show that the modified control strategy can suppress the discontinuity of angular velocities effectively.
In terms with compliance control problem of a robot, the controller efforts should be designed according to the desired commands and system characteristics. That is so say, the robot must follow a constraint that achieves compliance control, and at the same time, the inequality constraints are ensured to avoid obstacles. It is obvious that the control problem involves several constraints, including equality constraints and inequality ones. Using the thought of constraint-optimization, the control problem with multiple constraints can be well handled. Recently, the applications of recurrent neural networks for robotic control have been studied extensively, and have shown great efficient for real-time processing (Wang et al., 2015;Jin et al., 2017;Xu et al., 2019a). In those literatures, analysis in dual space and a convex projection are introduced to handle inequality constraints.
Recently, taking advantage of parallel computing, neural networks are used to solve the constraint-optimization, and have shown great efficiency in real-time processing. In , Li et al. (2017), Yang et al. (2018b), controllers are established in joint velocity/acceleration level, to fulfill kinematic tracking problem for robot manipulators. In Xu et al. (2019b), tracking problem with model uncertainties is considered, and an adaptive RNN based controller is proposed for a 6DOF robot Jaco 2 . Discussions on multiple robot systems, parallel manipulators, time-delay systems using RNN can be found in Zhang et al. (2018), Li et al. (2019), Xu et al. (2019b).
From the previous observations, we propose a RNN based collision-free compliance control strategy for redundant manipulators. The remainder of this paper is organized as follows. In section 2, the control objective including the positionforce control as well as collision avoidance is pointed out, and then rewritten as a QP problem. In section 3, the RNN based controller is proposed, and the stability of the system is also analyzed. A number of numerical experiments on a 4-DOF redundant manipulator including model uncertainties and narrow workspace are carried out to further verify the proposed control strategy. section 5 concludes the paper. The contributions of this paper are summarized as below • To the best of the author's knowledge, there is few research on compliance control using recurrent neural networks, the study in this paper is of great significance in enriching the theoretical frame of RNN. • The proposed controller is capable of handling compliance control, as well as avoiding obstacles in realtime, which does make sense in industrial applications, besides, physical constraints are also guaranteed. • Comparing to traditional neural-network-based controllers used in robotics, not only control errors but model information is considered, therefore, the proposed RNN has a simple structure, and the global convergence can be ensured.

Robot Kinematics and Impedance Control
Without loss of generality, we consider series robot manipulators with redundant DOFs, and the joints are assumed as rotational joints. Let θ ∈ R n be the vector of joint angles, the description of the end-effector in the cartesian space is: where x ∈ R m is the coordination of the end-effector. In the velocity level, the forward kinematic model can be formulated as: in which J(θ ) = ∂x/∂θ is Jacobian matrix. As to redundant manipulators, J ∈ R m×n , rank(J) < n.
In industrial applications, position control based operation mode has many limitations: due to the lack of compliance, pure kinematic control methods may cause unexpected consequences, especially when the robot is in contact with the environment. To enhance the compliance and achieve precise control of contact force, according to impedance control technology, the interaction between robot and environment can be described as a damper-spring system, as shown in Figure 1 (Senoo et al., 2017).
where, K p and K d are interaction coefficients, and x = x − x d is the difference between the actual response x and desired trajectory x d . The basic idea of impedance control methods is shown in Equation (2.1). By referring to Equations (2) and (3), we have:ẋ When the real values of K p and K d are known, F can be obtained by adjusting the velocityẋ of the end-effector according to Equation (4).

Obstacle Avoidance Scheme
In the process of robot force control, there is a risk of collision as the robot may contact with workpieces. Besides, robot manipulators usually work in a limited workspace restricted by fences, which are used to isolated robots from humans or other robots. This problem could be even more acute in tasks which requires collaboration of multiple robots. Therefore, obstacle avoidance problem must be taken into consideration. When collision does not happens, the distance between robot and obstacles keep positive. By describing the robot and obstacles as two separated sets, namely S A = {A 1 , · · · , A a }, are points on the robot and obstacles, respectively. Then the sufficient and necessary conditions of obstacle avoidance problem is that the intersection of A and B is an empty set. That is to say, for any point pair A i on the robot and B j on the obstacle, the distance between A i and B j is always positive, i.e., ||A i B j || 2 2 > 0, for all i = 1, · · · , a, j = 1, · · · , b, where || • || 2 2 is the Euclidean norm of vector A i B j . For sake of safety, let d > 0 be a proper value describing the minimum distance between robot and obstacles, the collision can be avoided b ensuring ||A i B j || 2 2 ≥ d. Remark 1. In fact, both S A and S B consist of infinite points. However, by evenly selecting representative points from the robot link and obstacles, S A and S B can be simplified significantly. Besides, the safety distance d can be appropriately increased. Despite that this treatment will sacrifice some workspace of the robot (the inequality ||A i B j || 2 2 ≥ d would into account some areas that collisions do not happen, due to a bigger d is considered), this sacrifice is meaningful: the number of inequality constraints can be reduced greatly, which is helpful for constraint description and solution.
In real applications, the key points of the robot manipulator is easy to select. Cylindrical envelopes are usually used to describe the robotic links, then the key points can be selected on the axes of the cylinders uniformly, and the distance between those points can be defined the same as the radius of the cylinder. As to the obstacles with irregular shapes, the key points can be selected based on image processing techniques, such as edge detection, corrosion, etc.

Problem Reformulation in QP Type
From the above description, the purpose of this paper is to build a collision-free force controller for redundant manipulators, to achieve precise force control along a predefined trajectory, in the As to a redundant manipulator, there exist redundant DOFs, which can be used to enhance the flexibility of the robot. When the robot gets close to the obstacles, the robot must avoid the obstacle without affecting the contact force and tracking errors. On the other hand, when there is no risk of collision, the robot may work in an economic way, by minimizing the joint velocities, energy consumption can be reduced effectively. Therefore, by defining an objective function as ||θ || 2 2 , the control objective can be summarized as: where ||θ|| 2 2 is the Euclidean norm ofθ . It is noteworthy that in actual industrial applications, the robot is also limited by its own physical structures. For instance, the joint angles are limited in a fixed range, and the upper/lower bounds of joint velocities are also constrained due to actuator saturation. By combing (Equation 4), the control objective rewrites to: with θ − , θ + ,θ − ,θ + being the upper/lower bounds of joint angles and velocities, respectively. However, the optimization problem is described in different levels, i.e., joint speed level or joint angle level, which remains challenging to solve (Equation 6) directly. Therefore, we will rewrite this formula in velocity level. As to the key points A i on the robot, let x Ai be the coordination of A i in the cartesian space, both x Ai andẋ Ai are available: x Ai = f Ai (θ ), (7a) where f Ai (•) is the forward kinematics of point A i , and J Ai is the corresponding Jacobian matrix from A i to joint space. Let us consider the following equality: in which k is a positive constant. It is obviously that the equilibrium point of Equation (8) is (5d) can be readily guaranteed. Taking the time-derivative of ||A i B j || 2 2 yields: where, − −− → |B j A i | = (A i −B j ) T /||θ|| 2 2 is a unit vector from B j to A i , anḋ B j is the velocity of key point B j on the obstacles. By Equations (9) and (6c), the inequality description of obstacle avoidance strategy is Remark 2. In this part, we have shown the basic idea of obstacle avoidance scheme in velocity level, whose equilibrium point is described in Equation (8). It is notable that the righthand side of Equation (8) is only a common form to realize obstacle avoidance. Generally speaking, the right-hand side of Equation (8) (10), the value of the response velocity to avoid obstacles is related to the two parts, the first part is the difference between the actual and safety distance, the other part depends on the movement of the obstacles.
In terms of the physical constraints of joint angles, according to escape velocity method, inequalities (6d) and (6e) can be uniformly described as max(α(θ − − θ ),θ − ) ≤θ ≤ min(θ + , α(θ + − θ )). So far, the position-force control problem together with obstacle avoidance strategy in velocity level is as below where (11c) is a rewritten inequality considering (6d) and (6e) based on escape velocity scheme , ∈ R ab×n is the concatenated form of J Ai considering all pairs between A i and B j , , which is the cascading form of the inequality description (10) for all points pairs A i B j , i.e., if (11d) hold, the obstacle avoidance can be achieved. It is notable that a larger number of key points do help to describe the information of the obstacle more clearly, but it would lead to a computational burden, since the number of inequality constraints also increases. Therefore, the distance of the key points on the obstacle can be selected similar to those of the manipulator.

RNN BASED CONTROLLER DESIGN
In section II, we have transform the compliance control as well as obstacle avoidance problem into a constraint-optimization one. However, because that the QP problem described in Equation (11) contains equality and inequality constraints, moreover, both (Equations 11b,d) are nonlinear, it is difficult to solve directly, especially in industrial applications in realtime. Based on the parallel computation ability, a RNN is established to solve (Equation 11) online, and the stability of the closed-loop system is also discussed.

RNN Design
In terms with the QP problem (Equation 11), although the analytical solution can be hardly obtained, by defining a Lagrange function as: where λ 1 and λ 2 are state variables, respectively. According to Karush-Kuhn-Tucker (KKT) conditions, the inherent solution of Equation (11) satisfies: where, P (x) = argmin y∈ ||y − x|| is a projection operator oḟ θ to convex , and = {θ ∈ R n |max(α(θ − − θ ),θ − ) ≤θ ≤ min(θ + , α(θ + − θ ))}. In Equation (13c), the operation function (•) + is defined as a mapping to the non-negative space. Equation (13c) can be rewritten as: When J oθ ≤ B, the inequality (Equation 11d) holds, then λ 2 stays zero. Instead, if the inequality reaches a critical state, λ 2 becomes positive to ensure J oθ = B. In order to obtain the inherent solution in real time, a recurrent neural network is built as follows: with ǫ being a positive constant scaling the convergence of Equation (15). The proposed RNN based algorithm is shown in Algorithm 1. Based on escape velocity method, the convex set of joint speed can be obtained based on the positive constant α and physical constraints θ − , θ + ,θ − ,θ − . After initializing state variables λ 1 and λ 2 , the reference velocity can be obtained based on the desired command and actual responses according to Equation (4). then the output of RNN (which is also the control command) can be calculated based on Equation (15a), at the same time, both λ 1 and λ 2 can be updated according to Equations (15b) and (15c).
In real applications, the nonlinear system can be hardly approximated completely. Therefore, the approximate error is inevitable, which would influence the performance of the proposed controller. However, the approximate error is a small Algorithm 1: Collision-Free position-force controller based on RNN. Input: Positive control gains α, ǫ, and interaction coefficients K p , K d . Initial statesq(0) = 0, q(0), desired path x d (t),ẋ d (t) and operation force F d (t), task duration T e , feedback of end effector's coordination x(t) and contact force F, joint angles θ , Jacobian matrix J(θ ), information of the obstacles B j andḂ j = 1, · · · , b. Location of key points A i , i = 1, · · · , a on the robot, and the corresponding Jacobian matrices J Ai . Physical limitations θ − , θ + ,θ − ,θ + . Safety distance d. Output: To achieve position-force control without colliding with obstacles 1.
Update λ 2 byλ 2 using Equation (15c) Until(t > T e ) value of higher order, and the influence can be suppressed based on the negative feedback scheme in the outer-loop, as shown in Equation (4).
Remark 3. The output dynamics of the proposed RNN is given in Equation (15a), in which the projection operator P (•) plays an important rule in handling physical constraints (Equation 11c), the updating ofθ depends on three parts: the first part −θ /||θ || 2 2 in used to optimize the objective function ||θ|| 2 2 , and the second item J T λ 1 guarantees the equality constraint (Equation 11b) by adjusting the dual state variable λ 1 according to Equation (15b), and the last item −J T o λ 2 ensures the inequality constraint (Equation 11d). The RNN consists of three kinds of nodes, namely,θ , λ 1 and λ 2 , with the number of neurons being n + ab + m.
It is remarkable that the proposed controller is based on the information of system models such as J, J o , K p , etc., which is helpful to reduce computational cost. As to the constraintoptimization problem (Equation 11), the main challenge is to solve it in real-time, since the parameters in constraints (Equations 11b, 11d) are time varying. From Equation (15), the control effort is obtained by calculating its updating law, which is based on the historical data and model information, i.e., it is no longer necessary to solve the solution of Equation (11) as every step, and the computational cost is thus reduced. In the following section, we will also show the convergence of the RNN based controller.
In this paper, we mainly concern the obstacle avoidance problem in force control tasks. It is notable that force control is mainly based on the idea of impedance control theory, which is similar to existing methods in Huang et al. (2019), Zhang and Xia (2019). The main challenge of the proposed control scheme lies in the limitation of sampling ability of cameras, which are used to capture the obstacles. To handle the measurement noise or disturbances, a larger safety distance d can be introduced to ensure the performance of obstacle avoidance.

Stability Analysis
Lemma 1: (Convergence for a class of neural networks) (Gao, 2003) A dynamic neural network is said to converge to its equilibrium point if it satisfies: where κ > 0 and ̺ > 0 are constant parameters, and P S = argmin y∈S ||y − x|| is a projection operator to closed set S. Definition 1: For a given function F(•) which is continuously differentiable, with its gradient defined as ∇F, if ∇F + ∇F T is positive semi-definite, F(•) is called a monotone function.
About the stability of the closed-loop system, we offer the following theorem.
Theorem 1: Given the collision-free position-force controller based on a recurrent neural network, the RNN will converge to the inherent solution (optimal solution) of Equation (11), and the stability of the closed-loop system is also ensured.
Proof: Define a vector ξ as ξ = [θ ; λ 1 ; λ 2 ] ∈ R n+m+ab , according to Equation (15), the time derivative of ξ satisfies: in which ǫ > 0, and F(ξ ) = [F 1 (ξ ), F 2 (ξ ), F 3 (ξ )] T , where By calculating the gradient of F(ξ ), we have: It is obviously that ∇F(ξ ) is positive definite. According to Definition 1, F(ξ ) is a monotone function. From the description of (17), the projection operator P S can be formulated as P S = [P ; P R ; P ], in which P is defined in (13a), P R can be regarded as a projection operator of λ 1 to R, with the upper and lower bounds being ±∞, and P = (•) + is a special projection operator to closed set R ab + . Therefore, P S is a projection operator to closed set [ ; R m ; R ab + ]. Based on Lemma 1, the proposed neural network (15) is stable and will globally converge to the optimal solution of (11).
Notable that the equality constraint 11(b) describes the impedance controller, and the convergence can be found in Na et al. (2015). Similarly, the establishment of inequality constraint enables obstacle avoidance during the whole process. The proof is completed.
Remark 4. It is remarkable that the original impedance controller described in 11(b) bears similar with traditional methods in Yang et al. (2018a) the main contribution of the proposed controller is that the controller can not only realize the force control, but also realize the obstacle avoidance, besides, the control strategy is capable of handling inequality constraints, including joint angles, and velocities.

NUMERICAL RESULTS
In this part, we carry out a series of numerical simulations on a planar 4-DOF robot, aiming at verifying the validity of the proposed control scheme. Firstly, a pure force control experiment is done to show the effectiveness of the force controller, and then the control scheme is further verified by examining the system response after introduction of obstacles. Then we check the control performance in more general situations, including model uncertainties and multiple obstacles.

Simulation Settings
First of all, the planar robot used in the simulation is show in Figure 2. The D-H parameters are also listed in Figure 2B. It is remarkable that in force control tasks, the end-effector is required to keep in touch with workpieces, which makes it necessary to distinguish between the necessary contact and the unnecessary collisions. In this paper, the proposed controller is capable of handling this problem by selecting the key points properly. Therefore, the end-effector is not considered as a key point, to make it possible to contact with the obstacles (or external environment). In order to avoid obstacles, the set of key points of the robot is defined as A 1 , · · · , A 7 , in which A 1 , A 3 , A 5 , and A 7 locate at the center of the links, and A 2 , A 4 , and A 6 are defined to be at J2, J3, and J4, as shown in Figure 2A.

Force Control Without Obstacles
First of all, an ideal case where there is no obstacles in the workspace is considered, and the parameters K d and K p are assumed to be known. The robot is wished to offer a constant contact force on a given plane. The contact force is set to be 20N, while the direction of contact force is aligned with the y-axis of the tool coordination system. In this example, the yaxis of is [1, −1] T in the base coordination. The pre-defined path on the contact plane is x d = [0.4 + 0.1cos(0.5t), 0.5 + 0.1cos(0.5t)]. The initial state of the robot system is set as θ 0 = [1.57, −0.628, −0.524, −0.524] T rad,θ 0 = [0, 0, 0, 0] T rad/s. The control gains of the proposed RNN controller are α = 8,ǫ = 0.02, respectively. Numerical results are shown in Figure 3. The tracking error along the contact plane is given in Figure 3B, the transient is about 1s. At the beginning stage, since the end-effector is not in contact with the surface, the contact force stays zero before 0.5s. As the end-effector approaches the surface, the contact force converges to 20N, showing the  convergence of both positional and force errors. The Euclidean norm of joint velocities (which is also output of the established RNN) is shown in Figure 3D, ||θ|| changes periodically, with the same cycle as the expected trajectory. The time history of the end-effector's motion trajectory and the corresponding joint configurations are shown in Figure 3A, in which the red arrow indicates the direction of the contact force, and the blue arrow shows the direction of the end-effector's free-motion. All in all, the proposed controller can achieve the position-force control precisely.

Force Control With Single Obstacles
In this section, a stick obstacle is introduced into the workspace, which is defined as x = −0.05 m. The initial states and expected values of x d , F d are the same as section 4.2.
Remark 5. In Equation (10), we have shown the basic idea of calculating the distance between the robot and obstacles, i.e., by abstracting key points form the robot and obstacles, the distances can be the robot and obstacle can be described approximately at a set of point-to-point distances. In this example, the distance can be obtained in a simpler way. However, the obstacle avoidance strategy is essentially consistent with (Equation 10).
Simulation results are given in Figures 4, 5. The output of RNN is shown in Figure 4E, when simulation begins,θ reaches its maximum value, driving the end-effector to move toward the desired path. And then the robot slows down quickly (after t ≈ 0.5s), the robot move smoothly, as a result, the position error successfully converges to 0, and simultaneously, the contact force converges to 20N. It is notable that at t = 1.2 s, the key point A 2 of the robot gets close to the obstacle, as shown in Figure 4F. Based on the obstacle avoidance strategy (Equation 15c), the state variable λ 2 (2) becomes positive, and then the output of the RNN varies with λ 2 (Figure 5B). Correspondingly, an error (about 1 × 10 −3 m) occurs in the positional tracking, and so as the contact force (force error is about 2N). However, the RNN converges to the new equilibrium point(since the equilibrium point would change when the inequality constraint works), and both e x and e f converges to 0. By comparing Figures 3A, 4A, after introducing the obstacle, the robot is capable of adjusting its joint configuration to avoid the obstacle. The distances between the key points A 1 − A 7 to the obstacle are shown in Figure 4D, a minimum value of about 0.01 m is ensured during the whole process. Using impedance model, the force control problem is transferred into a kinematic control one by modifying the reference speed (Equation 4). Consequently, the resulting trajectory x r together with x d are as shown in Figures 5D,E. As an important index in the proposed control scheme, the norm of joint speed ||θ|| 2 2 is wished as small as possible. Therefore, we introduce a comparative simulation, in which the solution is obtained based on pseudo-inverse of Jacobian matrix and physical limitations are not considered. Comparative curves of the objective functions are as shown in Figure 5F. The RNN based controller can optimize the objective function, it is remarkable that a difference appears at about t = 1.2 − 5 s, which is mainly caused by obstacle avoidance (which is not considered in JMPI based method). Since the output of RNNθ is used to approximate the reference speed b 0 , the approximate error ||Jθ − b 0 || 2 2 is shown in 4.35(C), demonstrating the effectiveness of the established RNN.

Force Control With Uncertain Parameters
In this example, we check the control performance of the proposed control scheme in presence of model uncertainties. Similar with previous simulations, the initial states of the robot are also θ 0 = [1.57, −0.628, −0.524, −0.524] T rad,θ 0 = [0, 0, 0, 0] T rad/s. In real implementations, the interaction model is usually unknown, and the nominal values of K d and K p are not accurate. Without loss of generality, we select the nominal values of K d and K p asK d = 80,K p = 4000, respectively.In order   to handle model uncertainties in the interaction coefficients, an extra node is introduced into (15). Then the modified RNN can be formulated as: , and the positive coefficient K in scaling the updating rate is defined as K in = diag (500,20). Simulation results are shown in Figures 6, 7.
Although the exact values of K d and K p are unknown, the closed-loop system is still stable, which can be shown from the convergence of tracking error e x and contact force F in Figures 6A,B. The change curves of joint angles and joint velocities with respect to time are shown in Figures 6C,D, in which the bounded-ness of joint angles and velocities are guaranteed. The observed interaction coefficientsK d andK p are shown in Figure 6E, indicating that bothK d andK p converge to their real values. Figure 7A shows the distances between the key points and the obstacle, it is obvious that all key points keep at a safe distance from the obstacle (the closest key point is A 2 ). Euclidean norm of b 0 − Jθ is illustrated in Figure 7C, despite fluctuation occurs at about t = 1.5 s, the proposed controller could handle model uncertainties. The impedance model based reference trajectory and the original desired trajectory are shown in Figures 7D,E. Although x r and x d are different, the tracking error e x along the direction of free motion and force error e F converges to zero, as shown in Figures 6A,B. The objective function ||θ || 2 2 to be optimized is given in Figure 7F. the convergence of the established RNN is shown in Figure 7C, despite the uncertain parameters, using the adaptive updating law, the established RNN is capable of learning the optimal solution. The spikes is mainly because of the change of λ 2 when obstacle avoidance scheme is activated.

Manipulation in Narrow Space
In this part, we discuss a more general case of motion-force control task, in which the workspace is defined in a limited narrow space. The robot is limited by two parallel lines, namely, y 1 = 0.15 and y 2 = −0.15 m. Considering the safety distance, all key points except A 8 must satisfy the workspace description −0.14 ≤ y ≤ 0.14 m. The initial joint angles are set to be θ 0 =  m, and the converges to zero, with the transient being about 0.5s. Simultaneously, the contact force also converges to 20N. In Figure 9A, minimum distances between the key points to y 1 and y 2 are represented by blue and red curves, respectively. The tracking trajectory and the corresponding joint configurations are shown in Figure 8A. During t = 1 − 1.5 s and t = 6 − 13 s, point A 2 gets close to y 1 , during t = 4 − 7 s, A 4 is close to y 2 . Remarkable that there exist fluctuations in positional and force errors at t = 1 s and t = 4 s (i.e., when A 2 and A 4 get close to the bounds), respectively. Similar to the previous simulations, the reference trajectories are given in Figures 7C,D, and the objective functions are shown in Figure 7E. Using the proposed RNN controller, the robot can realize both position and force control in limited narrow space.

Comparisons
In this part, comparisons among the proposed control scheme and existing methods are given to show the superiority of the RNN based strategy. The comparisons are shown in Table 1. In Guo and Zhang (2012), a RNN based controller is designed for redundant manipulators, both obstacle avoidance and physical constraints are considered. However, the controller only focus on kinematic control problem. In Nanayakkara et al. (2001) and Csiszar et al. (2011), force control together with obstacle avoidance are taken into account, but the physical constraints are ignored. Xu et al. (2019a) develop an adaptive admittance control strategy, which is capable of dealing with force control under model uncertainties, physical constraints and real-time optimization. It is remarkable that the proposed strategy focus on real-time obstacle avoidance in force control tasks, and the controller is capable of ensuring the boundedness of joint angles and velocities. At the same time, simulations have shown the potential of optimization ability of norm of joint speed.

CONCLUSIONS
In this paper, a novel collision-free compliance controller is constructed based on the idea of QP programming and neural networks. Different with existing methods, in this paper, the control problem is described from an optimization perspective, and the compliance control and collision avoidance are formulated as equality or inequality constraints. The physical constraints such as limitations of joint angles and velocities are also taken into consideration. Before ending this paper, it is worth pointing out that it is the first RNN based compliance control method, which considers collision avoidance problem in realtime, and also shows great potential in handling physical limitations. In this paper, simple numerical simulations in MATLAB are carried out to verify the efficiency of the proposed controller. In the future, we will check the control framework with different impedance models in physically realistic simulation environments, and then consider machine vision technology and system delay problem on physical experimental platforms.

DATA AVAILABILITY
All datasets analyzed for this study are included in the manuscript and the supplementary files.