^{*}

Edited by: Shuai Li, Swansea University, United Kingdom

Reviewed by: Tomas Kulvicius, University of Göttingen, Germany; Luca Patanè, University of Catania, Italy

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

Autonomous trajectory and torque profile synthesis through modulation and generalization require a database of motion with accompanying dynamics, which is typically difficult and time-consuming to obtain. Inspired by adaptive control strategies, this paper presents a novel method for learning and synthesizing Periodic Compliant Movement Primitives (P-CMPs). P-CMPs combine periodic trajectories encoded as Periodic Dynamic Movement Primitives (P-DMPs) with accompanying task-specific Periodic Torque Primitives (P-TPs). The state-of-the-art approach requires to learn TPs for each variation of the task, e.g., modulation of frequency. Comparatively, in this paper, we propose a novel P-TPs framework, which is both frequency and phase-dependent. Thereby, the executed P-CMPs can be easily modulated, and consequently, the learning rate can be improved. Moreover, both the kinematic and the dynamic profiles are parameterized, thus enabling the representation of skills using corresponding parameters. The proposed framework was evaluated on two robot systems, i.e., Kuka LWR-4 and Franka Emika Panda. The evaluation of the proposed approach on a Kuka LWR-4 robot performing a swinging motion and on Franka Emika Panda performing an exercise for elbow rehabilitation shows fast P-CTPs acquisition and accurate and compliant motion in real-world scenarios.

Programming by demonstration (PbD) is a typical approach for transferring skills to robots by mirroring human actions (Billard et al.,

Knowing the exact dynamical model is crucial to achieving compliant robot behavior, which is needed when robots are operating in an unstructured environment. Hence, exact dynamical models of both, the robot and the task makes it possible to either adjust the controller feedback gains to obtain the desirable compliance or to prescribe the desired dynamic behavior (Buchli et al.,

Learning of task-specific dynamics was proposed in Deniša et al. (

The main contribution of this paper is a two-layered system that combines Phase-synchronized Adaptive Fourier Series (P-AFO) with Periodic Compliant Movement Primitives (P-CMPs). The P-AFO is an incremental improvement of AFO proposed in Petrič et al. (

This paper is organized as follows. In the next section, we describe related work detailing the topics of learning of robot torque profiles and their modulation and generalization. In section 3 we describe the main contributions of this paper, i.e., unambiguous phase synchronization (P-AFO), periodic torque primitives (P-TPs), and the integration of feedback error learning. Results of experimental evaluation on a Kuka LWR-4 robot arm learning to perform a swinging task and evaluation on Franka Emika Panda robot learning to rehabilitate the elbow by a stretching task are presented in section 4. A discussion concludes the paper in section 5.

For accurate and compliant execution of tasks, the task-space dynamics is required (Del Prete and Mansard,

Exploiting the feed-forward torque was also utilized when the possibility of measuring joint torque was available. For example in Calandra et al. (

Other approaches for torque learning not directly related to CMPs were also proposed. Gaussian process regression for on-line learning of the dynamical model was proposed in Nguyen-Tuong and Peters (

Trajectory modulation and generalization is a wide topic that can be considered from different domains of application. Mostly, methods for modulation and generalization were focused on the kinematic trajectory and only a few dealt with dynamics. The modulation and generalization ability of kinematic and dynamic parameters are specifically important for the P-CMPs framework proposed in this paper. The kinematic part of P-CMPs is encoded with P-DMPs, which already allow a certain degree of modulation and generalization. In Gams et al. (

The DMPs are not the only trajectory representation method or even the only dynamical systems used for modulation and generalization. However, because our proposed approach in this paper is composed also of DMPs, other possible alternatives are only briefly listed below. The task-specific Gaussian Mixture Models (TP-GMM) were proposed by Khansari-Zadeh and Billard (

The generalization of both kinematic trajectories and torque profiles has been reported with the aforementioned CMPs in Deniša et al. (

The inspiration for the P-CMPs multi-layered framework has been taken from the two-layered imitation system reported in Gams et al. (

The multi-layered structure of the control system based on P-CMPs. The input _{f}. Note that the system can work in parallel for an arbitrary number of dimensions.

Periodic Compliant Movement Primitives (P-CMPs)

Here Ω and ϕ are the desired motion frequency and phase, respectively. _{d}(Ω, ϕ) and ṗ_{d}(Ω, ϕ) are the desired acceleration, velocity and position trajectories, respectively, encoded within P-DMPs. τ_{f}(Ω, ϕ) are the corresponding joint torques encoded in P-TPs.

Similar to the discrete CMPs, the two-stage process is used to obtain the P-CMPs. First, the kinematic motion trajectories are obtained typically by imitation learning (Gams et al.,

The adaptive phase oscillator with the adaptive Fourier series was originally proposed in Petrič et al. (

Here Ω is the estimated motion frequency, κ is the coupling strength, ϕ is the corresponding phase and ϵ is governed by

where

where _{i} and _{j} are updated as in Petrič et al. (

where η is the parameter update rate. By skipping the first parameter of the sinusoidal part of the Fourier series, i.e.,

The second layer ensures the proper waveform of the kinematic trajectories. It is encoded by P-DMPS, which are anchored to the phase signal ϕ of the adaptive oscillator as in Petrič et al. (

where α_{z} and β_{z} are the positive constants, which guarantee that the system monotonically converges,

Here

where _{i} is their distribution concerning the phase. Typically they are spread equally between 0 and 2π.

To learn the shape of the trajectory different methods where proposed. When data is available upfront, a batch regression can be used as in Ude et al. (

Here the triplet of _{d}, ẏ_{d} and _{i} of the kernel function ψ_{i}, we use the flowing recursive least-squares method.

The regression typically starts with _{i} = 0 and _{i} = 0. Note that _{i} is the inverse covariance. λ is the forgetting factor.

Essentially the combination of P-AFO and P-DMP ensures robustness against perturbations and allows frequency modulation of the trajectory. Especially frequency modulation is crucial when performing human-robot cooperative tasks.

The third layer encodes the corresponding torque trajectories τ_{f}(Ω, ϕ) and it is denoted by P-TPs. Note that torques are task-specific, which means they are dependent on the dynamic properties of the task including the execution speed, e.g., frequency. Therefore we propose that P-TPs τ_{f}(Ω, ϕ) are both, phase ϕ and frequency Ω dependent. They are governed by

where ν is a

Here, ^{ϕ} are the width of the kernel and ^{Ω} are the kernels width and

The P-TPs are learned on-line while executing the encoded DMP motion with low gain impedance control using the following law

Here, _{d}, ṗ_{d}, and _{p}, _{d}, and _{i} are the constants selected to ensure robot behaves compliantly, i.e., set to match the low impedance control requirements.

To learn task-specific torque profiles, we used the feedback error learning approach (Nakanishi and Schaal,

where ι is a positive constant determining the rate of learning. Note that stability analysis was given in Nakanishi and Schaal (

Because the torques are updated on-line, the task performance, i.e., tracking accuracy, improves over time even if the feedback gains are low. The main idea used in the proposed P-CMPs framework approach is to assure the nominal behavior of the robot for the given periodic task even if compliant robot control is used, i.e., using low feedback gains. In this way, we can assure both, the good tracking accuracy and the compliant behavior. This increases safety aspects for robots working in an unstructured environment or with humans.

In this section we describe the simulations used to compare the P-AFO phase and frequency synchronization performance with the original AFO (Petrič et al.,

In this numerical simulation example, we compare the phase and frequency synchronization abilities of the original AFO system with the proposed P-AFO system. Note that in both cases the adaptation is done without any signal processing since the entire process of frequency and phase synchronization is completely embedded in the dynamics of the oscillator. In the following example we used for both, AFO and P-AFO, the flowing parameters: κ = 20, μ = 2, _{i}(0) = _{j}(0) = 0.5. The input

Frequency and phase adaption results are illustrated in

Typical convergence of an AFO and P-AFO systems driven by a sinusoidal periodic signal. In the top plot, the comparison between the input signal and the approximation of the system is shown. The middle plot shows the phase synchronization and the bottom plot shows the frequency adaption.

_{i} and _{j}. The results shows that the phase synchronization of the original AFO concerning the input signal is not repeatable. Note, that if we change the initial parameters or the start of the input signal, the phase shift between the input signal and the extracted phase of AFO will be different. Extracting the exact phase of the input signal is crucial for the P-CMPs. In the middle plot of

Typical convergence of an AFO and P-AFO systems driven by a periodic signal with different initial conditions. Top plots, shows the comparison between the input signal and the approximation of the system and middle plots shows the phase synchronization.

To illustrate the ability to learn the internal dynamical model, we implemented the P-CMPs approach on a real robot Kuka LWR-4. In this example, the goal was to learn the corresponding dynamical model in P-TPs using the approach proposed in section 3. The kinematic trajectory for this task was predefined for all 7 degrees of freedom and it is shown on the left hand side of

Learning of internal dynamical models for different motion frequencies on 7 degrees of freedom Kuka LWR-4 robot. The left plot shows the desired kinematic motion _{d} dependent on the phase parameter ϕ and the right plot shows the sum of square motion tracking error during the leaning process.

By using the proposed P-CMPs system we can see that the tracking error, and hence the learning of the internal dynamical model, is rapid and successful. In the left plot in

Top and middle plots show example joint and torque trajectories, respectively (Ω = 2π example). The bottom plot shows the sum of the square motion tracking error during the leaning process.

The kinematic motion improvements and the evolution of the corresponding internal dynamical models, i.e., torque profiles, is for a Ω = 2π example shown in

In

Difference between AFO and P-AFO system, both used with P-CMPs. The top plots show the desired and actual joint movements when using previously learned P-CMPs from the example in

In the last example, the proposed P-CMPs method was demonstrated on a task where the robot was holding a human hand model with the simulated elbow joint as shown in

Experimental setup for physically simulated human elbow stretching tasks.

Instead, we can use the proposed P-CMPs approach to learn task-specific, appropriate torques for a given kinematic trajectory. This task could be performed with the original CMPs system combined with the statistical generalization. However, this would not be most effective since it would require to learn the CMPs at the specific frequency to build the database. In contrast, the proposed P-CMPs framework allows learning at an arbitrary frequency, as the frequency dependence is built into the P-TPs system. Working with a compliantly controlled robot, i.e., low feedback gains, with the ability to produce accurate trajectory tracking makes the system also safer for the environment, operator, and user.

To show the P-CMPs performance, the kinematic motion for elbow stretching was defined by using kinesthetic teaching (Deniša et al.,

Results of elbow stretching example. The top plot shows the desired motion frequency. The second plot shows the sum of square tracking errors. The third plot shows the relationship between current and final weight matrix. Bottom plots show the P-TPs weight matrix values for one joint at a certain time during the learning process.

The sum of square tracking errors shows that the proposed approach can significantly improve the kinematic tracking. We can see also that, by performing one sweep through the frequency space already significantly improves the tracking error. As seen in the third plot and bottom plots in

We presented a new P-CMPs framework consisting of a novel P-AFO frequency and phase synchronization systems, periodic DMPs, and a novel P-TPs system encoding task-specific primitives. The proposed P-CMPs system uses feedforward torque signals which are associated with corresponding kinematic motions. We showed, that the novel approach is able to unambiguously extract not only the frequency but also the phase from an arbitrary signal which allows anchoring the P-TPs to the P-DMPs trajectories. Furthermore, the novel extension of the P-TPs system also makes P-TPs frequency-dependent, which enables smooth frequency modulation of the P-CMPs. Integrating the feedback error learning concept in P-CMPs also improves the usability of the system. Our results indicate that the system was able to synchronize the kinematic and dynamics signals enabling compliant behavior while maintaining high tracking accuracy, without the need for developing mathematical dynamical models of the robot or the task.

The proposed P-CMPs framework is an improvement compared to the previews CMPs framework, enabling better learning performance and smooth frequency modulation abilities of periodic tasks.

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author/s.

TP contributed to the design, execution, and drafting of this work.

The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.