A PSO–ML-LSTM-based IMU state estimation approach for manipulator teleoperation

Zhou, Renyi; Li, Yuanchong; Zhang, Aimin; Zhang, Tie; Guan, Yisheng; Zhao, Zhijia; Chen, Shouyan

doi:10.3389/frobt.2025.1638853

ORIGINAL RESEARCH article

Front. Robot. AI, 16 September 2025

Sec. Human-Robot Interaction

Volume 12 - 2025 | https://doi.org/10.3389/frobt.2025.1638853

This article is part of the Research TopicWearables for Human-Robot Interaction & CollaborationView all 4 articles

A PSO–ML-LSTM-based IMU state estimation approach for manipulator teleoperation

Renyi Zhou^1,2

Yuanchong Li³

Aimin Zhang²

Tie Zhang⁴

Yisheng Guan¹

Zhijia Zhao³

Shouyan Chen³*

¹School of Electro-mechanical Engineering, Guangdong University of Technology, Guangzhou, China
²Advanced Technology Department, GAC R&D Center, Guangzhou Automobile Group Co. Ltd., Guangzhou, China
³School of Mechanical and Electrical Engineering, Guangzhou University, Guangzhou, China
⁴School of Mechanical and Automotive Engineering, South China University of Technology, Guangzhou, China

Manipulator teleoperation can liberate humans from hazardous tasks. Signal noise caused by environmental disturbances and the devices’ inherent characteristics may limit the teleoperation performance. This paper proposes an approach for inertial measurement unit (IMU) state estimation based on particle swarm optimization (PSO) and modulated long short-term memory (ML-LSTM) neural networks to mitigate the impact of IMU cumulative error on the robot teleoperation performance. A motion mapping model for the human arm and a seven-degree-of-freedom (7-DOF) robotic arm are first established based on global configuration parameters and a hybrid mapping method. This model is used to describe the impact of IMU cumulative error on the robot teleoperation performance. Subsequently, the IMU pose state estimation model is constructed using PSO and ML-LSTM neural networks. The initial data of multiple IMUs and handling handles are used for training the estimation model. Finally, comparative experiments are conducted to verify the performance of the proposed state estimation model. The results demonstrate that the PSO–ML-LSTM algorithm can effectively eliminate the impact of IMU cumulative errors on robot teleoperation.

1 Introduction

There are many environments and situations where robots are expected to replace or assist humans at the sites (Darvish et al., 2023; Liu et al., 2025). However, due to the limitations of AI techniques, fully autonomous solutions are still far from being able to generate natural and appropriate operational behaviors. Consequently, robot teleoperation is considered a reasonable solution for tasks in extreme environments, which can relieve human operators from potential hazards (Li S. et al., 2024). Various efforts have been made to deploy human senses, actions, and presence in remote locations. During the teleoperation, the robot is required to imitate human actions to perform highly dexterous tasks under limited information exchange. The priority of the robotic teleoperation system is to measure the kinematic and dynamic information of the human and transfer it to the robot’s movements for teleoperation. One of the key challenges lies in establishing precise motion mapping between the human operator and the robot.

To provide references for robotic motion, scholars have explored different technologies to measure human motion, including wearable inertial measurement unit (IMU)-based motion estimation (Penco et al., 2019; Škulj et al., 2021), vision-based motion capture (Tsitos and Dagioglou, 2022), exoskeleton-based motion measurement (Cheng et al., 2024), and EMG- and EEG-based motion intention estimation (Chen et al., 2023; Li H. et al., 2024). The performance of the vision-based capture approach is impacted by the occlusion and low portability of the setup, while the EMG- and EEG-based approaches have high requirements for the detection environment and equipment (Wang et al., 2023). In contrast, the IMU-based approach can estimate human motion without occlusion-related problems, which is more suitable for use in the field and other unstructured scenarios. However, gyroscopic drift of IMU tends to cause cumulative errors, which will gradually accumulate and amplify over time.

Scholars have attempted to improve IMU state estimation accuracy by deploying Kalman filtering and machine learning methods (Luo et al., 2025). Zhang et al. (2022) proposed a quadrotor state estimation method based on deep neural networks and a multi-sensor data fusion model. The IMU’s kinematic characteristics, the robot’s dynamic properties, and uncertainty representations are learned by training a cascaded network on real-world quadrotor flight data, the information of which is fused into a two-stage extended Kalman filter (EKF) framework for better estimation. Hosseinyalamdary (2018) proposed a combination of deep learning and Kalman filters for modeling to eliminate the system state estimation errors caused by IMU errors. Han et al. (2019) proposed a deep VIO algorithm that combines vision and IMU and uses a self-supervised end-to-end strategy to estimate system state. Kim et al. (2021) used one-dimensional convolutional neural networks to predict the desired velocity from the raw acceleration data to improve the accuracy of individual IMU state estimations. Luo et al. (2024) proposed a Kalman filtering and modulated long short-term memory (ML-LSTM)-based approach to estimate the state of the vehicle system. Xu et al. (2022) proposed a full-state estimation algorithm based on the error-state extended Kalman filter (ESEKF) framework, which can enable simultaneous state estimation and external calibration (POS–IMU and IMU–IMU), handheld platforms, quadrotor unmanned aerial vehicles (UAVs), and ground vehicles.

Furthermore, due to the structural differences between the human body and the robot, teleoperation control of robots requires consideration of the motion mapping between the human body and the robot. It can be categorized into motion mapping for the upper limbs, lower limbs, and the whole body. Upper-limb motion mapping typically involves mapping the Cartesian space movements of human limbs to the corresponding values of the robot’s limbs and then considering the robot’s constraints to solve the inverse kinematics problem by minimizing a cost function. A common approach is to establish motion mapping between the human wrist and the robot’s end-effector, which is known as configuration space retargeting (Wang et al., 2021). Zhao et al. (2023) proposed that the lower-priority elbow motion should be considered in the mapping process, which is important for delicate operations in constrained spaces. In reference to the above study, an IMU state estimation approach based on particle swarm optimization (PSO) and an ML-LSTM network is proposed to estimate the IMU error for robot teleoperation, while the human wrist and elbow motions are measured and imitated.

This study focuses on the human-robot posture mapping problem caused by multi-IMU drift errors in a remote sensing system. Unlike normal IMU drift issues, this problem requires establishing the spatial relationships among multiple IMUs on the human arm through kinematic modeling, leveraging their invariant spatial constraints to correct drift errors. The key contribution lies in proposing a lightweight, PSO-ML-LSTM-based online IMU calibration model that accounts for computational and temporal costs in data training. By fitting the model using early-stage IMU data (where drift is minimal), it enables real-time correction of subsequent drift-affected data. This method is used to establish the state and observation models for the teleoperation motion mapping, thereby enabling online calibration of the IMUs. The remainder of this study is structured as follows. Section II establishes the teleoperation mapping model and describes the background of the problem. Section III introduces the working principles of the algorithm. Section IV verifies the performance of the algorithm through comparative experiments. Section V summarizes the innovative methods proposed in this paper.

2 Robot teleoperation model and problem description

2.1 Robot teleoperation system

The robot teleoperation system is based on two IMUs and an operating handle. The operating handle provides the position and orientation of the human hand, while the two IMUs, worn on the upper arm and forearm of the human body, are used to detect the motion state of the human arm. The data from the IMUs and operating handle are transmitted via Bluetooth to the STM32 for data acquisition and processing. Subsequently, the data are sent through a Wi-Fi module to the rk3588 processor for robot inverse kinematics and arm angle calculations, generating the trajectory of a seven-degree-of-freedom (7-DOF) robot. The system composition and data transmission process are illustrated in Figure 1.

Figure 1

A collage showing a person with sensors labeled IMU attached to the arm, a 7DOF robotic arm, and electronic boards labeled STM32 and rk3588 with arrows indicating data flow between the components.

Figure 1. Robot teleoperation system.

2.2 Robot model

This paper adopts incremental position mapping and absolute attitude mapping. The kinematic model of the robot and the human body is established before determining the human–robot motion mapping. The 7-DOF robot structure is shown in Figure 2 and the robot DH parameters are provided in Table 1.

Figure 2

Illustration of a complex mechanical structure with interconnected cylindrical components and gears, resembling parts of a robotic or industrial arm. The design includes visible gears and likely indicates a system of joints and couplings.

Figure 2. Robot configuration.

Table 1

Table 1. Seven-DOF robot DH parameters.

The robot kinematics model can be obtained by:

{}_{7}^{0}{T_{r} (θ_{1}, θ_{2}, θ_{3}, θ_{4}, θ_{5}, θ_{6}, θ_{7})} = [\begin{array}{c} r_{11} & r_{12} & \begin{array}{c} r_{13} & p_{x} \end{array} \\ r_{21} & r_{22} & \begin{array}{c} r_{23} & p_{y} \end{array} \\ \begin{array}{c} r_{31} \\ 0 \end{array} & \begin{array}{c} r_{32} \\ 0 \end{array} & \begin{array}{c} \begin{array}{c} r_{33} & p_{z} \end{array} \\ \begin{array}{c} 0 & 1 \end{array} \end{array} \end{array}] = [\begin{array}{c} {}_{7}^{0}R_{r} & {}_{7}^{0}P \\ 0 & 1 \end{array}] . (1)

Here, ${}_{7}^{0}P = [p_{r, x}, p_{r, y}, p_{r, z}]$ is the robot end-effector position.

To solve the problem of multiple solutions in the inverse solution of a redundant DOF manipulator, a unique solution can be obtained by introducing the global configuration parameter GCk. According to Faria et al. (2018), the spatial orientation of robotic arms is directly affected by GCk of the shoulder, elbow, and wrist.

To address the global and local self-motion manifolds, two supplementary parameters are incorporated into the redundant robot inverse kinematics calculation, namely, global configuration (GC) and arm angle $ψ$ . The global configuration is used to specify the branch of the inverse kinematics solutions for the global configuration manifold. The arm angle $ψ$ indicates the elbow position in the redundancy circle, as shown in Figure 3. The global configuration parameter $G C_{k}$ is divided into three variables that represent the sign of the joint angle coordinates for the shoulder joint ( $G C_{2}$ ), the elbow joint ( $G C_{4}$ ), and the wrist joint ( $G C_{6}$ ). $G C_{k}$ is given as follows:

G C_{k} = \{\begin{array}{l} 1, if θ_{k} \geq 0 \\ - 1, if θ_{k} < 0 . \end{array}, \forall k \in \{2, 4, 6\} . (2)

Figure 3

Diagram illustrating a reference plane and an arm plane connected by points $P_S$, $P_E$, and $P_W$. An IMU sensor is located between these planes, indicating movement direction. An angular measurement, labeled as $\psi$, is present between the planes.

Figure 3. Representation of the arm angle $ψ$ as the angle between the robot arm plane and the reference plane.

In this paper, when the operator employs the right arm for teleoperation, the values are set as GC2 = 1, GC4 = 1, and GC6 = −1; conversely, when the left arm is utilized for teleoperation, the values are assigned as GC2 = −1, GC4 = −1, and GC6 = 1. Subsequently, a unique inverse solution can be derived based on the pose $T_{s, t_{1}}$ , the arm angle $ψ$ , and the global configuration parameter GCk. Since θ_4 is not affected by the arm angle $ψ$ , other angles can be obtained using the following formulas.

\begin{align} θ_{1} & = atan 2 (G C_{2} [a_{s 22} \sin ψ + b_{s 22} \cos ψ + c_{s 22}], \\ G C_{2} [a_{s 12} \sin ψ + b_{s 12} \cos ψ + c_{s 12}]), \end{align}

θ_{2} = G C_{2} \arccos (a_{s 32} \sin ψ + b_{s 32} \cos ψ + c_{s 32}),

\begin{align} θ_{3} & = atan 2 (G C_{2} [- a_{s 33} \sin ψ - b_{s 33} \cos ψ - c_{s 33}], \\ G C_{2} [- a_{s 31} \sin ψ - b_{s 31} \cos ψ - c_{s 31}]), \end{align}

\begin{align} θ_{5} & = atan 2 (G C_{6} [a_{w 23} \sin ψ + b_{w 23} \cos ψ + c_{w 23}], \\ G C_{6} [a_{w 13} \sin ψ + b_{w 13} \cos ψ + c_{w 13}]), \end{align}

θ_{6} = G C_{6} \arccos (a_{w 33} \sin ψ + b_{w 33} \cos ψ + c_{w 33}),

\begin{align} θ_{7} & = atan 2 (G C_{6} [a_{w 32} \sin ψ + b_{w 32} \cos ψ + c_{w 32}], \\ G C_{6} [- a_{w 31} \sin ψ - b_{w 31} \cos ψ - c_{w 31}]) . \end{align} (3)

2.3 Virtual human arm model

To construct a virtual human arm model, the coordinate system on the human arm is established as shown in Figure 1. Coordinate systems {1} and {2} are on the human shoulder joint, while coordinate system {3} is on the human elbow joint, and coordinate system {4} is established on the wrist. The DH parameterof virtual huma arm is shown in Table 2.

Table 2

Table 2. Human arm DH parameters.

Here, $a_{h, 2}$ is the length of the upper arm and $a_{h, 3}$ is the length of the lower arm. According to the homogeneous transformation formula, the elbow position can be derived as follows:

{}_{3}^{0}P_{h} = [\begin{array}{c} a_{h, 2} c_{h, 2} c_{h, 1} \\ a_{h, 2} c_{h, 2} s_{h, 1} \\ a_{h, 2} s_{h, 2} \end{array}] . (4)

The wrist position is

{}_{4}^{0}P_{h} = [\begin{array}{c} c_{h, 1} (a_{h, 3} c_{h, 23} + a_{h, 2} c_{h, 2}) \\ s_{h, 1} (a_{h, 3} c_{h, 23} + a_{h, 2} c_{h, 2}) \\ a_{h, 3} s_{h, 23} + a_{h, 2} s_{h, 2} \end{array}] . (5)

2.4 Master–slave motion relationship mapping

This paper employs a hybrid mapping approach, where position is mapped incrementally while orientation uses absolute mapping. The robot model and virtual human arm model are utilized to map the posture of both the human arm and wrist.

The positions of human wrist and robot can be obtained according to Equations 1, 5. The positions of the human wrist and robot end-effector are mapped incrementally. Assuming that the initial calibration position is $P_{m, t_{0}} = P_{h, t_{0}}$ and $P_{s, t_{0}} = {}_{7}^{0}P_{r, t_{0}}$ , the positions in relation to the master and slave ends can be expressed as follows:

P_{h, t_{1}} = P_{h, t_{0}} + ∆ P_{h, t_{1}},

P_{r, t_{1}} = P_{r, t_{0}} + ∆ P_{r, t_{1}}, (6)

∆ P_{r, t_{1}} = K ∆ P_{h, t_{1}} .

Here, $K$ is the mapping scale. The attitude of the human wrist and the robot end-effector are mapped absolutely:

∆ R_{r, t_{1}} = ∆ R_{h, t_{1}} . (7)

In order to ensure that the robot arm can adjust the attitude of the intermediate joint according to the requirements of the operator to achieve obstacle avoidance, the intermediate joint adopts absolute attitude mapping:

T_{h, t_{1}} = [P_{h, t_{1}}, ∆ R_{h, t_{1}}],

T_{r, t_{1}} = [K ∆ P_{r, t_{1}}, ∆ R_{r, t_{1}}] . (8)

According to Equations 6–8, the relationship mapping can be established. For ease of calculation, the arm angle $ψ$ in this paper is defined as the angle between the arm and reference planes, where the arm plane is considered parallel to the axis $X_{1}$ and $X_{2}$ of the IMU on the upper and lower arms. To figure out the arm plane, the Euler angles of the two IMUs are converted into direction vectors. By defining the IMU Euler angles as $[α_{1}, β_{1}, γ_{1}]$ and $[α_{2}, β_{2}, γ_{2}]$ , the normal vector S of the arm plane can be obtained by the cross product of $X_{1}$ and $X_{2}$ .

S = X_{1} \times X_{2} = [\begin{array}{c} c γ_{1} * c β_{1} \\ s γ_{1} * c β_{1} \\ s β_{1} \end{array}] \times [\begin{array}{c} c γ_{2} * c β_{2} \\ s γ_{2} * c β_{2} \\ s β_{2} \end{array}] . (9)

Defining the reference plane as being perpendicular to the horizontal plane, the arm angle $ψ$ can be obtained by dotting the normal vector $S$ with the horizontal normal vector $H$ .

ψ = a r c o s (S \cdot H) . (10)

The inverse kinematics can be obtained by substituting Equation 10 into Equation 2.

2.5 Problem description

When there is magnetic field interference in the operating environment, the magnetometer cannot be used to calibrate the IMU online, and the IMU cumulative error $[e_{1} (t), e_{2} (t)]$ will appear. In this case, the normal vector of the arm plane is $S^{'} (e_{1} (t), e_{2} (t))$ , and the arm angle error is

ψ^{'} (t) = a r \cos (S^{'} (t) \cdot H) . (11)

To observe the drift problem, an IMU with magnetometer malfunction is bound with a high-precision IMU with magnetometer. A random arm motion is conducted to observe the drift of the low-cost IMU, while the high-precision IMU is regarded as a reference standard. The Figure 4 shows the IMU yaw angle without (blue line) and with the magnetometer after 20 min of various random motions. The final cumulative error is approximately 70°. It is worth noting that to perform online calibration of the IMU on the arm, accurate data are required as a reference point. Since the handle has its own vision-based calibration function, this paper assumes that the position data of the handle are accurate.

Figure 4

Line graph showing two data sets labeled

Figure 4. IMU cumulative error after 20 min.

3 PSO–ML-LSTM-based IMU state estimation

In the case of magnetic field disturbance, the IMU cannot be self-calibrated by a magnetometer, and the cumulative error caused by drift will occur in the IMU. Therefore, it is necessary to estimate the IMU state online. This section introduces the PSO–ML-LSTM-based IMU state estimation approach. The whole process is shown in Figure 5.

Figure 5

Flowchart of a robot motion control system. IMU data undergoes Kalman filtering, feature selection, and modulation LSTM for IMU estimation. Master-Slave motion mapping leads to robot motion. Previous IMU data is used for Random Forest feature selection and parameter optimization.

Figure 5. Process of PSO–ML-LSTM-based IMU state estimation.

3.1 Feature selection based on random forest

Considering the computation and time costs for online training, random forest-based feature selection is conducted to exclude unimportant feature data. Random forest-based feature selection combines the powerful representation capabilities of neural networks with the stability and nonlinear processing capabilities of random forest models. During the generation of each decision tree, the data subset and feature subset are randomly selected to enhance the robustness of the model and reduce overfitting. The input data for feature selection include the position and attitude of the operating handle, the attitude of IMUs, and the angular velocity and acceleration of IMUs. The feature selection results indicate that the position of the operating handle and the pitch and roll angle of IMUs have a significant impact on the IMUs’ yaw angle estimation.

3.2 Kalman filtering for data processing

In this paper, Kalman filtering is used to reduce IMU data bias and noise. To achieve this, the angular velocity measured by the gyroscope is used to predict the attitude and simultaneously model the slow change of zero offset:

x_{k} = A x_{k - 1} + B u_{k} + w_{k} . (12)

Here, A is the state transition matrix, $u_{k}$ is the angular velocity measured by the gyroscope, and $w_{k}$ is data noise. The process is started with the initiating states $x_{0}$ and $P_{0}$ . The prediction state ${\hat{x}}_{k}^{-}$ and error covariance $P_{k}^{-}$ are calculated as follows:

{\hat{x}}_{k}^{-} = A {\hat{x}}_{k - 1} + B u_{k},

P_{k}^{-} = A P_{k - 1} {\tilde{A}}^{T} + Q . (13)

The Kalman gain $K_{k}$ and state are updated according to

K_{k} = P_{k}^{-} H^{T} {(H P_{k}^{-} H^{T} + R)}^{- 1},

{\hat{x}}_{k} = {\hat{x}}_{k}^{-} + K_{k} (Z_{k} - H {\hat{x}}_{k}^{-}),

P_{k} = P_{k}^{-} - K_{k} H P_{k}^{-} . (14)

Here, $Q$ is the process noise covariance matrix, and $R$ is the measurement noise covariance matrix.

3.3 PSO–ML-LSTM-based IMU state estimation

The ML-LSTM neural network introduces the modulation gate into traditional LSTM to evaluate the importance of historical information at different times. It can improve the traditional memory mode and take the memory summation average value of each data segment as the standard memory, thus improving the model’s ability to recognize key information and the model’s ability to interpret. The structure of the ML-LSTM neural network is shown in Figure 6.

Figure 6

Diagram of an LSTM cell, showing how information flows through input, output, and forget gates. It includes operations like sigmoid and tanh functions, with connections illustrating dependencies. Gates and processes are color-coded.

Figure 6. Structure of ML-LSTM.

The forget gate of ML-LSTM is used to determine what information can pass through the memory unit and generate an $f_{k}$ according to the output value $h_{k - 1}$ at the last moment and the current input value $x_{k}$ .

f_{k} = σ (W_{f} \cdot [h_{k - 1}, x_{k}] + b_{f}) . (15)

Here, $W_{f}$ is the weight, $b_{f}$ is the offset, and $σ$ is the activation function sigmoid. To generate updated information, the values of $i_{k}$ and the new candidate ${\tilde{C}}_{k}$ will be calculated and added to the memory unit as a candidate value generated by the current layer.

i_{k} = σ (W_{i} \cdot [h_{k - 1}, x_{k}] + b_{i}),

{\tilde{C}}_{k} = \tanh (W_{C} \cdot [h_{k - 1}, x_{k}] + b_{C}) . (16)

The memory cells are updated by

{C_{k}}^{'} = f_{k} \cdot C_{k - 1} + i_{k} \cdot {\tilde{C}}_{k} . (17)

The modulation gate sums the memory information and calculates the mean E.

C_{k} = E = \sum_{1}^{m} \frac{{C_{k, m}}^{'}}{m} . (18)

Here, m is the number of data segments. The model output $h_{k}$ can be obtained as follows:

h_{k} = σ (W_{o} \cdot [h_{k - 1}, x_{k}] + b_{o}) \cdot \tanh (C_{k}) . (19)

Then, the PSO algorithm is run offline using previous data to optimize the neuron number $n_{n n}$ and learning rate $l_{r}$ of ML-LSTM. The initial position and speed of the particles are first randomly generated. The fitness evaluation of each particle solution is conducted. The historical best position ( $p_{i d}^{t}$ ) and the global best position ( $p_{g d}^{t}$ ) for each particle are then recorded. The speed and position of each particle according to $p_{i d}^{t}$ and $p_{g d}^{t}$ are updated to calculate the neuron number and learning rate according to the following equation.

v_{i d}^{t + 1} = ω v_{i d}^{t 1} + c_{1} r_{1} (p_{i d}^{t} - x_{i d}^{t}) + c_{2} r_{2} (p_{g d}^{t} - x_{g d}^{t}),

x_{i d}^{t + 1} = x_{i d}^{t} + v_{i d}^{t + 1} . (20)

Here, $v_{i d}^{t + 1}$ and $x_{i d}^{t + 1}$ can be represented by $v_{l r}$ and $v_{n n}$ and $l_{r}$ and $n_{n n}$ . The above process will be repeated until the number of iterations is reached or an optimal solution is found. The PSO flow chart is shown in Figure 7.

Figure 7

Flowchart depicting a process starting with data preprocessing. It initializes LSTM parameters and PSO population, calculates PSO fitness, and updates positions. It checks if requirements are met, repeating or moving to predicting yaw angle using an optimized LSTM model. After checking prediction error requirements, it predicts angle and IMU data, finally outputting the global optimal value and ending once the prediction is completed.

Figure 7. Process of PSO.

3.4 Evaluation index

To evaluate the performance of the IMU state estimation model, three evaluation indexes are used, including the root mean square error (RMSE), mean absolute error (MAE), mean bias error (MBE), and R-squared $R^{2}$ .

M A E (X, h) = \frac{1}{m} \sum_{i = 1}^{m} |h (x_{i}) - y_{i}|, (21)

R M S E (X, h) = \sqrt{\frac{1}{m} \sum_{i = 1}^{m} {(h (x_{i}) - y_{i})}^{2}}, (22)

MBE = \frac{1}{n} \sum_{i = 1}^{n} (y_{i} - {\hat{y}}_{i}) (23)

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(h (x_{i}) - y_{i})}^{2}}{\sum_{i = 1}^{n} {(h ({\bar{x}}_{i}) - y_{i})}^{2}} (24)

4 Experiment and discussion

In this paper, the IMU estimation model is implemented online using LibTorch, which is integrated into the rk3588 processor. The dataset is split into training and validation sets with a ratio of 2:8. The random forest is first used for feature selection. and the particle swarm optimization is used to optimize the number of neurons and the learning rate of the ML-LSTM model before online estimation to reduce data requirement for model training and prediction. The search bounds for the number of neurons are set to [50, 200], while the learning rate is bounded within [1e-5, 1e-1]. The PSO hyperparameters are configured as follows: population size = 10, maximum iterations = 20, decay factor = 0.5, contraction–expansion coefficient = 1.0, and random seed = 42. The Adam optimizer is used for network training, where the training batch size is 64 and the number of iterations is 1,000. To confirm the validity of the proposed method, we compared it with other models, including Gaussian process regression (GPR) based on the radial basis kernel function (Ouyang et al., 2023), BP neural network based on the Levenberg–Marquardt algorithm (Hua et al., 2023), and neural network based on PSO–LSTM (Xiao et al., 2024). In order to observe the performance of the estimation models, the IMU data of 30 min without and with magnetometer calibration are compared with the IMU data estimated by the model mentioned above.

4.1 Levenberg–Marquardt-BP-based IMU estimation

The Levenberg–Marquardt-BP neural network can accelerate the training process and is suitable for complex nonlinear regression problems. The goal of this algorithm is to adjust the network weight by minimizing the error function. The algorithm implementation process is as follows: 1) the gradient of the loss function is calculated. 2) The weights are updated according to the Levenberg–Marquardt algorithm. 3) Iterations are repeated until convergence or the maximum number of iterations is reached. The model calculation results are given below, where the forecast results are colored yellow, the drift data are colored blue, and the correct data are colored green. The performance of LM-BP-based IMU estimation is shown in Figure 8.

Figure 8

Figure 8. Performance of LM-BP-based IMU estimation.

4.2 K-means-RBF-GPR-based IMU estimation

Gaussian process regression is a powerful non-parametric Bayesian regression method. It uses kernel functions to capture complex relationships of data by mapping the input space to a high-dimensional feature space. The cluster center of K-means (KM) is selected as the input feature point. The similarity between these feature points is calculated using the RBF kernel function. The model parameters are optimized by maximizing the posterior probability or minimizing the negative log-likelihood. The performance of KM-RBF-GPR-based IMU estimation is shown in Figure 9.

Figure 9

Figure 9. Performance of K-means-RBF-GPR-based IMU estimation.

4.3 PSO–ML-LSTM-based IMU estimation

Two experiments are conducted, including the PSO–LSTM- and the PSO–ML-LSTM-based IMU estimations. The performances of both models are shown in Figures 10, 11.

Figure 10

Line graph showing test set prediction for imu1_Euler_z. The x-axis represents the sample number, and the y-axis represents imu1_Euler_z values. Three lines are plotted: Actual in blue, Predicted in orange, and pico_imu_Euler_z in green. Evaluation metrics include RMSE of 9.8345, MAE of 6.9113, MBE of -3.8944, and R-squared of 0.8032.

Figure 10. Performance of PSO–LSTM-based IMU estimation.

Figure 11

Figure 11. Performance of PSO–ML-LSTM-based IMU estimation.

The results of the experiments are shown in Table 3. Although KM-RBF-GPR slightly outperforms LM-BP in terms of RMSE and MAE, it exhibits a larger bias (MBE) and relatively lower goodness of fit. The performance of PSO–LSTM is superior to that of KM-RBF-GPR, particularly showing an improvement in the goodness of fit (R²), but it still cannot surpass that of PSO–ML-LSTM. The RMSE of PSO–ML-LSTM is 8.3116, which is 1.9 units lower than that of LM-BP and 1.5 units lower than the RMSE of PSO–LSTM. The MAE and R² of PSO–ML-LSTM are 6.1617 and 0.8594, indicating that PSO–ML-LSTM has no significant systematic bias and fits the data well. The average cumulative error of PSO–ML-LSTM is approximately 2.8°, which is significantly better than that of the other three models.

Table 3

Table 3. Performance comparison.

4.4 Ablation study

The ablation studies are conducted to evaluate the effects of the Kalman filter and random-forest feature selection. The results are shown in the Figures 12, 13. Compared with the result shown in figure, the results shown in Figure 8 indicate that the use of random-forest feature selection and Kalman filter can achieve a better performance of RMSE, MAE, MBE and R².

Figure 12

Figure 12. Performance of PSO–ML-LSTM without random forest feature selection.

Figure 13

Figure 13. The performance of PSO-ML-LSTM without KF.

4.5 Robot remote control experiment

A manipulator teleoperation experiment was conducted. By comparing the arm angles of the manipulator before and after IMU error compensation, the effectiveness of the proposed method was verified. The manipulator used was the 7-DOF manipulator described in the previous section, whose structure is shown in Figure 14. Figure 15 shows the arm angle conditions after 20 min, including 1) the manipulator’s arm angle without compensation (green curve), 2) the arm angle of the human operator (yellow curve), and 3) the manipulator’s arm angle using the proposed method (red curve). Without IMU error correction, the deviation between the manipulator’s arm angle and the operator’s arm angle is approximately 30°, and the deviation exhibits a nonlinear variation. After applying the proposed approach, the arm angle deviation is significantly reduced. The maximum deviation decreased from 30° to 10°, and the variation trends followed similar patterns.

Figure 14

A robotic arm with exposed wiring and joints is positioned against a dark background. The arm features multiple articulated segments and visible cables, suggesting it is either under construction or undergoing repairs.

Figure 14. Seven-DOF manipulator.

Figure 15

Line graph comparing actual, predicted, and real arm angles in degrees over time. Solid green, dashed purple, and dotted orange lines represent actual, predicted, and real angles, respectively, with fluctuating trends across the graph.

Figure 15. Robot arm angle under remote control.

Figure 16 shows the robot end-effector position, including 1) the actual end-effector position without compensation (green curve), 2) the operator wrist position (blue curve), and 3) the predicted position of the end-effector using the proposed approach (purple curve). Without IMU error correction, the position deviation between the end-effector position and the operator’s wrist is approximately 0.25 m, and the angle of the end-effector differs significantly from that of the operator’s wrist. After applying the proposed approach, the position deviation is reduced to 0.13 m, and the end-effector’s angle closely matches the operator’s wrist.

Figure 16

Three graphs compare trajectories over ten seconds. The first graph in green shows the actual position on the XY plane. The second graph in purple displays the predicted position on the same plane. The third graph in blue illustrates the operator wrist trajectory on the XZ plane. Each graph has its respective legend to differentiate the positions.

Figure 16. Robot end-effector position under remote control.

5 Conclusion

This paper proposes a PSO-based modulated LSTM for online estimation of the IMU state in robot teleoperation. Experimental results show that the proposed method can effectively estimate the true attitude of the IMU, thus reducing the cumulative error of the IMU and the absolute error in teleoperation. However, the proposed method still has some limitations. For example, there is room for improvement in the RMSE and cumulative error.

Data availability statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Author contributions

RZ: Writing – original draft, Writing – review and editing. YL: Data curation, Investigation, Writing – review and editing, Writing – original draft. AZ: Funding acquisition, Resources, Visualization, Writing – original draft, Writing – review and editing. TZ: Methodology, Supervision, Validation, Writing – original draft, Writing – review and editing. YG: Formal analysis, Funding acquisition, Resources, Supervision, Validation, Writing – original draft, Writing – review and editing. ZZ: Writing – review and editing. SC: Conceptualization, Methodology, Project administration, Software, Supervision, Writing – original draft, Writing – review and editing.

Funding

The author(s) declare that financial support was received for the research and/or publication of this article. This study was funded by the Research Project of the Education Bureau of Guangzhou City (No.2024312411) and the National Natural Science Foundation of China (No.12172095).

Conflict of interest

Authors RZ and AZ were employed by Guangzhou Automobile Group Co. Ltd.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declare that no Generative AI was used in the creation of this manuscript.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Chen, P., Li, Z., Togo, S., Yokoi, H., and Jiang, Y. (2023). A layered sEMG–FMG hybrid sensor for hand motion recognition from forearm muscle activities. IEEE Trans. Human-Mach. Syst. 53 (5), 935–944. doi:10.1109/THMS.2023.3287594CrossRef Full Text | Google Scholar

Cheng, C., Dai, W., Wu, T., Chen, X., Wu, M., Yu, J., et al. (2024). Efficient and precise homo-hetero teleoperation based on an optimized upper limb exoskeleton. IEEE/ASME Trans. Mechatron., 1–13. doi:10.1109/TMECH.2024.3479873CrossRef Full Text | Google Scholar

Darvish, K., Penco, L., Ramos, J., Cisneros, R., Pratt, J., Yoshida, E., et al. (2023). Teleoperation of humanoid robots: a survey. IEEE Trans. Robot. 39 (3), 1706–1727. doi:10.1109/TRO.2023.3236952CrossRef Full Text | Google Scholar

Faria, C., Ferreira, F., Erlhagen, W., Monteiro, S., and Bicho, E. (2018). Position-based kinematics for 7-DoF serial manipulators with global configuration control, joint limit and singularity avoidance. Mech. Mach. Theory 121, 317–334. doi:10.1016/j.mechmachtheory.2017.10.025CrossRef Full Text | Google Scholar

Han, L., Lin, Y., Du, G., and Lian, S. (2019). “DeepVIO: self-Supervised deep learning of monocular visual inertial odometry using 3D geometric constraints,” in 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macau, China, 03-08 November 2019 (IEEE), 6906–6913.CrossRef Full Text | Google Scholar

Hosseinyalamdary, S. (2018). Deep kalman filter: simultaneous multi-sensor integration and modelling; A GNSS/IMU case study. Sensors 18 (5), 1316. doi:10.3390/s18051316PubMed Abstract | CrossRef Full Text | Google Scholar

Hua, D., Liu, X., Du, H., Królczyk, G. M., Li, W., and Li, Z. (2023). Positioning a magnetically controlled capsule robot based on double-layer symmetric sensor array. IEEE Trans. Instrum. Meas. 72, 1–12. doi:10.1109/TIM.2023.3300426PubMed Abstract | CrossRef Full Text | Google Scholar

Kim, S., Lee, J., Yoon, J., Ko, S., and Kim, J. (2021). Robust methods for estimating the orientation and position of IMU and MARG sensors. Electron. Lett. 57 (21), 816–818. doi:10.1049/ell2.12263CrossRef Full Text | Google Scholar

Li, S., Hendrich, N., Liang, H., Ruppel, P., Zhang, C., and Zhang, J. (2024). A dexterous hand-arm teleoperation system based on hand pose estimation and active vision. IEEE Trans. Cybern. 54 (3), 1417–1428. doi:10.1109/TCYB.2022.3207290PubMed Abstract | CrossRef Full Text | Google Scholar

Li, H., Bi, L., Li, X., and Gan, H. (2024). Robust predictive control for EEG-based brain–robot teleoperation. IEEE Trans. Intell. Transp. Syst. 25 (8), 9130–9140. doi:10.1109/TITS.2024.3359216CrossRef Full Text | Google Scholar

Liu, J., Pang, Z., Li, Z., Wen, G., Su, Z., He, J., et al. (2025). An origami-wheeled robot with variable width and enhanced sand walking versatility. Thin-Walled Struct. 206 (part A), 112645. doi:10.1016/j.tws.2024.112645CrossRef Full Text | Google Scholar

Luo, J., Wu, K., Wang, Y., Wang, T., Zhang, G., and Liu, Y. (2024). An improved UKF for IMU state estimation based on modulation LSTM neural network. IEEE Trans. Intell. Transp. Syst. 25 (9), 10702–10711. doi:10.1109/TITS.2024.3368040CrossRef Full Text | Google Scholar

Luo, J., Liu, S., Si, W., and Zeng, C. (2025). Enhancing human–robot collaboration: supernumerary robotic limbs for object balance. IEEE Trans. Syst. Man. Cybern. 55 (2), 1334–1347. doi:10.1109/TSMC.2024.3501389CrossRef Full Text | Google Scholar

Ouyang, Z., Zou, Z., and Zou, L. (2023). Adaptive hybrid-kernel function based gaussian process regression for nonparametric modeling of ship maneuvering motion. Ocean. Eng. 268, 113373. doi:10.1016/j.oceaneng.2022.113373CrossRef Full Text | Google Scholar

Penco, L., Scianca, N., Modugno, V., Lanari, L., Oriolo, G., and Ivaldi, S. (2019). A multimode teleoperation framework for humanoid loco-manipulation: an application for the iCub robot. IEEE Robot. Autom. Mag. 26 (4), 73–82. doi:10.1109/MRA.2019.2941245CrossRef Full Text | Google Scholar

Škulj, G., Vrabič, R., and Podržaj, P. (2021). A wearable IMU system for flexible teleoperation of a collaborative industrial robot. Sensors 21 (17), 5871. doi:10.3390/s21175871PubMed Abstract | CrossRef Full Text | Google Scholar

Tsitos, A., and Dagioglou, M. (2022). “Handling vision noise through robot motion control in a real-time teleoperation system,” in 2022 30th Mediterranean Conference on Control and Automation (MED), Vouliagmeni, Greece, 28 June 2022 - 01 July 2022 (IEEE), 624–629.CrossRef Full Text | Google Scholar

Wang, S., Murphy, K., Kenney, D., and Ramos, J. (2021). “A comparison between joint space and task space mappings for dynamic teleoperation of an anthropomorphic robotic arm in reaction tests,” in 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, 30 May 2021 - 05 June 2021 (IEEE), 2846–2852.CrossRef Full Text | Google Scholar

Wang, F., Chen, F., Dong, Y., Yong, Q., Yang, X., Zheng, L., et al. (2023). Whole-body teleoperation control of dual-arm robot using sensor fusion. Biomimetics 8 (8), 591. doi:10.3390/biomimetics8080591PubMed Abstract | CrossRef Full Text | Google Scholar

Xiao, W., Fu, Z., Wang, S., and Chen, X. (2024). Joint torque prediction of industrial robots based on PSO-LSTM deep learning. Ind. Robot. 51 (3), 501–510. doi:10.1108/IR-08-2023-0191CrossRef Full Text | Google Scholar

Xu, W., He, D., Cai, Y., and Zhang, F. (2022). Robots’ state estimation and observability analysis based on statistical motion models. IEEE Trans. Contr. Syst. Technol. 30 (5), 2030–2045. doi:10.1109/TCST.2021.3133080CrossRef Full Text | Google Scholar

Zhang, K., Jiang, C., Li, J., Yang, S., Ma, T., Xu, C., et al. (2022). DIDO: deep inertial quadrotor dynamical odometry. IEEE Robot. Autom. Lett. 7 (4), 9083–9090. doi:10.1109/LRA.2022.3189168CrossRef Full Text | Google Scholar

Zhao, J., Wang, X., Xie, B., and Zhang, Z. (2023). Human-robot kinematics mapping method based on dynamic equivalent points. Ind. Robot. 50 (2), 219–233. doi:10.1108/IR-02-2022-0056CrossRef Full Text | Google Scholar

Keywords: state estimation, manipulator teleoperation, particle swarm optimization–modulated long short-term memory, master–slave mapping, cumulative errors

Citation: Zhou R, Li Y, Zhang A, Zhang T, Guan Y, Zhao Z and Chen S (2025) A PSO–ML-LSTM-based IMU state estimation approach for manipulator teleoperation. Front. Robot. AI 12:1638853. doi: 10.3389/frobt.2025.1638853

Received: 31 May 2025; Accepted: 30 July 2025;
Published: 16 September 2025.

Edited by:

Gustavo J. G. Lahr, Hospital Israelita Albert Einstein, Brazil

Reviewed by:

Xin Zhang, University of Portsmouth, United Kingdom
Hamidreza Raei, Italian Institute of Technology (IIT), Italy

Copyright © 2025 Zhou, Li, Zhang, Zhang, Guan, Zhao and Chen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Shouyan Chen, bWF4Y3N5QGd6aHUuZWR1LmNu

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.