Bipedal Walking of Underwater Soft Robot Based on Data-Driven Model Inspired by Octopus

Wu, Qiuxuan; Wu, Yan; Yang, Xiaochen; Zhang, Botao; Wang, Jian; Chepinskiy, Sergey A; Zhilenkov, Anton A

doi:10.3389/frobt.2022.815435

ORIGINAL RESEARCH article

Front. Robot. AI, 20 April 2022
Sec. Soft Robotics
Volume 9 - 2022 | https://doi.org/10.3389/frobt.2022.815435

Bipedal Walking of Underwater Soft Robot Based on Data-Driven Model Inspired by Octopus

Qiuxuan Wu^1,2,3*

Yan Wu¹

Xiaochen Yang^1,2 www.frontiersin.org

Botao Zhang^1,2 www.frontiersin.org

Jian Wang^2,4

Sergey A Chepinskiy⁴

Anton A Zhilenkov³

¹Institute of Electrical Engineering, School of Automation, Hangzhou Dianzi University, Hangzhou, China
²HDU-ITMO Joint Institute, Hangzhou Dianzi University, Hangzhou, China
³Institute of Hydrodynamics and Control Processes, Saint-Petersburg State Marine Technical University, Saint Petersburg, Russia
⁴Faculty of Control Systems and Robotics, ITMO University, Saint Petersburg, Russia

The soft organisms in nature have always been a source of inspiration for the design of soft arms and this paper draws inspiration from the octopus’s tentacle, aiming at a soft robot for moving flexibly in three-dimensional space. In the paper, combined with the characteristics of an octopus’s tentacle, a cable-driven soft arm is designed and fabricated, which can motion flexibly in three-dimensional space. Based on the TensorFlow framework, a data-driven model is established, and the data-driven model is trained using deep reinforcement learning strategy to realize posture control of a single soft arm. Finally, two trained soft arms are assembled into an octopus-inspired biped walking robot, which can go forward and turn around. Experimental analysis shows that the robot can achieve an average speed of 7.78 cm/s, and the maximum instantaneous speed can reach 12.8 cm/s.

Introduction

With the development of ocean exploration and application, it is easy to cause damage to the target object and the environment when performing interactive operations on the marine environment, such as monitoring, biological sampling, seafloor landform and resource surveys, fishing for marine organisms and valuables, and maintaining marine underwater devices (Santina et al., 2018; Sinatra et al., 2019). Most autonomous robots with motion and operation functions are rigid, light-weight manipulators or claws for underwater transportation, and they are mainly operated for rough operations (Satja et al., 2018). Existing technology cannot cope with the vast and harsh environments that need monitoring and sampling the most. Compared with rigid and multi-joint robots, soft robots have continuous flexible deformation and manipulation capabilities, which are closer to biological softness. Given this, soft robots are used to aid in addressing the challenges posed by abyssal and wave-dominated environments (Aracri et al., 2021).

The appearance and movement characteristics of soft organisms in nature have always been a source of inspiration for the design of sophisticated soft arms (Nesher et al., 2020). When scientists studied the octopus’s tentacle, it was found that their taper angles range from 3° to 13.5°. A soft arm with a smaller taper angle can be bent into a larger curved shape, which can grasp lightweight items with a higher curvature more easily. A soft arm with a larger taper angle has a relatively smaller degree of bending and can grasp heavier and larger items with a lower curvature more easily (Xie et al., 2020).

The driving modes of soft manipulator are divided into cable drive, fibre-reinforced actuator, fluid-elastic pneumatic drive, variable stiffness pneumatic drive and intelligent biomimetic materials (Lu et al., 2020). The method of cable driven is easy to implement, easy to control and can transmit power over a long distance, and the inertia is also small, so this method is used to design the soft arm in this paper. There are two commonly used cable-driven methods including 4-cable drive and 3-cable drive, both of which can realize the three-dimensional drive of the soft arm. The driving cables of the former are symmetrically distributed in the soft arm at 90°, and the driving cables of the latter are symmetrically distributed in the soft arm at 120°. Because the 4-cable drive has an extra driving cable, theoretically the bearing capacity is greater, and the control will be more accurate. However, the 4-cable drive adds a servomotor to the 3-cable drive, and an extra cable is needed in the soft arm, which takes up more space.

The traditional modeling methods of the soft arm include analytical modeling using the physical parameters of the soft arm. The most used method is the piecewise constant curvature method proposed by Ian Walker. This method is simple and practical and has a wide range of applications (Lafmejani et al., 2020). At the same time, to improve the accuracy of modeling, many scholars have also proposed more complex modeling methods, such as the variable curvature method and the Cosserat rod method. A unified Cosserat-based formulation derived by resorting to a coupled approach that comprises of a model of the structural dynamics of the cephalopod-like elastic bladder and a model of the pulsed-jet thrust production is presented and tested by the robotics artefact developed by the authors sucessfully (Renda et al., 2015; Renda et al., 2018). A novel generation of macroscale underwater propellers is designed, and a Cosserat-based model is presented, accurately describing, and predicting the kinematic and the propulsive capabilities of the proposed solution (Armanini et al., 2021). However, due to the large amount of calculation and the number of parameters that need to be identified, these methods have not obtained significant performance improvement, so they are not widely used (Kim et al., 2021). Traditional modeling methods are difficult to build accurate models due to internal nonlinear interference and lack robustness and portability between different prototypes. Therefore, researchers turned their attention to machine learning, trying to use machine learning methods to build and control the soft arm model. As we all know, machine learning algorithms will effectively solve nonlinear problems in various fields. The neural network is first used in learning the forward kinematic model to solve the control problem of the cable-driven soft arm (Giorelli et al., 2013). A model-free control method based on reinforcement learning is proposed and this method is implemented on a multi-segment soft arm on a two-dimensional plane (You et al., 2017; Jiang et al., 2020). The prototype experiment verified the effectiveness and robustness of the control strategy and designed a simulation method to accelerate the training process. An octopus-inspired robot combines swimming and 4-leg crawling locomotion is designed and a Least Squares-based method coupled with a Genetic Algorithm-based method is employed for two phases, respectively (Giorgio-Serchi, et al., 2017). A systematic method for soft robot underwater locomotion using a controller based on deep reinforcement learning as a framework is developed and verified to create control inputs. However, it still didn’t expand the working space to three dimensions (Li et al., 2021).

In this paper, combined with the appearance and movement characteristics of the octopus’s tentacle, a more streamlined 3-cable cable-driven soft arm is designed and fabricated, and the soft arm model is established through a data-driven modeling method. At the same time, the control method of deep reinforcement learning is extended to three-dimensional space, to realize the straight walking, left turn and right turn of the biped robot.

Design and Fabrication of the Soft Arm

Imitating the movement mechanism and structural characteristics of the octopus tentacles, a soft arm was designed as Figure 1A. Its distal radius is 15 mm, proximal radius is 5 mm and length is 200 mm. The taper angle of the soft arm is 5.71°. It can achieve a bending situation similar to an octopus tentacle, with a small degree of bending at the proximal end and a large degree of bending at the distal end.

FIGURE 1

FIGURE 1. Design and fabrication of the soft arm. (A) Structure design of a cable-driven soft arm (B) The mold of the soft arm (C) The mold for bracket (D) The assembly of the soft arm (E) The link relationship of the driving cable.

The mold of soft arm is designed as Figure 1B. The inner diameter of the two ends and length of the mold are the dimensions of the soft arm mentioned above. Lay the cables respectively in two 3D-printed molds, pour silica gel, demold after curing and assemble. As is shown in Figures 1A,C platform to fix servomotors and soft arm is designed, and three servomotors are symmetrically distributed at 120° with respect to the center. The assembled soft arm is shown in Figure 1D, and ten marking points are marked equidistantly on the central axis.

Three driving cables are respectively controlled by three waterproof servomotors. By setting different pulse width $x$ of the input signal of the servomotor, it can be controlled to achieve different rotation angles. The type of the servomotor is JX6621, the pulse period is 20 ms, and its rotation range is 180°. The corresponding relationship between the servomotor parameter $x$ ( $0.5 m s - 2.5 m s$ ) and rotation angle of the servomotor $θ$ ( $0 - 180 °$ ) is as follows:

θ = - \frac{π}{2} x + \frac{5 π}{4} . (1)

According to Figure 1E, the relationship between the pulling length of the cable $Δ L$ and the rotation angle $θ$ of the servomotor can be obtained through the law of cosine:

Δ L = l_{1} - l_{0}, (2)

l_{1} = \sqrt{r^{2} + {(r + l_{0})}^{2} - 2 r (r + l_{0}) cosθ}, (3)

where $r = 16 m m$ is the radius of the turntable driven by the servomotor, $l_{0} = 15 m m$ is the distance from the fixed place of the servomotor of the cable to the hole of the distal end of the corresponding soft arm for cable when not pulling, and $l_{1}$ is that distance when pulling the cable.

Combining the above three formulas, the parameters of the servomotor $x$ can be converted into the pulling length of the cable $Δ L$ (Figure 2). When the parameter of the servomotor changes from $2.5 m s$ to $0.5 m s$ , the pulling length of the cable ranges from $0 m m$ to $32 m m$ .

FIGURE 2

FIGURE 2. The relationship between the parameter of the servomotor and the pulling length of the cable.

Modeling and Control of the Soft Arm

Modeling of the Single Soft Arm Based on Data-Driven Model

The most important part of training the neural network is collecting valid data. In this paper, we collect the data in the real-world by using camera. The advantage of this method is that it is easier to acquire high-quality position data ( $p x, p y, p z)$ of the points in the soft arm. Aiming at the three-dimensional modeling requirements of the soft arm, ordinary monocular cameras can only obtain two-dimensional image information, which cannot effectively meet the needs of obtaining position depth. Therefore, a binocular camera is used. The data-collecting system hardware consisted of a binocular camera, a calibration board, a notebook computer, and a supporting structure. After the calibration is completed, use the binocular camera to shoot the soft arm, and then use the SGBM algorithm to perform binocular matching to generate a depth map. At this time, the three-dimensional coordinates of the marker points can be extracted as Figure 3.

FIGURE 3

FIGURE 3. Image processing of the posture of the soft arm.

A data-driven model based on the Keras library of Python is built. Because there are three cables, the parameters of the servomotor 1, the servomotor 2 and the servomotor 3 are set as the input $x 1, x 2, x 3 of$ the designed neural network. The hyper-redundancy of the octopus and the lack of limitation by the number of skeletal joints make the representation of information in body coordinates unrealistic (Nesher et al., 2020). Another method is needed to represent the posture information of the soft arm. Because the soft arm does not have clear joints and the body deforms continuously during moving, the movement of a point on the soft arm at a certain time and the movement in a short interval near that point can be regarded as approximately the same. Therefore, a series of discrete points are selected as the control object on the soft manipulator. The experimental results also prove that this selection can meet the purpose of modeling. To facilitate the calculation, when seeking the constant curvature kinematics model, the manipulator is divided into n constant curvature sections, and the feedback point is set at the cross section of each constant curvature section (Ni et al., 2017). Therefore, pushpins are used to mark five feature points on the vertical center line of the soft arm at equal distances to feedback the degree of bending. Based on the above preprocessing, the output layer of the neural network is composed of 15 neurons ( $p x 1, p y 1, p z 1, \dots, p x 5, p y 5, p z 5$ ), which in turn are the three-dimensional coordinates of five feature points.

To enable the network to fully learn the model features, gradually increase the number of hidden layers and the number of neurons. The accuracy of the network reaches a performance bottleneck when there are seven layers and 64 neurons in each layer, and then it tends to be saturated. Therefore, the hidden layer of the neural network is designed (Figure 4).

FIGURE 4

FIGURE 4. Structure design of the neural network of the soft arm.

In this paper, Tanh function is selected as the activation function, the mean square error function is used as the training loss function, and Adam is used as the optimizer to train the above-mentioned deep neural network.

First, control the servomotor to traverse the reachable space of the entire soft arm with the same interval, and use the binocular camera to capture the bending shape of the soft arm each time, and save the soft arm shape corresponding to each action. Then, extract the morphological characteristics of the soft arm, and the available image data are screened. Finally, the pulling length of the cables are used as input, and the morphological feature data of the soft arm are used as the output to train the data-driven model of the soft arm, and the simulation environment of the soft arm is obtained. When training the neural network through the random gradient descent method (SGD), a data set of images of the soft arm is constructed, and 85% of images are randomly selected as the train set of the neural network, and 15% of images are used as the test set. This method solves the problem of difficult modeling of the soft arm and establishes the multi-layer perceptron model of the soft arm through the data collected in the experiment.

To reduce the data dimension and speed up the training speed, the data set here refers to the pose information of the soft arm when a certain driving cable is pulled alone to exhibit different degrees of bending. As shown in Figure 5, one driving cable is pulled individually to make it bend 120° to each other to form a motion primitive.

FIGURE 5

FIGURE 5. Simplified description of the data set.

As shown in Figure 6, the trajectory map (red) before training is slightly disorganized, and it cannot be clearly seen that the bending directions are 120° from each other when pulling one cable. The trajectory map (blue) after training effectively eliminates abnormal pose information, and more intuitively shows the movement trajectory characteristics of the soft arm under the 3-cable drive.

FIGURE 6

FIGURE 6. Comparison of the trajectory of the soft arm before and after training.

As is shown in Figure 7, the experimental results of the space soft arm (that is, the soft arm in the air) verify the validity of the established inverse kinematics model and the rationality of the soft arm positioning control method based on this model.

FIGURE 7

FIGURE 7. Accuracy of train set and test set of the trajectory of the soft arm.

We also accomplish the same experiments under the water for the movement of the space soft arm and the underwater soft arm, we pulled the drive cable with the same step to complete two sets of experiments. The experimental results show that the data-driven model is also suitable for the underwater movement of the soft arm, but the time that underwater soft arm completes the bending action is longer than the time for the space soft arm to complete the action. When the same servomotor control parameters are executed at the same time, the static bending degree of the underwater soft arm after the completion of the action is greater than the static bending degree of the space soft arm. When the drive cable is pulled with the same step, the data distribution is dense at both ends and sparse in the middle, which is synchronized with the change in the pulling length of the cable.

The Single-Arm Control of Three-Dimensional Deep Reinforcement Learning

The idea of reinforcement learning (RL) comes from zoology theory and conditioning theory. It is a kind of bionic algorithm that people get through the study of animal learning. RL relies on exploratory learning to give robots the ability to learn adaptively and can solve the problems of complex design process, and lack of robustness and autonomy in traditional control algorithms.

In this paper, Deep Q-Network (DQN) in RL is used as the soft arm control algorithm. DQN is the combination of Q-Learning and neural network, turning the Q table of Q-Learning into Q-Network. The use of deep neural network to approximate the Q table enables Q-Learning not only to process continuous state spaces, but also to have a certain generalization ability, which effectively enhances the application range of traditional Q-Learning (Zhang et al., 2015). Q-learning is a dynamic programming method based on value iteration. The function follows the following update formula:

Q_{t} (s_{t}, a_{t}) \leftarrow Q_{t} (s_{t}, a_{t}) + α (r_{t} + γma x_{a} Q_{t + 1} (s_{t + 1}, a) - Q_{t} (s_{t}, a_{t})), (4)

where $Q_{t} (s_{t}, a_{t})$ is the Q function, $r$ is the reward, $α$ is the learning rate, and $γ$ is the attenuation coefficient.

In this paper, the parameters of training are shown in Supplementary Table S2. The posture control of the soft arm is a process of continuous exploration. To improve the training effect of deep reinforcement learning, it is necessary to design an appropriate reward function (Supplementary Table S3) to adjust the control strategy and an end sign of the current training around. When the error of the posture of the soft arm is smaller than the set threshold, or the number of steps exceeds the set maximum running times, this round will be stopped.

The DQN training algorithm proposed is based on the greedy strategy. When training the neural network each round, a set of initial poses and a set of end poses are randomly selected from the data set. At the beginning, random actions are selected with a greater probability to enhance the DQN to explore the surroundings. Late in training, the optimal control action is selected with a greater probability. It is helpful to jump out of the local optimum and find the global optimum. The training process of a certain round is as follows, the line connected by orange marking points is the target posture of the soft arm model, and the line connected by blue marking points is the current posture of the soft arm model. The current posture keeps getting closer to the target posture through training, as shown in Figure 8.

FIGURE 8

FIGURE 8. The training process of the soft arm in a certain round.

The rate of success in the simulation training process is shown in Figure 9. As the number of learning increases, the rate of success fluctuates and eventually stabilizes at 71%. Therefore, the effectiveness of this DQN-based reinforcement learning method in controlling the posture of the soft arm has been verified in the simulation environment.

FIGURE 9

FIGURE 9. The rate of success in the training process of the soft arm.

Motion Coordination and Gait Design of a Bipedal Walking Soft Robot Bio-Inspired by Octopus

The Design of the Octopus-Inspired Soft Robot

Among different locomotion, friction-based gaits are among the slowest forms of locomotion employed, whereas running, jumping, and flying are among the fastest. Walking and swimming are intermediate between these border categories (Calisti et al., 2017). An underwater legged locomotion by means of a robotic octopus-inspired prototype and its associated model was studied (Calisti et al., 2015). And the mass of the robot is 0.755 kg and the length of the legs is 0.3 m. Finally, the robot can achieve an average speed of 4.2 cm/s.

In this paper, a biped walking (or running) soft robot is designed as Figure 10A. The height of the robot is 0.2 m and the mass of that is 0.4 kg. The length of the legs is 0.15 m and the taper angle of that is $7.63 ° .$ The taper angle of the soft arm that composes the soft robot is larger than the single soft arm in the previous experiment to support the robot platform better. And two same balloons are equipped in a symmetrical position on the robot platform to make the robot balance. When conducting the robot underwater bipedal walking experiment, the robot needs to be balanced in the vertical direction. In the vertical direction, the robot is subject to its own gravity and buoyancy by these balloons with a radius of 3.5 cm.

FIGURE 10

FIGURE 10. (A) The assembly of the octopus-inspired soft robot (B) Distribution of double-arm drive cables, tension direction and bending direction.

The bipedal walking method of an octopus is different from the common crawling method. The bipedal walking process is shown in Figure 11. RU means lifting the right leg, RD means putting down the right leg, LU means lifting the left leg, LD means putting down the left leg, SS means single support, and DS means double support (Wu et al., 2021).

FIGURE 11

FIGURE 11. Bipedal locomotion stride for Amphioctopus marginatus Octopus

Inspired by the biped walking gait of Amphioctopus marginatus Octopus (Figure 11), the walking gait of the biped robot is planned. The walking gait relies on the effective control of the six drive cables in the two arms. The distribution position of these cables is shown in Figure 10B. Three drive cables labeled 1, 2 and 3 control one arm, and the other three drive cables labeled 4, 5 and 6 control the other arm. The pulling direction is all perpendicular to the outside, and the arrow is the bending direction of the soft arm when the corresponding drive cable is pulled. At the same time, the bending direction of the No. 1 drive cable is designated as the front side, that is, the forward direction.

Taking a single arm as an example (Figure 12), under the same parameter conditions, the effect of pulling No. 2 and No. 3 at the same time is the same as the effect of pulling No. 1 only. When the drive cable is pulled 32mm, the bending angle is 25°, and the direction difference is 180°.

FIGURE 12

FIGURE 12. Bending comparison of single arm. (A) Pull the No. 1 drive cable by 32 mm (B) Pull the No. 2 and 3 drive cables by 32 mm respectively.

In the same way, for biped robot, the effect of pulling No. 5 and 6 at the same time is the same as the effect of pulling No. 1 only, and the direction is also the same.

The Gait Design of Straight Walking of the Octopus-Inspired Soft Robot

The original state is that the robot’s legs are slightly bent forward, and both legs are in contact with the bottom of the water tank.

Three individual actions of single arm are ordered as following.

Action 1: Stretch No.1 drive cable and loose No.2 and No.3 drive cables to make one soft arm bend forward. Action 2: Loose No.1 drive cable and keep No.2 and No.3 drive cables as last action to make the soft arm contact with the ground. Action 3: Stretch No.2 and No.3 drive cables and loose No.1 drive cable to make one soft arm bend back.

According to the label of the six cables in Figure 10B, the motion cycle of the robot’s straight walking is ordered into four sequences as following.

Action 1: The soft arm controlled by No.1∼3 drive cables (the No.1 soft arm) bends forward and leaves the ground, and the soft arm controlled by No.4∼6 drive cables (the No.2 soft arm) relaxes and keeps in contact with the ground. Action 2: The No.1 soft arm keeps the previous stage of bending, and the No.2 soft arm bends backward and pushes the ground to generate and forward thrust, which pushes the soft robot forward by half a step. Action 3: The No.1 soft arm relaxes and returned to contact with the ground, and the No.2 soft arm bends forward and leaves the ground. Action 4: The No.1 soft arm bends backward and pushes the ground to generate forward thrust, which pushes the robot forward to complete a step, and the No.2 soft arm keep the state of the previous stage.

When the robot goes forward, the drive status of six cables is shown in Supplementary Table S4 briefly to implement the above actions. “F” represents stretching the cable to make one soft arm bend forward, “B” represents stretching the cable to make one soft arm bend back, and “O” represents the original state or last state.

The robot was placed in a water tank with a length of 80 cm, a width of 45 cm, and a depth of 45 cm, so that its body was completely submerged in water, and its soft arms were kept in contact with the flat ground under the water. According to the 6-cable control commands corresponding to the robot’s straight walking gait planned above, the underwater bipedal walking experiment was carried out (Figure 13).

FIGURE 13

FIGURE 13. Frame-by-frame analysis of the video of the straight walking on the flat ground.

Using the 6-cable driving control commands corresponding to the straight walking gait of the robot planned in Supplementary Table S4, adjust the motion period every 0.1 s between 0.3 and 1 s, and complete ten walking experiments at each period. Record the time and distance in turn to calculate the walking speed of the robot under the corresponding motion period. And then, turn the motion period into motion frequency to express it more formally. According to the law of $3 σ$ , the average speed after excluding the abnormal value is obtained as the average walking speed at different motion frequency (Supplementary Table S5). As is shown in Figure 14, the motion velocity of the biped robot is approximately positively proportional to the frequency.

FIGURE 14

FIGURE 14. The relationship between motion velocity and the frequency.

Based on the image information captured by the camera, the machine vision algorithm is used to extract the straight walking trajectory of the center of mass of the robot in an underwater flat environment, as shown in Figure 15. The forward direction of the robot is the positive direction of the X-axis. Overall, the robot has a good straight walking gait, and the movement is relatively stable. In the water tank environment with a length of 0.8 m, the center of mass of the robot has experienced a total of 11 up-and-down motion links during the straight-line walking process, that is, 5.5 motion cycles. Among them, the average peak-to-peak value of the robot’s center of mass fluctuation is 0.65 cm, and the average forward distance of each motion cycle is 14.5 cm.

FIGURE 15

FIGURE 15. The straight walking trajectory of the center of mass of the robot on the flat ground.

The velocity changes in the horizontal and vertical directions when straight walking on the underwater flat ground. The movement speed fluctuates regularly in Figure 16.

FIGURE 16

FIGURE 16. The velocity of the robot in horizontal and vertical directions when straight walking on the underwater flat ground.

In each motion cycle, the maximum instantaneous speed of the robot in the x-axis is generated at the highest point of the motion trajectory. The robot can achieve an average speed of 7.78 cm/s, and the maximum instantaneous speed can reach 12.8 cm/s.

To test the locomotion ability of the bionic octopus biped walking robot in a complex environment, and to reflect the robustness of its walking action underwater, and to be closer to the real seabed environment, a thickness of 1∼2 cm sand was laid at the bottom of the original water tank. According to the 6-cable drive control commands corresponding to the straight walking gait of the robot planned in the previous section, and the motion period of a single soft arm of the robot is set to 0.3 s, the underwater bipedal walking experiment is carried out in a sandy underwater environment. Figure 17 shows the running state of one motion period of the robot. After many experiments, the average speed of the robot in the underwater sand environment is 5.3 cm/s.

FIGURE 17

FIGURE 17. Frame-by-frame analysis of the video of the straight walking on the sandy ground.

According to the captured motion image information of the robot, the straight walking trajectory of the center of mass of the robot in the underwater sandy environment is extracted, as shown in Figure 18. Although the sandy environment has a certain impact on the robot’s bipedal straight walking, in general, the fluctuation of the motion state is still relatively stable. In the underwater sandy environment with a length of 0.8m, the center of mass of the robot has experienced a total of 16 up-and-down motion during the straight walking process, that is, eight motion cycles. Affected by the uneven height of the sandy ground, the average peak-to-peak value of the robot’s center of mass fluctuation increases to 0.95 cm. Affected by the resistance of the sandy ground, the average forward distance of the robot in each motion period is reduced to 10 cm compared to the flat environment. Despite some resistance and slippage, the motion of the robot is overall stable in the sandy ground. There is a fluctuation of the motion trajectory of the robot because of the sand laying problem.

FIGURE 18

FIGURE 18. The straight walking trajectory of the center of mass of the robot on the sandy ground.

Summary and Outlook

According to the motion characteristics of the octopus’s tentacle, a data-driven model between its parameter control and the three-dimensional posture of the soft arm is established based on the TensorFlow framework. And DQN strategy in deep reinforcement learning is used to train the model to control the actual posture of the soft arm. This modeling and control method are used in the octopus-inspired biped robot, and the walking gait of the robot is designed. By observing and analyzing multiple experiments of underwater biped walking experiments in the water tank, the rationality of the gait design of the robot is confirmed. The average speed of the bipedal octopus walking robot can achieve an average speed of 7.86 cm/s when straight walking, and the maximum instantaneous speed can reach 8.5 cm/s. At the same time, it can also be fast and stable when turning around.

Compared with other underwater robot, the main advantages of the robot in this paper are as follows:

1) The crawling mechanism, manipulating arm and swimming mechanism of the POSEIDRONE robot (Calisti et al., 2015) are independent and decoupled from each other, which reduces the difficulty, but the complexity is high.

2) What’s more, another bipedal walking robot (Portilla et al., 2019) is hydraulicly driven, and the SLIP model of land bipedal walking is extended and applied to underwater bipedal walking control.

3) The structure of SILVER and SILVER2 robots is a kind of legged rigid structure which only have two degrees of freedom.

Compared with the above three kinds of legged robots, this paper adopts the bionic octopus flexible arm to operate and bipedal walking respectively. Compared with the rigid structure, it fits more closely with the environment, has less impact on the environment, and is more friendly.

In the future, cameras and IMU modules could be added to the platform to realize the underwater target recognition and autonomous navigation functions of the bionic octopus robot. Then, it could be applied in more work scenarios and achieve greater value. What’s more, the robustness to impact disturbances of the robot is needed to improve. We will learn from that a new neural network enhanced control system that stabilizes a three-dimensional simulated biped model of a human wearing an exoskeleton is presented (Liu et al., 2021). Results show that it stabilizes human/exoskeleton models and is robust to impact disturbances.

Data Availability Statement

The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.

Author Contributions

QW coordinated the research project and conceived the presented idea. YW designed the system, implemented the analysis of the data, and wrote the paper, in collaboration with XY, BZ and JW supervised the analysis and advice. SC and AZ revised the paper.

Funding

This material was based upon the work supported by Key Projects of Science and Technology Plan of Zhejiang Province (2019C04018).

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary Material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/frobt.2022.815435/full#supplementary-material.

References

Aracri, S., Giorgio-Serchi, F., Suaria, G., Sayed, M. E., Nemitz, M. P., Mahon, S., et al. (2021). Soft Robots for Ocean Exploration and Offshore Operations: A Perspective. Soft Robotics 8 (6), 625–639. doi:10.1089/soro.2020.0011

PubMed Abstract | CrossRef Full Text | Google Scholar

Armanini, C., Farman, M., Calisti, M., Giorgio-Serchi, F., Stefanini, C., and Renda, F. (2021). “Flagellate Underwater Robotics at Macroscale: Design, Modeling, and Characterization,” in IEEE Transactions on Robotics (IEEE), 1–17. doi:10.1109/TRO.2021.3094051

CrossRef Full Text | Google Scholar

Calisti, M., Corucci, F., Arienti, A., and Laschi, C. (2015). Dynamics of Underwater Legged Locomotion: Modeling and Experiments on an Octopus-Inspired Robot. Bioinspir. Biomim. 10 (4), 046012. doi:10.1088/1748-3190/10/4/046012

PubMed Abstract | CrossRef Full Text | Google Scholar

Calisti, M., Picardi, G., and Laschi, C. (2017). Fundamentals of Soft Robot Locomotion. J. R. Soc. Interf. 14 (130), 20170101. doi:10.1098/rsif.2017.0101

PubMed Abstract | CrossRef Full Text | Google Scholar

Giorelli, M., Renda, F., Ferri, G., and Laschi, C. (2013). “A Feed-Forward Neural Network Learning the Inverse Kinetics of a Soft cable-driven Manipulator Moving in Three-Dimensional Space,” in 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE), 5033–5039. doi:10.1109/iros.2013.6697084

CrossRef Full Text | Google Scholar

Giorgio-Serchi, F., Arienti, A., Corucci, F., Giorelli, M., and Laschi, C. (2017). Hybrid Parameter Identification of a Multi-Modal Underwater Soft Robot. Bioinspir. Biomim. 12 (2), 025007. doi:10.1088/1748-3190/aa5ccc

PubMed Abstract | CrossRef Full Text | Google Scholar

Jiang, H., Wang, Z., Jin, Y., Li, P., Gan, Y., Lin, S., et al. (2020). Design, Control, and Applications of a Soft Robotic Arm. England: International Journal of Robotics Research, 04047.

Google Scholar

Kim, D., Kim, S.-H., Kim, T., Kang, B. B., Lee, M., Park, W., et al. (2021). Review of Machine Learning Methods in Soft Robotics. Plos One 16 (2), e0246102. doi:10.1371/journal.pone.0246102

PubMed Abstract | CrossRef Full Text | Google Scholar

Lafmejani, A. S., Doroudchi, A., Farivarnejad, H., He, X., Aukes, D., Peet, M. M., et al. (2020). Kinematic Modeling and Trajectory Tracking Control of an Octopus-Inspired Hyper-Redundant Robot. IEEE Robot. Autom. Lett. 5 (2), 3460–3467. doi:10.1109/LRA.2020.2976328

CrossRef Full Text | Google Scholar

Li, G., Shintake, J., and Hayashibe, M. (2021). “Deep Reinforcement Learning Framework for Underwater Locomotion of Soft Robot,” in 2021 IEEE International Conference on Robotics and Automation (ICRA) (IEEE), 12033–12039. doi:10.1109/icra48506.2021.9561145

CrossRef Full Text | Google Scholar

Liu, C., Audu, M. L., Triolo, R. J., and Quinn, R. D. (2021). Neural Networks Trained via Reinforcement Learning Stabilize Walking of a Three-Dimensional Biped Model with Exoskeleton Applications. Front. Robot. AI 8, 8. doi:10.3389/frobt.2021.710999

CrossRef Full Text | Google Scholar

Lu, Z., Li, W., and Zhang, L. (2020). Research Development of Soft Manipulator: A Review. Adv. Mech. Eng. 12 (8), 168781402095009. doi:10.1177/1687814020950094

CrossRef Full Text | Google Scholar

Nesher, N., Levy, G., Zullo, L., and Hochner, B. (2020). Octopus Motor Control. Switzerland AG: Springer, Nature, 1–26. doi:10.1093/acrefore/9780190264086.013.283

CrossRef Full Text | Google Scholar

Ni, H., Wang, H., and Chen, W. (2017). Real-time Obstacle Avoidance and Position Control for a Soft Robot Based on its Redundant freedom. Robot 39 (03), 265–271. doi:10.13973/j.cnki.robot.2017.0265

CrossRef Full Text | Google Scholar

Portilla, G., Saltarén, R., Espinosa, F. M. D., Barroso, A. R., Cely, J., and Yakrangi, O. (2019). Dynamic Walking of a Legged Robot in Underwater Environments. Sensors 19 (16), 3588. doi:10.3390/s19163588

PubMed Abstract | CrossRef Full Text | Google Scholar

Renda, F., Giorgio-Serchi, F., Boyer, F., Laschi, C., Dias, J., and Seneviratne, L. (2018). A Unified Multi-Soft-Body Dynamic Model for Underwater Soft Robots. Int. J. Robotics Res. 37 (6), 648–666. doi:10.1177/0278364918769992

CrossRef Full Text | Google Scholar

Renda, F., Serchi, F. G., Boyer, F., and Laschi, C. (2015). Structural Dynamics of a Pulsed-Jet Propulsion System for Underwater Soft Robots. Int. J. Adv. Robotic Syst. 12, 68. doi:10.5772/60143

CrossRef Full Text | Google Scholar

Santina, C. D., Katzschmann, R. K., Bicchi, A., and Rus, D. (2018). “Dynamic Control of Soft Robots Interacting with the Environment,” in 2018 IEEE International Conference on Soft Robotics (IEEE), 46–53. doi:10.1109/robosoft.2018.8404895

CrossRef Full Text | Google Scholar

Sinatra, N. R., Teeple, C. B., Vogt, D. M., Parker, K. K., Gruber, D. F., and Wood, R. J. (2019). Ultragentle Manipulation of Delicate Structures Using a Soft Robotic Gripper. Sci. Robot. 4 (33), eaax5425. doi:10.1126/scirobotics.aax5425

PubMed Abstract | CrossRef Full Text | Google Scholar

Sivčev, S., Coleman, J., Omerdić, E., Dooly, G., and Toal, D. (2018). Underwater Manipulators: A Review. Ocean Eng. 163, 431–450. doi:10.1016/j.oceaneng.2018.06.018

CrossRef Full Text | Google Scholar

Wu, Q., Yang, X., Wu, Y., Zhou, Z., Wang, J., Zhang, B., et al. (2021). A Novel Underwater Bipedal Walking Soft Robot Bio-Inspired by the Coconut octopus. Bioinspir. Biomim. 16 (4), 046007. doi:10.1088/1748-3190/abf6b9

CrossRef Full Text | Google Scholar

Xie, Z., Domel, A. G., An, N., Green, C., Gong, Z., Wang, T., et al. (2020). Octopus Arm-Inspired Tapered Soft Actuators with Suckers for Improved Grasping. Soft robotics 7 (5), 639–648. doi:10.1089/soro.2019.0082

PubMed Abstract | CrossRef Full Text | Google Scholar

You, X., Zhang, Y., Chen, X., Liu, X., Wang, Z., Jiang, H., et al. (2017). “Model-free Control for Soft Manipulators Based on Reinforcement Learning,” in 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE), 2909–2915. doi:10.1109/iros.2017.8206123

CrossRef Full Text | Google Scholar

Zhang, F., Leitner, J., Milford, M., Upcroft, B., and Corke, P. (2015). Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control. New York, NY: Cornell University.

Google Scholar

Keywords: Octopus’s tentacle, soft arm, cable drive, data-driven model, deep reinforcement learning, bipedal coordinated walking

Citation: Wu Q, Wu Y, Yang X, Zhang B, Wang J, Chepinskiy SA and Zhilenkov AA (2022) Bipedal Walking of Underwater Soft Robot Based on Data-Driven Model Inspired by Octopus. Front. Robot. AI 9:815435. doi: 10.3389/frobt.2022.815435

Received: 15 November 2021; Accepted: 24 March 2022;
Published: 20 April 2022.

Edited by:

Ning Tan, Sun Yat-sen University, China

Reviewed by:

Francesco Giorgio-Serchi, University of Edinburgh, United Kingdom
Aiman Omer, Waseda University, Japan

Copyright © 2022 Wu, Wu, Yang, Zhang, Wang, Chepinskiy and Zhilenkov. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Qiuxuan Wu, wuqx@hdu.edu.cn

ORIGINAL RESEARCH article

Bipedal Walking of Underwater Soft Robot Based on Data-Driven Model Inspired by Octopus

Introduction

Design and Fabrication of the Soft Arm

Modeling and Control of the Soft Arm

Modeling of the Single Soft Arm Based on Data-Driven Model

The Single-Arm Control of Three-Dimensional Deep Reinforcement Learning

Motion Coordination and Gait Design of a Bipedal Walking Soft Robot Bio-Inspired by Octopus

The Design of the Octopus-Inspired Soft Robot

The Gait Design of Straight Walking of the Octopus-Inspired Soft Robot

Summary and Outlook

Data Availability Statement

Author Contributions

Funding

Conflict of Interest

Publisher’s Note

Supplementary Material

References

People also looked at