Skip to main content

ORIGINAL RESEARCH article

Front. Neurorobot., 03 February 2023
This article is part of the Research Topic The Roles of Self-organization and Sensory Adaptation for Locomotion in Animals and Robots View all 7 articles

Bio-inspired neural networks for decision-making mechanisms and neuromodulation for motor control in a differential robot

  • 1Department of Engineering, Universidad Autónoma de Occidente, Cali, Colombia
  • 2Department of Mathematics, Universidad Autónoma de Occidente, Cali, Colombia
  • 3Department of Physics, Universidad Autónoma de Occidente, Cali, Colombia

The aim of this work is to propose bio-inspired neural networks for decision-making mechanisms and modulation of motor control of an automaton. In this work, we have adapted and applied cortical synaptic circuits, such as short-term memory circuits, winner-take-all (WTA) class competitive neural networks, modulation neural networks, and nonlinear oscillation circuits, in order to make the automaton able to avoid obstacles and explore simulated and real environments. The performance achieved by using biologically inspired neural networks to solve the task at hand is similar to that of several works mentioned in the specialized literature. Furthermore, this work contributed to bridging the fields of computational neuroscience and robotics.

1. Introduction

Decision-making is a process in animals that allow them to increase their chances of survival. Decision-making includes, for example, knowing when to flee from a threat, avoiding the consuming of spoiled food, or even performing attacking or breeding behaviors. Understanding how decision-making mechanisms work within the cerebral cortex and generating a model of their output behaviors has been a focus of research in neuroscience. In Hurtado-López et al. (2017) and Hurtado-López and Ramirez-Moreno (2019) authors describe the model of a neural network that mimics social behavior in mice involving breeding and attack interactions. As mentioned in Hikosaka et al. (2018) the basal ganglia control body movements. In addition, it is involved in behavioral changes in animals. Héricé et al. (2016) propose a neural network model of the basal ganglia based on spiking neurons. The developed model allows second-level decision-making to be performed as in primates. There are other experiments, performed on Drosophila flies and consisting of introducing them into a flight simulator containing green and blue colored regions. If the fly stood on the blue regions it received a heat punishment. The results gave evidence that these insects have the ability to adjust their flight behavior based on visual color information. In Wei et al. (2017) a model based on Spiking Neural Networks (SNN) and postsynaptic plasticity is proposed to describe in a mathematical way both the decision-making behavior based on visual information received by Drosophila and the learning process.

Providing robotic navigation systems with the capacities mentioned above is of great interest in order to enhance their efficiency and autonomy. Zhao et al. (2020) developed an SNN model that allows describing the experimental behavior in Drosophila and implementing it in a UAV (Unmanned Aerial Vehicle). The results show that with the proposed model the UAV learns to make decisions quickly from the available visual information similar to the experiment. The closest approach to ours is made by Pardo-Cabrera et al. (2022), in which a bio-inspired navigation and exploration system for a robotic hexapod is developed. In this work, a network of social behaviors in mice, proposed in Hurtado-López et al. (2017) and Hurtado-López and Ramirez-Moreno (2019) is modified to perform homing, exploration, and approaching behaviors in robots. We propose a decision network to perform exploration in robots, and present as a novelty the implementation of a network inspired by the basal ganglia proposed by Ramirez-Moreno and Sejnowski (2012) to moderate the decision taken by the main network, reducing the reactivity of the system and providing greater safety in the navigation of the mobile platform.

Using bio-inspired neural networks allows us to perform numerous adaptations from animal-human behaviors and kinetics into autonomous robots. For instance, the performance of fast learning mechanisms for continuous adaptation or flexible plasticity in sensory pathways, in order to generate stable self-organized locomotion, deals with failures and adaptions to different walking in robots. In addition, it is clear how bio-inspired networks can work combined with distributed neural CPG, proprioceptive sensory adaptation, and body-environment interaction, achieving adaptive and flexible interlimb coordination for walking robots, as mentioned in Miguel-Blanco and Manoonpong (2020).

Additionally, the use of frequency is useful in order to control the locomotion of an automaton. Previous results show that the integration between motor pattern mechanisms and adaptation with a CPG-RBF leads to locomotion control of a hexapod robot in a complex environment. This kind of frequency adaptation not only significantly reduces energy use but also is comparable to the biological behaviors observed in animal locomotion (Thor et al., 2021).

As we have seen in previous works, there are numerous architectures of bio-inspired neural networks for the motor control of robots and automata. In this work, a novel bio-inspired neural network was designed for the control of the right and left actuators of a differential robot. For the latter, a reciprocal lateral inhibition circuit was used, which projects periodic cyclic signals and generates antagonistic nonlinear oscillations. The neuronal activity of these synaptic circuits with reciprocal lateral inhibition is typical of the motor control systems of periodic tasks such as breathing, swimming, or walking in vertebrates, among others.

Other approaches bring us a neat use of CPGs in order to control a sprawling quadruped robot (Suzuki et al., 2021), contributing to decentralized control with cross-couple sensory feedback to shaping body-limb coordination, which differs from previous research based on CPGs that works with inter-oscillator couplings or gait patterns based on geometric mechanics.

Ngamkajornwiwat et al. (2020) propose an online self-adaptive locomotion control technique based on the integration of a modular neural locomotion control (MNCL) and an artificial hormone mechanism (AHM) for a walking hexapod robot. Their contribution allows robot control without needing its kinematics, environmental model, and exteroceptive sensors. The technique performed relies only on a correlation between a predicted foot contact signal and the incoming foot contact signal from proprioceptive sensors. The steering and velocity regulation of the robot is achieved.

Most recent research introduces two new concepts in order to develop bio-inspired neural networks for the motor control of an automaton, the Self-Organizing Map (SOM) and the Spiking Neural Networks (SNN). Zahra et al. (2022) integrate both architectures, the SNN in a motor cortex-like differential map transforming motor plans from task-space to joint-space motor commands, and the SOM in a static map correlating joint-spaces of the robot and a teaching agent, which allows a robotic arm to learn from human actions, thus, the robotic arm learns by imitation.

Spiking Neural Network Models are based on the action potential firing temporal sequences. Usually, the leaky integrate-and-fire (LIF) neuron model is used in these networks. This model involves biophysics properties of the neuron such as membrane capacitance, conductance, and resting potential.

Current research shows wide implementations of SNN in mobile automata's navigation tasks. In Cao et al. (2015) a three-layer SNN-based controller is designed and implemented for target tracking of a mobile robot. Environmental information and target information are provided by CCD cameras, encoders, and ultrasonic sensors. The authors implemented a learning strategy based on Hebb's rule to modify the synaptic weights in the connections of the neural network (NN) in charge of the tracking task. Besides, the synaptic weights from a NN specialized in obstacle avoidance are defined by the designer and do not change in time. This strategy seeks to have more relevance in the obstacle avoidance task than that in the tracking task.

Shim and Li (2017), Lobov et al. (2020), and Liu et al. (2022) addressed the use of SNN for the motor control of mobile robots. Liu et al. (2022) proposed a biological autonomous learning algorithm based on reward modulated spike-timing-dependent plasticity (STDP). Taking this into consideration, an automaton can improve its decision-making in obstacle avoidance by a few sessions of trial-and-error in presence of new environments providing robustness to the exploration task. Approaching the cognitive and perception functions instrumented in automata behaviors, Macktoobian and Khataminejad (2016) developed a high-level cognitive behavior into a reactive agent, a Braitenberg vehicle (BV). Low-level perception is obtained by an SNN-Curved trajectory detection (CDT) model with which the motion of an agent in the environment is detected. The vehicle's control for producing the desired behaviors depending on the perception is made by an engineering method, approaching and fleeing behaviors are obtained.

Neurons respond to stimuli by generating action potentials. To describe the state of a neuron, the mean firing rate (MFR) of these action potentials can be taken. The dynamics of the NN based on MFR models can bring a better understanding of the expected behavior of the neuron at first sight than in SNN. Architectures based on SNN found in the literature have solved decision-making and motor control tasks. As mentioned in Suzuki et al. (2021), Thor et al. (2021), and Pardo-Cabrera et al. (2022), MFR models have been implemented to solve these tasks as well. The literature reviewed in this work shows that MFR models which satisfy the decision-making and control-motor for mobile automata navigation have not been explored widely. The present work proposes the design and implementation of an automaton's bio-inspired navigation framework using a mathematical MFR neural model described by Wilson and Cowan (1972).

SNN models have an advantage in single event-based learning on Hebb's rule. In our work, the advantage offered by MFR models is that it simply assimilates the advantage given in SNN models for single-event-based learning. We have achieved this in the meta-control network. Single-event-based learning enables modifying the performance of the automaton by setting the network's parameters in a single trial. In this work, the network's parameters are not modified, and for this reason, it cannot be considered a learning mechanism. The improvement in performance is obtained through the meta-control network. This network allows the adaptation of the automaton's behavior to significant environment changes (pop-up novelties) and dynamic obstacles, this is obtained by properly modulating the velocity applied to the robot's wheels. For the proposed network, the experimentation results show an improvement in the obstacle avoidance task when the meta-control network is involved. The literature consulted shows a recurrence of SNN models rather than mean firing rate models at the cost of the loss of a certain mathematical simplicity. As already mentioned, our model advantage is the combination of such mathematical simplicity and the formulation of single event-based learning.

2. Materials and methods

This section will explain the implemented bio-inspired neural circuits, the neural network design itself, the adaptation stage, and the signal processing.

2.1. Bio-inspired neural network design

As seen in Figure 1, the information perceived from the environment is captured by a LiDAR sensor and is processed by the Signal processing block (Section 2.5). From this block, the signals Ar and Al, and S1 and S2 are processed. The Ar and Al signals convey information about the obstacle's presence or absence in the right and left areas, respectively. These signals enter the Short-term memory circuits block (Section 2.2.1) which extends their information in time. The projections from the previous block enter the Memory linear chain block (Section 2.2.2) and retain and increase the intensity of the projections. In the Comparison Circuit block (Section 2.2.3) the projections from the Memory linear chain block are compared and thus promote a faster decision by the Competitive Neural Network (WTA) block (Section 2.2.4). In the Competitive Neural Network (WTA) block, the projections of the previous block are compared, and a proper motion decision is obtained among rightward, leftward, and forward. With the Adaptation Stage block (Section 2.2.5) it is possible to detect a tendency among the motions that have been executed in a time interval and thus adapt the parameters of the Non-linear oscillation generating circuit block (Section 2.2.7). The Non-linear oscillation generating circuit block, produces the signals for the automaton motor execution. Finally, the Meta-control circuit block (Section 2.2.6) modulates the rightward, leftward, and forward movements, allowing an improvement of the performance in situations where a novelty is prioritized before a previously weighted decision. The Meta-control circuit block is fed by the S1 signal, which corresponds to the information of any new obstacle, and by forced complementarity, we obtain the S2 signal. Forced complementary is understood as a decremental response to an incremental stimulus, obtained by the substruction between a threshold and the stimulus.

FIGURE 1
www.frontiersin.org

Figure 1. Architecture of the bio-inspired network for the exploration behavior.

The neuron model used in this work takes inspiration from the basic negative feedback loop described by Wilson and Cowan (1972), in which connections with arrow endings represent excitatory projections and connections with circled endings represent inhibitory projections.

The response (R) of a neuron to a single stimulus (P) is described by the differential (Equation 1) (Wilson and Cowan, 1972) where τ is the time constant.

dRdt=1τ(-R+Ψ(M,P,σ))    (1)

Ψ(M, P, σ) is the Naka-Rushton activation function (Wilson and Cowan, 1972), and is implemented as a mathematical approximation of these responses. M is the maximum firing rate for a very intense stimulus and σ, called the half-saturation constant, determines the value at which Ψ(M, P, σ) reaches half of its maximum. The mathematical representation is given in Equation (2).

Ψ(M,P,σ)={MP2σ2+P2for P00for P<0    (2)

2.2. Cortical synaptic circuits

The bio-inspired neural networks implemented in this work are based on cortical synaptic circuits (Ramirez-Moreno and Hurtado-Lopez, 2014). These circuits are observed in the cerebral cortex, subcortical nuclei, and in the spinal cord.

Next, we present some cortical synaptic circuits which provide good performance in our robot framework. The structure of the differential equations of these circuits corresponds to that shown in Equation (1) and the parameters used are summarized in Table 1. These parameters were obtained by the heuristic method, some of them were taken from Guerrero-Criollo et al. (2022).

TABLE 1
www.frontiersin.org

Table 1. Parameters.

2.2.1. Short-term memory circuits. Recurrent excitatory

Recurrent excitatory circuits within the Central Nervous System (CNS) allow short-term retention of information. The function of the adapting interneurons is to execute a delayed control task over the main units.

dZ1dt=1τ1(-Z1+Ψ(A,   ωAr+aZ3,   B+cZ2))    (3)
dZ2dt=1τ2(-Z2+bZ1)    (4)
dZ3dt=1τ1(-Z3+Ψ(A,   aZ1,   B+cZ4))    (5)
dZ4dt=1τ2(-Z4+bZ3)    (6)
dY1dt=1τ1(-Y1+Ψ(A,   ωAl+aY3,   B+cY2))    (7)
dY2dt=1τ2(-Y2+bY1)    (8)
dY3dt=1τ1(-Y3+Ψ(A,   aY1,   B+cY4))    (9)
dY4dt=1τ2(-Y4+bY3)    (10)

The purpose of this short-term memory circuit is to extend in time the information captured from the environment, Ar and Al. For this purpose, each signal enters through a single channel to the recurrent excitation circuit, where the main neurons Z1 and Y1, respectively, process the signals. Input signals are maintained by the delayed feedback of neurons Z2 and Y2, as shown in Figure 2.

FIGURE 2
www.frontiersin.org

Figure 2. Short-term memory circuit. Recurrent excitation. Two circuits similar to that shown above, process the signals Ar and Al, respectively. The four processing units shown correspond to the units named in Equations (3)–(6) y (7)–(10).

These circuits replicate those found in the CNS associated with the short-term memory mechanism that, for instance, serves to recall a phone number for immediate use. Without them, the automaton, like its biological counterpart, would not be able to remember the previously processed information itself, in this case, it would not be able to remember whether there was an obstacle before or not.

2.2.2. Memory linear chain

Projections from recurrent excitatory circuits (Figure 2) enter a memory linear chain of five neurons (Figure 3).

 dM1dt=1τ(-M1+Ψ(A,Z1,C))    (11)
 dMidt=1τ(-Mi+Ψ(A,   M(i-1),   C)), i=2,3,,5.    (12)
dAMdt=1τϵ(-AM+Ψ(A,   i=15Mi,   C))    (13)
 dN1dt=1τ(-N1+Ψ(A,   Y1,   C))    (14)
 dNidt=1τ(-Ni+Ψ(A,   N(i-1),   C)), i=2,3,,5.    (15)
 dANdt=1τϵ(-AN+Ψ(A,   i=15Ni,   C))    (16)

Implementation of the memory linear chain above seeks to retain and enhance the intensity of the projections. Each memory unit is separated from each other by delay units (Δt) and their projections are accumulated in the accumulation neurons AM (Equation 13) and AN (Equation 16), respectively, for each processing channel, Figure 3.

FIGURE 3
www.frontiersin.org

Figure 3. Memory linear chain. Two circuits similar to that shown above, process the propagation of the short-term memory circuit (Figure 2). The five processing units shown correspond to the units named in Equations (11), (12) y (14), (15).

Putting together the memory linear chain and the short-term memory circuits, it is possible to obtain an approximation to working memory. And thus, like its biological counterpart, the automaton could retain the previously processed information.

2.2.3. Comparison circuit

Units U1 and U2 in Figure 4B act as comparison neurons. These units receive the reciprocal excitatory and inhibitory projections from the accumulation neurons AM and AN, see Figure 4.

dU1dt=1τϵ(-U1+max(0,   A(AM-AN))D+AM-AN)    (17)
dU2dt=1τϵ(-U2+max(0,   A(AN-AM))D+AN-AM)    (18)

Projections from accumulation neurons AM and AN are compared in order to promote a faster and clearer decision by the WTA (winner-take-all) decision circuit. U1 and U2 units from Equations (17), (18) produce the Ou and Od signals, which together with a tonic activity Oe, feeds the WTA circuit. This allows to know which neuron (AM or AN) has an advantage in decision-making.

FIGURE 4
www.frontiersin.org

Figure 4. Architecture of the bio-inspired network for the exploration behavior. (A) Architecture of the bio-inspired network for the exploration behavior. (B) comparison circuit.

These circuits are based on a circuit present in the orbito-frontal cortex in primates. Without them, the automaton, like its biological counterpart, would not be able to take decisions, in this case, would not be able to take a decision between each processing channel (AM or AN).

2.2.4. Competitive neural networks. Decision circuit

Winner-take-all (WTA) decision circuits are neural networks that identify and choose the strongest input stimulus presented to the neural network. The decision-making mechanisms presented are inspired by those found in the Central Nervous System in primates.

In mammals and complex animals, we observe behaviors that come from decisions made in the face of different options, decisions that bring short- or long-term gains or benefits. For example, searching for food in the presence of predators, fleeing or fighting in threatening situations, or mating competition. The same applies to a mobile automaton that makes decisions based on stimuli taken from the environment, for instance, the slope of a terrain, its humidity, radioactivity, and viscosity, among others (Guerrero-Criollo et al., 2022).

Ramirez-Moreno and Hurtado-Lopez (2014) proposed a basic neural network that chooses an option between two alternatives presented. In this work, that network was modified in order to make a selection among three alternatives, see Figure 5.

dO1dt=1τ(-O1+Ψ(ΩA,   U1-a(O2+O3),   B))    (19)
dO2dt=1τ(-O2+Ψ(A,   U2-a(O1+O3),   B))    (20)
dO3dt=1τ(-O3+Ψ(A,   Oe-a(O1+O2),   B))    (21)

Our aim is to make units O1, O2 and O3 to compete and reach a decision among leftward, rightward, and forward movements, see Figure 5. These signals O1, O2, and O3 are passed to the nonlinear oscillation generator circuits, individually, and in this way generate the motor control of the mobile automaton.

FIGURE 5
www.frontiersin.org

Figure 5. Competitive neural network. Class winner-take-all (WTA).

2.2.5. Adaptation stage

The memory linear chain with delay units (Δt) feeds the accumulation neurons AP and AQ Equation (24), respectively.

 dP1dt=1τ(-P1+Ψ(A,   αO1,   C))    (22)
 dPidt=1τ(-Pi+Ψ(A,   P(i-1),   C)), i=2,3,,5.    (23)
dAPdt=1τ(-AP+Ψ(A,   i=15Pi,   C))    (24)

Our aim is to have the behaviors tendency in a period of time and thus adapt the parameters in the nonlinear oscillation generator circuits, see in Figure 6.

FIGURE 6
www.frontiersin.org

Figure 6. Memory linear chain for the adaptation stage. Two circuits similar to that shown above, process the propagation of O1 and O2 in the meta-decision circuit (Figure 5). The five processing units shown correspond to the units named in Equations (22)–(24).

Similar to the Memory linear chain (Section 2.2.2), this circuit oversees retaining the previously processed information in a time interval, and without it, the automaton would not be able to adapt its parameters in order to prioritize one of its three behaviors.

2.2.6. Meta-control circuit

The CNS has neuromodulators in charge of transforming or changing the result of a primary operation under a pop-up novelty. In this work for the processing of this novelty, the modulation neural network described in Ramirez-Moreno and Sejnowski (2012) was implemented. Unit G5 in Figure 7 projects an inhibition signal to the motor behaviors. The modulation of these behaviors allows a better performance that mimics the reactions of complex animals in situations where a novelty is prioritized before a previously weighted decision.

dX1dt=1τ4(-X1+Ψ(A,   ωS1+aX3,   B+cX2))    (25)
dX2dt=1τ5(-X2+bX1)    (26)
dX3dt=1τ4(-X3+Ψ(A,   aX1,   B+cX4))    (27)
dX4dt=1τ5(-X4+bX3)    (28)
dG1dt=1τ6(-G1+Ψ(Acn,   X1-ϕG4+G6,   Ncn))    (29)
dG2dt=1τ6(-G2+Ψ(Acn,   S2-θG3+G7,   Ncn))    (30)
dG3dt=1τ6(-G3+Ψ(Bcn,   G1,   Mcn))    (31)
dG4dt=1τ6(-G4+Ψ(Bcn,   G2,   Mcn))    (32)
dG5dt=1τ6(-G5+Acn1+e-Θ(gG1-hG2))    (33)
dG6dt=1τ6(-G6+Ψ(Acn,   G1,   Ncn))    (34)
dG7dt=1τ6(-G7+Ψ(Acn,   G2,   Ncn))    (35)

To feed the meta-control circuit, first, the information of the novelty S1 is extended for a longer time by entering a short-term memory circuit. By forced complementarity, the signal S2 is obtained. This signal together with X1 enters the control network that modulates the appropriate behaviors in the nonlinear oscillation-generating circuits, whose outputs are shown in Figure 13.

FIGURE 7
www.frontiersin.org

Figure 7. Meta-control circuit.

These circuits take inspiration from studies of the amygdala and cingulate cortex via Pontine Tegmental Nucleus (PTN) (Ramirez-Moreno and Sejnowski, 2012). Without them, the automaton, like its biological counterpart, in this case, would not be able to differentiate either extreme risk or security.

2.2.7. Non-linear oscillation circuits. Lateral inhibition

In CNS, the non-linear oscillation circuits are responsible for generating repetitive oscillating signals, presented in the execution of motor actions such as breathing, eating, and swimming, among others. In this work, the magnitude of these oscillations will be the speed level applied to each actuator of the differential robot.

dLndt=1τ7(-Ln+Ψ(A-ρG,   λ,   B+L(n+2)))    (36)
dLidt=1τ8(-Li+βL(i-2))    (37)
λ=Kjf(O1+AP-AQ)-dL(2-(n-1))+(-1)n+1ψAP    (38)
n{1,2}, i{3,4}, j{1,2}    (39)

In Figure 8A the O1 projection will be in charge of giving the order to generate the oscillations to perform a left-turn behavior. This will be fed to a parameter adaptation stage, note the order in which the connections coming from AP and AQ are given, these units refer to what is obtained in a stage of adaptation of left and right turning. Consequently, the left-turn adaptation (AP) presents an excitatory connection contrary to the inhibition of the right-turn adaptation (AQ). Our aim is to prolong this behavior over time. The projections of this stage continue to the nonlinear oscillation generator circuit. In this circuit, a left-turn adaptation unit is added again with the intention of increasing the difference between the widths of the oscillations and generate a torque that allows to change the orientation of the robot. The mathematical representation is given in Equations (36)–(39).

dRndt=1τ7(Rn+Ψ(AρG,λ,B+R(n+2))    (40)
 dRidt=1τ8(-Ri+βR(i-2))    (41)
 λ=Kjf(O2+AQ-AP)-dR(2-(n-1))+(-1)nψAQ    (42)
 n{1,2}, i{3,4}, j{2,1}    (43)

For the generation of the right-turn swing oscillations (Figure 8B), the same structure and principle is used. However, care must be taken, once again, with the connections of the adaptation units. In this case, the right-turn adaptation (AQ) has excitatory connections, and the left-turn adaptation (AP) has inhibitory connections. Likewise, the right-turn adaptation unit is added to the oscillation generator circuit to generate the difference in the width of the oscillations, in this case in the opposite signal to that of the right-turn and, in that way, to rotate in the opposite direction. The mathematical representation is given in Equations (40)–(43).

dFndt=1τ7(-Fn+Ψ(A-ρG,   K(2j+1)λ,   B+F(n+2)))    (44)
 dFidt=1τ8(-Fi+βF(i-2));    (45)
 λ=f(O3-AP-AQ)-dF2-(n-1)    (46)
 n{1,2}, i{3,4}, j{0,1}    (47)

Finally, for the generation of oscillations corresponding to the forward motion (Figure 8C), both adaptations (AP and AQ) project inhibitory connections, taking into account that in the established design the forward motion is expected to be less predominant. The mathematical representation is given in Equations (44)–(47).

FIGURE 8
www.frontiersin.org

Figure 8. Non-linear oscillation generator circuit. (A) Non-linear oscillation circuit for turning left. (B) Non-linear oscillation generator circuit for turning right. (C) Non-linear oscillation generator circuit for forward motion.

2.3. Software configuration

For the virtual implementation, we made use of the robot Burger from the TurtleBot3 open source libraries (Open Source Robotics Foundation, 2020). The simulated environment was performed in the Gazebo simulator (Foundation, 2014). The middle-ware used was ROS (Robotics, 2021).

2.4. Hardware configuration

A TurtleBot3 Burger platform was used as the mobile automaton. This robot is configured with a 360-degree LDS-01 LiDAR sensor, a Raspberry Pi 3 Model B board for processing, and an OpenCR board for hardware control. The wheels actuator is the Dynamixel XL430-W250 motor. All the system is powered by a 3 cell LiPo battery of 11.1v and 2.2 Ah. The robot dimensions are visualized in Figure 9.

FIGURE 9
www.frontiersin.org

Figure 9. Turtlebot3 Burger model dimensions. Taken from Robotis (2022).

The robotic platform was configured with ROS Kinectic middleware installed on a Raspbian Buster operating system. The processing of the bio-inspired exploration system was tested both on the embedded and on an external computing unit, the latter configured with ROS noetic, Ubuntu 20.04, an Intel Icore i7 8th generation processor, and 16 GB ram memory. The communication between the embedded and the computational unit was done via WiFi.

2.5. Signal processing

Considering that the objective of the terrestrial navigation platform is to perform an obstacle avoidance exploration behavior, it was proposed to make use of the information captured by the LiDAR sensor to generate the input signals to the bio-inspired network. This was processed as shown in Figure 10. Just frontal information provided by the sensor was considered and was divided into two areas Ar (0°–90°) and Al (90°–180°). A safety area of 0.5m radius was defined, with which it is defined that: points belonging to the degree range of the Ar area are classified into points inside the safety area (Pri) and points outside the safety area (Pro), likewise for the Pli and Plo points of Al. Points inside the safety area are penalized with a value of −1, while points outside the safety area are assigned with a value of +1. So, the values assigned to the areas Al and Ar are defined as shown in Equations (48), (49). The former processing is intended to define in which direction (right or left) obstacles are closer to the robot so that the robot will head toward the clearest area.

FIGURE 10
www.frontiersin.org

Figure 10. Signal processing. Take l for left and r for right. Take i for inside and o for outside.

As mentioned in Section 2.2.6, the aim of incorporating a basal ganglia-inspired meta-control network is to mediate the decisions made by the main network. It is proposed that the meta-control network will act on decisions where the robot's environment changes dramatically, for instance, when there is the presence of dynamic objects. To detect this, it is proposed to keep a record of the result obtained from the areas at instant t − 1 and compare it with those obtained at instant t. If a difference greater than a threshold ϵ exists, a value of 100 will be given to the signal S1 of the meta-control network in Equation (50) (Section 2.2.6).

Ar=k=0°90°Prok+Prik;   Prok=+1,   Prik=-1    (48)
Al=k=90°180°Plok+Plik;   Plik=-1,   Plok=+1    (49)
S1={100,|Art1Art|Art1>ϵ   and   |Alt1Alt|Alt1>ϵ0,otherwise    (50)
S2=γ-S1    (51)

3. Results

In this section, the results obtained from both the simulation part and its implementation in the TurtleBot3 Burger robot are presented. The performance of the automaton in the exploration task and in the obstacle avoidance task was measured.

3.1. Simulation

3.1.1. Exploration task

To evaluate the performance of the exploration behavior, as well as its obstacle avoidance task, controlled by the bio-inspired neural network, the adaptation and simulation of the environments for exploration proposed in Yan et al. (2015) were implemented in Gazebo. In this work, the environments have a maximum exploration area of 4 m2. The maze walls are rigid and fully reflective surfaces, and the corridor width is, at least, 3 times the outside diameter of the robot. These mazes are denominated loop, cross, zigzag and traditional maze. In Figures 11AD, the navigation in an established way is evaluated. In Figures 11EH the environments are simulated until a collision or a deadlock situation takes place. The simulation results of these environments are presented in Figure 11.

FIGURE 11
www.frontiersin.org

Figure 11. Simulation environment results. Figures on the left side show the Gazebo simulation environment without the meta-control circuit. The right side images show the trajectory made by the robot in the exploration behavior with the meta-control circuit. (A, B) Loop. (C, D) Zigzag. (E, F) Cross. (G, H) Traditional maze. In (B, D, F, H), one can observe how we obtain a better performance using the meta-control network and allowing to achieve a greater trajectory in (F, H).

Figure 11 illustrates the performance of the automaton without the meta-control network (left column, Figures 11A, C, E, G) and with the meta-control network (right column, Figures 11B, D, F, H). It is observed how the network modulates the right and left behaviors in the left column allowing better performance in the right column along the same path.

In the results shown in Figure 11, the mazes have a total area of 4.0 x 4.0 m with walls 1.0 m high and corridors 0.50 m wide. The environments in Figures 11AD have 0.15 m wide walls, and the environments in Figures 11EH have 0.05 m wide walls. The LiDAR sensor has a 360° field of view with a reading range of 0.12–3.50 m. Considering the safety area defined on the robot, Section 2.5, this field of view is reduced to 180° and a range of 0.12–0.50 m. In environments such as cross or traditional maze, if the width of its corridors is increased, it would cause the automaton to make a late decision between its three behaviors at interceptions, due to its actual change of vision, colliding with the outside corners while taking a wide-open curve. Considering the average speed of 0.04 m/s at which the automaton travels, this does not favor such decision-making. The opposite is true for loop and zigzag environments, where the automaton only decides between one of its behaviors.

3.2. Implementation

The bio-inspired neural network with neuromodulation designed in this work was mounted in the automaton TurtleBot3 Burger in order to measure its performance.

3.2.1. Exploration task

To evaluate the performance of the automaton in a natural environment, a hand-made maze was built, as shown in Figure 12. Each environment has a minimum of 1.0 m2 and a maximum of 2.0 m2; except for the simple maze that was built freely.

FIGURE 12
www.frontiersin.org

Figure 12. Implementation environment results. The path made by the automaton was drawn with red lines in each type of environment. Green circles are initial positions and blue circles are final positions. (A) Loop. (B) Cross. (C, D) Traditional maze part 1 and part 2, respectively. (E) Zigzag. (F) Simple maze. There one can be observed how the automaton completed the (A, E, F) environments successfully. In the (B) environment the automaton's performance started in the middle of the cross-environment and finished doing circles around the environment. In the (C, D) environments, there can be observed how the automaton's trajectory finishes at its starting point.

Figure 13 shows the signals obtained in the physical implementation of a zigzag environment. Figure 13A illustrates the information obtained from the real environment and its processing in time, top left image exhibits the LiDAR's points processing inside a corridor of the zigzag-maze, blue points correspond to points inside the safety zone and red points are those outside it. S1, Ar, and Al curves are the signals mentioned in Section 2.5, these were sampled within an interval of 360ms. Notice that S1 fires when there is an appropriate change in the values of Ar and Al from one instant to another one. For instance, near sample 99 the Ar signal changes from 90 to 50 and Al from 30 to 0, then, S1 triggers from 0 to 100. The automaton's trajectory seen in Figure 12E is a result of processing Ar and Al signals. The biggest values of Ar and Al appear when the robot executes turns. The projection of the meta-control network is shown in Figure 13B. Approaching sample 230 of S1, Ar, and Al signals, there were more obstacles inside the left area, thus, the robot must turn to the right. Figure 13C shows the wheels' motor action corresponds to this time, this signal was sampled within an interval of 1.0ms. The blue signal corresponds to the left wheel and the orange to the right. Blue oscillations are wider than orange oscillations, then, the left wheel spins more than the right wheel, and the right turn is made. When S1 fires, the modulation in the wheels' motor action is applied, and this generates a reduction in the amplitude of the oscillating signals. This reduces the automaton's velocity which gives time to taking a better decision.

FIGURE 13
www.frontiersin.org

Figure 13. Implementation of the zigzag environment. (A) Real signals obtained from the environment and its processing in time. The top left image exhibits the LiDAR's points processing inside a corridor of the zigzag environment. Blue dots correspond to the points inside the safe area and red dots are the points outside the safe area. S1, Ar, and Al curves are the inputs signals for the bio-inspired network (Section 2.5), these were sampled within an interval of 360ms. The automaton's trajectory seen in Figure 12E is a result of processing Ar and Al signals. The biggest values of Ar and Al appear when the robot executes turns. (B) Meta-control circuit projection G. (C) Motor control signals of the mobile automaton. This signal was sampled within an interval of 1.0ms. Blue and orange signals correspond to the left wheel and the right wheel, respectively.

3.2.2. Meta-control circuit test

The performance of the neuromodulation network was tested by putting an obstacle (box) in the automaton's area vision. The automaton automatically avoids the obstacle and continues exploring (see Supplementary Videos 6, 8).

3.3. Metrics

To quantify the performance of the exploration in the different established environments the next metrics are proposed:

• Covered distance (Td): Covered distance by the robot measured in meters.

• Elapsed time (Tt): Spent time in seconds.

• Average speed reached (Tv): Quotient between Td and Tt.

• Exploration area (Ea): Percentage of the total environment area covered by the robot.

These metrics values obtained for each simulated environment are presented in Table 2A.

TABLE 2
www.frontiersin.org

Table 2. Metrics values for simulation environments in Figure 11.

In order to compare quantitatively the performance evaluation of the automaton's trajectory and the optimum trajectory we added point-to-point metrics. The automaton's trajectory evaluation was evaluated considering the optimum trajectory, defined as the way that keeps in the middle of the corridors. In this comparison, the RMSE, mean error, standard deviation error, minimum error, and maximum error were computed for each axis. The results of the error metrics for each environment presented in Figure 11 are shown in Tables 2B, C.

4. Discussion and conclusion

The framework proposed in this work faces strong difficulties when it comes to navigate much more complex mazes (see Figures 11B, D). The automaton shows a very good performance in environments like those seen in Figures 11A, C. That difficulty is linked to the analysis of the environment information. Reducing the analysis to a specific area provoked a delay in the decision-making when an object appeared suddenly in front of the robot in open environments. Shortly, this problem could be solved by increasing the safety area, nevertheless, this could affect the performance in reduced space environments as shown in Figures 11A, C. The sensed area could be penalized with negatives. Due to this fact, it is proposed as part of future work the development and implementation of a bio-inspired strategy that allows a dynamic adjustment of the robot's safety area depending on the environment (wide or narrow areas).

The discussion presented above is also supported by the information presented in Table 2A. It shows the good performance exhibited by the cortical synaptic circuits adapted and applied, as mentioned in Section 2.2, for the exploration of unstructured environments in their entirety (Ea). In addition, the performance of the automaton with and without meta-control network is shown in the error metrics in Tables 2B, C. The results illustrate that we obtain better performance with the implementation of this network. On average, the TurtleBot3 Burger's navigation speed was approximately 0.04m/s. By comparing with Miguel-Blanco and Manoonpong (2020) our exploration system is slow, similar to the one developed by Pardo-Cabrera et al. (2022).

A first approximation of the motor control of a mobile autonomous was proposed in Guerrero-Criollo et al. (2022). In that work, the input signals were simulated rather than being captured by a robust system. The meta-control network, which is responsible for detecting novelties, is also absent. In this work, we implement both the sensor part of the system that measures environmental data for inputs and the meta-control network. The bio-inspired network was implemented into the TurtleBot3 Burger embedded system. In this work, the design, simulation, and implementation of a bio-inspired neural network allows a differential robot to perform a safe exploration. An exploration task is defined as the behavior of traversing a terrain indefinitely while avoiding obstacles. Here, a framework is proposed to extract information from a LiDAR sensor that generates the input signals to the neural network online. Additionally, the implementation of a modulatory or meta-control network inspired by the basal ganglia is carried out. This network allows modulating the exploration behavior of the robot by reducing its speed progressively when drastic changes occur in the robot's environment within a safety area of 0.5m radius (see Figure 13D). As the robot advances through the mazes, this network detects novelties with greater priority, enabling it to avoid obstacles much more effectively. This was done with the aim of adding robustness to the bio-inspired exploration system against dynamic objects and reducing the reactivity of decision-making, thus improving the autonomy of the navigation system.

Frequently, to perform autonomous navigation tasks, it is required either that an operator previously walks with a registration system through the environment in which the robot will operate or that the operator teleoperates the robot while the registration of the area is being done. Either of the above two situations presents difficulties, the most obvious of which is the dependence on an operator in the robot's workflow for its operation. There are scenarios that can put the operator's safety at risks such as environmental disaster zones or mines. In addition, in these areas connectivity can be problematic to operate the robot remotely. For these reasons, it is considered that the proposed work can have a significant impact on exploration systems and the identification of unknown environments for ground platforms.

Data availability statement

The original contributions presented in the study are included in the article/Supplementary material, further inquiries can be directed to the corresponding authors.

Author contributions

RG-C, JC-L, and DR-M contributed to the writing of the manuscript. RG-C and JC-L proposed the architecture of the bio-inspired neural network and run the simulation and got the results. JH-L and DR-M proposed the neuromodulation network. All authors contributed to the article and approved the submitted version.

Acknowledgments

The authors are grateful to the Universidad Autónoma de Occidente, the Motor Neurocontrol Research Group, and the Robotics and Autonomous Systems research group.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnbot.2023.1078074/full#supplementary-material

References

Cao, Z., Cheng, L., Zhou, C., Gu, N., Wang, X., and Tan, M. (2015). Spiking neural network-based target tracking control for autonomous mobile robots. Neural Comput. Appl. 26, 1839–1847. doi: 10.1007/s00521-015-1848-5

CrossRef Full Text | Google Scholar

Foundation, O. S. R. (2014). Gazebo. Available online at: http://gazebosim.org/

Google Scholar

Guerrero-Criollo, R. J., Castaño-López, J. A., Díaz-Cuchala, R. E., David Rozo-Giraldo, Y., and Ramirez-Moreno, D. F. (2022). “Design and simulation of a bio-inspired neural network for the motor control of a mobile automaton,” in 2022 IEEE Colombian Conference on Applications of Computational Intelligence (ColCACI) (Cali: IEEE), 1–6.

Google Scholar

Héricé, C., Khalil, R., Moftah, M., Boraud, T., Guthrie, M., and Garenne, A. (2016). Decision making under uncertainty in a spiking neural network model of the basal ganglia. J. Integr. Neurosci. 15, 515–53. doi: 10.1142/S021963521650028X

PubMed Abstract | CrossRef Full Text | Google Scholar

Hikosaka, O., Ghazizadeh, A., Griggs, W., and Amita, H. (2018). Parallel basal ganglia circuits for decision making. J. Neural Transm. 125, 515–529. doi: 10.1007/s00702-017-1691-1

PubMed Abstract | CrossRef Full Text | Google Scholar

Hurtado-López, J., and Ramirez-Moreno, D. F. (2019). Dynamics of a neural circuit that mediates social and nonsocial behaviors. Int. J. Bifurcat. Chaos 29, 1950138. doi: 10.1142/S0218127419501384

CrossRef Full Text | Google Scholar

Hurtado-López, J., Ramirez-Moreno, D. F., and Sejnowski, T. J. (2017). Decision-making neural circuits mediating social behaviors. J. Comput. Neurosci. 43, 127–142. doi: 10.1007/s10827-017-0654-8

PubMed Abstract | CrossRef Full Text | Google Scholar

Liu, J., Hua, Y., Yang, R., Luo, Y., Lu, H., Wang, Y., et al. (2022). Bio-inspired autonomous learning algorithm with application to mobile robot obstacle avoidance. Front. Neurosci. 16, 905596. doi: 10.3389/fnins.2022.905596

PubMed Abstract | CrossRef Full Text | Google Scholar

Lobov, S. A., Mikhaylov, A. N., Shamshin, M., Makarov, V. A., and Kazantsev, V. B. (2020). Spatial properties of stdp in a self-learning spiking neural network enable controlling a mobile robot. Front. Neurosci. 14, 88. doi: 10.3389/fnins.2020.00088

PubMed Abstract | CrossRef Full Text | Google Scholar

Macktoobian, M., and Khataminejad, A. T. (2016). On the formal development of behavioral reactive agents: a systematic braitenberg-vehicle approach. arXiv[Preprint].arXiv:1612.03979. doi: 10.48550/arXiv.1612.03979

CrossRef Full Text | Google Scholar

Miguel-Blanco, A., and Manoonpong, P. (2020). General distributed neural control and sensory adaptation for self-organized locomotion and fast adaptation to damage of walking robots. Front. Neural Circ. 14, 46. doi: 10.3389/fncir.2020.00046

PubMed Abstract | CrossRef Full Text | Google Scholar

Ngamkajornwiwat, P., Homchanthanakul, J., Teerakittikul, P., and Manoonpong, P. (2020). Bio-inspired adaptive locomotion control system for online adaptation of a walking robot on complex terrains. IEEE Access 8, 91587–91602. doi: 10.1109/ACCESS.2020.2992794

PubMed Abstract | CrossRef Full Text | Google Scholar

Open Source Robotics Foundation, I. (2020). Turtlebot3. Available online at: http://wiki.ros.org/Robots/TurtleBot

Pardo-Cabrera, J., Rivero-Ortega, J. D., Hurtado-López, J., and Ramírez-Moreno, D. F. (2022). Bio-inspired navigation and exploration system for a hexapod robotic platform. Eng. Res. Express 4, 025019. doi: 10.1088/2631-8695/ac6bde

CrossRef Full Text | Google Scholar

Ramirez-Moreno, D., and Hurtado-Lopez, J. (2014). Modelamiento Y Simulación De Circuitos Sipnáticos Sensoriomotores: Introducción a la Neurobiología Computacional. Calle: Universidad Autónoma de Occidente.

Google Scholar

Ramirez-Moreno, D. F., and Sejnowski, T. J. (2012). A computational model for the modulation of the prepulse inhibition of the acoustic startle reflex. Biol. Cybern. 106, 169–176. doi: 10.1007/s00422-012-0485-7

PubMed Abstract | CrossRef Full Text | Google Scholar

Robotics, O. (2021). Ros (robot operation system). Available online at: https://www.ros.org/

Google Scholar

Robotis (2022). Turtlebot3 specifications. Available online at: https://emanual.robotis.com/docs/en/platform/turtlebot3/features/

Shim, M. S., and Li, P. (2017). “Biologically inspired reinforcement learning for mobile robot collision avoidance,” in 2017 International Joint Conference on Neural Networks (IJCNN), 3098–3105. doi: 10.1109/IJCNN.2017.7966242. Available online at: https://ieeexplore.ieee.org/document/7966242

CrossRef Full Text | Google Scholar

Suzuki, S., Kano, T., Ijspeert, A. J., and Ishiguro, A. (2021). Sprawling quadruped robot driven by decentralized control with cross-coupled sensory feedback between legs and trunk. Front. Neurorobot. 14, 607455. doi: 10.3389/fnbot.2020.607455

PubMed Abstract | CrossRef Full Text | Google Scholar

Thor, M., Strohmer, B., and Manoonpong, P. (2021). Locomotion control with frequency and motor pattern adaptations. Front. Neural Circ. 15, 743888. doi: 10.3389/fncir.2021.743888

PubMed Abstract | CrossRef Full Text | Google Scholar

Wei, H., Bu, Y., and Dai, D. (2017). A decision-making model based on a spiking neural circuit and synaptic plasticity. Cogn. Neurodyn. 11, 415–431. doi: 10.1007/s11571-017-9436-2

PubMed Abstract | CrossRef Full Text | Google Scholar

Wilson, H. R., and Cowan, J. D. (1972). Excitatory and inhibitory interactions in localized populations of model neurons. Biophys. J. 12, 1–24. doi: 10.1016/S0006-3495(72)86068-5

PubMed Abstract | CrossRef Full Text | Google Scholar

Yan, Z., Fabresse, L., Laval, J., and Bouraqadi, N. (2015). “Metrics for performance benchmarking of multi-robot exploration,” in 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (Hamburg: IEEE), 3407–3414.

Google Scholar

Zahra, O., Tolu, S., Zhou, P., Duan, A., and Navarro-Alarcon, D. (2022). A bio-inspired mechanism for learning robot motion from mirrored human demonstrations. Front. Neurorobot. 16, 826410. doi: 10.3389/fnbot.2022.826410

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhao, F., Zeng, Y., Guo, A., Su, H., and Xu, B. (2020). A neural algorithm for drosophila linear and nonlinear decision-making. Sci. Rep. 10, 1–16. doi: 10.1038/s41598-020-75628-y

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: bio-inspired neural network, neuromodulation network, adaptation stage, signal processing, differential robot, exploration behavior, automaton

Citation: Guerrero-Criollo RJ, Castaño-López JA, Hurtado-López J and Ramirez-Moreno DF (2023) Bio-inspired neural networks for decision-making mechanisms and neuromodulation for motor control in a differential robot. Front. Neurorobot. 17:1078074. doi: 10.3389/fnbot.2023.1078074

Received: 24 October 2022; Accepted: 17 January 2023;
Published: 03 February 2023.

Edited by:

Bulcsú Sándor, Babeş-Bolyai University, Romania

Reviewed by:

Yuji Kawai, Osaka University, Japan
Luca Patanè, University of Messina, Italy
Bruno Lara, Universidad Autónoma del Estado de Morelos, Mexico

Copyright © 2023 Guerrero-Criollo, Castaño-López, Hurtado-López and Ramirez-Moreno. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Roberto Jose Guerrero-Criollo, yes cm9iZXJ0by5qb3NlLjcmI3gwMDA0MDtob3RtYWlsLmNvbQ==; Jason Alejandro Castaño-López, yes amFzb25hbGUzMDA3JiN4MDAwNDA7Z21haWwuY29t

These authors have contributed equally to this work and share first authorship

ORCID: Roberto Jose Guerrero-Criollo orcid.org/0000-0003-4760-4410
Jason Alejandro Castaño-López orcid.org/0000-0001-5874-3614
Julián Hurtado-López orcid.org/0000-0002-3773-0598
David Fernando Ramirez-Moreno orcid.org/0000-0003-2372-3554

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.