Autonomous robotic exploration with simultaneous environment and traversability models learning

In this study, we address generalized autonomous mobile robot exploration of unknown environments where a robotic agent learns a traversability model and builds a spatial model of the environment. The agent can benefit from the model learned online in distinguishing what terrains are easy to traverse and which should be avoided. The proposed solution enables the learning of multiple traversability models, each associated with a particular locomotion gait, a walking pattern of a multi-legged walking robot. We propose to address the simultaneous learning of the environment and traversability models by a decoupled approach. Thus, navigation waypoints are generated using the current spatial and traversability models to gain the information necessary to improve the particular model during the robot’s motion in the environment. From the set of possible waypoints, the decision on where to navigate next is made based on the solution of the generalized traveling salesman problem that allows taking into account a planning horizon longer than a single myopic decision. The proposed approach has been verified in simulated scenarios and experimental deployments with a real hexapod walking robot with two locomotion gaits, suitable for different terrains. Based on the achieved results, the proposed method exploits the online learned traversability models and further supports the selection of the most appropriate locomotion gait for the particular terrain types.


SUPPLEMENTARY APPENDIX S1 TERRAIN CLUSTER EROSION AND DILATION
In practice, it is not desirable to place cost exploration goals at the boundaries of terrains classes because, in such areas, a real robot with the imprecise path following might fail to traverse the correct terrain, and the descriptors in such areas might be distant from the prototype ta(T ). Besides, it might not be possible to acquire enough samples to learn the traversal cost on a small terrain area of a particular class. Hence, after assigning the terrain classes to cells, we erode cells that border different (or already eroded) terrain class using where ∅ is the eroded non-class terrain, T − and T −− are the class assignments before and after an erosion step, respectively, and the erosion process is repeated n steps erode -times. As a result of the erosion, some cells are assigned the eroded non-class ∅ with no prototype to use. Hence, when assigning cost predictions for path planning, we first dilate the terrain classes by selecting the most common class in the vicinity as where 8nb n size dilate is the n size dilate -times repeated neighborhood function 8nb, T + and T ++ are the class assignments before and after a dilation step, respectively, and the dilation process is repeated n steps dilate -times.

GAUSSIAN PROCESS REGRESSION
Assuming function f (x) that is observed with the noise Gaussian Process (GP) is defined as the distribution where m(x) is the mean and K(x, x ) is the covariance Given the training data X, the GP regressor's predictions and the query X * are where K(X, X ) is the covariance function.

INCREMENTAL GROWING NEURAL GAS
The Incremental Growing Neural Gas (IGNG) is a soft-computing clustering approach proposed by Prudent and Ennaji (2005). The approach builds on the Growing Neural Gas (GNG) (Fritzke, 1994), which adapts a graph topology to continually provided measurements. However, unlike the GNG, which is enlarged after a fixed number of measurement adaptation steps, the IGNG is only grown when adapting to a value x that is out of the bounds of the current structure.
Algorithm 1: Incremental Growing Neural Gas: Adaptation Input: Ω -IGNG structure with terrain classes T ; x -Adapted measurement for the terrain descriptor ta. Output: Ω -IGNG structure for the terrain classes T ).

15
if (ω 1 , ω 2 ) ∈ Ω connections then The IGNG adaptation is summarized in Alg. 1, and it operates as follows 1 . The algorithm keeps a graph of neurons (graph vertices) and their connections (graph edges) and keeps an age value for each neuron and connection. When adapting to a new measurement x, the algorithm first finds the closest neuron ω 1 and the second closest neuron ω 2 . If the graph is empty or the closest neuron is too far with x − ω 1 > σ IGNG , a new embryo neuron ω new with the age a(ω new ) = 1 is inserted at x. If ω 1 is close enough, but the second closest ω 2 is not, or there is only one neuron in the graph, a new neuron is also inserted at x. Moreover, an edge between the new neuron and ω 1 is inserted into the graph with the age a((ω 1 , ω new )) = 0.
If both ω 1 and ω 2 are close enough, ω 1 and all of its neighbors (neurons with an existing connection to ω 1 ) are warped towards x by IGNG 1 and IGNG nb , respectively. Then, if there is already a connection between ω 1 and ω 2 , its age is set to 0. Otherwise, the connection is created. Next, the ages of all neighbors a(ω nb ) of ω 1 and their respective connections a((ω 1 , ω nb )) are incremented by one.
After adapting to the measurement, the graph is pruned to remove old connections and isolated neurons. In general, it is desirable for neurons to be old (since measurements were often observed near then) and for connections to be young (since measurements were recently observed along the edge). First, we identify neurons that are mature with a(ω) ≥ a IGNG mature . Then, connections that are too old with a((ω, ω )) > a IGNG max are removed from the graph. If it leads to isolated mature neurons, these are also removed.