# Multi-robot searching with sparse binary cues and limited space perception

^{1}UMR 7503, Laboratoire Lorrain de Recherche en Informatique et ses Applications, Centre National de la Recherche Scientifique, Vandoeuvre-lès-Nancy, France^{2}School of Marine Science and Technology, Northwestern Polytechnical University, Xi’an, China^{3}Physics of Biological Systems, Institut Pasteur, Paris, France^{4}UMR 3525, Centre National de la Recherche Scientifique, Paris, France

In this paper, we consider the problem of searching for a source that releases particles in a turbulent medium with searchers having binary sensors and limited space perception. To this aim, we extend an information-theoretic strategy, namely Mapless, to multiple searchers and demonstrate its efficiency both in simulation and robotic experiments. The search time is found to decay as *1/n* for *n* cooperative robots as compared to $1/\sqrt{n}$ for independent robots so that significant gains in the search time are obtained with a small number of robots, e.g., *n* = 3. Search efficiency results from pooling sensory information between robots to improve individual decision-making (three detections on average per searcher were sufficient to reach the source) while still maintaining the individual resistivity to various errors during the search. The method is robust to odometry errors and is thus relevant to robots searching in low-visibility conditions, e.g., firefighter robots exploring smoky environments.

## 1. Introduction

Searching for a source releasing particles in the environment (e.g., toxic or explosive materials, pollutants, heat) is particularly challenging given that the chemical transport over long distances is dominated by turbulence (Csanady, 1973; Shraiman and Siggia, 2000). The sensory landscape is thus very heterogeneous in concentration and discontinuous in time, and consists of sporadically located patches traveling with the air flow. The probability of encountering one of these patches decays exponentially with distance from the source. In such turbulent conditions, odor detections become intermittent and no measurement gradient points toward the source (Csanady, 1973; Humphrey and Haj-Hariri, 2012; Celani et al., 2014). Methods based on a measurement gradient like extremum seeking (Zhang et al., 2007; Cochran and Krstic, 2009) are inappropriate in this context because the searcher has to rely on intermittent binary cues (hits with odor patches) rather than continuous sampling of concentration values.

Insects can be very efficient at solving this problem. One example is provided by male moths guided by pheromonal cues and searching for mates located hundreds of meters possibly kilometers away (Baker et al., 1985; Murlis et al., 1992; Mafra-Neto and Carde, 1994; Vickers, 2000). Another exceptional search behavior is the one of *Melanophila* beetles, which detect and track forest fires from infrared (Schmitz et al., 1997) and olfactory (Schutz et al., 1999) cues because their larvae can develop only in freshly burnt wood (Didier, 2010; Schmitz and Bousack, 2012). Artificial robots with searching ability similar to these insects are expected to be very useful in many applications (Gelenbe et al., 1997), e.g., to assist human firefighters in detecting gas leaks and exploring buildings on fire. Models of search processes are therefore important not only to biology but also to applications in robotics. As a search scheme intended to deal with uncertain and dynamic environments, Infotaxis has been shown to produce trajectories similar to those of animals, e.g., moths attracted by a sexual pheromone (Vergassola et al., 2007) or nematodes foraging for food (Calhoun et al., 2014).

Infotaxis is a probabilistic search method based on information theory that relies on a grid map of the environment (Vergassola et al., 2007). The posterior probability for the source position is calculated over the entire map and the searcher moves in the direction that minimizes the entropy of the distribution. Rather than searching for the source position the searcher moves to increase information on the position of the source (Barbieri et al., 2011; Atanasov et al., 2015). Furthermore, Infotaxis, during the Greedy decision process, slightly favors exploration over exploitation of information. Infotaxis has been successfully applied to robotic searches (Martin-Moraud and Martinez, 2010), a prerequisite being that the robot has full access to its position in the environment. Yet, for robots engaged in search missions, space perception can be limited. Think about a firefighter robot searching for fire indoor. As revealed by experiments in this paper, the presence of smoke prevents the use of cameras and laser range finders for localization. In such low-visibility conditions, Infotaxis is not easily applicable as the robot is unable to correct its odometry errors from external cues. Yet, adaptation of Infotaxis to such conditions is not excluded. Another approach, introduced in Masson (2013) as Mapless, allows searching in complex varying environment with limited space perception, possibly corrupted or incomplete information and limited memory. Mapless is based on a standardized projection of the probability map of the source location to remove space perception and on the evaluation of a free energy, whose minimization along the path gives direction to the searcher. Free-energy minimization allows reinforcement of the maximum likelihood decision.

Hereafter, following a similar procedure as the one shown in Masson et al. (2009), we extend Mapless to multiple searchers (swarm Mapless). Whereas decision-making is performed individually by each searcher, the probability of the source location and hence the free energy are jointly estimated by the swarm. The main difference with related works is that the information metric is approximated analytically in swarm Mapless rather than estimated from a grid map or by using (computationally expensive) Monte Carlo sampling techniques in Cortez et al. (2009), Barbieri et al. (2011), Dames and Kumar (2013), and Atanasov et al. (2015). We present here a successful solution with a real robotic system [search for a heat source in a turbulent medium as in Martin-Moraud and Martinez (2010) and Masson (2013)]. This framework is employed as a testbed to assert complete and rigorous evaluations of Mapless and swarm Mapless under real conditions. The paper is organized as follows. Infotaxis, Mapless, and swarm Mapless are detailed in the Section “Materials and Methods.” The performance in terms of effectiveness and robustness are assessed, both in simulations and robotic experiments, in the Section “Results”. Our work is discussed in the final section.

## 2. Materials and Methods

### 2.1. Infotaxis

Infotaxis was introduced in Vergassola et al. (2007) for searching in complex environments with sparse detections. It is built around two core components: Bayesian inference of the position of the source based on detection history and Greedy decision making based on entropy minimization. The former depends on the modeling of the local environment. An efficient approximation describes the propagation of the cues in the turbulent environment by the advection–diffusion equation (shown in Section “Appendix” for sake of completion). The properties of the medium are encoded by a rate function $R\left(\overrightarrow{r}|\overrightarrow{{r}_{0}}\right)$ with $\overrightarrow{{r}_{0}}$ the position of the source and $\overrightarrow{r}$ the position of the searcher. Note that a correlation length λ is associated with $R\left(\overrightarrow{r}|\overrightarrow{{r}_{0}}\right)$ and can be interpreted as the mean distance traveled by the particles before they vanish. The detection process is approximated by a Poisson process, leading to a probability of *k* detections during time δ*t*
${\mathrm{\rho}}_{k}=\frac{{\left(R\left(\overrightarrow{r}\right|\overrightarrow{{r}_{0}})\delta t\right)}^{k}\mathit{exp}\left(-R\left(\overrightarrow{r}|\overrightarrow{{r}_{0}}\right)\mathrm{\delta}t\right)}{k!}$. After following a path Θ* _{t}*, the posterior distribution of the position of the source at time

*t*reads:

with *H* the total number of detections experienced in Θ* _{t}* and ${Z}_{t}=\int \text{\hspace{0.17em}}\mathit{exp}\left(-{\int \hspace{0.17em}}_{0}^{t}\text{\hspace{0.17em}}R\left(\overrightarrow{{r}_{t\prime}}|\overrightarrow{{r}_{0}}\right)\mathit{dt}\prime \right){\prod}_{i=1}^{H}\text{\hspace{0.17em}}R\left(\overrightarrow{{r}_{i}}|\overrightarrow{{r}_{0}}\right)\mathrm{\delta}\mathit{td}\overrightarrow{{r}_{0}}$ the normalization constant. The detection process being approximated as Markovian, update of the posterior distribution ${P}_{t+\mathit{dt}}\left({\overrightarrow{r}}_{0}|{\mathrm{\Theta}}_{t}\right)$ is directly obtained from ${P}_{t}\left({\overrightarrow{r}}_{0}|{\mathrm{\Theta}}_{t}\right)$ by multiplying with the probability of detection or no-detection experienced during δ

*t*.

Moving toward the most probable source location, i.e., a maximum likelihood or maximum *a posteriori* strategy, systematically fails far from the source because of the misrepresentation of the environment by ${P}_{t}\left({\overrightarrow{r}}_{0}|{\mathrm{\Theta}}_{t}\right)$. Infotaxis, searches for information about the position of the source rather than directly trying to reach the source. Upon moving to a neighboring position, ${\overrightarrow{r}}_{t+\mathit{dt}}$, the searcher minimizes the expected variation of entropy of ${P}_{t}\left({\overrightarrow{r}}_{0}|{\mathrm{\Theta}}_{t}\right)$

with ${S}_{t}=-\int \text{\hspace{0.17em}}{P}_{t}\left({\overrightarrow{r}}_{0}|{\mathrm{\Theta}}_{t}\right)\text{\hspace{0.17em}}\mathrm{log}\text{\hspace{0.17em}}{P}_{t}\left({\overrightarrow{\mathrm{r}}}_{0}|{\mathrm{\Theta}}_{t}\right)d{\overrightarrow{\mathrm{r}}}_{0}$ is the entropy of the posterior field computed at time *t*. The first term encodes the probability of finding the source and promotes maximum likelihood decision and the second, which encodes the probability of not finding the source in ${\overrightarrow{r}}_{t+\mathit{dt}}$, promotes exploration of the environment. In the rest of the paper, the summation will be reduced to zero and one detection as the probability of having more detections during δ*t* is usually extremely low.

### 2.2. Mapless

A prerequisite for the evaluation of equation (2) based on the probability map [equation (1)] is that the agent perceives space, i.e., the agent is able to (i) build a spatial map of the environment, (ii) locate itself on the map, and (iii) go purposefully to predefined locations. These three tasks have been extensively studied in robotics and are known under the term SLAM for *simultaneously localization and mapping* (Thrun et al., 2005). In the case of fire searching, however, precise localization of the robot and map building operations would be difficult to achieve because infrared light and smoke particles emitted from burning objects prevents the use of cameras and laser range finders. In such low-visibility conditions, it is safer to conduct the search based on a coarse estimation of the robot position.

Mapless was introduced in Masson (2013) as a method for searching with limited space perception, for handling unreliable cues and controlling actively the exploration/exploitation balance. To remove space perception, the posterior distribution ${P}_{t}\left({\overrightarrow{r}}_{0}|{\mathrm{\Theta}}_{t}\right)$ is projected into a standardized form. The posterior distribution, that later will not directly used by the agent to make direction decision, reads

with ${\overrightarrow{r}}_{G}$ is the damped center of mass of the detections, ${\overrightarrow{r}}_{j}$s represent the perceived (by the agent) positions of the agent when there was no detection, λ* _{G}* is the scale of the Gaussian approximating the detection term, λ

*is the scale of the Gaussian approximating the non-detection term and ${Z}_{t}^{M}$ is a normalization constant. This projection is based on the separation of the detection and non-detection terms, approximating the former by its main component and the latter by a mean field approximation [Supplementary in Masson (2013)]. This projection allows an essential component of the posterior distribution of the source position to be encoded: the local decrease of the probability around the visited locations where no detections have been experienced. Whereas*

_{u}*N*is the total number of visited positions in Θ

_{t}*, only the last*

_{t}*N*positions are recalled to compute the posterior. This prevents storing indefinitely unreliable cues or positioning errors. The parameters λ

_{M}*and λ*

_{u}*are related to the correlation length λ of the source but are not necessarily the same as the non-detection term is made of a larger number of events and is less localized than the detection term.*

_{G}Instead of the entropy in Infotaxis (Vergassola et al., 2007), a free-energy formulation is used in Mapless. The free energy is written as *F _{t}* =

*W*+

_{t}*TS*with

_{t}*T*an internal (temperature) parameter that controls the balance between the entropy ${S}_{t}=-\int \text{\hspace{0.17em}}{P}_{t}^{M}\left({\overrightarrow{r}}_{0}|{\mathrm{\Theta}}_{t}\right)\mathrm{log}{P}_{t}^{M}\left({\overrightarrow{r}}_{0}|{\mathrm{\Theta}}_{t}\right)d{\overrightarrow{\mathrm{r}}}_{0}$ and the “working energy” ${W}_{t}={\int \text{\hspace{0.17em}}}_{A}\text{\hspace{0.17em}}{P}_{t}^{M}\left({\overrightarrow{r}}_{0}|{\mathrm{\Theta}}_{t}\right)d{\overrightarrow{r}}_{0}$ where the integration domain

*A*is defined as $|{\overrightarrow{r}}_{0}-{\overrightarrow{r}}_{G}|\le {\mathrm{\lambda}}_{G}\u22152$ (Masson, 2013). Note that free energy has been previously used as a principle for linking action to perception (Friston et al., 2010, 2011) and that various functional can be used for the “work term.” In Mapless, the free-energy formulation allows an active control between exploration and exploitation through the internal temperature

*T*, see Masson (2013) for the details. When the agent moves from position ${\overrightarrow{r}}_{t}$ at time

*t*to a neighboring position, ${\overrightarrow{r}}_{t+\mathit{dt}}$, the expected variation in the free-energy reads

where the summation limitation has been applied. The first and second terms on the right-hand side correspond to finding and not finding the source at the new position, respectively. If the source is found at the next step, the free energy *F*_{t + dt} becomes one. If the source is not found, the agent may or may not detect leading to different variations in the free energy, namely $\mathrm{\Delta}{F}_{t}^{1}$ and $\mathrm{\Delta}{F}_{t}^{0}$.

An important characteristics of approximating the posterior *P _{t}* [equation (1)] by ${P}_{t}^{M}$ [equation (3)] is that the free energy

*F*can be computed analytically without the computation of the approximated posterior distribution ${P}_{t}^{M}({\overrightarrow{r}}_{0}|{\mathrm{\Theta}}_{t}$). All terms involved in the computation of equation (4) are described in Masson (2013). Thus, unlike Infotaxis, Mapless does not require the searcher to build a probability map and locate itself precisely. Efficient searches, far from the source with significant odometry errors are demonstrated in Masson (2013).

_{t}### 2.3. Swarm Mapless

Interest in swarms of agents stems from the expected increase in task efficiency by having multiple agents performing it. As multiple agents can explore an environment more efficiently as a group than as individuals (Berdahl et al., 2013), we propose here an extension of Mapless to collective search. The best performing strategy would be the full collaboration between the agents; that is, the free energy is computed from the shared observations and decision-making is obtained by evaluating the effects of moving the whole swam. Yet, the number of possible actions for *n* agents on a square 2D grid is 5* ^{n}* so that performing full collaboration in real time is difficult in practice when

*n*> 3. An alternative approach is that the agents share information during their path (i.e., detection and non-detection events) but decision-making is performed individually. Namely, the

*s*-th agent chooses the move ${\overrightarrow{r}}_{t}^{\text{\hspace{0.17em}}s}\to {\overrightarrow{r}}_{t+\mathit{dt}}^{\text{\hspace{0.17em}}s}$ that minimizes

where ${\mathrm{\Theta}}_{t}=\left\{{\mathrm{\Theta}}_{t}^{1},{\mathrm{\Theta}}_{t}^{2},\dots ,{\mathrm{\Theta}}_{t}^{s},\dots \right\}$ denotes the search history for the whole swarm while ${\mathrm{\Theta}}_{t}^{s}$ is the self-generated path of the *s*-th agent. It is worth remembering that the robots share their own measurements of their paths and detection history, thus they share paths with odometry errors and possibly anomalous detection, yet as it will be shown Mapless and Swarm Mapless are resistive to these errors. This strategy is referred to swarm Mapless in the following.

## 3. Results

### 3.1. Swarm Mapless in Simulation: Three Searchers is Sufficient

Before considering robotic implementations, we first assess the performance of swarm Mapless using numerical simulations in terms of effectiveness (search time) and robustness (with respect to changes in environmental conditions).

The dependency of the search time *t _{s}* on the number of searchers

*n*is shown in Figure 1A for swarm Mapless as compared to independent and fully collaborative searchers (see Materials and Methods). The simulations were performed in C with parameters given in figure caption. In all cases, the data are well fitted by a power law

*t*∝

_{s}*n*

^{β}. The exponent β is −0.5 for independent searchers and −0.9 for swarm Mapless so that

*t*decays more rapidly $({t}_{s}\propto \frac{1}{n})$ when the searchers cooperate than when they are independent $({t}_{s}\propto \frac{1}{\sqrt{n}})$. Interestingly, both β are consistent with Masson et al. (2009) for swarm of infotactic searchers.

_{s}**Figure 1. Efficiency of swarm Mapless in simulation**. **(A)** Dependency of the search time *t _{s}* on the number of Mapless searchers

*n*. Simulations were performed in C on Ubuntu Linux (2.2 GHz). Sensory information is binary (detection/no detection). The environmental parameters used in the simulations are (in arbitrary units) the emission rate at the source

*J*= 1, diffusivity

*D*= 1, lifetime of particles τ = 400, and wind speed

*V*= 0 leading to a correlation length λ = 20. The other Mapless parameters to compute the free energy are the scaling factors λ

*= 0.5λ and λ*

_{u}*= λ for the detection and non-detection events, the number*

_{G}*N*= 750 of visited locations stored in memory and the internal temperature

_{M}*T*= 1. Blue, red, and black plots are in log–log scale for independent searchers, swarm Mapless, and fully collaborative searchers, respectively. Points are means ± SEM estimated over 2000 simulations. Solid lines represent power law fits

*t*∝

_{s}*n*

^{β}. The exponent β is −0.9 for swarm Mapless, −0.95 for fully collaborative searchers and −0.5 for independent searchers so that the search time decays as

*1/n*for swarm Mapless and $1/\sqrt{n}$ for independent searchers.

**(B–D)**Examples of swarm Mapless trajectory with three searchers. The source is located at (21, 41). The agent starting points are (16, 20), (21, 21), and (26, 21).

**(E)**Dependency of the search time

*t*on the number of independent random walkers

_{s}*n*. Same conditions as in

**(A)**.

**(F)**Example of search path with three independent random walkers. Same conditions as in

**(B)**.

We also note that the gain resulting from full collaboration between the agents is marginal as compared to swarm Mapless with individual decision-making (β = −0.95 for full collaboration vs. −0.9 for swarm Mapless). The gain with fully collaborative searchers was more significant in Infotaxis (Masson et al., 2009) than in Mapless. This is the consequence of the reduced representation of the environment in Mapless, the full collaboration between searchers does not improve much Greedy decision processes based on limited information. Yet, the implementation cost of a full collaboration is much higher. Due to the power law, the percentage decrease in the search time for swarm Mapless is 70% from *n* = 1 to 3 searchers and 20% from *n* = 3 to 5, so that swarm Mapless reveals impressive gains in the search time even with a limited number of searchers (*n* = 3).

Some examples of swarm Mapless trajectories obtained with three searchers are depicted in Figures 1B–D. We note that some agents originally follow a direction opposite to the one of the source. These incorrect paths are not surprising given the uncertain belief resulting from the lack of detections at the beginning of the search. Yet, the direction toward the source emerges as information from odor detections is gathered over time. On average, we found that only three detections per searcher are sufficient to reach the source, in the search configuration displayed here. It is worth noting that even if the detections are rare (characteristic of searches in dilute or desertic conditions), they are nevertheless crucial to the search process. To assess their importance, we performed complementary simulations with random walkers (see Figures 1E,F). The search time of random walkers also exhibits a power law decay with *n*. Yet, it is many orders of magnitude higher than for swarm Mapless. It is also worth noting that swarm Mapless, Mapless, and obviously Infotaxis exploit the non-detections to explore the search space. Swarms Mapless gains a lot of efficiency from the various parts of the search space where no detections occurred. From the observations above, it is therefore sufficient for experimental purposes to consider a swarm of three searchers to improve effectiveness.

It has been shown that Mapless is resistant to incorrect modeling of the environment (Masson, 2013). Is this robustness preserved when the swarm accumulates information on multiple locations at the same time? To judge it, we tested swarm Mapless under varying conditions, i.e., isotropic diffusivity *D* in range 0.4–1.6 au in Figure 2A, lifetime of particles τ in range 100–800 au in Figure 2B, λ* _{G}*/λ and λ

*/λ in range 0.1–1 in Figure 2C. In each condition, we observe that the variability (as given by the SD of the search time) decreases with the number of searchers (see Figure 2D). Moreover, the gain in robustness in more pronounced for*

_{u}*n*≤ 3 than for

*n*> 3.

**Figure 2. Robustness of swarm Mapless in simulation ( n = number of searchers)**.

**(A)**Swarm Mapless considers isotropic diffusivity

*D*that is different from the one of the source (

*D*= 1 au). Points represent the search time

*t*as means ± SEM estimated over 2000 simulations. Different colors are associated with the number of searchers as indicated in the figure. Within- and between-group differences are significant (Kruskal–Wallis test). Asterisks indicate significant differences (*

_{s}*p*< 0.05, ***

*p*< 0.001).

**(B)**Swarm Mapless considers a lifetime of particles τ that is different from the one of the source (τ = 400 au). Same conditions as in

**(A)**.

**(C)**Swarm Mapless considers scaling factors λ

*and λ*

_{u}*for detection and non-detection events that are different from the correlation length λ of the source.*

_{G}**(D)**SD of the search time SD (

*t*) versus the number of Mapless searchers

_{s}*n*under the three conditions: mismatch in diffusivity D [as in

**(A)**], mismatch in lifetime of particles τ [as in

**(B)**] and mismatch in correlation length λ [as in

**(C)**].

### 3.2. Swarm Mapless in Robotic Experiments: Resistance to Odometry Errors

Promising results were achieved with Swarm Mapless in simulation (Figures 1 and 2). Nevertheless, experimental implementations are necessary to ensure that swarm Mapless can be used in real turbulent environments and can handle the numerous errors encountered. We present hereafter a successful solution for implementing swarm Mapless within a robotic system, and we assess its performance in the real environment. All experiments were performed with Khepera III robots (K-Team SA, Switzerland) and several modules: Korebot II (embedded ARM processor running Linux 2.6 at 600 MHz), KoreIOLE (acquisition board with 12 analog inputs in the 0–5 V range with 5 mV resolution), and KoreWifi (board allowing Wifi communication with the robot).

As a proof of concept for fire searching, we consider the search for a heat source with robots equipped with temperature sensors. If the heat source is set to only few degrees above room temperature, the setup is also valid to model olfactory cue searches as in Masson (2013). The environmental conditions inside a building on fire can rapidly deteriorate making visual navigation difficult in the presence of smoke. To assess whether precise localization can still be obtained in low visibility conditions, we equipped a robot with a rangefinder module (Figure 3A). The rangefinder sensor is a LIDAR (URG-04LX, Hokuyo) that determines the distance to objects from the time-of-flight of a rotating laser. In smoky conditions, the LIDAR is not able to detect the boundaries of the test apparatus (Figure 3B). Instead, the LIDAR returns the distance to the bottom of the smoke layer so that the measured distance decreases with the smoke density (from moderate to heavy in Figure 3B). In agreement with Pascoal et al. (2008) and Starr and Lattimer (2014), this result indicates that a LIDAR would not be capable of providing accurate range finding information in smoky environments.

**Figure 3. Limited space perception with smoke**. **(A)** LIDAR sensor (±120° detection range) mounted on a Khepera robot. **(B)** Experiments with the robot placed in a closed chamber (dashed rectangle). Black and colored points are measurements obtained during 100 scans without and with smoke, respectively. Artificial smoke is produced by a smoke machine (VDL400SM, Velleman). Higher levels of smoke (from moderate to heavy) are obtained by running the machine for longer time periods.

An alternative method for robot localization is to use odometry, which is path integration of the robot velocity sensed from its wheels. To assess the localization error, we compared robot trajectories obtained from the odometry tracking module of the Khepera III Toolbox (http://en.wikibooks.org/wiki/Khepera_III_Toolbox) to the ground truth provided by a motion capture device (Figure 4A). An example of trajectory is shown in Figure 4B. The systematic error, resulting from discrete-time integration and/or incorrect parameters in robot kinematics, appeared to be small. Yet, we noticed the occurrence of large non-systematic errors due to wheel slippage during the re-orientation phases of the robot (Figure 4B). Although similar re-orientation phases are used in the experiments below due to step-like movements, we tested swarm Mapless without correcting for odometry errors. Experimental Mapless searches shown in Masson (2013) were very resistant to strong odometry errors, yet accumulating information from multiple searchers is also accumulating errors from all searchers. Thus, it is important to question odometry errors in the context of swarm Mapless experimental searches. In some ways, it allows us to assess the robustness of the algorithm.

**Figure 4. Odometry errors**. **(A)** Motion capture device used to characterize the odometry error in our robot. Six infrared cameras (Qualisys Oqus 7, 12 MP/300 Hz) allow robot tracking with millimeter precision. **(B)** Typical example of systematic and non-systematic errors. The trajectory in black is estimated from integration of the robot velocity sensed from its wheels. The trajectory in red is the ground truth measured by the motion capture device **(A)**. The systematic error resulting from discrete-time integration and/or incorrect parameters in robot kinematics is small. The non-systematic error resulting from wheel slippage (here occurring during the re-orientation phases of the robot) is large.

In swarm Mapless experiments, we consider the search for a heat source with three robots equipped with temperature sensors (Figure 5A). The temperature signal was amplified and filtered with a custom-made board previously designed for biological signals (Martinez et al., 2014). The search was performed in an arena of 6 m long by 4 m large, resulting in a grid-based model of the environment of 30 × 20 steps (Figure 5B). At every step, each robot chooses the best strategy in terms of free-energy minimization among the five possible actions, i.e., making a move to one of the four neighboring steps or staying still. Linear and angular speeds were set to 10 cm/s and 90°/s as they offer a good compromise between minimizing the errors in the step-like movements and being fast enough (each individual step, including translation and rotation, is performed in ≈3 s). To allow searching with obstacles (e.g., the boundaries delimiting the search space) and prevent the robots running into each other, we added to swarm Mapless a Braitenberg avoidance scheme based on the readings of the Khepera proximity sensors. An example of collective search with three robots is shown in Figure 6.

**Figure 5. Robotic experiments (proof of concept for fire searching)**. **(A)** Temperature sensor (Thermocouple probe TKA01-5 type K, T.M.Electronics) mounted on a Khepera robot. Preprocessing (amplification ×5000, sampling frequency 1 KHz) is performed via a custom-made board. **(B)** The search space is 6 m long by 4 m large, resulting in a grid-based model of the environment of 30 × 20 steps. In order to obtain statistically comparable results, all trials reported hereafter are done with the three robots initially located at *(x, y)* = (5, 6), (10, 6), and (15, 6) and the heat source (S) at (10,24). The heat source had an internal fan with 90° oscillation producing a wind oriented downward with fluctuations around the *y* axis, as indicated by the arrows.

**Figure 6. Swarm searching with three robots**. Snapshot of the collective search at particular steps. At each step, the source S is at location (10,24) and wind blows as indicated by the arrow. At step 0, the three robots start from locations (5,6), (10,6), and (15,6). At step 51, robot #2 found the source.

The heat source had an internal fan with 90° oscillation aiming at increasing wind fluctuations and thereby the turbulence level. This dispersion model was also used in Masson (2013). The air conditioning was turned off while other instruments and furniture in the room were placed as usual. This setup led to a complex temperature pattern and the heat source was sufficiently hot for the robots to detect local temperature variations at several meters from the source. Figure 7 provides two examples of the signal measured by the robot while moving straight toward the heat source (Figure 7A) and without the source (Figure 7B). Detection events are triggered each time the temperature signal exceeds an adaptive threshold (see figure caption for details). The statistics of detections obtained by repeating the experiment 12 times is shown in Figure 7 with and without the source. With the heat source (Figure 7C), the detection rate decays exponentially with the source distance, in well agreement with the expression of $R(\overrightarrow{r},\text{\hspace{0.17em}}{r}_{0})$ derived in the Section “Appendix” with a correlation length of λ = 20 au. Without the source (Figure 7D), the false alarm rate is low (≈1 false positive every 12 s) and independent of the source distance.

**Figure 7. Detection events with and without source**. **(A,B)** Temporal evolution of the measured temperature as a function of the source distance when the robot moves straight toward the source location. The blue curve represents the local variation of the temperature; that is, the difference between the current temperature and a running average calculated over a 10-s sliding window. Red dots correspond to detection events triggered each time the variation in temperature exceeds 15 digits. **(C,D)** Histogram of the number of detections with respect to the source distance *d* with and without the heat source (*n* = 12 trials in each condition). The dashed curve in C corresponds to a fit with the detection rate $R(\overrightarrow{r},\overrightarrow{{r}_{0}}$) derived in the Section “Appendix” with a correlation length of λ = 20 au. The dashed line in D corresponds to a mean false alarm rate of 0.08 detection/s.

To test the effectiveness of robot swarm Mapless in the real environment, we repeated experiments in order to obtain 20 successful runs. One successful trial is defined by the fact that one of the robot in the swarm reaches the source within a reasonable search time set at 700, 600, and 500 steps for *n* = 1, 2, and 3 robots, respectively. Above this time limit, the robots are considered to be lost. The total number of trials with (1, 2, 3) robots was (21, 22, 24) and (21, 21, 23) for collaborative and independent robots, respectively. The success rate is high and comparable to what has been previously obtained with one robot (Masson, 2013). More interesting is that the power law dependency of the search time obtained in robotic experiments is similar to the one in simulation (Figure 8A). Thus, the search time also decays as *1/n* for swarm Mapless with *n* robots as compared to $1/\sqrt{n}$ for independent robots. The *1/n* decay of swarm Mapless leads to significant gains in the search time. As an example, the mean duration of the search is ≈20 min with one robot as compared to ≈7 min with three robots. An example of swarm Mapless trajectory is shown in Figure 8B. It is worth noting that the paths of the 3 robots were reconstructed from an external video camera and not from the odometry of the robots. The reason is that, during the search, the robots have enough time to accumulate odometry errors and their estimated trajectories do not correspond to the reality. Nevertheless, the efficiency of the search confirms that swarm Mapless is resistant to odometry errors. Among the useful properties of swam Mapless, the capability to handle erroneous information is an important one allowing for efficient applications in real environments.

**Figure 8. Efficiency of swarm Mapless in robotic experiments**. **(A)** Dependency of the search time (number of steps) on the number of searchers in log–log scale for robotic experiments (circles) and simulations (points and power law fits from Figure 1A). Results are means ± SEM estimated over 20 trials for robotic experiments (same parameters as in Figure 1A). Red and blue plots are for swarm Mapless and independent searchers, respectively. **(B)** Example of swarm Mapless trajectory with three robots.

## 4. Discussion

In the case of diffusion (the signal is maximum at source location and decays with distance from the source), search methods based on a measurement gradient (Ogren et al., 2004; Zhang et al., 2007; Cochran and Krstic, 2009) are guaranteed to converge to the source location. Multiple searchers can similarly be used to locate the plume front in the case of advection (Li et al., 2014). These methods are applicable only in the presence of a homogeneous signal field for which the computation of a measurement gradient is feasible. Here, we addressed the more challenging problem of searching in a turbulent medium. In this context, even if a local gradient could be measured, its direction would not point toward the source, thus the searcher has to rely on intermittent binary cues.

To this aim, we considered an information-theoretic method (Mapless) and its extension to multiple searchers (swarm Mapless). The search strategy is motivated by the fact that the expected search time is bounded by the Shannon’s entropy of the probability distribution for the source location (Vergassola et al., 2007). The reduction of entropy in the estimated distribution is thus a necessary (although not sufficient) condition for effective searching. No pure mathematical proof of the algorithmic convergence exists for Mapless and swarm Mapless. Yet, simulations in Masson (2013) show exponentially tailed distributions for the search time ensuring that the average search time is not driven by the tail dynamics. Furthermore, there is a non-nul probability of having no detection during the initial spiraling exploratory behavior, thus not all searches are insured to find the source. Here, we provided statistical measures of the search time based on more than 10^{5} simulations (Figure 1A) and 10^{2} robotic experiments (Figure 8A).

The power law dependency of the search time on the number of searchers revealed significant gains even with a small number of robots (e.g., *n* = 3). The search time was found to decay as *1/n* for swarm Mapless with *n* robots as compared to $1/\sqrt{n}$ for independent Mapless robots. Search efficiency results from pooling sensory information between robots to improve individual decision-making (three detections on average per searcher were sufficient to reach the source). In our experiments, loss of efficiency due to collision was not a problem in part because of the small number of robots exploring a relatively large search space and also because the robots tend to repel each other when their distance is inferior to the correlation length of the source, a behavior also observed in infotactic searches (Masson et al., 2009). Yet, it is worth noting that the repellent effect between robots is much weaker than for swarm Infotaxis. It is the consequence of the simplified representation of the environment.

Search methods based on an information gradient, e.g., Atanasov et al. (2015), are related to our work. They are guaranteed to converge to a local maximum of the mutual information (and thereby to the source location provided it corresponds to the local maximum). Yet, it is difficult to judge their efficiency from the literature as no theoretical estimation or upper bound on the search time is given – for example, the searcher may spend a lot of time far from the source where the information gradient is very small – and no evaluation was conducted under real turbulent conditions – mere simulations were performed with a homogeneous signal field in Atanasov et al. (2015). When considering robotic implementation, we also note that calculating an information metric analytically (as in swarm Mapless) is computationally more efficient than estimating it via particle filters.

Future work will then concentrate on comparing the performance of swarm Mapless (in terms of search time and computational complexity) to related approaches on real robotic problems including obstacle-cluttered environments. Another interesting line of research that may prove beneficial and ought to be considered as future work is the generalization of swarm Mapless to cope with multiple sources, as done for example in Masson et al. (2009) for Infotaxis and in Masson (2013) for Mapless.

## Author Contributions

JM designed the algorithm, SZ, DM, and JM designed research; SZ, DM, and JM performed research; SZ and DM performed the experiments, DM and JM wrote the paper.

## Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

## Acknowledgments

This work was funded by the state program investissements d’avenir managed by ANR (grant ANR-10-BINF-05 Pherotaxis). SZ acknowledges support from the National Natural Science Foundation of China under grant 61472325, 51209174, and 51311130137.

## References

Atanasov, N., Le Ny, J., and Pappas, G. (2015). Distributed algorithms for stochastic source seeking with mobile robot networks. *J. Dyn. Syst. Meas. Control* 137, 031004. doi: 10.1115/1.4027892

Baker, T., Willis, M., Haynes, K., and Phelan, P. (1985). A pulsed cloud of sex pheromone elicits upwind flight in male moths. *Physiol. Entomol.* 10, 257–265. doi:10.1111/j.1365-3032.1985.tb00045.x

Barbieri, C., Cocco, S., and Monasson, R. (2011). On the trajectories and performance of infotaxis, an information based Greedy search algorithm. *Europhys. Lett.* 94, 20005. doi:10.1209/0295-5075/94/20005

Berdahl, A., Torney, C., Ioannou, C., Faria, J., and Couzin, I. (2013). Emergent sensing of complex environments by mobile animal groups. *Science* 339, 574–576. doi:10.1126/science.1225883

Calhoun, A., Chalasani, S., and Sharpee, T. (2014). Maximally informative foraging by *Caenorhabditis elegans*. *eLife* 3, e04220. doi:10.7554/eLife.04220

Csanady, G. T. (1973). *Turbulent Diffusion in the Environment*. Dordrecht: D. Reidel Publishing Company.

Celani, A., Villermaux, E., and Vergassola, M. (2014). Odor landscape in turbulent environments. *Phys. Rev.* 4, 1–17. doi:10.3791/51704

Cochran, J., and Krstic, M. (2009). Nonholonomic source seeking with tuning of angular velocity. *IEEE Trans. Automat. Contr.* 54, 717–731. doi:10.1109/TAC.2009.2014927

Cortez, A., Tanner, H., and Lumia, R. (2009). Distributed robotic radiation mapping. *Exp. Robot.* 54, 147–156. doi:10.1007/978-3-642-00196-3_17

Dames, P., and Kumar, V. (2013). “Cooperative multi-target localization with noisy sensors,” in *IEEE International Conference on Robotics and Automation (ICRA)* (Karlsruhe).

Friston, K., Daunizeau, J., and Kilner, J. (2010). Action and behavior: a free-energy formulation. *Biol. Cybern.* 102, 227–260. doi:10.1007/s00422-010-0364-z

Friston, K., Mattout, J., and Kilner, J. (2011). Action understanding and active inference. *Biol. Cybern.* 104, 137–160. doi:10.1007/s00422-011-0424-z

Gelenbe, E., Schmajuk, N., Staddon, J., and Rief, J. (1997). Autonomous search by robots and animals: a survey. *Rob. Auton. Syst.* 22, 23–34. doi:10.1016/S0921-8890(97)00014-6

Humphrey, J.-A., and Haj-Hariri, H. (2012). “Stagnation point flow analysis of odorant detection by permeable moth antennae,” in *Frontiers in Sensing*. eds F. G. Barth, J. A. C. Humphrey, and M. V. Srinivasan (Wien: Springer-Verlag), 171–192. doi:10.1007/978-3-211-99749-9_12

Li, S., Guo, Y., and Bingham, B. (2014). “Multi-robot cooperative control for monitoring and tracking dynamic plumes,” in *IEEE International Conference on Robotics and Automation (ICRA)* (Hong Kong).

Mafra-Neto, A., and Carde, R. T. (1994). Fine-scale structure of pheromone plumes modulated upwind orientation of flying moths. *Nature* 369, 142–144. doi:10.1038/369142a0

Martinez, D., Arhidi, L., Demondion, E., Masson, J.-B., and Lucas, P. (2014). Using insect electroantennogram sensors on autonomous robots for olfactory searches. *J. Vis. Exp.* 90, e51704. doi:10.3791/51704

Martin-Moraud, E., and Martinez, D. (2010). Effectiveness and robustness of robot infotaxis for searching in dilute conditions. *Front. Neurorobot.* 4:1. doi:10.3389/fnbot.2010.00001

Masson, J.-B. (2013). Olfactory searches with limited space perception. *Proc. Natl. Acad. Sci. U.S.A.* 110, 11261–11266. doi:10.1073/pnas.1221091110

Masson, J.-B., Bailly-Bechet, M., and Vergassola, M. (2009). Chasing information to search in random environments. *J. Phys. A Math. Theor.* 42:434009. doi:10.1088/1751-8113/42/43/434009

Murlis, J., Elkinton, J. S., and Card, R. T. (1992). Odor plumes and how insects use them. *Annu. Rev. Entomol.* 37, 479–503. doi:10.1146/annurev.en.37.010192.002445

Ogren, P., Fiorelli, E., and Leonard, N. (2004). Cooperative control of mobile sensor networks: adaptive gradient climbing in a distributed environment. *IEEE Trans. Automat. Contr.* 49, 1292–1302. doi:10.1109/TAC.2004.832203

Pascoal, J., Marques, L., and de Almeida, A. (2008). “Assessment of laser range finders in risky environments,” in *IEEE/RSJ International Conference on, Intelligent Robots and Systems, IROS 2008* (IEEE), 3533–3538.

Schmitz, H., Bleckmann, H., and Murtz, M. (1997). Infrared detection in a beetle. *Nature* 386, 773–774. doi:10.1038/386773a0

Schmitz, H., and Bousack, H. (2012). Modelling a historic oil-tank fire allows an estimation of the sensitivity of the infrared receptors in pyrophilous *Melanophila* beetles. *PLoS ONE* 7:e37627. doi:10.1371/journal.pone.0037627

Schutz, S., Weissbecker, B., Hummel, H. E., Apel, K.-H., Schmitz, H., and Bleckmann, H. (1999). Insect antenna as a smoke detector. *Nature* 398, 298–299. doi:10.1038/18585

Shraiman, B. I., and Siggia, E. D. (2000). Scalar turbulence. *Nature* 405, 639–646. doi:10.1038/35015000

Smoluchowski, M. (1917). Versuch einer mathematischen theorie des koagulationslinetic kolloider losungen. *Z. Phys. Chem.* 92, 129–168.

Starr, J., and Lattimer, B. (2014). Evaluation of navigation sensors in fire smoke environments. *Fire Technol.* 50, 1459–1481. doi:10.3390/s101210953

Thrun, S., Burgard, W., and Dieter, F. (2005). *Probabilistic Robotics*. Cambridge, MA: The MIT Press.

Vergassola, M., Villermaux, E., and Shraiman, B. I. (2007). Infotaxis as a strategy for searching without gradients. *Nature* 445, 406–409. doi:10.1038/nature05464

Vickers, N. (2000). Mechanisms of animal navigation in odor plumes. *Biol. Bull.* 198, 203–212. doi:10.2307/1542524

Zhang, C., Arnold, D., Ghods, N., Siranosian, A., and Krstic, M. (2007). Source seeking with non-holonomic unicycle without position measurement and with tuning of forward velocity. *Syst. Contr. Lett.* 56, 245–252. doi:10.1016/j.sysconle.2006.10.014

## Appendix

### A.1. Detection Rate Function in a Simplified Turbulent Medium

We consider a source located at $\overrightarrow{{r}_{0}}=({x}_{0},{y}_{0}$) and emitting “particles” at a rate *J*. The particles propagate in the environment with diffusivity *D*, have a mean lifetime τ and are advected by a mean current or wind *V* (the wind blows in the −*y* direction). The rate function $R\left(\overrightarrow{r},\overrightarrow{{r}_{0}}\right)$ models how particles are detected at location $\overrightarrow{r}=\left(x,y\right)$ given the source at $\overrightarrow{{r}_{0}}$. It is obtained by solving the advection–diffusion equation

where $C(\overrightarrow{r})$ is the local concentration of particles at $\overrightarrow{r}$ and δ is the Dirac delta function. In the three dimensional case, the solution to equation A1 writes:

where *r* is the distance from the source and $\mathrm{\lambda}=\sqrt{D\mathrm{\tau}\u2215\left(1+\frac{{V}^{2}\mathrm{\tau}}{4D}\right)}$ is the correlation length that can be interpreted as the mean distance traveled by the particles before they vanish. A similar expression is obtained in the 2D case (Vergassola et al., 2007). Considering that particles are detected with a spherical sensor of radius “*a*,” the detection rate follows the Smoluchowski’s expression (Smoluchowski, 1917)

Keywords: search and rescue, multi-robot systems, swarm robotics, fire searching, firefighter robot

Citation: Zhang S, Martinez D and Masson J-B (2015) Multi-robot searching with sparse binary cues and limited space perception. *Front. Robot. AI* 2:12. doi: 10.3389/frobt.2015.00012

Received: 12 January 2015; Accepted: 05 May 2015;

Published: 26 May 2015

Edited by:

M. Ani Hsieh, Drexel University, USAReviewed by:

Konstantinos Karydis, University of Delaware, USARoberto Tron, University of Pennsylvania, USA

Copyright: © 2015 Zhang, Martinez and Masson. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Dominique Martinez, UMR 7503, Laboratoire Lorrain de Recherche en Informatique et ses Applications, Centre National de la Recherche Scientifique, Vandoeuvre-lès-Nancy, France, dominique.martinez@loria.fr

^{†}Senior authors of the paper.