<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Comput. Neurosci.</journal-id>
<journal-title>Frontiers in Computational Neuroscience</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Comput. Neurosci.</abbrev-journal-title>
<issn pub-type="epub">1662-5188</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fncom.2021.629380</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Neuroscience</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Identification of Exploration and Exploitation Balance in the Silkmoth Olfactory Search Behavior by Information-Theoretic Modeling</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name><surname>Hernandez-Reyes</surname> <given-names>Cesar A.</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="corresp" rid="c001"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1080476/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Fukushima</surname> <given-names>Shumpei</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Shigaki</surname> <given-names>Shunsuke</given-names></name>
<xref ref-type="aff" rid="aff3"><sup>3</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1019152/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Kurabayashi</surname> <given-names>Daisuke</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1193918/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Sakurai</surname> <given-names>Takeshi</given-names></name>
<xref ref-type="aff" rid="aff4"><sup>4</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/91758/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Kanzaki</surname> <given-names>Ryohei</given-names></name>
<xref ref-type="aff" rid="aff5"><sup>5</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1193/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Sezutsu</surname> <given-names>Hideki</given-names></name>
<xref ref-type="aff" rid="aff6"><sup>6</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/855616/overview"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Department of Systems and Control Engineering, Tokyo Institute of Technology</institution>, <addr-line>Tokyo</addr-line>, <country>Japan</country></aff>
<aff id="aff2"><sup>2</sup><institution>MHPS Ltd.</institution>, <addr-line>Takasago</addr-line>, <country>Japan</country></aff>
<aff id="aff3"><sup>3</sup><institution>Department of Systems Innovation, Osaka University</institution>, <addr-line>Osaka</addr-line>, <country>Japan</country></aff>
<aff id="aff4"><sup>4</sup><institution>Department of Agricultural Innovation for Sustainability, Tokyo University of Agriculture</institution>, <addr-line>Atsugi</addr-line>, <country>Japan</country></aff>
<aff id="aff5"><sup>5</sup><institution>Research Center for Advanced Science and Technology, The University of Tokyo</institution>, <addr-line>Tokyo</addr-line>, <country>Japan</country></aff>
<aff id="aff6"><sup>6</sup><institution>Transgenic Silkworm Research Unit, National Agriculture and Food Research Organization</institution>, <addr-line>Tsukuba</addr-line>, <country>Japan</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Dai Owaki, Tohoku University, Japan</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Hitoshi Aonuma, Hokkaido University, Japan; Dominique Martinez, UMR7503 Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), France</p></fn>
<corresp id="c001">&#x0002A;Correspondence: Cesar A. Hernandez-Reyes <email>hernandez.cesar&#x00040;irs.sc.e.titech.ac.jp</email></corresp>
</author-notes>
<pub-date pub-type="epub">
<day>01</day>
<month>02</month>
<year>2021</year>
</pub-date>
<pub-date pub-type="collection">
<year>2021</year>
</pub-date>
<volume>15</volume>
<elocation-id>629380</elocation-id>
<history>
<date date-type="received">
<day>14</day>
<month>11</month>
<year>2020</year>
</date>
<date date-type="accepted">
<day>08</day>
<month>01</month>
<year>2020</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2021 Hernandez-Reyes, Fukushima, Shigaki, Kurabayashi, Sakurai, Kanzaki and Sezutsu.</copyright-statement>
<copyright-year>2021</copyright-year>
<copyright-holder>Hernandez-Reyes, Fukushima, Shigaki, Kurabayashi, Sakurai, Kanzaki and Sezutsu</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract><p>Insects search for and find odor sources as their basic behaviors, such as when looking for food or a mate. This has motivated research to describe how they achieve such behavior under turbulent odor plumes with a small number of neurons. Among different insects, the silk moth has been studied owing to its clear motor response to olfactory input. In past studies, the &#x0201C;programmed behavior&#x0201D; of the silk moth has been modeled as the average duration of a sequence of maneuvers based on the duration of periods without odor hits. However, this model does not fully represent the fine variations in their behavior. In this study, we used silk moth olfactory search trajectories from an experimental virtual reality device. We achieved an accurate input by using optogenetic silk moths that react to blue light. We then modeled such trajectories as a probabilistic learning agent with a belief of possible source locations. We found that maneuvers mismatching the programmed behavior are related to larger entropy decrease, that is, they are more likely to increase the certainty of the belief. This implies that silkmoths include some stochasticity in their search policy to balance the exploration and exploitation of olfactory information by matching or mismatching the programmed behavior model. We believe that this information-theoretic representation of insect behavior is important for the future implementation of olfactory searches in artificial agents such as robots.</p></abstract>
<kwd-group>
<kwd><italic>Bombyx mori</italic></kwd>
<kwd>infotaxis</kwd>
<kwd>olfaction</kwd>
<kwd>ethology</kwd>
<kwd>adaptive-behavior</kwd>
<kwd>exploration-exploitation</kwd>
</kwd-group>
<contract-num rid="cn002">JP19H02104</contract-num>
<contract-num rid="cn002">JP19H04930</contract-num>
<contract-num rid="cn002">JP19K14943</contract-num>
<contract-sponsor id="cn001">Consejo Nacional de Ciencia y Tecnolog&#x000ED;a<named-content content-type="fundref-id">10.13039/501100003141</named-content></contract-sponsor>
<contract-sponsor id="cn002">Japan Society for the Promotion of Science<named-content content-type="fundref-id">10.13039/501100001691</named-content></contract-sponsor>
<counts>
<fig-count count="8"/>
<table-count count="2"/>
<equation-count count="16"/>
<ref-count count="26"/>
<page-count count="12"/>
<word-count count="7319"/>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="s1">
<title>1. Introduction</title>
<p>Odor source localization is a search problem that requires fast decision-making based on sporadic and stochastic detection of chemical particles. Despite the challenge of turbulent and dilute plumes that often have a complex spatio-temporal structure (Mafra-Neto and Card&#x000E9;, <xref ref-type="bibr" rid="B9">1994</xref>; Celani et al., <xref ref-type="bibr" rid="B4">2014</xref>), insects such as the fruit fly (van Breugel and Dickinson, <xref ref-type="bibr" rid="B23">2014</xref>) and various species of moths (Vickers, <xref ref-type="bibr" rid="B25">2005</xref>) rely on olfactory searches to conduct essential behaviors such as searching for food or potential mates. The high performance that insects show on such a complex search problem despite their simple brain motivates researchers to further analyze and understand the decision processes that these insects execute when conducting olfactory searches (Baker et al., <xref ref-type="bibr" rid="B2">2018</xref>).</p>
<p>With this motivation, our research group has analyzed the olfactory behavior of the male silk moth <italic>Bombyx mori</italic> (lepidoptera: bombycidae). Despite having wings, this insect is unable to fly, and has a body that is on average 30 mm long and 10 mm wide. It has two antennae of approximately 6 mm in length on its head. This insect has been widely employed to analyze olfactory behavior because it exhibits only one action: It walks only when it detects a pheromone (<italic>Bombykol</italic>) released by its female counterpart (Obara, <xref ref-type="bibr" rid="B11">1979</xref>). Such behavior consists of a series of maneuvers called a &#x0201C;surge,&#x0201D; &#x0201C;zigzag,&#x0201D; and &#x0201C;loop.&#x0201D; This sequence of maneuvers has been approximated to a mean-response model denoted as &#x0201C;programmed behavior&#x0201D; (Kanzaki et al., <xref ref-type="bibr" rid="B7">1992</xref>).</p>
<p>Based on the mean durations of the surge, zigzag, and loop maneuvers, the programmed behavior has been algorithmically defined as follows: first, immediately after a pheromone stimulus, the moth advances in a straightforward manner through a <italic>surge</italic> motion. Then, if there is an absence of pheromone detections, the moth moves on a <italic>zig-zag</italic> pattern, trying to detect pheromones again. Finally, if the pheromone remains undetected, the moth transitions into a <italic>loop</italic> motion until the next detection. A diagram of the programmed behavior is shown in <xref ref-type="fig" rid="F1">Figure 1</xref>. Because the silk moth is motionless by default and only elicits its programmed behavior after the first pheromone hit, this search strategy has been labeled as &#x0201C;reactive&#x0201D; by Voges et al. (<xref ref-type="bibr" rid="B26">2014</xref>). Despite the simplicity of this sequential pattern, the male silk moth can effectively locate females with remarkable efficiency.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p><bold>(A)</bold> Specimen of a male silk moth pictured next to a ruler in mm. <bold>(B)</bold> Conceptual diagram of the &#x0201C;programmed behavior&#x0201D; model of the male silk moth behavior.</p></caption>
<graphic xlink:href="fncom-15-629380-g0001.tif"/>
</fig>
<p>However, this model does not reflect how the motions of the moth vary in response to fine spatio-temporal fluctuations of the odor plume and individual differences among specimens. In previous studies, such variability was investigated by identifying maneuver transitions with machine learning (Shigaki et al., <xref ref-type="bibr" rid="B17">2018b</xref>) and fuzzy logic (Shigaki et al., <xref ref-type="bibr" rid="B18">2019b</xref>). Although these studies succeeded in identifying deviations from the programmed behavior, they relied on data from electro-physiological signals obtained from implanting electrodes in the wing muscles or brain of the silk moth; however, electrode implantation is technically challenging and risks degrading the tissues of the moth. Therefore, an analysis method that allows modeling adaptive olfactory behavior from non-intrusive experimental measurements is necessary.</p>
<p>To identify adaptive olfactory behavior, recent studies have used the information-theoretic framework of infotaxis, which was first proposed by Vergassola et al. (<xref ref-type="bibr" rid="B24">2007</xref>). A recent study by Pang et al. (<xref ref-type="bibr" rid="B13">2018</xref>) investigated the features of odor encounters that modulate the intensity of upwind turns in the fruit fly <italic>Drosophila melanogaster</italic> and the mosquito <italic>Aedes aegypti</italic>. The authors found through simulations that, compared to a <italic>centerline inferring</italic> odor source search algorithm, infotaxis produced trajectories that were more similar to those of the actual animals, in the sense that they exhibit weaker upwind turns later in a sequence of odor encounters. Similarly, Calhoun et al. (<xref ref-type="bibr" rid="B3">2014</xref>) recently demonstrated the possibility of using infotaxis to model the multi-stage foraging behavior of the nematode <italic>Caenorhabditis elegans</italic>. In their paper, the authors showed that infotaxis-like search strategies, which minimize the entropy of the probability distribution of odor source locations, reflects both the &#x0201C;local&#x0201D; and &#x0201C;global&#x0201D; stages of the <italic>C. elegans</italic> foraging behavior.</p>
<p>In this paper, we investigate the potential causes of variability in the behavioral maneuvers of the silk moth <italic>B. mori</italic> by using a non-invasive experimental method and an infotaxis-based model similar to those described in recent studies. We measured the silk moth trajectories and input stimuli data with a tether, a two-dimensional treadmill, and a virtual odor plume. To ensure accurate and reproducible stimuli, we used optogenetic silk moths that react to the impulses of blue light in the same way as with pheromone particles. We modeled the trajectories and stimuli measurements as infotaxis agents and found that; maneuvers that mismatch the <italic>programmed behavior</italic> model correspond to higher expected information rewards regarding the location of the source. In summary, we believe that this paper demonstrates the possibility of using non-invasive experimental measurements and infotaxis-based modeling to identify variability in the olfactory behaviors of the male silk moths.</p>
<p>This paper is structured as follows: section 2 states the research questions of this paper. Section 3 describes the usage of optogenetic silk moths, the experimental virtual reality system to measure their behavior, and how to model it as infotaxis agents. Section 4 shows the results of the behavior measurement experiments and calculations of the information entropy of infotaxis-modeled silk moths. Section 5 discusses the contributions of this study and possible future areas of research.</p>
</sec>
<sec id="s2">
<title>2. Problem Statement</title>
<p>In this paper, we look for possible causes of adaptive mechanisms in the olfactory behavior of the silk moth, which are not represented in the programmed behavior model. Specifically, we investigate the following two hypotheses:</p>
<list list-type="bullet">
<list-item><p>Are deviations from the programmed behavior motivated by higher information gains?</p></list-item>
<list-item><p>Can a probabilistic framework such as infotaxis explain how the male silk moth balances exploration and exploitation of olfactory information?</p></list-item>
</list>
<p>To test the first hypothesis, we need to measure the behavior of the silk moth in an olfactory environment that can be accurately reproduced in each experimental run. Therefore, in this paper we utilize a &#x0201C;virtual reality&#x0201D; behavioral measurement system in which we can subject moths to virtual odor plumes and measure their motor response to odor stimuli. However, such a system faces the challenge of an accurate stimulation of the moth antennae. In other words, stimulating the antennae with gaseous pheromone particles results in uncertain stimulation because such particles diffuse in the air; hence, they do not produce stimuli with the same intensity or duration each time. To overcome this, we employed genetically modified silkmoths that elicit their normal olfactory behavior response when subjected to a blue light stimulus at their antennae; thus, we can present reproducible olfactory inputs.</p>
<p>To test the second hypothesis we modeled the trajectories of silkmoths as an agent that minimizes the information entropy of its probabilistic belief of the location of an odor source. Such a maximally informative agent is based in the infotaxis algorithm (Vergassola et al., <xref ref-type="bibr" rid="B24">2007</xref>). We related the decrease in entropy of the infotaxis-modeled moth to the time steps in which the moth behavior matched or mismatched the programmed behavior model. Finally we determined whether infotaxis can explain the exploration-exploitation strategy of the silk moth behavior by evaluating the distribution of entropy reductions by either matching or mismatching behaviors.</p>
</sec>
<sec sec-type="materials and methods" id="s3">
<title>3. Materials and Methods</title>
<p>Here, we describe our methodology for conducting olfactory search experiments with optogenetic male silk moths and a non-invasive behavior measurement system. We also describe the method we used to represent the silk moth trajectories as those of an infotaxis agent. The silk moth experiments in this study were examined and approved by the Tokyo Institute of Technology Gene Recombination Experiments Safety Management Committee.</p>
<sec>
<title>3.1. Virtual Reality System for Measurement of Moth Behavior</title>
<p>We conducted non-intrusive behavioral measurements on tethered male silk moths. Although similar systems to measure the olfactory behavior of insects have been used in the past (Shigaki et al., <xref ref-type="bibr" rid="B15">2018a</xref>, <xref ref-type="bibr" rid="B18">2019b</xref>), in this study we ensure that odor stimuli are accurately presented by using optogenetic silk moths. Using genetically modified specimens that react to blue light stimuli in the same way as normal specimens react to the pheromone bombykol, allowed us to present stimuli accurately and with reproducibility. This is because gaseous pheromones diffuse in the air; therefore, not all stimuli present the same amount of pheromone molecules to the antennae of the moth. Furthermore, in this case, the response of the antennae is measured using an electroantennogram (EAG), which is technically challenging and subjected to electrical noise; in addition, damage to the antennae may occur. Our non-invasive behavior measurement system for the silk moth is shown in <xref ref-type="fig" rid="F2">Figure 2</xref>, and fulfills the following purposes:</p>
<list list-type="bullet">
<list-item><p>Measuring the pose (<italic>x</italic>, <italic>y</italic>, &#x003B8;) of the moths.</p></list-item>
<list-item><p>Accurately presenting light stimuli to the antennae of the moth.</p></list-item>
<list-item><p>Subjecting moths to a virtual odor plume to which we can alter the emission rate, wind speed, and other parameters.</p></list-item>
</list>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p><bold>(A)</bold> A diagram of the behavioral measurement system used in our experiments. <bold>(B)</bold> An actual ChR2 moth used in the measurement system with optic fibers pointed at its antennae to present blue light stimuli. <bold>(C)</bold> The dimensions of the virtual environment to which we subjected the moths and their initial position.</p></caption>
<graphic xlink:href="fncom-15-629380-g0002.tif"/>
</fig>
<p>To measure the pose of the silk moth, we fixed its back to a thin aluminum rod (&#x000D8; 2 mm; length 150 mm) with glue (G17 Bond, Konishi K.K., Osaka, Japan) and placed it on a polysterene sphere (&#x000D8; 60 mm), which served as a two-dimensional treadmill. When the moth walked, the sphere moved in response because it was being levitated by the flow of wind from a small fan (FW1251-1051C2ALARX, ARX, Wanchai, Hong Kong). The movements of the sphere were detected using two optical sensors, such as those found in a computer mouse (ADNS-5030, Avago Technologies, California, USA), at a sampling rate of 20 Hz. They were then translated into translational and rotational movements of the moth, that is, the pose.</p>
<p>We developed a virtual representation of an odor plume by modeling the dispersion of white smoke in a wind tunnel. First, we recorded videos of the dispersion of smoke. We also calculated the statistics of the position and intensity of the pixels in the smoke video. Based on these statistics, we programmed a random process that generates virtual circular puffs that match the intensity and transit the positions of the real smoke puffs in the video. An example of a virtual plume is shown in <xref ref-type="fig" rid="F2">Figure 2A</xref>. In addition to the virtual representation of the odor plume, we also programmed a virtual representation of a silk moth. As in the real world, the virtual moth reacts to the virtual plume and travels toward its source. By using a virtual odor plume environment, we can tune parameters such as wind speed, emission rate, and particle lifetime. Tuning such parameters is particularly useful in infotaxis-based behavior modeling because it allows for faster testing of various plume structures and higher reproducibility; compared with real plume experiments. In summary, the following process describes the operation of our experimental device:</p>
<list list-type="order">
<list-item><p>The moth in the virtual world encounters a puff of pheromone.</p></list-item>
<list-item><p>Blue light is shown to the real moth depending on which antenna of the virtual moth reacted.</p></list-item>
<list-item><p>The real moth moves after receiving the stimulus.</p></list-item>
<list-item><p>The movement of the real moth is sent to the virtual world.</p></list-item>
<list-item><p>The virtual moth reflects the movement of the real moth.</p></list-item>
<list-item><p>The loop is repeated until either the moth reaches the virtual source or until a predetermined time limit is passed.</p></list-item>
</list>
</sec>
<sec>
<title>3.2. Use of Optogenetic Moths for Accurate Antennae Stimulation</title>
<p>The presentation of accurate stimuli is important for the applied infotaxis-based analysis because updating the probability distribution of the source position; as well as the calculation of the expected entropy decrease, are directly affected by whether the agent experiences a hit or not at a given time step. In addition, reproducible odor stimuli are an overall useful property for an olfactory behavior measurement system because their duration and frequency can be finely tuned. Both properties have been reported to directly influence the olfactory behavior of moths (Celani et al., <xref ref-type="bibr" rid="B4">2014</xref>) and other animals (Ache et al., <xref ref-type="bibr" rid="B1">2016</xref>). To present olfactory stimuli to the moth, previous studies have presented pheromones from glass tubes placed directly in front of the antennae of the moth. However, the amount of pheromone particles that effectively reach the antenna varies owing to their gaseous nature.</p>
<p>To ensure that each stimulus has the same intensity and is accurately sensed by the antennae, we utilized genetically modified moths. These BmOR1-GAL4/UAS-ChR2 silk moths (ChR2 hereinafter); express channelrhodopsin-2 in their olfactory receptor neurons. As a result, they execute their olfactory search behavior when their antennae encounter blue light, rather than pheromone particles. This property has been used in previous studies to ensures that all stimuli are reproducible with the same intensity and duration (Shigaki et al., <xref ref-type="bibr" rid="B17">2018b</xref>, <xref ref-type="bibr" rid="B16">2019a</xref>). To activate channelrhodopsin-2, i.e., blue light sensitivity in these moths, we injected all-trans retinal (ATR) into their abdomen on the day before the experiments; because insects do not intrinsically possess ATR. All behavior measurement experiments were conducted from 9:00 to 17:00 to reduce circadian effects (Tomioka et al., <xref ref-type="bibr" rid="B22">1993</xref>). It is reported that brain serotonin level increases in the daytime and that serotonin enhances pheromones sensitivities in the silk moth (Gatellier et al., <xref ref-type="bibr" rid="B5">2004</xref>).</p>
<p>We generated stimuli for the ChR2 silk moths with LEDs (LBW5AP-JYKY-35-Z; Osram Opto Semiconductors), which produced blue light with a 470 nm wavelength and a light intensity of more than 1.6 mW/mm<sup>2</sup>. Such values of wavelength and light intensity have been reported to reliably produce olfactory search responses in ChR2 moths (Tabuchi et al., <xref ref-type="bibr" rid="B20">2013</xref>). On each LED, we attached optical fibers of 3 mm in diameter to ensure that blue light was directed only to each antenna, as seen in <xref ref-type="fig" rid="F2">Figure 2B</xref>. In addition, moths are unable to make yaw turns because their back is glued to an aluminum rod. The only rotation they are able to make is on their neck (see <xref ref-type="supplementary-material" rid="SM1">Supplementary Video</xref>). However, this neck rotation is very small and it does not decrease the sensibility or the amount of stimulation to the antennae.</p>
</sec>
<sec>
<title>3.3. Modeling the Silk Moth as an Infotaxis Agent</title>
<p>Infotaxis was first proposed by Vergassola et al. (<xref ref-type="bibr" rid="B24">2007</xref>) as an odor source search algorithm for turbulent environments. In this algorithm, a point-mass agent is located at a position <bold>r</bold> and searches for an odor source by iteratively reducing its uncertainty about the distribution of possible source locations <bold>r</bold><sub><italic>src</italic></sub>. The agent has knowledge of its trajectory, <inline-formula><mml:math id="M20"><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi mathcolor="black" mathvariant="-tex-caligraphic">T</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, which contains its sequence of positions as well as the odor &#x0201C;hits&#x0201D; it has experienced throughout the search. The agent also maintains a probability map <italic>P</italic>(<bold>r</bold><sub><italic>src</italic></sub>|<inline-formula><mml:math id="M21"><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi mathcolor="black" mathvariant="-tex-caligraphic">T</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>) or &#x0201C;belief&#x0201D; (Thrun et al., <xref ref-type="bibr" rid="B21">2005</xref>) about the location of the source. This belief spans all possible locations of the source <bold>r</bold><sub><italic>src</italic></sub> that in both the original infotaxis study and the present paper, consist of a two-dimensional lattice of discrete locations. The certainty of the belief <italic>P</italic>(<bold>r</bold><sub><italic>src</italic></sub>|<inline-formula><mml:math id="M32"><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi mathcolor="black" mathvariant="-tex-caligraphic">T</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>) is represented by Shannon&#x00027;s entropy as in Equation (1):</p>
<disp-formula id="E1"><label>(1)</label><mml:math id="M1"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>S</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mi>S</mml:mi><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>r</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">|</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi mathcolor="black" mathvariant="-tex-caligraphic">T</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">]</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>r</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:munder></mml:mstyle><mml:mi>P</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>r</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">|</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi mathcolor="black" mathvariant="-tex-caligraphic">T</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo class="qopname">ln</mml:mo><mml:mrow><mml:mo stretchy="true">(</mml:mo><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>r</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">|</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi mathcolor="black" mathvariant="-tex-caligraphic">T</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="true">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>The goal of infotaxis is to minimize the entropy of the belief <italic>P</italic>(<bold>r</bold><sub><italic>src</italic></sub>|<inline-formula><mml:math id="M24"><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi mathcolor="black" mathvariant="-tex-caligraphic">T</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>); therefore, at every time step, the agent calculates the expected change of entropy by moving from its current position <bold>r</bold><sub><italic>t</italic></sub> to a future position <bold>r</bold>&#x02032; as defined in Equation (2).</p>
<disp-formula id="E2"><label>(2)</label><mml:math id="M2"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:mrow><mml:mo>&#x00394;</mml:mo><mml:mi>S</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>&#x021A6;</mml:mo><mml:msup><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">]</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mo>&#x0002A;</mml:mo></mml:mrow></mml:msup><mml:mo>&#x00394;</mml:mo><mml:msup><mml:mrow><mml:mi>S</mml:mi></mml:mrow><mml:mrow><mml:mo>&#x0002A;</mml:mo></mml:mrow></mml:msup><mml:mo>&#x0002B;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>-</mml:mo><mml:msup><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mo>&#x0002A;</mml:mo></mml:mrow></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x00394;</mml:mo><mml:mi>S</mml:mi></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>Where <italic>p</italic><sup>&#x0002A;</sup> is the probability of finding the source at <bold>r</bold>&#x02032;, and &#x00394;<italic>S</italic><sup>&#x0002A;</sup> and &#x00394;<italic>S</italic> are the change in entropy if the source is found or not found at <bold>r</bold>&#x02032;, respectively. The agent then executes the move <inline-formula><mml:math id="M25"><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>&#x021A6;</mml:mo><mml:msup><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> with the largest negative value of <italic>E</italic>[&#x00394;<italic>S</italic>], or; in other words, the move that causes the greatest reduction of uncertainty in the agent&#x00027;s probability map of the possible source locations. <xref ref-type="fig" rid="F3">Figures 3A,B</xref> show conceptual representations of the agent&#x00027;s belief as well as the effect of odor detections on such belief. Detailed derivations of the infotaxis formulae are presented in <xref ref-type="app" rid="A1">Appendix A</xref> of this paper.</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p><bold>(A)</bold> An agent (blue dot) at the start of an infotaxis search. Each cell of the map has the same probability of being the odor source; thus, the entropy is maximal. <bold>(B)</bold> An agent that has narrowed down the probability distribution of the source location to an area near the actual source (star symbol). In this case, the information entropy of the belief is low. <bold>(C)</bold> How a silkmoth is modeled as a point-mass agent for infotaxis calculations. In this illustrative example, only the green area will react to pheromone particles owing to the &#x0201C;wingflap effect&#x0201D; i.e., when cos(&#x003C0; &#x02212; &#x003B8; &#x0002B; &#x003B8;<sub><italic>src</italic></sub>) &#x0003E; 0. <bold>(D)</bold> The adaptation of the infotaxis navigation policy to a silkmoth. In this case, moving forward from position <bold>r</bold><sub><italic>t</italic></sub> to <bold>r</bold>&#x02032; yields more expected entropy decrease than rotating. Please note that a more negative value is more desirable because it would narrow down the possible locations where the odor source is located.</p></caption>
<graphic xlink:href="fncom-15-629380-g0003.tif"/>
</fig>
<p>We modeled the body of the silk moth as a point agent with a radius of 10 mm (half of its average body length). We reduced the three degrees of freedom of the moth to (<italic>x</italic>, <italic>y</italic>) coordinates because an infotaxis agent moves in a two-dimensional grid ignoring the orientation. Furthermore, we considered as odor hits only those that occurred when the moth was facing upwind, that is, when cos(&#x003C0; &#x02212; &#x003B8; &#x0002B; &#x003B8;<sub><italic>src</italic></sub>) &#x0003E; 0 (see <xref ref-type="fig" rid="F3">Figure 3C</xref>), where &#x003B8; and &#x003B8;<sub><italic>src</italic></sub> are the angle of the moth and the plume&#x00027;s centerline, respectively. We considered this capture region because real moths limit the odor hits to those coming from the front by flapping their wings (Loudon and Koehl, <xref ref-type="bibr" rid="B8">2000</xref>).</p>
</sec>
<sec>
<title>3.4. Classification of Variability in the Moth Behavior</title>
<p>We determined whether the behavior of the silk moth matches the definition of the programmed behavior (Kanzaki et al., <xref ref-type="bibr" rid="B7">1992</xref>) by comparing it to the definition of Minegishi et al. (<xref ref-type="bibr" rid="B10">2012</xref>). Accordingly, we classified the maneuvers of the silk moth by simply considering the time elapsed since the last odor hit, which we call &#x0201C;blank duration&#x0201D; &#x003C4;<sub><italic>b</italic></sub> as in Celani et al. (<xref ref-type="bibr" rid="B4">2014</xref>). We also classified maneuvers according to both &#x003C4;<sub><italic>b</italic></sub> and the moth&#x00027;s linear and angular velocities (<italic>v</italic> and &#x003C9;, respectively) based on Minegishi et al. (<xref ref-type="bibr" rid="B10">2012</xref>). We denote the first and second classification as &#x0201C;temporal&#x0201D; and &#x0201C;kinematic,&#x0201D; respectively. <xref ref-type="table" rid="T1">Table 1</xref> shows a comparison of both schemes used to classify maneuvers and <xref ref-type="fig" rid="F4">Figure 4</xref> shows the result of using each scheme. The blank duration threshold of 500 ms in the &#x0201C;temporal&#x0201D; classification of <xref ref-type="table" rid="T1">Table 1</xref> was selected because this is the average duration of surge motions after an odor hit as reported in Kanzaki et al. (<xref ref-type="bibr" rid="B7">1992</xref>). Throughout all olfactory search experiments, we classified the moth maneuvers by both schemes and labeled the state of the moth at each time step as &#x0201C;matching&#x0201D; if it matches the criteria of both schemes and &#x0201C;mismatching&#x0201D; if it only matches the &#x0201C;kinematic&#x0201D; criteria.</p>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p>Definitions for the maneuvers of silk moths when classified by either a temporal or a kinematic state.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th/>
<th valign="top" align="left"><bold>Temporal</bold></th>
<th valign="top" align="left"><bold>Kinematic</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Surge</td>
<td valign="top" align="left">&#x003C4;<sub><italic>b</italic></sub> &#x02264; 500 ms</td>
<td valign="top" align="left">&#x003C4;<sub><italic>b</italic></sub> &#x02264; 500 ms and <italic>v</italic>&#x0003E;0 or</td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="left">&#x003C4;<sub><italic>b</italic></sub> &#x0003E; 200 ms and |&#x003C9;| &#x0003C;5deg/s</td>
</tr>
<tr>
<td valign="top" align="left">Rotate</td>
<td valign="top" align="left">&#x003C4;<sub><italic>b</italic></sub> &#x0003E; 500 ms</td>
<td valign="top" align="left">&#x003C4;<sub><italic>b</italic></sub> &#x0003E; 500 ms and |&#x003C9;|&#x0003E; 0</td>
</tr>
<tr>
<td valign="top" align="left">Stop</td>
<td valign="top" align="left">Otherwise</td>
<td valign="top" align="left">Otherwise</td>
</tr>
</tbody>
</table>
</table-wrap>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption><p>Classification of moth actions by <bold>(A)</bold> the kinematic criteria and <bold>(B)</bold> the temporal criteria.</p></caption>
<graphic xlink:href="fncom-15-629380-g0004.tif"/>
</fig>
<p>To determine whether &#x0201C;mismatching&#x0201D; behaviors are motivated by higher information gains, we analyze the value of the entropy change &#x00394;<italic>S</italic> and the expected entropy change <italic>E</italic>[&#x00394;<italic>S</italic>] regarding the rate of odor hits and the cumulative odor hits experienced by moths over a search. We are particularly interested in these variables because recent studies identified that they influence the decision-process of olfactory behaviors (Celani et al., <xref ref-type="bibr" rid="B4">2014</xref>; Pang et al., <xref ref-type="bibr" rid="B13">2018</xref>). We also evaluate whether the distribution of &#x00394;<italic>S</italic> is different for &#x0201C;matching&#x0201D; and &#x0201C;mismatching&#x0201D; behaviors with a two-sample Kolmogorov-Smirnov test and by comparing their histograms. In addition, we calculate the cumulative density function (CDF) of &#x00394;<italic>S</italic> and <italic>E</italic>[&#x00394;<italic>S</italic>] to specifically determine whether &#x0201C;mismatching&#x0201D; behaviors have a higher probability of obtaining larger negative values of those variables, that is, greater information gains. Finally, we calculate the root mean squared error (RMSE) between the values of &#x00394;<italic>S</italic> and <italic>E</italic>[&#x00394;<italic>S</italic>] to determine what type of behavior is more similar to infotaxis, regarding the rate of odor hits and the cumulative sum of hits, which are our variables of interest. The following section presents the results of the calculations of &#x00394;<italic>S</italic> and <italic>E</italic>[&#x00394;<italic>S</italic>] regarding hit rate and cumulative hits, the histograms and CDFs, and the RMSE of &#x0201C;matching&#x0201D; and &#x0201C;mismatching&#x0201D; behaviors.</p>
</sec>
</sec>
<sec sec-type="results" id="s4">
<title>4. Results</title>
<p>Here, we present the results of the VR odor source search experiments using optogenetic silkmoths. First, we present the trajectories of the moths as well as their information entropy. We then show the statistics of the matching versus mismatching states, followed by the relationship between those two states and the expected decrease in information entropy for each.</p>
<sec>
<title>4.1. VR Olfactory Search Experiments</title>
<p>We subjected ChR2 silkmoths to olfactory search experiments. We conducted 20 trials in which the moth searched for a pheromone source in a 350 mm long by 200 mm wide virtual environment where the wind was blowing in the positive <italic>x</italic>-direction at a mean speed of 0.1 m/s. The initial position of the moth in the virtual environment was (<italic>x</italic>, <italic>y</italic>, &#x003B8;) = (180, 0, &#x02212;&#x003C0;/6), where &#x003B8; is in radians. Moths searched for a source located at (<italic>x</italic>, <italic>y</italic>) = (0, 0) by entering a radius of 35 mm around it under a time limit of 180 s. The mean &#x000B1; std. dev. of the time required to reach the source was 73.92 &#x000B1; 46.5 s. <xref ref-type="fig" rid="F5">Figure 5A</xref> shows the information entropy for the experiments where moths found the pheromone source. The solid line represents the average value, the shaded range represents the standard deviation, and the gray lines show the value for each trial. <xref ref-type="fig" rid="F5">Figure 5B</xref> shows the moth trajectories of these successful trials. The color gradient represents the value of the information entropy. <xref ref-type="table" rid="T2">Table 2</xref> shows the statistics of the matching and mismatching moth states. Surge (temporal) and Rotate (kinematic) represent the proportion of time taken when the silk moths exhibited a mismatching state over the entire duration of the search experiments. In total we conducted 20 experiments with 10 specimens. Out of these, 12 trials from six specimens successfully found the odor source under the time limit; thus achieving a success rate of 60.0%. We considered only the data from the successful trials for the classification of matching and mismatching behaviors.</p>
<fig id="F5" position="float">
<label>Figure 5</label>
<caption><p><bold>(A)</bold> Information entropy of infotaxis-modeled silkmoths. Gray lines represent each of the 12 runs that found the odor source. The blue line represents the average entropy. <bold>(B)</bold> Trajectories of the successful experimental runs. The star symbol represents the pheromone source.</p></caption>
<graphic xlink:href="fncom-15-629380-g0005.tif"/>
</fig>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p>Normalized counts of each maneuver taken by the moths.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th/>
<th valign="top" align="center" style="border-bottom: thin solid #000000;" colspan="2"><bold>Temporal</bold></th>
</tr>
<tr>
<th valign="top" align="left"><bold>Kinematic</bold></th>
<th valign="top" align="center"><bold>Surge</bold></th>
<th valign="top" align="center"><bold>Rotate</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Surge</td>
<td valign="top" align="center">0.1597 &#x000B1; 0.07</td>
<td valign="top" align="center">0</td>
</tr>
<tr>
<td valign="top" align="left">Rotate</td>
<td valign="top" align="center"><sup>&#x0002A;&#x0002A;</sup>0.1939 &#x000B1; 0.11</td>
<td valign="top" align="center">0.6464 &#x000B1; 0.19</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>The &#x0201C;kinematic&#x0201D; classification scheme is based on the linear and angular velocities of the moths. The &#x0201C;temporal&#x0201D; scheme is based on the time since the last odor hit. The values with the asterisks indicate &#x0201C;mismatching&#x0201D; behaviors</italic>.</p>
</table-wrap-foot>
</table-wrap>
</sec>
<sec>
<title>4.2. Relationship Between Behavior Variability and Information Gains</title>
<p>We investigated whether there is a relationship between mismatching maneuvers and a higher expected decrease in entropy <inline-formula><mml:math id="M26"><mml:mi>E</mml:mi><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mo>&#x00394;</mml:mo><mml:mi>S</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>&#x021A6;</mml:mo><mml:msup><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula>. <xref ref-type="fig" rid="F6">Figure 6</xref> shows the actual rewards &#x00394;<italic>S</italic> and expected rewards <italic>E</italic>[&#x00394;<italic>S</italic>] of the match and mismatch behaviors. As can be seen in <xref ref-type="fig" rid="F6">Figures 6A,C</xref>, matching and mismatching behaviors generate large decreases in entropy at low or high hit rates, respectively. In addition, the matching behaviors generated penalties (entropy increase) at high numbers of accumulated hits. Please note that entropy is non-monotonous (Hajieghrary et al., <xref ref-type="bibr" rid="B6">2016</xref>; Rodr&#x000ED;guez et al., <xref ref-type="bibr" rid="B14">2017</xref>) and can increase on detection to non-detection sequences since the agent&#x00027;s belief is narrowed by the detection but broadens again at the non-detection. <xref ref-type="fig" rid="F6">Figures 6B,D</xref> show that the expected rewards are greater at low or high hit rates for mismatching and matching behaviors, respectively. <xref ref-type="fig" rid="F7">Figures 7A,C</xref> show histograms of the actual and expected rewards, respectively. We validated the statistical difference in the distributions of the matching and mismatching states (Kolmogorov-Smirnov test <italic>p</italic> &#x0003C; 0.01).</p>
<fig id="F6" position="float">
<label>Figure 6</label>
<caption><p><bold>(A,C)</bold> The actual rewards obtained by either matching or mismatching behavior. <bold>(B,D)</bold> The expected rewards. Blue hue indicates more entropy decrease, that is, greater information rewards. Red hue indicates the opposite. In this figure, &#x00394;<italic>S</italic><sub><italic>t</italic></sub> indicates the actual entropy change, in other words, <italic>S</italic>(<bold>r</bold><sub><italic>t</italic>&#x0002B;1</sub>) &#x02212; <italic>S</italic>(<bold>r</bold><sub><italic>t</italic></sub>). <italic>E</italic>[&#x00394;<italic>S</italic>] indicates the expected entropy change for all possible actions (i.e., moving from <bold>r</bold><sub><bold>t</bold></sub> to <bold>r</bold>&#x02032;).</p></caption>
<graphic xlink:href="fncom-15-629380-g0006.tif"/>
</fig>
<fig id="F7" position="float">
<label>Figure 7</label>
<caption><p><bold>(A,C)</bold> Histograms of actual and expected rewards, respectively. <bold>(B,D)</bold> Cumulative density functions of the actual and expected rewards, respectively.</p></caption>
<graphic xlink:href="fncom-15-629380-g0007.tif"/>
</fig>
<p><xref ref-type="fig" rid="F7">Figure 7B</xref> shows the cumulative density function of the actual rewards &#x00394;<italic>S</italic> for matching and mismatching behaviors. As shown in the figure, mismatching behaviors have a higher probability of greater entropy reductions (particularly values of approximately 10<sup>&#x02212;4</sup> and 10<sup>&#x02212;1</sup>). Mismatching behaviors also have a higher probability of a larger decrease in entropy (values of approximately -4 &#x000D7; 10<sup>&#x02212;3</sup>) as shown in <xref ref-type="fig" rid="F7">Figure 7D</xref>. <xref ref-type="fig" rid="F8">Figures 8A,B</xref> show the cumulative odor hits and hit rate against the root mean squared error between the actual &#x00394;<italic>S</italic> and the expected reward <italic>E</italic>[&#x00394;<italic>S</italic>]. This was calculated as shown in Equation (3), where <italic>N</italic> is 20 because the sampling frequency of the behavioral measurement system is 20 Hz.</p>
<disp-formula id="E3"><label>(3)</label><mml:math id="M5"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mi>R</mml:mi><mml:mi>M</mml:mi><mml:mi>S</mml:mi><mml:mi>E</mml:mi><mml:mo>=</mml:mo><mml:msqrt><mml:mrow><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:mfrac><mml:mstyle displaystyle="true"><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mtext>&#x000A0;</mml:mtext><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:munderover></mml:mstyle><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mo>&#x00394;</mml:mo><mml:msub><mml:mrow><mml:mi>S</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:mrow><mml:mo>&#x00394;</mml:mo><mml:msub><mml:mrow><mml:mi>S</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">]</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:msqrt></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<fig id="F8" position="float">
<label>Figure 8</label>
<caption><p>Root mean squared error (RMSE) between actual and expected rewards. Lower values indicate that the expected reward calculated by Equation (2) matches the actual rewards &#x00394;<italic>S</italic>. <bold>(A)</bold> RMSE against the accumulated odor hits of the agent over time. <bold>(B)</bold> RMSE against hit rates, which are the average number of odor hits per second.</p></caption>
<graphic xlink:href="fncom-15-629380-g0008.tif"/>
</fig>
</sec>
</sec>
<sec sec-type="discussion" id="s5">
<title>5. Discussion</title>
<p>In this study, we investigated the possible causes of variability in the programmed behavior model of the male silk moth. Specifically, we asked whether such variability leads to higher information gains; in other words, if it minimizes the information entropy of the probability distribution of the moth regarding the location of an odor source. We also investigated whether the probabilistic framework of infotaxis can explain how the male silk moth selects maneuvers to balance the exploration and exploitation of the expected rewards.</p>
<sec>
<title>5.1. Relationship Between Behavioral Variability and Information Rewards</title>
<p>In a recent study, Shigaki et al. (<xref ref-type="bibr" rid="B18">2019b</xref>) simultaneously measured the odor search behavior of male silkmoths and the neural activity from their lateral accessory lobe (LAL). The LAL generates motor commands in response to odor stimuli. That study found that silkmoths are less likely to &#x0201C;surge&#x0201D; (move forward) as the frequency of odor hits increases. In terms of infotaxis, this can be interpreted as moths preferring rotations (exploration) because, at high odor encounter rates, the expected decrease in entropy is less than at low rates. Our results found that matching and mismatching behaviors generate rewards at high and low hit rates, respectively (<xref ref-type="fig" rid="F6">Figure 6</xref>). Thus, this leads us to believe that at high hit rates, silk moths prefer reactive or more exploitative behaviors, and at low rates, they prefer more stochastic or explorative behaviors such as rotations instead of straight forward moves. Furthermore, this tendency was observed on all specimens that reached the odor source.</p>
<p>An interesting interpretation of these results can also be made from the viewpoint of reinforcement learning (RL). In this field, an agent learns to behave according to an optimal policy with the highest expected accumulated reward over a time horizon. Nonetheless, many RL algorithms face the exploration and exploitation dilemma in which greedily selecting the actions with the highest reward can lead to suboptimal policies stuck in the local maxima. A common way to avoid this is to add stochasticity in the selection of actions; thus balancing exploration and exploitation, using methods such as &#x003F5;-greedy algorithms (Sutton and Barto, <xref ref-type="bibr" rid="B19">2018</xref>). An analogy can be made to the behavior of the silkmoth in the sense that some randomness in the selection of the &#x0201C;surge&#x0201D; maneuver leads to higher information gains and possibly a better odor source search performance. This can be clearly seen in <xref ref-type="fig" rid="F7">Figures 7B,D</xref>, where the probability of obtaining better rewards is higher for the mismatching behaviors.</p>
</sec>
<sec>
<title>5.2. Exploration and Exploitation in Silk Moth Behavior</title>
<p>We found that maneuvers that deviate from the programmed behavior model correspond to a larger expected decrease in entropy, that is, a higher expected reward in the terminology of reinforcement learning. Therefore, we demonstrated the capability of the infotaxis strategy to quantitatively express maneuvers that deviate from the programmed behavior as explorative and those that match it as exploitative.</p>
<p>Another interesting point to note is the relationship between matching and mismatching behaviors with the root mean squared error (RMSE) of the real vs. expected rewards. As shown in <xref ref-type="fig" rid="F8">Figure 8A</xref>, the error decreases proportionally to the accumulation of odor hits. This is relatively intuitive because more detections narrow down the belief of the source location. However, more RMSE occurs between real and expected rewards at times of high hit rates. Furthermore, the matching behaviors have a lower error than the mismatching behaviors. One possible interpretation for this is that matching behaviors are more exploitative; thus they are more similar to the greedy infotaxis policy, whereas the mismatching behaviors are more explorative; hence, they differ from the expected reward of the infotaxis strategy.</p>
<p>We believe that being able to represent animal olfactory behavior through a method such as infotaxis is an important contribution to the fields of ethology and robotics because having a representation of the decision process of animals in terms of probabilistic beliefs and expected rewards facilitates the algorithmic implementation of these processes in robots. Furthermore, it allows for the refinement of these decision processes using tools such as machine and reinforcement learning.</p>
</sec>
</sec>
<sec sec-type="conclusions" id="s6">
<title>6. Conclusion</title>
<p>In this study, we measured the behavior of moths using a virtual reality system that presents accurate and reproducible odor stimuli by using blue light and optogenetic moths. We then took trajectories from these measurements and modeled them as an infotaxis (Vergassola et al., <xref ref-type="bibr" rid="B24">2007</xref>) strategy. We used infotaxis-based modeling to determine if variability in the silkmoth behavior is related to higher gains in information regarding the probabilistic distribution of the source location. We found that variations have a higher probability of obtaining larger information gains than &#x0201C;programmed behaviors&#x0201D; (i.e., reactive, exploitative behaviors). This suggests that silkmoths incorporate some stochasticity into their behavior to balance the exploration and exploitation of information gains. Future studies should be conducted to develop ways to extract decision-making mechanisms from free-running silkmoths. In this study, we used tethered moths walking on a treadmill, and, although such a device imposes minimal disturbances on the moth behavior, we believe it is necessary to study whether models from free-running experiments will differ from those in this specific study. It would also be useful to develop an olfactory search algorithm based on the silkmoth exploration/exploitation mechanisms elucidated in this paper and then implement such an algorithm on a robot to test whether the search performance is improved compared with either the programmed behavior or the infotaxis strategy.</p>
</sec>
<sec sec-type="data-availability-statement" id="s7">
<title>Data Availability Statement</title>
<p>The original contributions presented in the study are included in the article/<xref ref-type="sec" rid="s9">Supplementary Files</xref>, further inquiries can be directed to the corresponding author/s.</p>
</sec>
<sec id="s8">
<title>Author Contributions</title>
<p>CH-R, SF, SS, and DK contributed conception and design of the study. SF conducted the virtual reality silk moth experiments. CH-R performed the numerical analyses and wrote the manuscript. TS, RK, and HS provided genetically modified silk moths. All authors read and approved the submitted version.</p>
</sec>
<sec sec-type="COI-statement" id="conf1">
<title>Conflict of Interest</title>
<p>SF was employed by MHPS Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
</body>
<back>
<sec sec-type="supplementary-material" id="s9">
<title>Supplementary Material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fncom.2021.629380/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fncom.2021.629380/full#supplementary-material</ext-link></p>
<supplementary-material xlink:href="Video_1.MP4" id="SM1" mimetype="video/mp4" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ache</surname> <given-names>B. W.</given-names></name> <name><surname>Hein</surname> <given-names>A. M.</given-names></name> <name><surname>Bobkov</surname> <given-names>Y. V.</given-names></name> <name><surname>Principe</surname> <given-names>J. C.</given-names></name></person-group> (<year>2016</year>). <article-title>Smelling time: a neural basis for olfactory scene analysis</article-title>. <source>Trends Neurosci</source>. <volume>39</volume>, <fpage>649</fpage>&#x02013;<lpage>655</lpage>. <pub-id pub-id-type="doi">10.1016/j.tins.2016.08.002</pub-id><pub-id pub-id-type="pmid">27594700</pub-id></citation></ref>
<ref id="B2">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Baker</surname> <given-names>K. L.</given-names></name> <name><surname>Dickinson</surname> <given-names>M.</given-names></name> <name><surname>Findley</surname> <given-names>T. M.</given-names></name> <name><surname>Gire</surname> <given-names>D. H.</given-names></name> <name><surname>Louis</surname> <given-names>M.</given-names></name> <name><surname>Suver</surname> <given-names>M. P.</given-names></name> <etal/></person-group>. (<year>2018</year>). <article-title>Algorithms for olfactory search across species</article-title>. <source>J. Neurosci</source>. <volume>38</volume>, <fpage>9383</fpage>&#x02013;<lpage>9389</lpage>. <pub-id pub-id-type="doi">10.1523/JNEUROSCI.1668-18.2018</pub-id><pub-id pub-id-type="pmid">30381430</pub-id></citation></ref>
<ref id="B3">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Calhoun</surname> <given-names>A. J.</given-names></name> <name><surname>Chalasani</surname> <given-names>S. H.</given-names></name> <name><surname>Sharpee</surname> <given-names>T. O.</given-names></name></person-group> (<year>2014</year>). <article-title>Maximally informative foraging by <italic>Caenorhabditis elegans</italic></article-title>. <source>Elife</source> <volume>3</volume>:<fpage>e04220</fpage>. <pub-id pub-id-type="doi">10.7554/eLife.04220</pub-id><pub-id pub-id-type="pmid">25490069</pub-id></citation></ref>
<ref id="B4">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Celani</surname> <given-names>A.</given-names></name> <name><surname>Villermaux</surname> <given-names>E.</given-names></name> <name><surname>Vergassola</surname> <given-names>M.</given-names></name></person-group> (<year>2014</year>). <article-title>Odor landscapes in turbulent environments</article-title>. <source>Phys. Rev. X</source> <volume>4</volume>:<fpage>041015</fpage>. <pub-id pub-id-type="doi">10.1103/PhysRevX.4.041015</pub-id></citation></ref>
<ref id="B5">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gatellier</surname> <given-names>L.</given-names></name> <name><surname>Nagao</surname> <given-names>T.</given-names></name> <name><surname>Kanzaki</surname> <given-names>R.</given-names></name></person-group> (<year>2004</year>). <article-title>Serotonin modifies the sensitivity of the male silkmoth to pheromone</article-title>. <source>J. Exp. Biol</source>. <volume>207</volume>, <fpage>2487</fpage>&#x02013;<lpage>2496</lpage>. <pub-id pub-id-type="doi">10.1242/jeb.01035</pub-id><pub-id pub-id-type="pmid">15184520</pub-id></citation></ref>
<ref id="B6">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hajieghrary</surname> <given-names>H.</given-names></name> <name><surname>Hsieh</surname> <given-names>M. A.</given-names></name> <name><surname>Schwartz</surname> <given-names>I. B.</given-names></name></person-group> (<year>2016</year>). <article-title>Multi-agent search for source localization in a turbulent medium</article-title>. <source>Phys. Lett. A</source> <volume>380</volume>, <fpage>1698</fpage>&#x02013;<lpage>1705</lpage>. <pub-id pub-id-type="doi">10.1016/j.physleta.2016.03.013</pub-id></citation></ref>
<ref id="B7">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kanzaki</surname> <given-names>R.</given-names></name> <name><surname>Sugi</surname> <given-names>N.</given-names></name> <name><surname>Shibuya</surname> <given-names>T.</given-names></name></person-group> (<year>1992</year>). <article-title>Self-generated zigzag turning of bombyx mori males during pheromone-mediated upwind walking (physology)</article-title>. <source>Zool. Sci</source>. <volume>9</volume>, <fpage>515</fpage>&#x02013;<lpage>527</lpage>.</citation></ref>
<ref id="B8">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Loudon</surname> <given-names>C.</given-names></name> <name><surname>Koehl</surname> <given-names>M.</given-names></name></person-group> (<year>2000</year>). <article-title>Sniffing by a silkworm moth: wing fanning enhances air penetration through and pheromone interception by antennae</article-title>. <source>J. Exp. Biol</source>. <volume>203</volume>, <fpage>2977</fpage>&#x02013;<lpage>2990</lpage>.<pub-id pub-id-type="pmid">10976034</pub-id></citation></ref>
<ref id="B9">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mafra-Neto</surname> <given-names>A.</given-names></name> <name><surname>Card&#x000E9;</surname> <given-names>R. T.</given-names></name></person-group> (<year>1994</year>). <article-title>Fine-scale structure of pheromone plumes modulates upwind orientation of flying moths</article-title>. <source>Nature</source> <volume>369</volume>:<fpage>142</fpage>. <pub-id pub-id-type="doi">10.1038/369142a0</pub-id></citation></ref>
<ref id="B10">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Minegishi</surname> <given-names>R.</given-names></name> <name><surname>Takashima</surname> <given-names>A.</given-names></name> <name><surname>Kurabayashi</surname> <given-names>D.</given-names></name> <name><surname>Kanzaki</surname> <given-names>R.</given-names></name></person-group> (<year>2012</year>). <article-title>Construction of a brain-machine hybrid system to evaluate adaptability of an insect</article-title>. <source>Robot. Auton. Syst</source>. <volume>60</volume>, <fpage>692</fpage>&#x02013;<lpage>699</lpage>. <pub-id pub-id-type="doi">10.1016/j.robot.2011.06.012</pub-id></citation></ref>
<ref id="B11">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Obara</surname> <given-names>Y.</given-names></name></person-group> (<year>1979</year>). <article-title><italic>Bombyx mori</italic> mating dance: an essential in locationg the female</article-title>. <source>Appl. Entomol. Zool</source>. <volume>14</volume>, <fpage>130</fpage>&#x02013;<lpage>132</lpage>. <pub-id pub-id-type="doi">10.1303/aez.14.130</pub-id></citation></ref>
<ref id="B12">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Pang</surname> <given-names>R.</given-names></name></person-group> (<year>2018</year>). <source>Infotaxis Summary</source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://nbviewer.jupyter.org/github/rkp8000/infotaxis/blob/master/test_infotaxis.ipynb">https://nbviewer.jupyter.org/github/rkp8000/infotaxis/blob/master/test_infotaxis.ipynb</ext-link></citation></ref>
<ref id="B13">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pang</surname> <given-names>R.</given-names></name> <name><surname>van Breugel</surname> <given-names>F.</given-names></name> <name><surname>Dickinson</surname> <given-names>M.</given-names></name> <name><surname>Riffell</surname> <given-names>J. A.</given-names></name> <name><surname>Fairhall</surname> <given-names>A.</given-names></name></person-group> (<year>2018</year>). <article-title>History dependence in insect flight decisions during odor tracking</article-title>. <source>PLoS Comput. Biol</source>. <volume>14</volume>:<fpage>e1005969</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pcbi.1005969</pub-id><pub-id pub-id-type="pmid">29432454</pub-id></citation></ref>
<ref id="B14">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rodr&#x000ED;guez</surname> <given-names>J. D.</given-names></name> <name><surname>G&#x000F3;mez-Ullate</surname> <given-names>D.</given-names></name> <name><surname>Mej&#x000ED;a-Monasterio</surname> <given-names>C.</given-names></name></person-group> (<year>2017</year>). <article-title>On the performance of blind-infotaxis under inaccurate modeling of the environment</article-title>. <source>Eur. Phys. J. Spec. Top</source>. <volume>226</volume>, <fpage>2407</fpage>&#x02013;<lpage>2420</lpage>. <pub-id pub-id-type="doi">10.1140/epjst/e2017-70067-1</pub-id></citation></ref>
<ref id="B15">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shigaki</surname> <given-names>S.</given-names></name> <name><surname>Fikri</surname> <given-names>M. R.</given-names></name> <name><surname>Hernandez Reyes</surname> <given-names>C.</given-names></name> <name><surname>Sakurai</surname> <given-names>T.</given-names></name> <name><surname>Ando</surname> <given-names>N.</given-names></name> <name><surname>Kurabayashi</surname> <given-names>D.</given-names></name> <etal/></person-group>. (<year>2018a</year>). <article-title>Animal-in-the-loop system to investigate adaptive behavior</article-title>. <source>Adv. Robot</source>. <volume>32</volume>, <fpage>945</fpage>&#x02013;<lpage>953</lpage>. <pub-id pub-id-type="doi">10.1080/01691864.2018.1511473</pub-id></citation></ref>
<ref id="B16">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shigaki</surname> <given-names>S.</given-names></name> <name><surname>Haigo</surname> <given-names>S.</given-names></name> <name><surname>Reyes</surname> <given-names>C. H.</given-names></name> <name><surname>Sakurai</surname> <given-names>T.</given-names></name> <name><surname>Kanzaki</surname> <given-names>R.</given-names></name> <name><surname>Kurabayashi</surname> <given-names>D.</given-names></name> <etal/></person-group>. (<year>2019a</year>). <article-title>Analysis of the role of wind information for efficient chemical plume tracing based on optogenetic silkworm moth behavior</article-title>. <source>Bioinspir. Biomimet</source>. <volume>14</volume>:<fpage>046006</fpage>. <pub-id pub-id-type="doi">10.1088/1748-3190/ab1d34</pub-id><pub-id pub-id-type="pmid">31026859</pub-id></citation></ref>
<ref id="B17">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shigaki</surname> <given-names>S.</given-names></name> <name><surname>Sakurai</surname> <given-names>T.</given-names></name> <name><surname>Ando</surname> <given-names>N.</given-names></name> <name><surname>Kurabayashi</surname> <given-names>D.</given-names></name> <name><surname>Kanzaki</surname> <given-names>R.</given-names></name></person-group> (<year>2018b</year>). <article-title>Time-varying moth-inspired algorithm for chemical plume tracing in turbulent environment</article-title>. <source>IEEE Robot. Autom. Lett</source>. <volume>3</volume>, <fpage>76</fpage>&#x02013;<lpage>83</lpage>. <pub-id pub-id-type="doi">10.1109/LRA.2017.2730361</pub-id></citation></ref>
<ref id="B18">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shigaki</surname> <given-names>S.</given-names></name> <name><surname>Shiota</surname> <given-names>Y.</given-names></name> <name><surname>Kurabayashi</surname> <given-names>D.</given-names></name> <name><surname>Kanzaki</surname> <given-names>R.</given-names></name></person-group> (<year>2019b</year>). <article-title>Modeling of adaptive chemical plume tracing algorithm of insect using fuzzy inference</article-title>. <source>IEEE Trans. Fuzzy Syst</source>. <volume>28</volume>, <fpage>72</fpage>&#x02013;<lpage>84</lpage>. <pub-id pub-id-type="doi">10.1109/TFUZZ.2019.2915187</pub-id></citation></ref>
<ref id="B19">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sutton</surname> <given-names>R. S.</given-names></name> <name><surname>Barto</surname> <given-names>A. G.</given-names></name></person-group> (<year>2018</year>). <source>Reinforcement Learning: An Introduction</source>. <publisher-loc>Cambridge, MA</publisher-loc>: <publisher-name>MIT press</publisher-name>.</citation></ref>
<ref id="B20">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tabuchi</surname> <given-names>M.</given-names></name> <name><surname>Sakurai</surname> <given-names>T.</given-names></name> <name><surname>Mitsuno</surname> <given-names>H.</given-names></name> <name><surname>Namiki</surname> <given-names>S.</given-names></name> <name><surname>Minegishi</surname> <given-names>R.</given-names></name> <name><surname>Shiotsuki</surname> <given-names>T.</given-names></name> <etal/></person-group>. (<year>2013</year>). <article-title>Pheromone responsiveness threshold depends on temporal integration by antennal lobe projection neurons</article-title>. <source>Proc. Natl. Acad. Sci. U.S.A</source>. <volume>2013</volume>:<fpage>201313707</fpage>. <pub-id pub-id-type="doi">10.1073/pnas.1313707110</pub-id></citation></ref>
<ref id="B21">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Thrun</surname> <given-names>S.</given-names></name> <name><surname>Burgard</surname> <given-names>W.</given-names></name> <name><surname>Fox</surname> <given-names>D.</given-names></name></person-group> (<year>2005</year>). <source>Probabilistic Robotics</source>. <publisher-loc>Cambridge, MA</publisher-loc>: <publisher-name>MIT Press</publisher-name>.</citation>
</ref>
<ref id="B22">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tomioka</surname> <given-names>K.</given-names></name> <name><surname>Ikeda</surname> <given-names>M.</given-names></name> <name><surname>Nagao</surname> <given-names>T.</given-names></name> <name><surname>Tamotsu</surname> <given-names>S.</given-names></name></person-group> (<year>1993</year>). <article-title>Involvement of serotonin in the circadian rhythm of an insect visual system</article-title>. <source>Naturwissenschaften</source> <volume>80</volume>, <fpage>137</fpage>&#x02013;<lpage>139</lpage>. <pub-id pub-id-type="doi">10.1007/BF01131019</pub-id></citation>
</ref>
<ref id="B23">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>van Breugel</surname> <given-names>F.</given-names></name> <name><surname>Dickinson</surname> <given-names>M. H.</given-names></name></person-group> (<year>2014</year>). <article-title>Plume-tracking behavior of flying drosophila emerges from a set of distinct sensory-motor reflexes</article-title>. <source>Curr. Biol</source>. <volume>24</volume>, <fpage>274</fpage>&#x02013;<lpage>286</lpage>. <pub-id pub-id-type="doi">10.1016/j.cub.2013.12.023</pub-id></citation>
</ref>
<ref id="B24">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Vergassola</surname> <given-names>M.</given-names></name> <name><surname>Villermaux</surname> <given-names>E.</given-names></name> <name><surname>Shraiman</surname> <given-names>B. I.</given-names></name></person-group> (<year>2007</year>). <article-title>&#x0201C;Infotaxis&#x0201D; as a strategy for searching without gradients</article-title>. <source>Nature</source> <volume>445</volume>:<fpage>406</fpage>. <pub-id pub-id-type="doi">10.1038/nature05464</pub-id></citation>
</ref>
<ref id="B25">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Vickers</surname> <given-names>N. J.</given-names></name></person-group> (<year>2005</year>). <article-title>Winging it: moth flight behavior and responses of olfactory neurons are shaped by pheromone plume dynamics</article-title>. <source>Chem. Senses</source> <volume>31</volume>, <fpage>155</fpage>&#x02013;<lpage>166</lpage>. <pub-id pub-id-type="doi">10.1093/chemse/bjj011</pub-id><pub-id pub-id-type="pmid">16339269</pub-id></citation></ref>
<ref id="B26">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Voges</surname> <given-names>N.</given-names></name> <name><surname>Chaffiol</surname> <given-names>A.</given-names></name> <name><surname>Lucas</surname> <given-names>P.</given-names></name> <name><surname>Martinez</surname> <given-names>D.</given-names></name></person-group> (<year>2014</year>). <article-title>Reactive searching and infotaxis in odor source localization</article-title>. <source>PLoS Comput. Biol</source>. <volume>10</volume>:<fpage>e1003861</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pcbi.1003861</pub-id><pub-id pub-id-type="pmid">25330317</pub-id></citation></ref>
</ref-list>
<app-group>
<app id="A1">
<title>Appendix A: Infotaxis strategy</title>
<p>Herein, we provide a more detailed explanation of the derivation of the infotaxis formulae. We based this explanation on the work of Pang (<xref ref-type="bibr" rid="B12">2018</xref>) and the original infotaxis strategy developed by Vergassola et al. (<xref ref-type="bibr" rid="B24">2007</xref>). The agent&#x00027;s belief in the source location <italic>P</italic>(<bold>r</bold><sub><italic>src</italic></sub>|<inline-formula><mml:math id="M27"><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi mathcolor="black" mathvariant="-tex-caligraphic">T</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>) can be written using Bayes&#x00027; theorem, as indicated in Equation (A1).</p>
<disp-formula id="E4"><label>(A1)</label><mml:math id="M6"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mi>P</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>r</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">|</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi mathcolor="black" mathvariant="-tex-caligraphic">T</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi mathcolor="black" mathvariant="-tex-caligraphic">T</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">|</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>r</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>r</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi mathcolor="black" mathvariant="-tex-caligraphic">T</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mfrac><mml:mo>&#x0221D;</mml:mo><mml:mi>P</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi mathcolor="black" mathvariant="-tex-caligraphic">T</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">|</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>r</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>r</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where <italic>P</italic>(<inline-formula><mml:math id="M28"><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi mathcolor="black" mathvariant="-tex-caligraphic">T</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>|<bold>r</bold><sub><italic>src</italic></sub>) is the likelihood of the source position and <italic>P</italic>(<bold>r</bold><sub><italic>src</italic></sub>) is the prior distribution of the source. Infotaxis assumes that odor hits and misses are independent of one another and the likelihood of the source position takes the following form:</p>
<disp-formula id="E5"><label>(A2)</label><mml:math id="M7"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mi>P</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi mathcolor="black" mathvariant="-tex-caligraphic">T</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">|</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>r</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x0220F;</mml:mo></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:munder></mml:mstyle><mml:mi>P</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>h</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo stretchy="false">|</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>r</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where <italic>h</italic>(<bold>r</bold><sub><italic>t</italic></sub>) is 1 if the agent detects an odor hit at time <italic>t</italic> and 0 if it detects a miss. The infotaxis strategy considers that the number of hits follows a Poisson distribution; hence, the probability of a hit or miss becomes the following:</p>
<disp-formula id="E6"><label>(A3)</label><mml:math id="M8"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mi>P</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>h</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo stretchy="false">|</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>r</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mo class="qopname">exp</mml:mo><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:mrow><mml:mo>-</mml:mo><mml:mi>R</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">|</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>r</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x00394;</mml:mo><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">]</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="E7"><label>(A4)</label><mml:math id="M9"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mi>P</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>h</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0003E;</mml:mo><mml:mn>0</mml:mn><mml:mo stretchy="false">|</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>r</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>-</mml:mo><mml:mi>P</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>h</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo stretchy="false">|</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>r</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where <italic>R</italic>(<bold>r</bold><sub><italic>t</italic></sub>|<bold>r</bold><sub><italic>src</italic></sub>)&#x00394;<italic>t</italic> is the mean rate of hits the agent expects at <bold>r</bold><sub><italic>t</italic></sub>, during a time period of &#x00394;<italic>t</italic>, given a source position <bold>r</bold><sub><italic>src</italic></sub>. The <xref ref-type="supplementary-material" rid="SM1">Supplementary Material</xref> of the original paper on infotaxis indicate that the hit rate is derived from the advection-diffusion equation of a turbulent plume and define it as follows:</p>
<disp-formula id="E8"><label>(A5)</label><mml:math id="M10"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mi>R</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle><mml:mo stretchy="false">|</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>r</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>E</mml:mi></mml:mrow><mml:mrow><mml:mo class="qopname">ln</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:mo>&#x003BB;</mml:mo></mml:mrow><mml:mrow><mml:mi>&#x003B1;</mml:mi></mml:mrow></mml:mfrac></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mfrac><mml:msup><mml:mrow><mml:mi>e</mml:mi></mml:mrow><mml:mrow><mml:mfrac><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>r</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:mi>x</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mi>V</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn><mml:mi>D</mml:mi></mml:mrow></mml:mfrac></mml:mrow></mml:msup><mml:msub><mml:mrow><mml:mi>K</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="true">(</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:mo stretchy="false">|</mml:mo><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>r</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">|</mml:mo></mml:mrow><mml:mrow><mml:mo>&#x003BB;</mml:mo></mml:mrow></mml:mfrac></mml:mrow><mml:mo stretchy="true">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="E9"><label>(A6)</label><mml:math id="M11"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mo>&#x003BB;</mml:mo><mml:mo>=</mml:mo><mml:msqrt><mml:mrow><mml:mfrac><mml:mrow><mml:mi>D</mml:mi><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn><mml:mo>&#x0002B;</mml:mo><mml:msup><mml:mrow><mml:mi>V</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mi>&#x003C4;</mml:mi><mml:mo>/</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>4</mml:mn><mml:mi>D</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:mrow></mml:msqrt></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where <italic>E</italic> is the emission rate of odor particles, which have an effective diffusivity <italic>D</italic> and, a finite lifetime &#x003C4;, and are advected by a wind with mean velocity <italic>V</italic> that blows in the positive <italic>x</italic>-direction. <italic>K</italic><sub>0</sub> is the modified Bessel function of order zero, and &#x003B1; is the radius of a round-shaped agent. In our calculations of the information entropy of the silkmoth we used the following parameters into Equation A6: &#x003B1;=10 mm, <italic>E</italic>=1, &#x003C4;=6.3 s, <italic>D</italic>=0.012, and <italic>V</italic>=0.1 m/s to match the wind speed in the moth experiments. The range of possible values for the source location was a 1730 &#x000D7; 770 lattice; i.e. the size of each cell was 0.26 mm. For Equation A4 we set &#x00394;<italic>t</italic> to 50 ms; which is the same as the sampling period of the treadmill described in section 3.1. At each time step, the belief of the source position distribution <italic>P</italic>(<inline-formula><mml:math id="M29"><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi mathcolor="black" mathvariant="-tex-caligraphic">T</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>|<bold>r</bold><sub><italic>src</italic></sub>) can be recursively updated as follows:</p>
<disp-formula id="E10"><label>(A7)</label><mml:math id="M12"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mi>P</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>r</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">|</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi mathcolor="black" mathvariant="-tex-caligraphic">T</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>t</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>P</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>h</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>t</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo stretchy="false">|</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>r</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>r</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">|</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi mathcolor="black" mathvariant="-tex-caligraphic">T</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>At each time step, the agent considers five possible actions: moving forward, backward, left, right, or waiting. For each possible action, it calculates the probability <italic>p</italic><sub>&#x0002A;</sub> that the action will result in finding the source:</p>
<disp-formula id="E11"><label>(A8)</label><mml:math id="M13"><mml:mrow><mml:msup><mml:mi>p</mml:mi><mml:mo>&#x0002A;</mml:mo></mml:msup><mml:mo>=</mml:mo><mml:mstyle displaystyle='true'><mml:munder><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:msub><mml:mstyle mathvariant='bold'><mml:mtext>r</mml:mtext></mml:mstyle><mml:mrow><mml:mi>s</mml:mi><mml:mi>r</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:munder><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mstyle mathvariant='bold'><mml:mtext>r</mml:mtext></mml:mstyle><mml:mrow><mml:mi>s</mml:mi><mml:mi>r</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0007C;</mml:mo><mml:msub><mml:mstyle mathvariant="bold-italic"><mml:mi mathcolor="black" mathvariant="-tex-caligraphic">T</mml:mi></mml:mstyle><mml:mi>t</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mstyle><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mrow><mml:mo>|</mml:mo> <mml:mrow><mml:msup><mml:mstyle mathvariant='bold'><mml:mtext>r</mml:mtext></mml:mstyle><mml:mo>&#x02032;</mml:mo></mml:msup><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mstyle mathvariant='bold'><mml:mtext>r</mml:mtext></mml:mstyle><mml:mrow><mml:mi>s</mml:mi><mml:mi>r</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:mrow> <mml:mo>|</mml:mo></mml:mrow><mml:mo>&#x02248;</mml:mo><mml:mn>0</mml:mn></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula>
<p>Consequently, the probability of not finding the source is 1 &#x02212; <italic>p</italic><sub>&#x0002A;</sub>. If the source is found, then the entropy of the belief will become zero, that is, &#x00394;<italic>S</italic><sup>&#x0002A;</sup>=(0 &#x02212; <italic>S</italic><sub><italic>t</italic></sub>)=&#x02212;<italic>S</italic><sub><italic>t</italic></sub>. To balance the exploration and exploitation, the agent also considers the case in which it does not find the source after taking an action. In such case, it would sample from the environment either a miss with a probability <italic>p</italic><sub><italic>m</italic></sub> or a hit with a probability <italic>p</italic><sub><italic>h</italic></sub>=1 &#x02212; <italic>p</italic><sub><italic>m</italic></sub>. The probability of sampling a miss is the average of the miss probability over the range of possible source locations:</p>
<disp-formula id="E12"><label>(A9)</label><mml:math id="M14"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mi>m</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>r</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:munder></mml:mstyle><mml:msub><mml:mrow><mml:mi>P</mml:mi></mml:mrow><mml:mrow><mml:mi>m</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>r</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">|</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi mathcolor="black" mathvariant="-tex-caligraphic">T</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>r</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:munder></mml:mstyle><mml:mi>P</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>h</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo stretchy="false">|</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>r</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>r</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">|</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi mathcolor="black" mathvariant="-tex-caligraphic">T</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where <bold>r</bold>&#x02032; is the future position of the agent after taking an action. The agent also estimates how its source position belief, as well as its entropy, would change after moving. The change in entropy after sampling a miss or a hit at <bold>r</bold>&#x02032; would be the following:</p>
<disp-formula id="E13"><label>(A10)</label><mml:math id="M15"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mo>&#x00394;</mml:mo><mml:msub><mml:mrow><mml:mi>S</mml:mi></mml:mrow><mml:mrow><mml:mi>m</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>r</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:munder></mml:mstyle><mml:msub><mml:mrow><mml:mi>P</mml:mi></mml:mrow><mml:mrow><mml:mi>m</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>r</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">|</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi mathcolor="black" mathvariant="-tex-caligraphic">T</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo class="qopname">ln</mml:mo><mml:mrow><mml:mo stretchy="true">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>P</mml:mi></mml:mrow><mml:mrow><mml:mi>m</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>r</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">|</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi mathcolor="black" mathvariant="-tex-caligraphic">T</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="true">)</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>S</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="E14"><label>(A11)</label><mml:math id="M16"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mo>&#x00394;</mml:mo><mml:msub><mml:mrow><mml:mi>S</mml:mi></mml:mrow><mml:mrow><mml:mi>h</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>r</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:munder></mml:mstyle><mml:msub><mml:mrow><mml:mi>P</mml:mi></mml:mrow><mml:mrow><mml:mi>h</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>r</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">|</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi mathcolor="black" mathvariant="-tex-caligraphic">T</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo class="qopname">ln</mml:mo><mml:mrow><mml:mo stretchy="true">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>P</mml:mi></mml:mrow><mml:mrow><mml:mi>h</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>r</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">|</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi mathcolor="black" mathvariant="-tex-caligraphic">T</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="true">)</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>S</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>Overall, the agent calculates the expected change of entropy by moving from <bold>r</bold><sub><italic>t</italic></sub> to <bold>r</bold>&#x02032; as follows:</p>
<disp-formula id="E15"><label>(A12)</label><mml:math id="M17"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:mrow><mml:mo>&#x00394;</mml:mo><mml:mi>S</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>&#x021A6;</mml:mo><mml:msup><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">]</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mo>&#x0002A;</mml:mo></mml:mrow></mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>0</mml:mn><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>S</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>-</mml:mo><mml:msup><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mo>&#x0002A;</mml:mo></mml:mrow></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mi>m</mml:mi></mml:mrow></mml:msub><mml:mo>&#x00394;</mml:mo><mml:msub><mml:mrow><mml:mi>S</mml:mi></mml:mrow><mml:mrow><mml:mi>m</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mi>h</mml:mi></mml:mrow></mml:msub><mml:mo>&#x00394;</mml:mo><mml:msub><mml:mrow><mml:mi>S</mml:mi></mml:mrow><mml:mrow><mml:mi>h</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where the terms on the left and right sides of the sum are the change in entropy if the source is found or not found at <bold>r</bold>&#x02032;, respectively. Finally, the agent chooses the action <italic>a</italic> with the largest expected decrease in entropy (see <xref ref-type="fig" rid="F3">Figure 3D</xref>) as:</p>
<disp-formula id="E16"><label>(A13)</label><mml:math id="M18"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mi>a</mml:mi><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mo class="qopname">arg</mml:mo><mml:mo class="qopname">min</mml:mo></mml:mrow><mml:mrow><mml:msup><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:mrow><mml:mo>&#x00394;</mml:mo><mml:mi>S</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>&#x021A6;</mml:mo><mml:msup><mml:mrow><mml:mstyle mathvariant="bold"><mml:mtext>r</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">]</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>After making a move, the agent encounters either a miss or a hit from the odor plume and updates the probability distribution of the source location. The agent then repeats the navigation policy process iteratively until it finds the source.</p>
</app>
</app-group>
<fn-group>
<fn fn-type="financial-disclosure"><p><bold>Funding.</bold> CH-R acknowledges funding from Instituto de Innovaci&#x000F3;n y Transferencia de Tecnolog&#x000ED;a (I2T2) and Consejo Nacional de Ciencia y Tecnolog&#x000ED;a (CONACYT). This work was also partially supported by JSPS KAKENHI under Grant JP19H02104, JP19H04930, and JP19K14943.</p>
</fn>
</fn-group>
</back>
</article>
