<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Hum. Neurosci.</journal-id>
<journal-title>Frontiers in Human Neuroscience</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Hum. Neurosci.</abbrev-journal-title>
<issn pub-type="epub">1662-5161</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fnhum.2019.00386</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Human Neuroscience</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>The Dynamics of Attention Shifts Among Concurrent Speech in a Naturalistic Multi-speaker Virtual Environment</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Shavit-Cohen</surname> <given-names>Keren</given-names></name>
<uri xlink:href="https://loop.frontiersin.org/people/731666/overview"/>

</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Zion Golumbic</surname> <given-names>Elana</given-names></name>
<xref ref-type="corresp" rid="c001"><sup>&#x0002A;</sup></xref>
<uri xlink:href="https://loop.frontiersin.org/people/132738/overview"/>

</contrib>
</contrib-group>
<aff id="aff1"><institution>The Gonda Multidisciplinary Brain Research Center, Bar Ilan University</institution>, <addr-line>Ramat Gan</addr-line>, <country>Israel</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Pietro Cipresso, Italian Auxological Institute (IRCCS), Italy</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Kristina C. Backer, University of California, Merced, United States; Carlo Sestieri, Universit&#x000E0; degli Studi G. d&#x02019;Annunzio Chieti e Pescara, Italy</p></fn>
<corresp id="c001">&#x0002A;Correspondence: Elana Zion Golumbic <email>elana.zion-golumbic&#x00040;biu.ac.il</email></corresp>
<fn fn-type="other" id="fn001"><p>Specialty section: This article was submitted to Speech and Language, a section of the journal Frontiers in Human Neuroscience</p></fn>
</author-notes>
<pub-date pub-type="epub">
<day>08</day>
<month>11</month>
<year>2019</year>
</pub-date>
<pub-date pub-type="collection">
<year>2019</year>
</pub-date>
<volume>13</volume>
<elocation-id>386</elocation-id>
<history>
<date date-type="received">
<day>02</day>
<month>05</month>
<year>2019</year>
</date>
<date date-type="accepted">
<day>16</day>
<month>10</month>
<year>2019</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2019 Shavit-Cohen and Zion Golumbic.</copyright-statement>
<copyright-year>2019</copyright-year>
<copyright-holder>Shavit-Cohen and Zion Golumbic</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<abstract><p>Focusing attention on one speaker on the background of other irrelevant speech can be a challenging feat. A longstanding question in attention research is whether and how frequently individuals shift their attention towards task-irrelevant speech, arguably leading to occasional detection of words in a so-called unattended message. However, this has been difficult to gauge empirically, particularly when participants attend to continuous natural speech, due to the lack of appropriate metrics for detecting shifts in internal attention. Here we introduce a new experimental platform for studying the dynamic deployment of attention among concurrent speakers, utilizing a unique combination of Virtual Reality (VR) and Eye-Tracking technology. We created a Virtual Caf&#x000E9; in which participants sit across from and attend to the narrative of a target speaker. We manipulated the number and location of distractor speakers by placing additional characters throughout the Virtual Caf&#x000E9;. By monitoring participant&#x02019;s eye-gaze dynamics, we studied the patterns of overt attention-shifts among concurrent speakers as well as the consequences of these shifts on speech comprehension. Our results reveal important individual differences in the gaze-pattern displayed during selective attention to speech. While some participants stayed fixated on a target speaker throughout the entire experiment, approximately 30% of participants frequently shifted their gaze toward distractor speakers or other locations in the environment, regardless of the severity of audiovisual distraction. Critically, preforming frequent gaze-shifts negatively impacted the comprehension of target speech, and participants made more mistakes when looking away from the target speaker. We also found that gaze-shifts occurred primarily during gaps in the acoustic input, suggesting that momentary reductions in acoustic masking prompt attention-shifts between competing speakers, in line with &#x0201C;glimpsing&#x0201D; theories of processing speech in noise. These results open a new window into understanding the dynamics of attention as they wax and wane over time, and the different listening patterns employed for dealing with the influx of sensory input in multisensory environments. Moreover, the novel approach developed here for tracking the locus of momentary attention in a naturalistic virtual-reality environment holds high promise for extending the study of human behavior and cognition and bridging the gap between the laboratory and real-life.</p></abstract>
<kwd-group>
<kwd>speech processing</kwd>
<kwd>auditory attention</kwd>
<kwd>eye-tracking</kwd>
<kwd>virtual reality</kwd>
<kwd>cocktail party effect</kwd>
<kwd>distractability</kwd>
</kwd-group>
<contract-sponsor id="cn001">Israel Science Foundation<named-content content-type="fundref-id">10.13039/501100003977</named-content></contract-sponsor>
<contract-sponsor id="cn002">Commission europ&#x000E9;enne Office Europ&#x000E9;en de Lutte Antifraude<named-content content-type="fundref-id">10.13039/100013286</named-content></contract-sponsor>
<contract-sponsor id="cn003">United States&#x02014;Israel Binational Science Foundation<named-content content-type="fundref-id">10.13039/100006221</named-content></contract-sponsor>
<counts>
<fig-count count="5"/>
<table-count count="0"/>
<equation-count count="1"/>
<ref-count count="94"/>
<page-count count="12"/>
<word-count count="9018"/>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="introduction" id="s1">
<title>Introduction</title>
<p>Focusing attention on one speaker in a noisy environment can be challenging, particularly in the background of other irrelevant speech (McDermott, <xref ref-type="bibr" rid="B53">2009</xref>). Despite the difficulty of this task, comprehension of an attended speaker is generally good and the content of distractor speech is rarely recalled explicitly (Cherry, <xref ref-type="bibr" rid="B14">1953</xref>; Lachter et al., <xref ref-type="bibr" rid="B47">2004</xref>). Preferential encoding of attended speech in multi-speaker contexts is also mirrored by enhanced neural responses to attended vs. distractor speech (Ding and Simon, <xref ref-type="bibr" rid="B22">2012b</xref>; Mesgarani and Chang, <xref ref-type="bibr" rid="B54">2012</xref>; Zion Golumbic et al., <xref ref-type="bibr" rid="B93">2013b</xref>; O&#x02019;Sullivan et al., <xref ref-type="bibr" rid="B58">2015</xref>). However, there are also indications that distractor speech is processed, at least to some degree. Examples for this are the Irrelevant Stimulus Effect, where distractor words exert priming effect on an attended task (Treisman, <xref ref-type="bibr" rid="B82">1964</xref>; Neely and LeCompte, <xref ref-type="bibr" rid="B57">1999</xref>; Beaman et al., <xref ref-type="bibr" rid="B7">2007</xref>), as well as occasional explicit detection of salient words in distractor streams (Cherry, <xref ref-type="bibr" rid="B14">1953</xref>; Wood and Cowan, <xref ref-type="bibr" rid="B90">1995</xref>; R&#x000F6;er et al., <xref ref-type="bibr" rid="B69">2017</xref>; Parmentier et al., <xref ref-type="bibr" rid="B60">2018</xref>). These effects highlight a key theoretical tension regarding how processing resources are allocated among competing speech inputs. Whereas Late-Selection models of attention posit that attended and distractor speech can be fully processed, allowing for explicit detection of words in so-called unattended speech (Deutsch and Deutsch, <xref ref-type="bibr" rid="B20">1963</xref>; Duncan, <xref ref-type="bibr" rid="B23">1980</xref>; Parmentier et al., <xref ref-type="bibr" rid="B60">2018</xref>), Limited-Resources models hold that there are inherent bottlenecks for linguistic processing of concurrent speech due to limited resources (Broadbent, <xref ref-type="bibr" rid="B11">1958</xref>; Lachter et al., <xref ref-type="bibr" rid="B47">2004</xref>; Lavie et al., <xref ref-type="bibr" rid="B48">2004</xref>; Raveh and Lavie, <xref ref-type="bibr" rid="B64">2015</xref>). The latter perspective reconciles indications for occasional processing of distractor speech as stemming from rapid shifts of attention toward distractor speech (Conway et al., <xref ref-type="bibr" rid="B15">2001</xref>; Escera et al., <xref ref-type="bibr" rid="B25">2003</xref>; Lachter et al., <xref ref-type="bibr" rid="B47">2004</xref>). Yet, despite the parsimonious appeal of this explanation, to date, there is little empirical evidence supporting and characterizing the psychological reality of attention switches among concurrent speakers.</p>
<p>Establishing whether and when rapid shifts of attention towards distractor stimuli occur is operationally challenging since it refers to individuals&#x02019; internal state that researchers do not have direct access to. Existing metrics for detecting shifts of attention among concurrent speech primarily rely on indirect measures such as prolongation of reaction times on an attended task (Beaman et al., <xref ref-type="bibr" rid="B7">2007</xref>) or subjective reports (Wood and Cowan, <xref ref-type="bibr" rid="B90">1995</xref>). Given these limitations, the current understanding of the dynamics of attention over time, and the nature and consequences of rapid attention-shifts among concurrent speech is extremely poor. Nonetheless, gaining insight into the dynamics of internal attention-shifts is critical for understanding how attention operates in naturalistic multi-speaker settings.</p>
<p>Here, we introduce a new experimental platform for studying the dynamic deployment of attention among concurrent speakers. We utilize Virtual Reality (VR) technology to simulate a naturalistic audio-visual multi-speaker environment, and track participant&#x02019;s gaze-position within the Virtual Scene as a marker for the locus of overt attention and as a means for detecting attention-shifts among concurrent speakers. Participants experienced sitting in a &#x0201C;Virtual Caf&#x000E9;&#x0201D; across from a partner (avatar; animated target speaker) and were required to focus attention exclusively towards this speaker. Additional distracting speakers were placed at surrounding tables, with their number and location manipulated across conditions. Continuous tracking of gaze-location allowed us to characterize whether participants stayed focused on the target speaker as instructed or whether and how often they performed overt glimpses around the environment and toward distractor speakers. Critically, we tested whether shifting one&#x02019;s gaze around the environment and away from the target speaker impacted comprehension of target speech. We further tested whether gaze-shifts are associated with salient acoustic changes in the environment, such as onsets in distractor speech that can potentially grab attention exogenously (Wood and Cowan, <xref ref-type="bibr" rid="B90">1995</xref>) or brief pauses that create momentary unmasking of competing sounds (Lavie et al., <xref ref-type="bibr" rid="B48">2004</xref>; Cooke, <xref ref-type="bibr" rid="B16">2006</xref>).</p>
<p>Gaze-shifts are often used as a proxy for attention shifts in natural vision (Anderson et al., <xref ref-type="bibr" rid="B1">2015</xref>; Schomaker et al., <xref ref-type="bibr" rid="B71">2017</xref>; Walker et al., <xref ref-type="bibr" rid="B86">2017</xref>), however this measure has not been utilized extensively in dynamic contexts (Marius&#x02019;t Hart et al., <xref ref-type="bibr" rid="B81">2009</xref>; Foulsham et al., <xref ref-type="bibr" rid="B30">2011</xref>). This novel approach enabled us to characterize the nature of momentary attention-shifts in ecological multi-speaker listening conditions, as well as individual differences, gaining insight into the factors contributing to dynamic attention shifting and its consequences on speech comprehension.</p>
</sec>
<sec sec-type="materials and methods" id="s2">
<title>Materials and Methods</title>
<sec id="s2-1">
<title>Participants</title>
<p>Twenty-six adults participated in this study (ages 18&#x02013;32, median 24; 18 female, three left handed), all fluent in Hebrew, with self-reported normal hearing and no history of psychiatric or neurological disorders. Signed informed consent was obtained from each participant prior to the experiment, in accordance with the guidelines of the Institutional Ethics Committee at Bar-Ilan University. Participants were paid for participation or received class credit.</p>
</sec>
<sec id="s2-2">
<title>Apparatus</title>
<p>Participants were seated comfortably in an acoustic-shielded room and viewed a 3D VR scene of a caf&#x000E9;, through a head-mounted device (Oculus Rift Development Kit 2). The device was custom-fitted with an embedded eye-tracker (SMI, Teltow, Germany; 60 Hz monocular sampling rate) for continuous monitoring of participants&#x02019; eye-gaze position. Audio was presented through high-quality headphone (Sennheiser HD 280 pro).</p>
</sec>
<sec id="s2-3">
<title>Stimuli</title>
<p>Avatar characters were selected from the Mixamo platform (Adobe Systems, San Jose, CA, USA). Soundtracks for the avatars&#x02019; speech were 35&#x02013;50 s long segments of natural Hebrew speech taken from podcasts and short stories<xref ref-type="fn" rid="fn0001"><sup>1</sup></xref>. Avatars&#x02019; mouth and articulation movements were synced to the audio to create a realistic audio-visual experience of speech (LipSync Pro, Rogo Digital, England). Scene animation and experiment programming was controlled using an open-source VR engine (Unity Software<xref ref-type="fn" rid="fn0002"><sup>2</sup></xref>). Speech loudness levels (RMS) were equated for all stimuli, in 10-s long bins (to avoid biases due to fluctuations in speech time-course). Audio was further manipulated within Unity using a 3D sound algorithm, so that it was perceived as originating from the spatial location of the speaking avatar, with overall loudness decreasing logarithmically with distance from the listener. Participant&#x02019;s head movements were not restricted, and both the graphic display and 3D sound were adapted on-line in accordance with head-position, maintaining a spatially-coherent audio-visual experience.</p>
</sec>
<sec id="s2-4">
<title>Experiment Design</title>
<p>In the Virtual Caf&#x000E9; setting, participants experienced sitting at a caf&#x000E9; table facing a partner (animated speaking avatar) telling a personal narrative. They were told to focus attention exclusively on the speech of their partner (target speaker) and to subsequently answer four multiple-choice comprehension questions about the narrative (e.g., &#x0201C;What computer operating system was mentioned?&#x0201D;). Answers to the comprehension questions were evenly distributed throughout the narrative, and were pre-screened in a pilot study to ensure accuracy rates between 80% and 95% in a single-speaker condition. The time-period containing the answer to each question was recorded and used in subsequent analysis of performance as a function of gaze-shift behaviors (see below). Additional pairs of distracting speakers (avatars) were placed at surrounding tables, and we systematically manipulated the number and location of distractors in four conditions: No Distraction (NoD), Left Distractors (LD), Right Distractors (RD), Right and Left Distractors (RLD; <xref ref-type="fig" rid="F1">Figure 1</xref>). Each condition consisted of five trials (&#x0007E;4 min per condition) and was presented in random order, which was different for each participant. The identity and voice of the main speaker were kept constant throughout the experiment, with different narratives in each trial, while the avatars and narratives serving as distractors varied from trial to trial. The allocation of each narrative to the condition was counter-balanced across participants, to avoid material-specific biases. Before starting the experiment itself, participants were given time to look around and familiarize themselves with the Caf&#x000E9; environment and the characters in it. During this familiarization stage, no audio was presented and participants terminated it when they were ready. They also completed two training-trials, in the NoD and RLD conditions, to familiarize them with the stimuli and task as well as the type of comprehension questions asked. This familiarization and training period lasted approximately 3-min.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p>Manipulation of distraction in the Virtual Caf&#x000E9;. Participants are instructed to attend to the narrative of the target speaker facing them. The number and location of distractor speakers was manipulated in four conditions: only the target-speaker presented and No Distractors (NoD), a single distractor-pair sitting to the left (LD) or right (RD) of the target speaker, and two distractor-pairs sitting to the right and the left of the target speaker (RLD). Top-Left: demonstration of a participant experiencing the Virtual Caf&#x000E9; (written informed consent was obtained from the participant for publication of this photograph).</p></caption>
<graphic xlink:href="fnhum-13-00386-g0001.tif"/>
</fig>
</sec>
<sec id="s2-5">
<title>Analysis of Eye-Gaze Dynamics</title>
<p>Analysis of eye-gaze data was performed in Matlab (Mathworks, Natick, MA, USA) using functions from the fieldtrip toolbox<xref ref-type="fn" rid="fn0003"><sup>3</sup></xref> as well as custom-written scripts. The position of eye-gaze position in virtual space coordinates (x, y, z) was monitored continuously throughout the experiment. Periods surrounding eye-blinks were removed from the data (250 ms around each blink). Clean data from each trial were analyzed as follows.</p>
<p>First, we mapped gaze-positions onto specific avatars/locations in the 3D virtual scene. For data reduction, we used a spatial clustering algorithm (k-means) to combine gaze data-points associated with similar locations in space. Next, each spatial cluster was associated with the closest avatar, by calculating the Euclidean distance between the center of the cluster and the center of each avatar presented in that condition. If two or more clusters were associated with looking at the same avatar, they were combined. Similarly, clusters associated with the members of the distractor avatar-pairs (left or right distractors) were combined. If a cluster did not fall within a particular distance-threshold from any of the avatars, it was associated with looking at &#x0201C;The Environment.&#x0201D; This resulted in a maximum of four clusters capturing the different possible gaze locations in each trial: (1) Target Speaker; (2) Left Distractors (when relevant); (3) Right Distractors (when relevant); and (4) Rest of the Environment. The appropriateness of cluster-to-avatar association and distance-threshold selection was verified through visual inspection.</p>
<p>Based on the clustered data, we quantified the percent of time that participants spent focusing at each location (<italic>Percent Gaze Time</italic>) in each trial, and detected the times of <italic>Gaze-Shifts</italic> from one cluster to another. Gaze-shifts lasting less than 250 ms were considered artifacts and removed from the analysis, as they are physiologically implausible (Bompas and Sumner, <xref ref-type="bibr" rid="B8">2009</xref>; Gilchrist, <xref ref-type="bibr" rid="B32">2011</xref>). The number of Gaze-shifts as well as the Percent Gaze Time spent at each of the four locations&#x02014;Target Speaker, Left Distractors, Right Distractors and Environment&#x02014;were averaged across trials, within condition. Since conditions differed in the type and number of distractors, comparison across conditions focused mainly on metrics pertaining to gazing at/away-from the target speaker.</p>
<p>Mixed linear regression models were used in all analyses to fit the data and test for effects of Condition on gaze patterns (both Percent Gaze-Time Away and Gaze-Shifts), as well as possible correlations with speech comprehension accuracy measures. These analyses were conducted in R (R Development Core Team, <xref ref-type="bibr" rid="B100">2012</xref>) and we report statistical results derived using both regular linear (lme4 package for R; Bates et al., <xref ref-type="bibr" rid="B6">2015</xref>) and robust estimation approaches (robustlmm package for R; Koller, <xref ref-type="bibr" rid="B46">2016</xref>), to control for possible contamination by outliers. The advantage of mixed-effects models is that they account for variability between subjects and correlations within the data, as well as possible differences in trial numbers across conditions (Baayen et al., <xref ref-type="bibr" rid="B4">2008</xref>), which makes them particularly suitable for the type of data collected here.</p>
</sec>
<sec id="s2-6">
<title>Analysis of Speech Acoustics Relative to Gaze-Shifts</title>
<p>A key question is what prompts overt gaze-shifts away from the target speakers, and specifically whether they are driven by changes in the acoustic input or if they should be considered more internally-driven. Two acoustic factors that have been suggested as inviting attention-shifts among concurrent speech are: (a) onsets/loudness increases in distractor speech that can potentially grab attention exogenously (Wood and Cowan, <xref ref-type="bibr" rid="B90">1995</xref>); and (b) brief pauses that create momentary unmasking of competing sounds (Lavie et al., <xref ref-type="bibr" rid="B48">2004</xref>; Cooke, <xref ref-type="bibr" rid="B16">2006</xref>). To test whether one or both of these factors account for the occurrence of gaze-shifts away from the target speaker in the current data, we performed a gaze-shift time-locked analysis of the speech-acoustics of target speech (in all conditions) and distractor speech (in the LD, RD and RLD conditions).</p>
<p>To this end, we first calculated the temporal envelope of the speech presented in each trial using a windowed RMS (30 ms smoothing). The envelopes were segmented relative to the times where gaze-shifts <italic>away from the target speaker</italic> occurred in that particular trial (&#x02212;400 to +200 ms around each shift). Given that the initiation-time for executing saccades is &#x0007E;200 ms (Gilchrist, <xref ref-type="bibr" rid="B32">2011</xref>), the time-window of interest for looking at possible influences of the acoustics on gaze-shifts is prior to that, i.e., 400&#x02013;200 ms prior to the gaze-shift itself.</p>
<p>Since the number of gaze-shifts varied substantially across participants, we averaged the gaze-shift-locked envelope-segments across all trials and participants, within condition. The resulting average acoustic-loudness waveform in each condition was compared to a distribution of non-gaze-locked loudness levels, generated through a permutation procedure as follows: the same acoustic envelopes were segmented randomly into an equal number of segments as the number of gaze-shifts in each condition (sampled across participants with the same proportion as the real data). These were averaged, producing a non-gaze-locked average waveform. This procedure was repeated 1,000 times and the real gaze-shift locked waveform was compared to the distribution of non-gaze-locked waveforms. We identified time-points where the loudness level fell above or below the top/bottom 5% tile of the non-gaze-locked distribution, signifying that the speech acoustics were particularly quiet or loud relative (relative to the rest of the presented speech stimuli). We also quantified the signal-to-noise ratio (SNR) between the time-resolved spectrograms of target and distractor speech surrounding gaze-shifts, according to: <inline-formula><mml:math id="M1"><mml:mrow><mml:mi>S</mml:mi><mml:mi>N</mml:mi><mml:mi>R</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mi>f</mml:mi><mml:mo>,</mml:mo><mml:mi>t</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:mo>=</mml:mo><mml:mi>log</mml:mi><mml:mo>&#x02061;</mml:mo><mml:mo stretchy='false'>(</mml:mo><mml:mfrac><mml:mrow><mml:msub><mml:mi>P</mml:mi><mml:mrow><mml:mtext>target</mml:mtext></mml:mrow></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mi>f</mml:mi><mml:mo>,</mml:mo><mml:mi>t</mml:mi><mml:mo stretchy='false'>)</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mi>P</mml:mi><mml:mrow><mml:mtext>distractor</mml:mtext></mml:mrow></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mi>f</mml:mi><mml:mo>,</mml:mo><mml:mi>t</mml:mi><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:mfrac><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:math></inline-formula>, with <italic>P(f,t)</italic> depicting the power at frequency <italic>f</italic> at time <italic>t</italic>. This was calculated for target-distractor combinations surrounding each gaze-shift, and averaged across shifts and trials.</p>
</sec>
</sec>
<sec sec-type="results" id="s3">
<title>Results</title>
<sec id="s3-1">
<title>Gaze-Patterns and Speech Comprehension</title>
<p>On an average, participants spent -7.6% of each trial (-3 s in a 40-s-long trial) looking at locations other than the target speaker and they performed an average of 2.5 gaze-shifts per trial. <xref ref-type="fig" rid="F2">Figure 2A</xref> shows the distribution of eye-gaze location in two example trials taken from different participants, demonstrating that sometimes gaze was fixated on the target-speaker throughout the entire trial, and sometimes shifted occasionally towards the distractors. The distribution of Gaze-shifts was relatively uniform over the course of the entire experiment (<xref ref-type="fig" rid="F2">Figure 2B</xref>, left). Twenty-three percentage of gaze-shifts were performed near the onset of the trial, however, the majority of gaze-shifts occurred uniformly throughout the entire trial (<xref ref-type="fig" rid="F2">Figure 2B</xref>, right).</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p>Characterization of gaze-shift patterns. <bold>(A)</bold> Illustration of the variability in gaze-patterns across individuals. The figure depicts all gaze data points in a specific trial in the RLD condition for two example participants. While the participant shown in the left panel remained focused exclusively on the target speaker throughout the trial (blue dots), the participant in the right panel spent a substantial portion of the trial looking at the distractor speakers on both the left (green) and the right (magenta). <bold>(B)</bold> Left: distribution of all gaze-shifts across the duration of the experiment, collapsed across participants. Gaze-shifts occurred throughout the experiment and were not more prevalent the beginning/end. Right: distribution of gaze-shifts over the course of a trial, collapsed across all trials and participants. Twenty-three percentage of gaze-shifts occurred during the first 5 s of each trial, and the remainder could occur with similar probability throughout the entire trial.</p></caption>
<graphic xlink:href="fnhum-13-00386-g0002.tif"/>
</fig>
<p><xref ref-type="fig" rid="F3">Figures 3A,B</xref> show how the average <italic>Gaze Time</italic> <italic>Away</italic> from the target speaker (i.e., time spent looking at distractor avatars or other locations in the Environment) and the number of <italic>Gaze-Shifts</italic> away from the target speaker, varied across the four conditions. To test whether gaze patterns (number of <italic>Gaze-Shifts</italic> and/or proportion <italic>Gaze-Time</italic> <italic>Away</italic>) differed across conditions, we estimated each of them separately using linear mixed effect model with the factor Condition as a fixed effect (Gaze-Shifts&#x02019; Condition and Gaze-Time&#x02013;Condition), where each of the three distraction conditions (RD, LD and RLD) was compared to the NoD condition. By-subject intercepts were included as random effects. No significant effects of Condition were found on <italic>Gaze-Time</italic>, however, participants performed significantly more <italic>Gaze-Shifts</italic> in the RLD condition relative to the NoD condition (lmer: <italic>&#x003B2;</italic> = 0.8, <italic>t</italic> = 2.5, <italic>p</italic> = 0.01; robustlmm: <italic>&#x003B2;</italic> = 0.54, <italic>t</italic> = 2.5).</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p>Summary of gaze-shift patterns and behavioral outcomes across conditions. <bold>(A,B)</bold> Proportion of Gaze-Time and Number of Gaze-Shifts Away from target speaker, per trial and across conditions. Results within each condition are broken down by gaze-location (Right Distractors, Left Distractors or Environment in blank, left and right diagonals, respectively). There was no significant difference between conditions in the total Gaze-time away from the target speaker or number of gaze-shifts. Significantly more Gaze-Shifts were performed in the RLD condition relative to the NoD condition. No other contrasts were significant. <bold>(C)</bold> Mean accuracy on comprehension questions, across condition. Difference between conditions was not significant. <bold>(D,E)</bold> Analysis of Accuracy as a function of Gaze-Shift Patterns, at the whole trial level. Trials where participants spent a larger proportion of the time looking away from the target-speaker were associated with lower accuracy rates. No significant correlation was found between accuracy rates and the number of Gaze-Shifts performed. <bold>(F)</bold> Analysis of Accuracy on single question as a function of Gaze-Shift Patterns. Mistake rates were significantly higher if participants were looking away from the target speaker vs. fixating on the target speaker during the time-window when the information critical for answering the question was delivered. Error bars indicate Standard Error of the Mean (SEM). *<italic>p</italic> &#x0003C; 0.05.</p></caption>
<graphic xlink:href="fnhum-13-00386-g0003.tif"/>
</fig>
<p>Of critical interest is whether the presence of distractors and gaze-shifts towards them impacted behavioral outcomes of speech comprehension. Accuracy on the multiple-choice comprehension questions of the target speaker was relatively good in all conditions (mean accuracy 82% &#x000B1; 3; <xref ref-type="fig" rid="F3">Figure 3C</xref>). A mixed linear model estimating Accuracy &#x0007E; Condition did not reveal any significant differences in Accuracy between conditions (lmer: all <italic>t</italic>&#x02019;s &#x0003C; 0.199, <italic>p</italic> &#x0003E; 0.6; robustlmm: all <italic>t</italic>&#x02019;s &#x0003C; 0.05). However, adding Percent Gaze-Time as a second fixed effect to the Accuracy &#x0007E; Condition model, improved the model significantly (<italic>&#x003C7;</italic><sup>2</sup> = 9.14, <italic>p</italic> &#x0003C; 10<sup>3</sup>), with Percent Gaze-Time showing a significant correlation with Accuracy (lmer: <italic>&#x003B2;</italic> = &#x02212;0.19, <italic>t</italic> = &#x02212;3.13, <italic>p</italic> = 0.001; robustlmm: <italic>&#x003B2;</italic> = &#x02212;0.23, <italic>t</italic> = &#x02212;3.77; <xref ref-type="fig" rid="F3">Figure 3D</xref>). Adding Number of Shifts to the Accuracy &#x0007E; Condition model, however, did not yield any additional significant advantage (likelihood ratio test <italic>&#x003C7;</italic><sup>2</sup> = 2.4, <italic>p</italic> &#x0003E; 0.1; <xref ref-type="fig" rid="F3">Figure 3E</xref>), suggesting that the number of gaze-shifts performed <italic>per se</italic> did not affect speech comprehension.</p>
<p>To further assess the link between performance on the comprehension questions and gaze-shifts, we tested whether participants were more likely to make mistakes on specific questions if they happened to be looking away from the target-speaker when the critical information for answering that question was delivered. Mistake rates were slightly lower when participants fixated on the target speaker when the critical information was delivered (16% miss-rate) vs. when they looked away (18% miss-rate). To evaluate this effect statistically, we fit a linear mixed model to the accuracy results on individual questions testing whether they were mediated by the presence of a gaze-shift when the answer was given, as well as the condition [Accuracy &#x0007E; Shift (yes/no) + Condition as fixed effects], with by-subject intercepts included as random effects. This analysis demonstrated a small yet significant effect of the presence of a gaze-shift during the period when the answer was given (lmer <italic>&#x003B2;</italic> = &#x02212;0.05, <italic>t</italic> = &#x02212;2.16, <italic>p</italic> &#x0003C; 0.04; robustlmm <italic>t</italic> = &#x02212;3; <xref ref-type="fig" rid="F3">Figure 3F</xref>), however there was no significant effect of Condition (all <italic>t</italic>&#x02019;s &#x0003C; 0.5).</p>
</sec>
<sec id="s3-2">
<title>Individual Differences in Gaze Patterns</title>
<p>When looking at gaze-patterns across participants, we noted substantial variability in the number of gaze-shift performed and percent time spent gazing away from the target speaker. As illustrated in <xref ref-type="fig" rid="F2">Figures 2A</xref>, <xref ref-type="fig" rid="F4">4</xref>, some participants stayed completely focused on the main speaker throughout the entire experiment, whereas others spent a substantial portion of each trial gazing around the environment (<italic>range across participants</italic>: 0&#x02013;18 average number gaze-shifts per trial; 0&#x02013;34.52% average percent of trial spent looking away from the target speaker). This motivated further inspection of gaze-shift behavior at the individual level. Specifically, we tested whether individual behavior of performing many or few gaze-shifts away from the target was stable across conditions. We calculated Cronbach&#x02019;s &#x003B1; between conditions and found high internal consistency across conditions in the number of gaze-shifts performed as well as in the percent of gaze-time away from the target speaker (&#x003B1; = 0.889 and &#x003B1; = 0.832, respectively). This was further demonstrated by strong positive correlations between the percent time spent gazing away from the target speaker in No Distraction condition vs. each of the Distraction conditions (lmer: all <italic>r</italic>&#x02019;s &#x0003E; 0.5; robustlmm all <italic>r</italic>&#x02019;s &#x0003E; 0.6) as well as the number of gaze-shifts (lmer and robustlmm: all <italic>r</italic>&#x02019;s &#x0003E; 0.5; <xref ref-type="fig" rid="F4">Figures 4C,D</xref>). This pattern suggests that individuals have characteristic tendencies to either stay focused or gaze-around the scene, above and beyond the specific sensory attributes or degree of distraction in a particular scenario.</p>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption><p>Individual gaze-shift patterns. <bold>(A,B)</bold> Proportion of time spent gazing away from the target speaker (left) and average number of gaze-shifts per trial (right) in the NoD condition <bold>(A)</bold> and the RLD conditions <bold>(B)</bold>, across individual participants. In both cases, participant order is sorted by the NoD condition (top panels). Scatter plots on the left indicate the relationship between the number of gaze-shifts and the proportion gaze-time away, across all participants in each condition. <bold>(C,D)</bold> Scatter plots depicting the relationship between the proportion of time spent gazing away from the target speaker <bold>(C)</bold> and average number of gaze-shifts per trial <bold>(D)</bold>, in the two extreme conditions: NoD vs. RLD. Correlations were significant in both cases (<italic>r</italic> &#x0003E; 0.5).</p></caption>
<graphic xlink:href="fnhum-13-00386-g0004.tif"/>
</fig>
</sec>
<sec id="s3-3">
<title>Gaze-Locked Analysis of Speech Acoustics</title>
<p>Last, we tested whether there was any relationship between the timing of gaze-shifts and the local speech-acoustics. To this end, we performed a gaze-shift-locked analysis of the envelope of the target or distractor speech (when present). Analysis of distractor speech envelope consisted only of eye-gaze shifts <italic>toward</italic> <italic>that</italic> <italic>distractor</italic> (i.e., excluding shifts to other places in the environment). <xref ref-type="fig" rid="F5">Figure 5</xref> shows the average time-course of the target and distractor speech envelopes relative to the onset of a gaze-shift. For both target speech (top row) as well as for distractor speech (bottom row), gaze-shifts seem to have been preceded by a brief period of silence (within the lower 5% tile; red shading) between 200 and 300 ms prior to the shift.</p>
<fig id="F5" position="float">
<label>Figure 5</label>
<caption><p>Gaze-shift locked analysis of speech acoustics. <bold>(A)</bold> Average time-course of the target (top) and distractor (bottom) speech envelopes relative to gaze shift onset (<italic>t</italic> = 0). Horizontal dotted gray lines depict the top and bottom 5%tile of loudness values generated through the permutation procedure of non-gaze-locked acoustics segments. The shaded red areas indicate time-periods where the speech sound-level fell within the lower/upper 5% tile of the distribution, respectively. <bold>(B)</bold> Spectrograms depicting the signal-to-noise ratio (SNR) between the target and distractor speaker(s), surrounding the onset of a gaze-shift, in the single and two-distractor conditions. A reduction in SNR is seen in a 200 ms pre-shift time window, primarily in the higher &#x0201C;unvoiced&#x0201D; portion of the spectrogram (4&#x02013;8 KHz).</p></caption>
<graphic xlink:href="fnhum-13-00386-g0005.tif"/>
</fig>
<p>Frequency-resolved analysis of the SNR between target and distractor speech similarly indicates low SNR in the period preceding gaze-shifts. A reduction in SNR prior to gaze-shifts was primarily evident in the 3&#x02013;8 kHz range (sometimes considered the &#x0201C;unvoiced&#x0201D; part of the speech spectrum; Atal and Hanauer, <xref ref-type="bibr" rid="B2">1971</xref>), whereas SNR in the lower part of the spectrum (0&#x02013;2 kHz) was near 1 dB both before and after gaze-shifts. Although SNR does not take into account the overall loudness-level of each speaker but only the ratio between the speakers, the observed SNR modulation is consistent with momentary periods of silence/drops in the volume of <underline>both</underline> concurrent speakers.</p>
<p>This pattern is in line with an acoustic release-from-masking account, suggesting that gaze-shifts are prompted by momentary gaps in the speech, and particularly when gaps in concurrent speech coincide-temporally (as seen here in the Single and Two Distractor conditions). Conversely, the suggestion that attention-shifts are a product of exogenous capture by salient events in distracting speech does not seem to be supported by the current data, since the acoustics of the distractor speech that participants shifted their gaze towards did not seem to contain periods with consistently loud acoustics. We did, however, find increases in loudness of the target speech acoustics near gaze-shift onset (within the top 5% tile; red shading between &#x02212;100 and +50 ms).</p>
</sec>
</sec>
<sec sec-type="discussion" id="s4">
<title>Discussion</title>
<p>The current study is a first and novel attempt to characterize how individuals deploy overt attention in naturalistic audiovisual settings, laden with rich and competing stimuli. By monitoring eye-gaze dynamics in our Virtual Caf&#x000E9;, we studied the patterns of gaze-shifts and its consequences for speech comprehension. Interestingly, we found that the presence and number of competing speakers in the environment did not, on average, affect the amount of time spent looking at the target speaker, nor did it impair comprehension of the target speaker, although participants did perform slightly more gaze-shifts away in the two-distractor RLD condition. This demonstrates an overall resilience of the attention and speech-processing systems for overcoming the acoustic-load posed by distractors in naturalistic audio-visual conditions. This ability is of utmost ecological value, and likely benefits both from the availability of visual and spatial cues (Freyman et al., <xref ref-type="bibr" rid="B31">2004</xref>; Zion Golumbic et al., <xref ref-type="bibr" rid="B92">2013a</xref>) as well as the use of semantic context to maintain comprehension despite possible reductions in speech intelligibility (Simpson and Cooke, <xref ref-type="bibr" rid="B75">2005</xref>; Vergauwe et al., <xref ref-type="bibr" rid="B84">2010</xref>; Ding and Simon, <xref ref-type="bibr" rid="B21">2012a</xref>; Calandruccio et al., <xref ref-type="bibr" rid="B12">2018</xref>). At the same time, our results also suggest that the ability to maintain attention on the designated speaker under these conditions is highly individualized. Participants displayed characteristic patterns of either staying focused on a target speaker or sampling other locations in the environment overtly, regardless of the severity of the so-called sensory distraction. Critically, the amount of time that individuals spent looking around the environment and away from the target speaker was negatively correlated with speech comprehension, directly linking overt attention to speech comprehension. We also found that gaze-shifts away from the target speaker occurred primarily following gaps in the acoustic input, suggesting that momentary reductions in acoustic masking can prompt attention-shifts between competing speakers, in line with &#x0201C;glimpsing&#x0201D; theories of processing speech in noise. These results open a new window into understanding the dynamics of attention as they wax and wane over time, and the listening patterns exhibited by individuals for dealing with the influx of sensory input in complex naturalistic environments.</p>
<sec id="s4-1">
<title>Is Attention Stationary?</title>
<p>An underlying assumption of many experimental studies is that participants allocate attention solely to task-relevant stimuli, and that attention remains stationary over time. However, this assumption is probably unwarranted (Weissman et al., <xref ref-type="bibr" rid="B88">2006</xref>; Esterman et al., <xref ref-type="bibr" rid="B26">2013</xref>) since sustaining attention over long periods of time is extremely taxing (Schweizer and Moosbrugger, <xref ref-type="bibr" rid="B73">2004</xref>; Warm et al., <xref ref-type="bibr" rid="B87">2008</xref>; Avisar and Shalev, <xref ref-type="bibr" rid="B3">2011</xref>), and individuals spend a large proportion of the time mind-wandering or &#x0201C;off-task&#x0201D; (Killingsworth and Gilbert, <xref ref-type="bibr" rid="B43">2010</xref>; Boudewyn and Carter, <xref ref-type="bibr" rid="B9">2018</xref>; but see Seli et al., <xref ref-type="bibr" rid="B74">2018</xref>). Yet, empirically testing the studying the frequency and characteristics of attention shifts is operationally difficult since it pertains to participants&#x02019; internal state that experimenters do not have direct access to. The use of eye-gaze position as a continuous metric for the locus of momentary overt attention in a dynamic scene in the current study contributes to this endeavor.</p>
<p>Here, we found that indeed, in many participants eye-gaze was not maintained on the target speaker throughout the entire trial. Roughly 30% of participants spent over 10% of each trial looking at places in the environment other than the to-be-attended speaker, across all conditions. Interestingly, this proportion is similar to that reported in previous studies for the prevalence of detecting ones&#x02019; own name in a so-called unattended message (Cherry, <xref ref-type="bibr" rid="B14">1953</xref>; Wood and Cowan, <xref ref-type="bibr" rid="B90">1995</xref>), an effect attributed by some to rapid attention shifts (Lachter et al., <xref ref-type="bibr" rid="B47">2004</xref>; Beaman et al., <xref ref-type="bibr" rid="B7">2007</xref>; Lin and Yeh, <xref ref-type="bibr" rid="B50">2014</xref>). Although in the current study we did not test whether these participants also gleaned more information from distractors&#x02019; speech, we did find that comprehension of the target speaker was reduced as a function of the time spent looking away from the target speaker. Participants were also more likely to miss information from the target-speech during gaze-shifts away, yielding slightly higher mistake-rates. These results emphasize the dynamic nature of attention and attention-shifts, and demonstrate that brief overt attention-shifts can negatively impact speech processing in ecological multi-speaker and multisensory contexts.</p>
<p>They also highlight the importance of studying individual differences in attentional control. In the current study set, we did not collect additional personal data from participants which may have shed light on the source of the observed variability in gaze-patterns across individuals. However, based on previous literature, individual differences may stem from factors such as susceptibility to distraction (Ellermeier and Zimmer, <xref ref-type="bibr" rid="B24">1997</xref>; Cowan et al., <xref ref-type="bibr" rid="B17">2005</xref>; Avisar and Shalev, <xref ref-type="bibr" rid="B3">2011</xref>; Bourel-Ponchel et al., <xref ref-type="bibr" rid="B10">2011</xref>; Forster and Lavie, <xref ref-type="bibr" rid="B29">2014</xref>; Hughes, <xref ref-type="bibr" rid="B38">2014</xref>), working memory capacity (Conway et al., <xref ref-type="bibr" rid="B15">2001</xref>; Kane and Engle, <xref ref-type="bibr" rid="B40">2002</xref>; Tsuchida et al., <xref ref-type="bibr" rid="B83">2012</xref>; S&#x000F6;rqvist et al., <xref ref-type="bibr" rid="B77">2013</xref>; Hughes, <xref ref-type="bibr" rid="B38">2014</xref>; Naveh-Benjamin et al., <xref ref-type="bibr" rid="B56">2014</xref>; Wiemers and Redick, <xref ref-type="bibr" rid="B89">2018</xref>) or personality traits (Rauthmann et al., <xref ref-type="bibr" rid="B63">2012</xref>; Risko et al., <xref ref-type="bibr" rid="B66">2012</xref>; Baranes et al., <xref ref-type="bibr" rid="B5">2015</xref>; Hoppe et al., <xref ref-type="bibr" rid="B37">2018</xref>). Additional dedicated research is needed to resolve the source of the individual differences observed here.</p>
</sec>
<sec id="s4-2">
<title>Is Eye-Gaze a Good Measure for Attention-Shifts Among Concurrent Speech?</title>
<p>One may ask, to what extent do the current results fully capture the prevalence of attention-shifts, since it is known that these can also occur covertly (Posner, <xref ref-type="bibr" rid="B62">1980</xref>; Petersen and Posner, <xref ref-type="bibr" rid="B61">2012</xref>)? This is a valid concern and indeed the current results should be taken as representing a <italic>lower-bound</italic> for the frequency of attention-shifts and we should assume that attention-shifts are probably more prevalent than observed here. This motivates the future development of complementary methods for quantifying covert shifts of attention among concurrent speech, given the current absence of a reliable metrics.</p>
<p>Another concern that may be raised with regard to the current results is that individuals may maintain attention to the target speaker even while looking elsewhere, and hence the gaze-shifts measured here might not reflect true shifts of attention. Although in principle this could be possible, previous research shows that this is probably not the default mode of listening under natural audiovisual conditions. Rather, a wealth of studies demonstrate a tight link between gaze-shifts and attention-shifts (Chelazzi et al., <xref ref-type="bibr" rid="B13">1995</xref>; Deubel and Schneider, <xref ref-type="bibr" rid="B19">1996</xref>; Grosbras et al., <xref ref-type="bibr" rid="B36">2005</xref>; Szinte et al., <xref ref-type="bibr" rid="B80">2018</xref>) and gaze is widely utilized experimentally as a proxy for the locus of visuospatial attention (Gredeb&#x000E4;ck et al., <xref ref-type="bibr" rid="B35">2009</xref>; Linse et al., <xref ref-type="bibr" rid="B51">2017</xref>). In multi-speaker contexts, it has been shown that participants tend to move their eyes towards the location of attended speech sounds (Gopher and Kahneman, <xref ref-type="bibr" rid="B34">1971</xref>; Gopher, <xref ref-type="bibr" rid="B33">1973</xref>). Similarly, looking towards the location of distractor-speech significantly reduces intelligibility and memory for attended speech and increases intrusions from distractor speech (Reisberg et al., <xref ref-type="bibr" rid="B65">1981</xref>; Spence et al., <xref ref-type="bibr" rid="B78">2000</xref>; Yi et al., <xref ref-type="bibr" rid="B91">2013</xref>). This is in line with the current finding of a negative correlation between the time spent looking at the target speaker and speech comprehension, and higher mistake-rates during gaze-shifts, which further link overt gaze to selective attention to speech. Studies on audiovisual speech processing further indicate that looking at the talking face increases speech intelligibility and neural selectivity for attended speech (Sumby and Pollack, <xref ref-type="bibr" rid="B79">1954</xref>; Zion Golumbic et al., <xref ref-type="bibr" rid="B92">2013a</xref>; Lou et al., <xref ref-type="bibr" rid="B52">2014</xref>; Crosse et al., <xref ref-type="bibr" rid="B18">2016</xref>; Park et al., <xref ref-type="bibr" rid="B59">2016</xref>), even when the video is not informative about the content of speech (Kim and Davis, <xref ref-type="bibr" rid="B44">2003</xref>; Schwartz et al., <xref ref-type="bibr" rid="B72">2004</xref>), and eye-gaze is particularly utilized for focusing attention to speech under adverse listening condition (Yi et al., <xref ref-type="bibr" rid="B91">2013</xref>). Taken together, current findings support the interpretation that gaze-shifts reflect shifts in attention away from the target speaker, in line with the limited resources perspective of attention (Lavie et al., <xref ref-type="bibr" rid="B48">2004</xref>; Esterman et al., <xref ref-type="bibr" rid="B27">2014</xref>), making eye-gaze a useful and reliable metric for studying the dynamics of attention to naturalistic audio-visual speech. Interestingly, this metric has recently been capitalized on for use in assistive listening devices, utilizing eye-gaze direction to indicate the direction of a listener&#x02019;s attention (Favre-Felix et al., <xref ref-type="bibr" rid="B28">2017</xref>; Kidd, <xref ref-type="bibr" rid="B42">2017</xref>). That said, gaze-position is likely only one of several factors in determining successful speech comprehension in multi-speaker environments (e.g., SNR level, audio-visual congruency, engagement in content etc.), as suggested by the significant yet still moderate effect-sizes found here.</p>
</sec>
<sec id="s4-3">
<title>Listening Between the Gaps&#x02014;What Prompts Attention Shifts Among Concurrent Speech?</title>
<p>Besides characterizing the prevalence and behavioral consequences of attention-shifts in audio-visual multi-talker contexts, it is also critical to understand what prompts these shifts. Here we tested whether there are aspects of the scene acoustics that can be associated with attention-shifts away from the target speaker. We specifically tested two hypotheses: (1) that attention is captured exogenously by highly salient sensory events in distracting speech (Wood and Cowan, <xref ref-type="bibr" rid="B90">1995</xref>; Itti and Koch, <xref ref-type="bibr" rid="B39">2000</xref>; Kayser et al., <xref ref-type="bibr" rid="B41">2005</xref>); and (2) that attention-shifts occur during brief pauses in speech acoustics that momentarily unmask the competing sounds (Lavie et al., <xref ref-type="bibr" rid="B48">2004</xref>; Cooke, <xref ref-type="bibr" rid="B16">2006</xref>).</p>
<p>Regarding the first hypothesis, the current data suggest that distractor saliency is not a primary factor in prompting gaze-shifts. Since gaze-shifts were just as prevalent in the NoD condition as in conditions that contained distractors and since no consistent increase in distractor loudness was observed near gaze-shifts, we conclude that the gaze-shifts performed by participants do not necessarily reflect exogenous attentional capture by distractor saliency. This is in line with previous studies suggesting that sensory saliency is less effective in drawing exogenous attention in dynamic scenarios relative to the stationary contexts typically used in laboratory experiments (Smith et al., <xref ref-type="bibr" rid="B76">2013</xref>).</p>
<p>Rather, our current results seem to support the latter hypothesis that attention-shifts are prompted by momentary acoustic release-from-masking. We find that gaze-shifts occurred more consistently &#x0007E;200&#x02013;250 ms after instances of low acoustic intensity in both target and distractor sounds and low SNR. This time-scale is on-par with the initiation time for saccades (Gilchrist, <xref ref-type="bibr" rid="B32">2011</xref>), and suggests that momentary reduction in masking provide an opportunity for the system to shift attention between speakers. This pattern fits with accounts for comprehension of speech-in-noise, suggesting that listeners utilize brief periods of unmasking or low SNR to glean and piece together information for deciphering speech content (&#x0201C;acoustic glimpsing&#x0201D;; Cooke, <xref ref-type="bibr" rid="B16">2006</xref>; Li and Loizou, <xref ref-type="bibr" rid="B49">2007</xref>; Vestergaard et al., <xref ref-type="bibr" rid="B85">2011</xref>; Rosen et al., <xref ref-type="bibr" rid="B70">2013</xref>). Although this acoustic-glimpsing framework is often used to describe how listeners maintain intelligibility of target-speech in noise, it has not been extensively applied to studying <italic>shifts</italic> of attention among concurrent speech. The current results suggest that brief gaps in the audio or periods of low SNR may serve as triggers for momentary attention shifts, which can manifest overtly (as demonstrated here), and perhaps also covertly. Interestingly, a previous study found that eye-blinks also tend to occur more often around pauses when viewing and listening to audio-visual speech (Nakano and Kitazawa, <xref ref-type="bibr" rid="B55">2010</xref>), pointing to a possible link between acoustic glimpsing and a reset in the oculomotor system, creating optimal conditions for momentary attention-shifts.</p>
</sec>
</sec>
<sec sec-type="conclusion" id="s5">
<title>Conclusion</title>
<p>There is growing understanding that in order to really understand the human cognitive system, it needs to be studied in contexts relevant for real-life behavior, and that tightly constrained artificial laboratory paradigms do not always generalize to real-life (Kingstone et al., <xref ref-type="bibr" rid="B45">2008</xref>; Marius&#x02019;t Hart et al., <xref ref-type="bibr" rid="B81">2009</xref>; Foulsham et al., <xref ref-type="bibr" rid="B30">2011</xref>; Risko et al., <xref ref-type="bibr" rid="B67">2016</xref>; Rochais et al., <xref ref-type="bibr" rid="B68">2017</xref>; Hoppe et al., <xref ref-type="bibr" rid="B37">2018</xref>). The current study represents the attempt to bridge this gap between the laboratory and real-life, by studying how individuals spontaneously deploy overt attention in a naturalistic virtual-reality environment. Using this approach, the current study highlights the characteristics and individual differences in selective attention to speech under naturalistic listening conditions. This pioneering work opens up new horizons for studying how attention operates in real-life and understanding the factors contributing to success as well as the difficulties in paying attention to speech in noisy environments.</p>
</sec>
<sec id="s6">
<title>Data Availability Statement</title>
<p>The datasets generated for this study are available on request to the corresponding author.</p>
</sec>
<sec id="s7">
<title>Ethics Statement</title>
<p>The study was approved by the Institutional Ethics Committee at Bar-Ilan University, and the research was conducted according to the guidelines of the committee. Signed informed consent was obtained from each participant prior to the experiment.</p>
</sec>
<sec id="s8">
<title>Author Contributions</title>
<p>EZG designed the study, oversaw data collection and analysis. KS-C collected and analyzed the data. Both authors wrote the article.</p>
</sec>
<sec id="s9">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
</body>
<back>
<fn-group>
<fn fn-type="financial-disclosure">
<p><bold>Funding.</bold> This work was supported by the Israel Science Foundation I-Core Center for Excellence 51/11, and by the United States&#x02013;Israel Binational Science Foundation grant &#x00023;2015385.</p>
</fn>
</fn-group>
<ref-list>
<title>References</title>
<ref id="B1"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Anderson</surname> <given-names>N. C.</given-names></name> <name><surname>Ort</surname> <given-names>E.</given-names></name> <name><surname>Kruijne</surname> <given-names>W.</given-names></name> <name><surname>Meeter</surname> <given-names>M.</given-names></name> <name><surname>Donk</surname> <given-names>M.</given-names></name></person-group> (<year>2015</year>). <article-title>It depends on <italic>when</italic> you look at it: salience influences eye movements in natural scene viewing and search early in time</article-title>. <source>J. Vis.</source> <volume>15</volume>:<fpage>9</fpage>. <pub-id pub-id-type="doi">10.1167/15.5.9</pub-id><pub-id pub-id-type="pmid">26067527</pub-id></citation></ref>
<ref id="B2"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Atal</surname> <given-names>B. S.</given-names></name> <name><surname>Hanauer</surname> <given-names>S. L.</given-names></name></person-group> (<year>1971</year>). <article-title>Speech analysis and synthesis by linear prediction of the speech wave</article-title>. <source>J. Acoust. Soc. Am.</source> <volume>50</volume>, <fpage>637</fpage>&#x02013;<lpage>655</lpage>. <pub-id pub-id-type="doi">10.1121/1.1912679</pub-id><pub-id pub-id-type="pmid">4106390</pub-id></citation></ref>
<ref id="B3"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Avisar</surname> <given-names>A.</given-names></name> <name><surname>Shalev</surname> <given-names>L.</given-names></name></person-group> (<year>2011</year>). <article-title>Sustained attention and behavioral characteristics associated with ADHD in adults</article-title>. <source>Appl. Neuropsychol.</source> <volume>18</volume>, <fpage>107</fpage>&#x02013;<lpage>116</lpage>. <pub-id pub-id-type="doi">10.1080/09084282.2010.547777</pub-id><pub-id pub-id-type="pmid">21660762</pub-id></citation></ref>
<ref id="B4"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Baayen</surname> <given-names>R. H.</given-names></name> <name><surname>Davidson</surname> <given-names>D. J.</given-names></name> <name><surname>Bates</surname> <given-names>D. M.</given-names></name></person-group> (<year>2008</year>). <article-title>Mixed-effects modeling with crossed random effects for subjects and items</article-title>. <source>J. Mem. Lang.</source> <volume>59</volume>, <fpage>390</fpage>&#x02013;<lpage>412</lpage>. <pub-id pub-id-type="doi">10.1016/j.jml.2007.12.005</pub-id></citation></ref>
<ref id="B5"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Baranes</surname> <given-names>A.</given-names></name> <name><surname>Oudeyer</surname> <given-names>P.-Y.</given-names></name> <name><surname>Gottlieb</surname> <given-names>J.</given-names></name></person-group> (<year>2015</year>). <article-title>Eye movements reveal epistemic curiosity in human observers</article-title>. <source>Vision Res.</source> <volume>117</volume>, <fpage>81</fpage>&#x02013;<lpage>90</lpage>. <pub-id pub-id-type="doi">10.1016/j.visres.2015.10.009</pub-id><pub-id pub-id-type="pmid">26518743</pub-id></citation></ref>
<ref id="B6"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bates</surname> <given-names>D.</given-names></name> <name><surname>Maechler</surname> <given-names>M.</given-names></name> <name><surname>Bolker</surname> <given-names>B.</given-names></name> <name><surname>Walker</surname> <given-names>S.</given-names></name></person-group> (<year>2015</year>). <article-title>Fitting linear mixed-effects models using lme4</article-title>. <source>J. Stat. Softw.</source> <volume>67</volume>, <fpage>1</fpage>&#x02013;<lpage>48</lpage>. <pub-id pub-id-type="doi">10.18637/jss.v067.i01</pub-id></citation></ref>
<ref id="B7"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Beaman</surname> <given-names>C. P.</given-names></name> <name><surname>Bridges</surname> <given-names>A. M.</given-names></name> <name><surname>Scott</surname> <given-names>S. K.</given-names></name></person-group> (<year>2007</year>). <article-title>From dichotic listening to the irrelevant sound effect: a behavioural and neuroimaging analysis of the processing of unattended speech</article-title>. <source>Cortex</source> <volume>43</volume>, <fpage>124</fpage>&#x02013;<lpage>134</lpage>. <pub-id pub-id-type="doi">10.1016/s0010-9452(08)70450-7</pub-id><pub-id pub-id-type="pmid">17334212</pub-id></citation></ref>
<ref id="B8"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bompas</surname> <given-names>A.</given-names></name> <name><surname>Sumner</surname> <given-names>P.</given-names></name></person-group> (<year>2009</year>). <article-title>Temporal dynamics of saccadic distraction</article-title>. <source>J. Vis.</source> <volume>9</volume>, <fpage>17</fpage>&#x02013;<lpage>17</lpage>. <pub-id pub-id-type="doi">10.1167/9.9.17</pub-id><pub-id pub-id-type="pmid">19761350</pub-id></citation></ref>
<ref id="B9"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Boudewyn</surname> <given-names>M. A.</given-names></name> <name><surname>Carter</surname> <given-names>C. S.</given-names></name></person-group> (<year>2018</year>). <article-title>I must have missed that: &#x003B1;-band oscillations track attention to spoken language</article-title>. <source>Neuropsychologia</source> <volume>117</volume>, <fpage>148</fpage>&#x02013;<lpage>155</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuropsychologia.2018.05.024</pub-id><pub-id pub-id-type="pmid">29842859</pub-id></citation></ref>
<ref id="B10"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bourel-Ponchel</surname> <given-names>E.</given-names></name> <name><surname>Quern&#x000E9;</surname> <given-names>L.</given-names></name> <name><surname>Le Moing</surname> <given-names>A. G.</given-names></name> <name><surname>Deligni&#x000E8;res</surname> <given-names>A.</given-names></name> <name><surname>de Broca</surname> <given-names>A.</given-names></name> <name><surname>Berquin</surname> <given-names>P.</given-names></name></person-group> (<year>2011</year>). <article-title>Maturation of response time and attentional control in ADHD: evidence from an attentional capture paradigm</article-title>. <source>Eur. J. Paediatr. Neurol.</source> <volume>15</volume>, <fpage>123</fpage>&#x02013;<lpage>130</lpage>. <pub-id pub-id-type="doi">10.1016/j.ejpn.2010.08.008</pub-id><pub-id pub-id-type="pmid">21185754</pub-id></citation></ref>
<ref id="B11"><citation citation-type="book"><person-group person-group-type="author"><name><surname>Broadbent</surname> <given-names>D. E.</given-names></name></person-group> (<year>1958</year>). &#x0201C;<article-title>Selective listening to speech</article-title>,&#x0201D; in <source>Perception and Communication</source>, ed. <person-group person-group-type="editor"><name><surname>Broadbent</surname> <given-names>D. E.</given-names></name></person-group> (<publisher-loc>Elmsford, NY</publisher-loc>: <publisher-name>Pergamon Press</publisher-name>), <fpage>11</fpage>&#x02013;<lpage>35</lpage>.</citation></ref>
<ref id="B12"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Calandruccio</surname> <given-names>L.</given-names></name> <name><surname>Buss</surname> <given-names>E.</given-names></name> <name><surname>Bencheck</surname> <given-names>P.</given-names></name> <name><surname>Jett</surname> <given-names>B.</given-names></name></person-group> (<year>2018</year>). <article-title>Does the semantic content or syntactic regularity of masker speech affect speech-on-speech recognition?</article-title> <source>J. Acoust. Soc. Am.</source> <volume>144</volume>, <fpage>3289</fpage>&#x02013;<lpage>3302</lpage>. <pub-id pub-id-type="doi">10.1121/1.5081679</pub-id><pub-id pub-id-type="pmid">30599661</pub-id></citation></ref>
<ref id="B13"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chelazzi</surname> <given-names>L.</given-names></name> <name><surname>Biscaldi</surname> <given-names>M.</given-names></name> <name><surname>Corbetta</surname> <given-names>M.</given-names></name> <name><surname>Peru</surname> <given-names>A.</given-names></name> <name><surname>Tassinari</surname> <given-names>G.</given-names></name> <name><surname>Berlucchi</surname> <given-names>G.</given-names></name></person-group> (<year>1995</year>). <article-title>Oculomotor activity and visual spatial attention</article-title>. <source>Behav. Brain Res.</source> <volume>71</volume>, <fpage>81</fpage>&#x02013;<lpage>88</lpage>. <pub-id pub-id-type="doi">10.1016/0166-4328(95)00134-4</pub-id><pub-id pub-id-type="pmid">8747176</pub-id></citation></ref>
<ref id="B14"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cherry</surname> <given-names>E. C.</given-names></name></person-group> (<year>1953</year>). <article-title>Some experiments on the recognition of speech, with one and with two ears</article-title>. <source>J. Acoust. Soc. Am.</source> <volume>25</volume>, <fpage>975</fpage>&#x02013;<lpage>979</lpage>. <pub-id pub-id-type="doi">10.1121/1.1907229</pub-id></citation></ref>
<ref id="B15"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Conway</surname> <given-names>A. R. A.</given-names></name> <name><surname>Cowan</surname> <given-names>N.</given-names></name> <name><surname>Bunting</surname> <given-names>M. F.</given-names></name></person-group> (<year>2001</year>). <article-title>The cocktail party phenomenon revisited: the importance of working memory capacity</article-title>. <source>Psychon. Bull. Rev.</source> <volume>8</volume>, <fpage>331</fpage>&#x02013;<lpage>335</lpage>. <pub-id pub-id-type="doi">10.3758/bf03196169</pub-id><pub-id pub-id-type="pmid">11495122</pub-id></citation></ref>
<ref id="B16"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cooke</surname> <given-names>M.</given-names></name></person-group> (<year>2006</year>). <article-title>A glimpsing model of speech perception in noise</article-title>. <source>J. Acoust. Soc. Am.</source> <volume>119</volume>, <fpage>1562</fpage>&#x02013;<lpage>1573</lpage>. <pub-id pub-id-type="doi">10.1121/1.2166600</pub-id><pub-id pub-id-type="pmid">16583901</pub-id></citation></ref>
<ref id="B17"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cowan</surname> <given-names>N.</given-names></name> <name><surname>Elliott</surname> <given-names>E. M.</given-names></name> <name><surname>Scott Saults</surname> <given-names>J.</given-names></name> <name><surname>Morey</surname> <given-names>C. C.</given-names></name> <name><surname>Mattox</surname> <given-names>S.</given-names></name> <name><surname>Hismjatullina</surname> <given-names>A.</given-names></name> <etal/></person-group>. (<year>2005</year>). <article-title>On the capacity of attention: its estimation and its role in working memory and cognitive aptitudes</article-title>. <source>Cogn. Psychol.</source> <volume>51</volume>, <fpage>42</fpage>&#x02013;<lpage>100</lpage>. <pub-id pub-id-type="doi">10.1016/j.cogpsych.2004.12.001</pub-id><pub-id pub-id-type="pmid">16039935</pub-id></citation></ref>
<ref id="B18"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Crosse</surname> <given-names>M. J.</given-names></name> <name><surname>Di Liberto</surname> <given-names>G. M.</given-names></name> <name><surname>Lalor</surname> <given-names>E. C.</given-names></name></person-group> (<year>2016</year>). <article-title>Eye can hear clearly now: inverse effectiveness in natural audiovisual speech processing relies on long-term crossmodal temporal integration</article-title>. <source>J. Neurosci.</source> <volume>36</volume>, <fpage>9888</fpage>&#x02013;<lpage>9895</lpage>. <pub-id pub-id-type="doi">10.1523/jneurosci.1396-16.2016</pub-id><pub-id pub-id-type="pmid">27656026</pub-id></citation></ref>
<ref id="B19"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Deubel</surname> <given-names>H.</given-names></name> <name><surname>Schneider</surname> <given-names>W. X.</given-names></name></person-group> (<year>1996</year>). <article-title>Saccade target selection and object recognition: evidence for a common attentional mechanism</article-title>. <source>Vision Res.</source> <volume>36</volume>, <fpage>1827</fpage>&#x02013;<lpage>1837</lpage>. <pub-id pub-id-type="doi">10.1016/0042-6989(95)00294-4</pub-id><pub-id pub-id-type="pmid">8759451</pub-id></citation></ref>
<ref id="B20"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Deutsch</surname> <given-names>J. A.</given-names></name> <name><surname>Deutsch</surname> <given-names>D.</given-names></name></person-group> (<year>1963</year>). <article-title>Attention: some theoretical considerations</article-title>. <source>Psychol. Rev.</source> <volume>70</volume>, <fpage>80</fpage>&#x02013;<lpage>90</lpage>. <pub-id pub-id-type="doi">10.1037/h0039515</pub-id><pub-id pub-id-type="pmid">14027390</pub-id></citation></ref>
<ref id="B21"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ding</surname> <given-names>N.</given-names></name> <name><surname>Simon</surname> <given-names>J. Z.</given-names></name></person-group> (<year>2012a</year>). <article-title>Emergence of neural encoding of auditory objects while listening to competing speakers</article-title>. <source>Proc. Natl. Acad. Sci. U S A</source> <volume>109</volume>, <fpage>11854</fpage>&#x02013;<lpage>11859</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.1205381109</pub-id><pub-id pub-id-type="pmid">22753470</pub-id></citation></ref>
<ref id="B22"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ding</surname> <given-names>N.</given-names></name> <name><surname>Simon</surname> <given-names>J. Z.</given-names></name></person-group> (<year>2012b</year>). <article-title>Neural coding of continuous speech in auditory cortex during monaural and dichotic listening</article-title>. <source>J. Neurophysiol.</source> <volume>107</volume>, <fpage>78</fpage>&#x02013;<lpage>89</lpage>. <pub-id pub-id-type="doi">10.1152/jn.00297.2011</pub-id><pub-id pub-id-type="pmid">21975452</pub-id></citation></ref>
<ref id="B23"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Duncan</surname> <given-names>J.</given-names></name></person-group> (<year>1980</year>). <article-title>The locus of interference in the perception of simultaneous stimuli</article-title>. <source>Psychol. Rev.</source> <volume>87</volume>, <fpage>272</fpage>&#x02013;<lpage>300</lpage>. <pub-id pub-id-type="doi">10.1037/0033-295x.87.3.272</pub-id><pub-id pub-id-type="pmid">7384344</pub-id></citation></ref>
<ref id="B24"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ellermeier</surname> <given-names>W.</given-names></name> <name><surname>Zimmer</surname> <given-names>K.</given-names></name></person-group> (<year>1997</year>). <article-title>Individual differences in susceptibility to the &#x0201C;irrelevant speech effect&#x0201D;</article-title>. <source>J. Acoust. Soc. Am.</source> <volume>102</volume>, <fpage>2191</fpage>&#x02013;<lpage>2199</lpage>. <pub-id pub-id-type="doi">10.1121/1.419596</pub-id><pub-id pub-id-type="pmid">9348677</pub-id></citation></ref>
<ref id="B25"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Escera</surname> <given-names>C.</given-names></name> <name><surname>Yago</surname> <given-names>E.</given-names></name> <name><surname>Corral</surname> <given-names>M.-J.</given-names></name> <name><surname>Corbera</surname> <given-names>S.</given-names></name> <name><surname>Nu&#x000F1;ez</surname> <given-names>M. I.</given-names></name></person-group> (<year>2003</year>). <article-title>Attention capture by auditory significant stimuli: semantic analysis follows attention switching</article-title>. <source>Eur. J. Neurosci.</source> <volume>18</volume>, <fpage>2408</fpage>&#x02013;<lpage>2412</lpage>. <pub-id pub-id-type="doi">10.1046/j.1460-9568.2003.02937.x</pub-id><pub-id pub-id-type="pmid">14622204</pub-id></citation></ref>
<ref id="B26"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Esterman</surname> <given-names>M.</given-names></name> <name><surname>Noonan</surname> <given-names>S. K.</given-names></name> <name><surname>Rosenberg</surname> <given-names>M.</given-names></name> <name><surname>Degutis</surname> <given-names>J.</given-names></name></person-group> (<year>2013</year>). <article-title>In the zone or zoning out? Tracking behavioral and neural fluctuations during sustained attention</article-title>. <source>Cereb. Cortex</source> <volume>23</volume>, <fpage>2712</fpage>&#x02013;<lpage>2723</lpage>. <pub-id pub-id-type="doi">10.1093/cercor/bhs261</pub-id><pub-id pub-id-type="pmid">22941724</pub-id></citation></ref>
<ref id="B27"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Esterman</surname> <given-names>M.</given-names></name> <name><surname>Rosenberg</surname> <given-names>M. D.</given-names></name> <name><surname>Noonan</surname> <given-names>S. K.</given-names></name></person-group> (<year>2014</year>). <article-title>Intrinsic fluctuations in sustained attention and distractor processing</article-title>. <source>J. Neurosci.</source> <volume>34</volume>, <fpage>1724</fpage>&#x02013;<lpage>1730</lpage>. <pub-id pub-id-type="doi">10.1523/JNEUROSCI.2658-13.2014</pub-id><pub-id pub-id-type="pmid">24478354</pub-id></citation></ref>
<ref id="B28"><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Favre-Felix</surname> <given-names>A.</given-names></name> <name><surname>Graversen</surname> <given-names>C.</given-names></name> <name><surname>Dau</surname> <given-names>T.</given-names></name> <name><surname>Lunner</surname> <given-names>T.</given-names></name></person-group> (<year>2017</year>). &#x0201C;<article-title>Real-time estimation of eye gaze by in-ear electrodes. in</article-title>,&#x0201D; in <source>Proceedings of the 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)</source> (<conf-loc>Seogwipo</conf-loc>: <conf-name>IEEE</conf-name>), <fpage>4086</fpage>&#x02013;<lpage>4089</lpage>.</citation></ref>
<ref id="B29"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Forster</surname> <given-names>S.</given-names></name> <name><surname>Lavie</surname> <given-names>N.</given-names></name></person-group> (<year>2014</year>). <article-title>Distracted by your mind? Individual differences in distractibility predict mind wandering</article-title>. <source>J. Exp. Psychol. Learn. Mem. Cogn.</source> <volume>40</volume>, <fpage>251</fpage>&#x02013;<lpage>260</lpage>. <pub-id pub-id-type="doi">10.1037/a0034108</pub-id><pub-id pub-id-type="pmid">23957365</pub-id></citation></ref>
<ref id="B30"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Foulsham</surname> <given-names>T.</given-names></name> <name><surname>Walker</surname> <given-names>E.</given-names></name> <name><surname>Kingstone</surname> <given-names>A.</given-names></name></person-group> (<year>2011</year>). <article-title>The where, what and when of gaze allocation in the lab and the natural environment</article-title>. <source>Vision Res.</source> <volume>51</volume>, <fpage>1920</fpage>&#x02013;<lpage>1931</lpage>. <pub-id pub-id-type="doi">10.1016/j.visres.2011.07.002</pub-id><pub-id pub-id-type="pmid">21784095</pub-id></citation></ref>
<ref id="B31"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Freyman</surname> <given-names>R. L.</given-names></name> <name><surname>Balakrishnan</surname> <given-names>U.</given-names></name> <name><surname>Helfer</surname> <given-names>K. S.</given-names></name></person-group> (<year>2004</year>). <article-title>Effect of number of masking talkers and auditory priming on informational masking in speech recognition</article-title>. <source>J. Acoust. Soc. Am.</source> <volume>115</volume>, <fpage>2246</fpage>&#x02013;<lpage>2256</lpage>. <pub-id pub-id-type="doi">10.1121/1.1689343</pub-id><pub-id pub-id-type="pmid">15139635</pub-id></citation></ref>
<ref id="B32"><citation citation-type="book"><person-group person-group-type="author"><name><surname>Gilchrist</surname> <given-names>I. D.</given-names></name></person-group> (<year>2011</year>). &#x0201C;<article-title>Saccades</article-title>,&#x0201D; in <source>The Oxford Handbook of Eye Movements</source>, eds <person-group person-group-type="editor"><name><surname>Liversedge</surname> <given-names>S. P.</given-names></name> <name><surname>Gilchrist</surname> <given-names>I. D.</given-names></name> <name><surname>Everling</surname> <given-names>S.</given-names></name></person-group> (<publisher-loc>Oxford, UK</publisher-loc>: <publisher-name>Oxford University Press</publisher-name>), <fpage>85</fpage>&#x02013;<lpage>94</lpage>.</citation></ref>
<ref id="B33"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gopher</surname> <given-names>D.</given-names></name></person-group> (<year>1973</year>). <article-title>Eye-movement patterns in selective listening tasks of focused attention</article-title>. <source>Percept. Psychophys.</source> <volume>14</volume>, <fpage>259</fpage>&#x02013;<lpage>264</lpage>. <pub-id pub-id-type="doi">10.3758/bf03212387</pub-id></citation></ref>
<ref id="B34"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gopher</surname> <given-names>D.</given-names></name> <name><surname>Kahneman</surname> <given-names>D.</given-names></name></person-group> (<year>1971</year>). <article-title>Individual differences in attention and the prediction of flight criteria</article-title>. <source>Percept. Mot. Skills</source> <volume>33</volume>, <fpage>1335</fpage>&#x02013;<lpage>1342</lpage>. <pub-id pub-id-type="doi">10.2466/pms.1971.33.3f.1335</pub-id><pub-id pub-id-type="pmid">5160058</pub-id></citation></ref>
<ref id="B35"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gredeb&#x000E4;ck</surname> <given-names>G.</given-names></name> <name><surname>Johnson</surname> <given-names>S.</given-names></name> <name><surname>von Hofsten</surname> <given-names>C.</given-names></name></person-group> (<year>2009</year>). <article-title>Eye tracking in infancy research</article-title>. <source>Dev. Neuropsychol.</source> <volume>35</volume>, <fpage>1</fpage>&#x02013;<lpage>19</lpage>. <pub-id pub-id-type="doi">10.1080/87565640903325758</pub-id></citation></ref>
<ref id="B36"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Grosbras</surname> <given-names>M. H.</given-names></name> <name><surname>Laird</surname> <given-names>A. R.</given-names></name> <name><surname>Paus</surname> <given-names>T.</given-names></name></person-group> (<year>2005</year>). <article-title>Cortical regions involved in eye movements, shifts of attention and gaze perception</article-title>. <source>Hum. Brain Mapp.</source> <volume>25</volume>, <fpage>140</fpage>&#x02013;<lpage>154</lpage>. <pub-id pub-id-type="doi">10.1002/hbm.20145</pub-id><pub-id pub-id-type="pmid">15846814</pub-id></citation></ref>
<ref id="B37"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hoppe</surname> <given-names>S.</given-names></name> <name><surname>Loetscher</surname> <given-names>T.</given-names></name> <name><surname>Morey</surname> <given-names>S. A.</given-names></name> <name><surname>Bulling</surname> <given-names>A.</given-names></name></person-group> (<year>2018</year>). <article-title>Eye movements during everyday behavior predict personality traits</article-title>. <source>Front. Hum. Neurosci.</source> <volume>12</volume>:<fpage>105</fpage>.<pub-id pub-id-type="doi">10.3389/fnhum.2018.00105</pub-id><pub-id pub-id-type="pmid">29713270</pub-id></citation></ref>
<ref id="B38"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hughes</surname> <given-names>R. W.</given-names></name></person-group> (<year>2014</year>). <article-title>Auditory distraction: a duplex-mechanism account</article-title>. <source>Psych J.</source> <volume>3</volume>, <fpage>30</fpage>&#x02013;<lpage>41</lpage>. <pub-id pub-id-type="doi">10.1002/pchj.44</pub-id><pub-id pub-id-type="pmid">26271638</pub-id></citation></ref>
<ref id="B39"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Itti</surname> <given-names>L.</given-names></name> <name><surname>Koch</surname> <given-names>C.</given-names></name></person-group> (<year>2000</year>). <article-title>A saliency-based search mechanism for overt and covert shifts of visual attention</article-title>. <source>Vision Res.</source> <volume>40</volume>, <fpage>1489</fpage>&#x02013;<lpage>1506</lpage>. <pub-id pub-id-type="doi">10.1016/s0042-6989(99)00163-7</pub-id><pub-id pub-id-type="pmid">10788654</pub-id></citation></ref>
<ref id="B40"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kane</surname> <given-names>M. J.</given-names></name> <name><surname>Engle</surname> <given-names>R. W.</given-names></name></person-group> (<year>2002</year>). <article-title>The role of prefrontal cortex in working-memory capacity, executive attention and general fluid intelligence: an individual-differences perspective</article-title>. <source>Psychon. Bull. Rev.</source> <volume>9</volume>, <fpage>637</fpage>&#x02013;<lpage>671</lpage>. <pub-id pub-id-type="doi">10.3758/bf03196323</pub-id><pub-id pub-id-type="pmid">12613671</pub-id></citation></ref>
<ref id="B41"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kayser</surname> <given-names>C.</given-names></name> <name><surname>Petkov</surname> <given-names>C. I.</given-names></name> <name><surname>Lippert</surname> <given-names>M.</given-names></name> <name><surname>Logothetis</surname> <given-names>N. K.</given-names></name></person-group> (<year>2005</year>). <article-title>Mechanisms for allocating auditory attention: an auditory saliency map</article-title>. <source>Curr. Biol.</source> <volume>15</volume>, <fpage>1943</fpage>&#x02013;<lpage>1947</lpage>. <pub-id pub-id-type="doi">10.1016/j.cub.2005.09.040</pub-id><pub-id pub-id-type="pmid">16271872</pub-id></citation></ref>
<ref id="B42"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kidd</surname> <given-names>G.</given-names><suffix>Jr.</suffix></name></person-group> (<year>2017</year>). <article-title>Enhancing auditory selective attention using a visually guided hearing aid</article-title>. <source>J. Speech Lang. Hear. Res.</source> <volume>60</volume>, <fpage>3027</fpage>&#x02013;<lpage>3038</lpage>. <pub-id pub-id-type="doi">10.1044/2017_JSLHR-H-17-0071</pub-id><pub-id pub-id-type="pmid">29049603</pub-id></citation></ref>
<ref id="B43"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Killingsworth</surname> <given-names>M. A.</given-names></name> <name><surname>Gilbert</surname> <given-names>D. T.</given-names></name></person-group> (<year>2010</year>). <article-title>A wandering mind is an unhappy mind</article-title>. <source>Science</source> <volume>330</volume>:<fpage>932</fpage>.<pub-id pub-id-type="doi">10.1126/science.1192439</pub-id><pub-id pub-id-type="pmid">21071660</pub-id></citation></ref>
<ref id="B44"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kim</surname> <given-names>J.</given-names></name> <name><surname>Davis</surname> <given-names>C.</given-names></name></person-group> (<year>2003</year>). <article-title>Hearing foreign voices: does knowing what is said affect visual-masked-speech detection?</article-title> <source>Perception</source> <volume>32</volume>, <fpage>111</fpage>&#x02013;<lpage>120</lpage>. <pub-id pub-id-type="doi">10.1068/p3466</pub-id><pub-id pub-id-type="pmid">12613790</pub-id></citation></ref>
<ref id="B45"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kingstone</surname> <given-names>A.</given-names></name> <name><surname>Smilek</surname> <given-names>D.</given-names></name> <name><surname>Eastwood</surname> <given-names>J. D.</given-names></name></person-group> (<year>2008</year>). <article-title>Cognitive ethology: a new approach for studying human cognition</article-title>. <source>Br. J. Psychol.</source> <volume>99</volume>, <fpage>317</fpage>&#x02013;<lpage>340</lpage>. <pub-id pub-id-type="doi">10.1348/000712607X251243</pub-id><pub-id pub-id-type="pmid">17977481</pub-id></citation></ref>
<ref id="B46"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Koller</surname> <given-names>M.</given-names></name></person-group> (<year>2016</year>). <article-title>robustlmm: an R package for robust estimation of linear mixed-effects models</article-title>. <source>J. Stat. Softw.</source> <volume>75</volume>, <fpage>1</fpage>&#x02013;<lpage>24</lpage>. <pub-id pub-id-type="doi">10.18637/jss.v075.i06</pub-id></citation></ref>
<ref id="B47"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lachter</surname> <given-names>J.</given-names></name> <name><surname>Forster</surname> <given-names>K. I.</given-names></name> <name><surname>Ruthruff</surname> <given-names>E.</given-names></name></person-group> (<year>2004</year>). <article-title>Forty-five years after broadbent (1958): still no identification without attention</article-title>. <source>Psychol. Rev.</source> <volume>111</volume>, <fpage>880</fpage>&#x02013;<lpage>913</lpage>. <pub-id pub-id-type="doi">10.1037/0033-295X.111.4.880</pub-id><pub-id pub-id-type="pmid">15482066</pub-id></citation></ref>
<ref id="B48"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lavie</surname> <given-names>N.</given-names></name> <name><surname>Hirst</surname> <given-names>A.</given-names></name> <name><surname>de Fockert</surname> <given-names>J. W.</given-names></name> <name><surname>Viding</surname> <given-names>E.</given-names></name></person-group> (<year>2004</year>). <article-title>Load theory of selective attention and cognitive control</article-title>. <source>J. Exp. Psychol. Gen.</source> <volume>133</volume>, <fpage>339</fpage>&#x02013;<lpage>354</lpage>. <pub-id pub-id-type="doi">10.1037/0096-3445.133.3.339</pub-id><pub-id pub-id-type="pmid">15355143</pub-id></citation></ref>
<ref id="B49"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>N.</given-names></name> <name><surname>Loizou</surname> <given-names>P. C.</given-names></name></person-group> (<year>2007</year>). <article-title>Factors influencing glimpsing of speech in noise</article-title>. <source>J. Acoust. Soc. Am.</source> <volume>122</volume>, <fpage>1165</fpage>&#x02013;<lpage>1172</lpage>. <pub-id pub-id-type="doi">10.1121/1.2749454</pub-id><pub-id pub-id-type="pmid">17672662</pub-id></citation></ref>
<ref id="B50"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lin</surname> <given-names>S.-H.</given-names></name> <name><surname>Yeh</surname> <given-names>Y.-Y.</given-names></name></person-group> (<year>2014</year>). <article-title>Attentional load and the consciousness of one&#x02019;s own name</article-title>. <source>Conscious. Cogn.</source> <volume>26</volume>, <fpage>197</fpage>&#x02013;<lpage>203</lpage>. <pub-id pub-id-type="doi">10.1016/j.concog.2014.03.008</pub-id><pub-id pub-id-type="pmid">24762974</pub-id></citation></ref>
<ref id="B51"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Linse</surname> <given-names>K.</given-names></name> <name><surname>R&#x000FC;ger</surname> <given-names>W.</given-names></name> <name><surname>Joos</surname> <given-names>M.</given-names></name> <name><surname>Schmitz-Peiffer</surname> <given-names>H.</given-names></name> <name><surname>Storch</surname> <given-names>A.</given-names></name> <name><surname>Hermann</surname> <given-names>A.</given-names></name></person-group> (<year>2017</year>). <article-title>Eye-tracking-based assessment suggests preserved well-being in locked-in patients</article-title>. <source>Ann. Neurol.</source> <volume>81</volume>, <fpage>310</fpage>&#x02013;<lpage>315</lpage>. <pub-id pub-id-type="doi">10.1002/ana.24871</pub-id><pub-id pub-id-type="pmid">28074605</pub-id></citation></ref>
<ref id="B52"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lou</surname> <given-names>Y.</given-names></name> <name><surname>Yoon</surname> <given-names>J. W.</given-names></name> <name><surname>Huh</surname> <given-names>H.</given-names></name></person-group> (<year>2014</year>). <article-title>Modeling of shear ductile fracture considering a changeable cut-off value for stress triaxiality</article-title>. <source>Int. J. Plast.</source> <volume>54</volume>, <fpage>56</fpage>&#x02013;<lpage>80</lpage>. <pub-id pub-id-type="doi">10.1016/j.ijplas.2013.08.006</pub-id></citation></ref>
<ref id="B81"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Marius&#x02019;t Hart</surname> <given-names>B. M.</given-names></name> <name><surname>Vockeroth</surname> <given-names>J.</given-names></name> <name><surname>Schumann</surname> <given-names>F.</given-names></name> <name><surname>Bartl</surname> <given-names>K.</given-names></name> <name><surname>Schneider</surname> <given-names>E.</given-names></name> <name><surname>K&#x000F6;nig</surname> <given-names>P.</given-names></name> <etal/></person-group>. (<year>2009</year>). <article-title>Gaze allocation in natural stimuli: comparing free exploration to head-fixed viewing conditions</article-title>. <source>Vis. cogn.</source> <volume>17</volume>, <fpage>1132</fpage>&#x02013;<lpage>1158</lpage>. <pub-id pub-id-type="doi">10.1080/13506280902812304</pub-id></citation></ref>
<ref id="B53"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>McDermott</surname> <given-names>J. H.</given-names></name></person-group> (<year>2009</year>). <article-title>The cocktail party problem</article-title>. <source>Curr. Biol.</source> <volume>19</volume>, <fpage>R1024</fpage>&#x02013;<lpage>R1027</lpage>. <pub-id pub-id-type="doi">10.1016/j.cub.2009.09.005</pub-id><pub-id pub-id-type="pmid">19948136</pub-id></citation></ref>
<ref id="B54"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mesgarani</surname> <given-names>N.</given-names></name> <name><surname>Chang</surname> <given-names>E. F.</given-names></name></person-group> (<year>2012</year>). <article-title>Selective cortical representation of attended speaker in multi-talker speech perception</article-title>. <source>Nature</source> <volume>485</volume>, <fpage>233</fpage>&#x02013;<lpage>236</lpage>. <pub-id pub-id-type="doi">10.1038/nature11020</pub-id><pub-id pub-id-type="pmid">22522927</pub-id></citation></ref>
<ref id="B55"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nakano</surname> <given-names>T.</given-names></name> <name><surname>Kitazawa</surname> <given-names>S.</given-names></name></person-group> (<year>2010</year>). <article-title>Eyeblink entrainment at breakpoints of speech</article-title>. <source>Exp. Brain Res.</source> <volume>205</volume>, <fpage>577</fpage>&#x02013;<lpage>581</lpage>. <pub-id pub-id-type="doi">10.1007/s00221-010-2387-z</pub-id><pub-id pub-id-type="pmid">20700731</pub-id></citation></ref>
<ref id="B56"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Naveh-Benjamin</surname> <given-names>M.</given-names></name> <name><surname>Kilb</surname> <given-names>A.</given-names></name> <name><surname>Maddox</surname> <given-names>G. B.</given-names></name> <name><surname>Thomas</surname> <given-names>J.</given-names></name> <name><surname>Fine</surname> <given-names>H. C.</given-names></name> <name><surname>Chen</surname> <given-names>T.</given-names></name> <etal/></person-group>. (<year>2014</year>). <article-title>Older adults do not notice their names: a new twist to a classic attention task</article-title>. <source>J. Exp. Psychol. Learn. Mem. Cogn.</source> <volume>40</volume>, <fpage>1540</fpage>&#x02013;<lpage>1550</lpage>. <pub-id pub-id-type="doi">10.1037/xlm0000020</pub-id><pub-id pub-id-type="pmid">24820668</pub-id></citation></ref>
<ref id="B57"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Neely</surname> <given-names>C.</given-names></name> <name><surname>LeCompte</surname> <given-names>D.</given-names></name></person-group> (<year>1999</year>). <article-title>The importance of semantic similarity to the irrelevant speech effect</article-title>. <source>Mem. Cogn.</source> <volume>27</volume>, <fpage>37</fpage>&#x02013;<lpage>44</lpage>. <pub-id pub-id-type="doi">10.3758/bf03201211</pub-id><pub-id pub-id-type="pmid">10087854</pub-id></citation></ref>
<ref id="B58"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>O&#x02019;Sullivan</surname> <given-names>J. A.</given-names></name> <name><surname>Power</surname> <given-names>A. J.</given-names></name> <name><surname>Mesgarani</surname> <given-names>N.</given-names></name> <name><surname>Rajaram</surname> <given-names>S.</given-names></name> <name><surname>Foxe</surname> <given-names>J. J.</given-names></name> <name><surname>Shinn-Cunningham</surname> <given-names>B. G.</given-names></name> <etal/></person-group>. (<year>2015</year>). <article-title>Attentional selection in a cocktail party environment can be decoded from single-trial EEG</article-title>. <source>Cereb. Cortex</source> <volume>25</volume>, <fpage>1697</fpage>&#x02013;<lpage>1706</lpage>. <pub-id pub-id-type="doi">10.1093/cercor/bht355</pub-id><pub-id pub-id-type="pmid">24429136</pub-id></citation></ref>
<ref id="B59"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Park</surname> <given-names>H.</given-names></name> <name><surname>Kayser</surname> <given-names>C.</given-names></name> <name><surname>Thut</surname> <given-names>G.</given-names></name> <name><surname>Gross</surname> <given-names>J.</given-names></name></person-group> (<year>2016</year>). <article-title>Lip movements entrain the observers&#x02019; low-frequency brain oscillations to facilitate speech intelligibility</article-title>. <source>Elife</source> <volume>5</volume>:<fpage>e14521</fpage>. <pub-id pub-id-type="doi">10.7554/elife.14521</pub-id><pub-id pub-id-type="pmid">27146891</pub-id></citation></ref>
<ref id="B60"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Parmentier</surname> <given-names>F. B. R.</given-names></name> <name><surname>Pacheco-Unguetti</surname> <given-names>A. P.</given-names></name> <name><surname>Valero</surname> <given-names>S.</given-names></name></person-group> (<year>2018</year>). <article-title>Food words distract the hungry: evidence of involuntary semantic processing of task-irrelevant but biologically-relevant unexpected auditory words</article-title>. <source>PLoS One</source> <volume>13</volume>:<fpage>e0190644</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0190644</pub-id><pub-id pub-id-type="pmid">29300763</pub-id></citation></ref>
<ref id="B61"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Petersen</surname> <given-names>S. E.</given-names></name> <name><surname>Posner</surname> <given-names>M. I.</given-names></name></person-group> (<year>2012</year>). <article-title>The attention system of the human brain: 20 years after</article-title>. <source>Annu. Rev. Neurosci.</source> <volume>35</volume>, <fpage>73</fpage>&#x02013;<lpage>89</lpage>. <pub-id pub-id-type="doi">10.1146/annurev-neuro-062111-150525</pub-id><pub-id pub-id-type="pmid">22524787</pub-id></citation></ref>
<ref id="B62"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Posner</surname> <given-names>M. I.</given-names></name></person-group> (<year>1980</year>). <article-title>Orienting of attention</article-title>. <source>Q. J. Exp. Psychol.</source> <volume>32</volume>, <fpage>3</fpage>&#x02013;<lpage>25</lpage>. <pub-id pub-id-type="doi">10.1080/00335558008248231</pub-id><pub-id pub-id-type="pmid">7367577</pub-id></citation></ref>
<ref id="B63"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rauthmann</surname> <given-names>J. F.</given-names></name> <name><surname>Seubert</surname> <given-names>C. T.</given-names></name> <name><surname>Sachse</surname> <given-names>P.</given-names></name> <name><surname>Furtner</surname> <given-names>M. R.</given-names></name></person-group> (<year>2012</year>). <article-title>Eyes as windows to the soul: gazing behavior is related to personality</article-title>. <source>J. Res. Pers.</source> <volume>46</volume>, <fpage>147</fpage>&#x02013;<lpage>156</lpage>. <pub-id pub-id-type="doi">10.1016/j.jrp.2011.12.010</pub-id></citation></ref>
<ref id="B64"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Raveh</surname> <given-names>D.</given-names></name> <name><surname>Lavie</surname> <given-names>N.</given-names></name></person-group> (<year>2015</year>). <article-title>Load-induced inattentional deafness</article-title>. <source>Atten. Percept. Psychophys.</source> <volume>77</volume>, <fpage>483</fpage>&#x02013;<lpage>492</lpage>. <pub-id pub-id-type="doi">10.3758/s13414-014-0776-2</pub-id><pub-id pub-id-type="pmid">25287617</pub-id></citation></ref>
<ref id="B65"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Reisberg</surname> <given-names>D.</given-names></name> <name><surname>Scheiber</surname> <given-names>R.</given-names></name> <name><surname>Potemken</surname> <given-names>L.</given-names></name></person-group> (<year>1981</year>). <article-title>Eye position and the control of auditory attention</article-title>. <source>J. Exp. Psychol. Hum. Percept. Perform.</source> <volume>7</volume>, <fpage>318</fpage>&#x02013;<lpage>323</lpage>. <pub-id pub-id-type="doi">10.1037/0096-1523.7.2.318</pub-id><pub-id pub-id-type="pmid">6453926</pub-id></citation></ref>
<ref id="B66"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Risko</surname> <given-names>E. F.</given-names></name> <name><surname>Anderson</surname> <given-names>N. C.</given-names></name> <name><surname>Lanthier</surname> <given-names>S.</given-names></name> <name><surname>Kingstone</surname> <given-names>A.</given-names></name></person-group> (<year>2012</year>). <article-title>Curious eyes: individual differences in personality predict eye movement behavior in scene-viewing</article-title>. <source>Cognition</source> <volume>122</volume>, <fpage>86</fpage>&#x02013;<lpage>90</lpage>. <pub-id pub-id-type="doi">10.1016/j.cognition.2011.08.014</pub-id><pub-id pub-id-type="pmid">21983424</pub-id></citation></ref>
<ref id="B67"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Risko</surname> <given-names>E. F.</given-names></name> <name><surname>Richardson</surname> <given-names>D. C.</given-names></name> <name><surname>Kingstone</surname> <given-names>A.</given-names></name></person-group> (<year>2016</year>). <article-title>Breaking the fourth wall of cognitive science</article-title>. <source>Curr. Dir. Psychol. Sci.</source> <volume>25</volume>, <fpage>70</fpage>&#x02013;<lpage>74</lpage>. <pub-id pub-id-type="doi">10.1177/0963721415617806</pub-id></citation></ref>
<ref id="B68"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rochais</surname> <given-names>C.</given-names></name> <name><surname>Henry</surname> <given-names>S.</given-names></name> <name><surname>Hausberger</surname> <given-names>M.</given-names></name></person-group> (<year>2017</year>). <article-title>Spontaneous attention-capture by auditory distractors as predictor of distractibility: a study of domestic horses <italic>(Equus caballus)</italic></article-title>. <source>Sci. Rep.</source> <volume>7</volume>:<fpage>15283</fpage>. <pub-id pub-id-type="doi">10.1038/s41598-017-15654-5</pub-id><pub-id pub-id-type="pmid">29127367</pub-id></citation></ref>
<ref id="B69"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>R&#x000F6;er</surname> <given-names>J. P.</given-names></name> <name><surname>K&#x000F6;rner</surname> <given-names>U.</given-names></name> <name><surname>Buchner</surname> <given-names>A.</given-names></name> <name><surname>Bell</surname> <given-names>R.</given-names></name></person-group> (<year>2017</year>). <article-title>Attentional capture by taboo words: a functional view of auditory distraction</article-title>. <source>Emotion</source> <volume>17</volume>, <fpage>740</fpage>&#x02013;<lpage>750</lpage>. <pub-id pub-id-type="doi">10.1037/emo0000274</pub-id><pub-id pub-id-type="pmid">28080086</pub-id></citation></ref>
<ref id="B70"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rosen</surname> <given-names>S.</given-names></name> <name><surname>Souza</surname> <given-names>P.</given-names></name> <name><surname>Ekelund</surname> <given-names>C.</given-names></name> <name><surname>Majeed</surname> <given-names>A. A.</given-names></name></person-group> (<year>2013</year>). <article-title>Listening to speech in a background of other talkers: effects of talker number and noise vocoding</article-title>. <source>J. Acoust. Soc. Am.</source> <volume>133</volume>, <fpage>2431</fpage>&#x02013;<lpage>2443</lpage>. <pub-id pub-id-type="doi">10.1121/1.4794379</pub-id><pub-id pub-id-type="pmid">23556608</pub-id></citation></ref>
<ref id="B100"><citation citation-type="web"><person-group person-group-type="author"><collab>R Development Core Team</collab></person-group>. (<year>2012</year>). <article-title>R: A Language and Environment for Statistical Computing. Vienna: R foundation for Statistical Computing</article-title>. Available online at: <ext-link ext-link-type="uri" xlink:href="http://www.R-project.org/">http://www.R-project.org/</ext-link>.</citation></ref>
<ref id="B71"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schomaker</surname> <given-names>J.</given-names></name> <name><surname>Walper</surname> <given-names>D.</given-names></name> <name><surname>Wittmann</surname> <given-names>B. C.</given-names></name> <name><surname>Einh&#x000E4;user</surname> <given-names>W.</given-names></name></person-group> (<year>2017</year>). <article-title>Attention in natural scenes: affective-motivational factors guide gaze independently of visual salience</article-title>. <source>Vision Res.</source> <volume>133</volume>, <fpage>161</fpage>&#x02013;<lpage>175</lpage>. <pub-id pub-id-type="doi">10.1016/j.visres.2017.02.003</pub-id><pub-id pub-id-type="pmid">28279712</pub-id></citation></ref>
<ref id="B72"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schwartz</surname> <given-names>J.-L.</given-names></name> <name><surname>Berthommier</surname> <given-names>F.</given-names></name> <name><surname>Savariaux</surname> <given-names>C.</given-names></name></person-group> (<year>2004</year>). <article-title>Seeing to hear better: evidence for early audio-visual interactions in speech identification</article-title>. <source>Cognition</source> <volume>93</volume>, <fpage>B69</fpage>&#x02013;<lpage>B78</lpage>. <pub-id pub-id-type="doi">10.1016/s0010-0277(04)00054-x</pub-id><pub-id pub-id-type="pmid">15147940</pub-id></citation></ref>
<ref id="B73"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schweizer</surname> <given-names>K.</given-names></name> <name><surname>Moosbrugger</surname> <given-names>H.</given-names></name></person-group> (<year>2004</year>). <article-title>Attention and working memory as predictors of intelligence</article-title>. <source>Intelligence</source> <volume>32</volume>, <fpage>329</fpage>&#x02013;<lpage>347</lpage>. <pub-id pub-id-type="doi">10.1016/j.intell.2004.06.006</pub-id></citation></ref>
<ref id="B74"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Seli</surname> <given-names>P.</given-names></name> <name><surname>Beaty</surname> <given-names>R. E.</given-names></name> <name><surname>Cheyne</surname> <given-names>J. A.</given-names></name> <name><surname>Smilek</surname> <given-names>D.</given-names></name> <name><surname>Oakman</surname> <given-names>J.</given-names></name> <name><surname>Schacter</surname> <given-names>D. L.</given-names></name></person-group> (<year>2018</year>). <article-title>How pervasive is mind wandering, really?</article-title> <source>Conscious. Cogn.</source> <volume>66</volume>, <fpage>74</fpage>&#x02013;<lpage>78</lpage>. <pub-id pub-id-type="doi">10.1016/j.concog.2018.10.002</pub-id><pub-id pub-id-type="pmid">30408603</pub-id></citation></ref>
<ref id="B75"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Simpson</surname> <given-names>S. A.</given-names></name> <name><surname>Cooke</surname> <given-names>M.</given-names></name></person-group> (<year>2005</year>). <article-title>Consonant identification in N-talker babble is a nonmonotonic function of N</article-title>. <source>J. Acoust. Soc. Am.</source> <volume>118</volume>, <fpage>2775</fpage>&#x02013;<lpage>2778</lpage>. <pub-id pub-id-type="doi">10.1121/1.2062650</pub-id><pub-id pub-id-type="pmid">16334654</pub-id></citation></ref>
<ref id="B76"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Smith</surname> <given-names>T. J.</given-names></name> <name><surname>Lamont</surname> <given-names>P.</given-names></name> <name><surname>Henderson</surname> <given-names>J. M.</given-names></name></person-group> (<year>2013</year>). <article-title>Change blindness in a dynamic scene due to endogenous override of exogenous attentional cues</article-title>. <source>Perception</source> <volume>42</volume>, <fpage>884</fpage>&#x02013;<lpage>886</lpage>. <pub-id pub-id-type="doi">10.1068/p7377</pub-id><pub-id pub-id-type="pmid">24303751</pub-id></citation></ref>
<ref id="B77"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>S&#x000F6;rqvist</surname> <given-names>P.</given-names></name> <name><surname>Marsh</surname> <given-names>J. E.</given-names></name> <name><surname>N&#x000F6;stl</surname> <given-names>A.</given-names></name></person-group> (<year>2013</year>). <article-title>High working memory capacity does not always attenuate distraction: bayesian evidence in support of the null hypothesis</article-title>. <source>Psychon. Bull. Rev.</source> <volume>20</volume>, <fpage>897</fpage>&#x02013;<lpage>904</lpage>. <pub-id pub-id-type="doi">10.3758/s13423-013-0419-y</pub-id><pub-id pub-id-type="pmid">23479339</pub-id></citation></ref>
<ref id="B78"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Spence</surname> <given-names>C.</given-names></name> <name><surname>Ranson</surname> <given-names>J.</given-names></name> <name><surname>Driver</surname> <given-names>J.</given-names></name></person-group> (<year>2000</year>). <article-title>Cross-modal selective attention: on the difficulty of ignoring sounds at the locus of visual attention</article-title>. <source>Percept. Psychophys.</source> <volume>62</volume>, <fpage>410</fpage>&#x02013;<lpage>424</lpage>. <pub-id pub-id-type="doi">10.3758/bf03205560</pub-id><pub-id pub-id-type="pmid">10723219</pub-id></citation></ref>
<ref id="B79"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sumby</surname> <given-names>W. H.</given-names></name> <name><surname>Pollack</surname> <given-names>I.</given-names></name></person-group> (<year>1954</year>). <article-title>Visual contribution to speech intelligibility in noise</article-title>. <source>J. Acoust. Soc. Am.</source> <volume>26</volume>, <fpage>212</fpage>&#x02013;<lpage>215</lpage>. <pub-id pub-id-type="doi">10.1121/1.1907309</pub-id></citation></ref>
<ref id="B80"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Szinte</surname> <given-names>M.</given-names></name> <name><surname>Jonikaitis</surname> <given-names>D.</given-names></name> <name><surname>Rangelov</surname> <given-names>D.</given-names></name> <name><surname>Deubel</surname> <given-names>H.</given-names></name></person-group> (<year>2018</year>). <article-title>Pre-saccadic remapping relies on dynamics of spatial attention</article-title>. <source>Elife</source> <volume>7</volume>:<fpage>e37598</fpage>. <pub-id pub-id-type="doi">10.7554/elife.37598</pub-id><pub-id pub-id-type="pmid">30596475</pub-id></citation></ref>
<ref id="B82"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Treisman</surname> <given-names>A. M.</given-names></name></person-group> (<year>1964</year>). <article-title>The effect of irrelevant material on the efficiency of selective listening</article-title>. <source>Am. J. Psychol.</source> <volume>77</volume>, <fpage>533</fpage>&#x02013;<lpage>546</lpage>. <pub-id pub-id-type="doi">10.2307/1420765</pub-id><pub-id pub-id-type="pmid">14251963</pub-id></citation></ref>
<ref id="B83"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tsuchida</surname> <given-names>Y.</given-names></name> <name><surname>Katayama</surname> <given-names>J.</given-names></name> <name><surname>Murohashi</surname> <given-names>H.</given-names></name></person-group> (<year>2012</year>). <article-title>Working memory capacity affects the interference control of distractors at auditory gating</article-title>. <source>Neurosci. Lett.</source> <volume>516</volume>, <fpage>62</fpage>&#x02013;<lpage>66</lpage>. <pub-id pub-id-type="doi">10.1016/j.neulet.2012.03.057</pub-id><pub-id pub-id-type="pmid">22484011</pub-id></citation></ref>
<ref id="B84"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Vergauwe</surname> <given-names>E.</given-names></name> <name><surname>Barrouillet</surname> <given-names>P.</given-names></name> <name><surname>Camos</surname> <given-names>V.</given-names></name></person-group> (<year>2010</year>). <article-title>Do mental processes share a domain-general resource?</article-title> <source>Psychol. Sci.</source> <volume>21</volume>, <fpage>384</fpage>&#x02013;<lpage>390</lpage>. <pub-id pub-id-type="doi">10.1177/0956797610361340</pub-id><pub-id pub-id-type="pmid">20424075</pub-id></citation></ref>
<ref id="B85"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Vestergaard</surname> <given-names>M. D.</given-names></name> <name><surname>Fyson</surname> <given-names>N. R. C.</given-names></name> <name><surname>Patterson</surname> <given-names>R. D.</given-names></name></person-group> (<year>2011</year>). <article-title>The mutual roles of temporal glimpsing and vocal characteristics in cocktail-party listening</article-title>. <source>J. Acoust. Soc. Am.</source> <volume>130</volume>, <fpage>429</fpage>&#x02013;<lpage>439</lpage>. <pub-id pub-id-type="doi">10.1121/1.3596462</pub-id><pub-id pub-id-type="pmid">21786910</pub-id></citation></ref>
<ref id="B86"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Walker</surname> <given-names>F.</given-names></name> <name><surname>Bucker</surname> <given-names>B.</given-names></name> <name><surname>Anderson</surname> <given-names>N. C.</given-names></name> <name><surname>Schreij</surname> <given-names>D.</given-names></name> <name><surname>Theeuwes</surname> <given-names>J.</given-names></name></person-group> (<year>2017</year>). <article-title>Looking at paintings in the vincent van gogh museum: eye movement patterns of children and adults</article-title>. <source>PLoS One</source> <volume>12</volume>:<fpage>e0178912</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0178912</pub-id><pub-id pub-id-type="pmid">28636664</pub-id></citation></ref>
<ref id="B87"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Warm</surname> <given-names>J. S.</given-names></name> <name><surname>Parasuraman</surname> <given-names>R.</given-names></name> <name><surname>Matthews</surname> <given-names>G.</given-names></name></person-group> (<year>2008</year>). <article-title>Vigilance requires hard mental work and is stressful</article-title>. <source>Hum. Factors J. Hum. Factors Ergon. Soc.</source> <volume>50</volume>, <fpage>433</fpage>&#x02013;<lpage>441</lpage>. <pub-id pub-id-type="doi">10.1518/001872008X312152</pub-id><pub-id pub-id-type="pmid">18689050</pub-id></citation></ref>
<ref id="B88"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Weissman</surname> <given-names>D. H.</given-names></name> <name><surname>Roberts</surname> <given-names>K. C.</given-names></name> <name><surname>Visscher</surname> <given-names>K. M.</given-names></name> <name><surname>Woldorff</surname> <given-names>M. G.</given-names></name></person-group> (<year>2006</year>). <article-title>The neural bases of momentary lapses in attention</article-title>. <source>Nat. Neurosci.</source> <volume>9</volume>, <fpage>971</fpage>&#x02013;<lpage>978</lpage>. <pub-id pub-id-type="doi">10.1038/nn1727</pub-id><pub-id pub-id-type="pmid">16767087</pub-id></citation></ref>
<ref id="B89"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wiemers</surname> <given-names>E. A.</given-names></name> <name><surname>Redick</surname> <given-names>T. S.</given-names></name></person-group> (<year>2018</year>). <article-title>Working memory capacity and intra-individual variability of proactive control</article-title>. <source>Acta Psychol.</source> <volume>182</volume>, <fpage>21</fpage>&#x02013;<lpage>31</lpage>. <pub-id pub-id-type="doi">10.1016/j.actpsy.2017.11.002</pub-id><pub-id pub-id-type="pmid">29127776</pub-id></citation></ref>
<ref id="B90"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wood</surname> <given-names>N.</given-names></name> <name><surname>Cowan</surname> <given-names>N.</given-names></name></person-group> (<year>1995</year>). <article-title>The cocktail party phenomenon revisited: how frequent are attention shifts to one&#x02019;s name in an irrelevant auditory channel?</article-title> <source>J. Exp. Psychol. Learn. Mem. Cogn.</source> <volume>21</volume>, <fpage>255</fpage>&#x02013;<lpage>260</lpage>. <pub-id pub-id-type="doi">10.1037/0278-7393.21.1.255</pub-id><pub-id pub-id-type="pmid">7876773</pub-id></citation></ref>
<ref id="B91"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yi</surname> <given-names>A.</given-names></name> <name><surname>Wong</surname> <given-names>W.</given-names></name> <name><surname>Eizenman</surname> <given-names>M.</given-names></name></person-group> (<year>2013</year>). <article-title>Gaze patterns and audiovisual speech enhancement</article-title>. <source>J. Speech Lang. Hear. Res.</source> <volume>56</volume>, <fpage>471</fpage>&#x02013;<lpage>480</lpage>. <pub-id pub-id-type="doi">10.1044/1092-4388(2012/10-0288)</pub-id><pub-id pub-id-type="pmid">23275394</pub-id></citation></ref>
<ref id="B92"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zion Golumbic</surname> <given-names>E. M.</given-names></name> <name><surname>Cogan</surname> <given-names>G. B.</given-names></name> <name><surname>Schroeder</surname> <given-names>C. E.</given-names></name> <name><surname>Poeppel</surname> <given-names>D.</given-names></name></person-group> (<year>2013a</year>). <article-title>Visual input enhances selective speech envelope tracking in auditory cortex at a &#x0201C;Cocktail Party&#x0201D;</article-title>. <source>J. Neurosci.</source> <volume>33</volume>, <fpage>1417</fpage>&#x02013;<lpage>1426</lpage>. <pub-id pub-id-type="doi">10.1523/JNEUROSCI.3675-12.2013</pub-id><pub-id pub-id-type="pmid">23345218</pub-id></citation></ref>
<ref id="B93"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zion Golumbic</surname> <given-names>E. M.</given-names></name> <name><surname>Ding</surname> <given-names>N.</given-names></name> <name><surname>Bickel</surname> <given-names>S.</given-names></name> <name><surname>Lakatos</surname> <given-names>P.</given-names></name> <name><surname>Schevon</surname> <given-names>C. A.</given-names></name> <name><surname>McKhann</surname> <given-names>G. M.</given-names></name> <etal/></person-group>. (<year>2013b</year>). <article-title>Mechanisms underlying selective neuronal tracking of attended speech at a &#x0201C;cocktail party&#x0201D;</article-title>. <source>Neuron</source> <volume>77</volume>, <fpage>980</fpage>&#x02013;<lpage>991</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuron.2012.12.037</pub-id><pub-id pub-id-type="pmid">23473326</pub-id></citation></ref>
</ref-list>
<fn-group>
<fn id="fn0001"><p><sup>1</sup><ext-link ext-link-type="uri" xlink:href="http://www.icast.co.il">www.icast.co.il</ext-link></p></fn>
<fn id="fn0002"><p><sup>2</sup><ext-link ext-link-type="uri" xlink:href="http://unity3d.com">unity3d.com</ext-link></p></fn>
<fn id="fn0003"><p><sup>3</sup><ext-link ext-link-type="uri" xlink:href="http://fieldtriptoolbox.org">fieldtriptoolbox.org</ext-link></p></fn>
</fn-group>
</back>
</article>