# WHAT'S SHARED IN SHARING TASKS AND ACTIONS? PROCESSES AND REPRESENTATIONS UNDERLYING JOINT PERFORMANCE

EDITED BY : Motonori Yamaguchi, Timothy N. Welsh, Karl Christoph Klauer and Kerstin Dittrich PUBLISHED IN : Frontiers in Psychology

#### Frontiers Copyright Statement

© Copyright 2007-2019 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.

The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.

Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.

Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.

As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.

All copyright, and all rights therein, are protected by national and international copyright laws.

The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use. ISSN 1664-8714 ISBN 978-2-88945-900-1 DOI 10.3389/978-2-88945-900-1

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# WHAT'S SHARED IN SHARING TASKS AND ACTIONS? PROCESSES AND REPRESENTATIONS UNDERLYING JOINT PERFORMANCE

Topic Editors:

Motonori Yamaguchi, Edge Hill University, United Kingdom Timothy N. Welsh, University of Toronto, Canada Karl Christoph Klauer, Albert-Ludwigs-Universität Freiburg, Germany Kerstin Dittrich, Albert-Ludwigs-Universität Freiburg, Germany

Citation: Yamaguchi, M., Welsh, T. N., Klauer, K. C., Dittrich, K., eds. (2019). What's Shared in Sharing Tasks and Actions? Processes and Representations Underlying Joint Performance. Lausanne: Frontiers Media. doi: 10.3389/978-2-88945-900-1

# Table of Contents


Motonori Yamaguchi, Emma L. Clarke and Danny L. Egan

*75 The Influence of Co-action on a Simple Attention Task: A Shift Back to the Status Quo*

Jill A. Dosso, Kevin H. Roberts, Alessandra DiGiacomo and Alan Kingstone

*83 "Two Minds Don't Blink Alike": The Attentional Blink Does not Occur in a Joint Context*

Merryn D. Constable, Jay Pratt and Timothy N. Welsh

*94 Intimacy Effects on Action Regulation: Retrieval of Observationally Acquired Stimulus–Response Bindings in Romantically Involved Interaction Partners Versus Strangers*

Carina Giesen, Virginia Löhl, Klaus Rothermund and Nicolas Koranyi

*105 What's Shared in Movement Kinematics: Investigating Co-representation of Actions Through Movement*

Matilde Rocca and Andrea Cavallo


Francesca Ciardo and Agnieszka Wykowska

*145 The Social Situation Affects how we Process Feedback About our Actions* Artur Czeszumski, Benedikt V. Ehinger, Basil Wahn and Peter König

# Editorial: What's Shared in Sharing Tasks and Actions? Processes and Representations Underlying Joint Performance

#### Motonori Yamaguchi <sup>1</sup> \*, Timothy N. Welsh<sup>2</sup> , Karl Christoph Klauer <sup>3</sup> and Kerstin Dittrich<sup>3</sup>

<sup>1</sup> Department of Psychology, Edge Hill University, Ormskirk, United Kingdom, <sup>2</sup> Faculty of Kinesiology & Physical Education, University of Toronto, Toronto, ON, Canada, <sup>3</sup> Department of Psychology, Albert-Ludwigs-Universität Freiburg, Freiburg, Germany

Keywords: joint task, joint action, task sharing, co-representation, joint cognition

**Editorial on the Research Topic**

Edited and reviewed by:

Bernhard Hommel, Leiden University, Netherlands

\*Correspondence: Motonori Yamaguchi cog.yamaguchi@gmail.com; yamagucm@edgehill.ac.uk

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 06 March 2019 Accepted: 08 March 2019 Published: 02 April 2019

#### Citation:

Yamaguchi M, Welsh TN, Klauer KC and Dittrich K (2019) Editorial: What's Shared in Sharing Tasks and Actions? Processes and Representations Underlying Joint Performance. Front. Psychol. 10:659. doi: 10.3389/fpsyg.2019.00659 **What's Shared in Sharing Tasks and Actions? Processes and Representations Underlying Joint Performance**

# SIX BLIND MEN AND AN ELEPHANT

There is an ancient story about six blind men who met to discuss what an elephant is like. Each of them touched a different part of the elephant and came up with an image of what an elephant was like. The first man touched its leg and thought that an elephant was like a pillar; the second man touched its ear and thought that an elephant was like a fan; the third man touched its trunk and thought that an elephant was like a snake, and so on. The six blind men, the six different images of what an elephant was like. The six men quarreled over an elephant, until they realized that they touched different parts of the same elephant. How can these blind men ever learn what an elephant is really like? Doing so requires integrating other people's images of an elephant into their own, the process known as co-representation.

The story of six blind men and an elephant offers several morals. One of the morals is that it is very difficult to see the whole from its parts, especially when the parts are distributed among different individuals. This poses a challenge that individual actors face when they share a task with co-actors, and it is important to understand cognitive mechanisms that meet the challenge. The present research topic aimed at bringing together different approaches and perspectives to the study of sharing tasks and actions between co-acting individuals. It was hoped that these perspectives would collectively present a big picture that delineates the cognitive processes and representations underlying joint performance.

# THE BEGINNING: SHARING TASKS AND ACTIONS

The last decade has seen a surge of interest in experimental studies of joint task performance. These studies have suggested that actors who share a single task not only perform it together, but also share a mental representation of the whole task; that is, actors co-represent both their and their co-actors portion of the task. Initial evidence supporting the idea of co-representation was garnered through the joint Simon task (Sebanz et al., 2003), wherein pairs of actors divide the work involved in performing a choice reaction task. In the standard version of the Simon task with a single actor, response times (RT) are shorter when the responses spatially correspond to the location of stimuli (e.g., pressing a left key to circles on the left side of a computer monitor) than when they do not (pressing a left key to circles on the right). This Simon effect disappears if the task setting is altered in such a way that spatial attributes of stimuli or choice of responses are eliminated. For example, in a go/nogo version of the task, the actor responds to a type of stimulus (e.g., red circles) by pressing one key and withholds responding to another type of stimulus (green circles). The Simon effect disappears in this task context because only a single response is involved in the task, so the spatial attribute is no longer used to represent the response and there is no response conflict to resolve. However, the Simon effect re-emerges when two actors perform the go/nogo version of the task together. In this joint Simon task, each co-actor operates one of the two response keys to respond to one type of stimulus, and is told to withhold a response when the stimulus assigned to their partner is presented. The critical finding of these studies is that RTs are still shorter if stimuli occur on the same side as the response location than if they occur on the opposite side. This finding has been used to argue that co-acting individuals co-represent (or share a mental representation of) the task.

# THIRTEEN ARTICLES, THIRTY-SEVEN AUTHORS, AND ONE CONCLUSION?

More than a decade after the original study, the joint Simon task still remains to be a popular paradigm of task sharing, but in more complex situations that involve multiple modalities (Dolk and Liepelt) or a large number of display items (Baess et al.) and with more elaborate measures, such as autocorrelation (Ciardo and Wykowska) and sequential modulations (Mendl et al.) of RTs as well as event-related potentials (Michel et al.). Other behavioral paradigms have also been developed to study interpersonal phenomena in a joint task setting, such as attentional blink in the rapid

# REFERENCES


sequential visual presentation (Constable et al.), four alternative forced choice (Czeszumski et al.), line bisection (Dosso et al.), stimulus and response priming in a prime-probe task (Giesen et al.), and Stroop interference (Yamaguchi et al.). As in the Simon task, these paradigms measure discrete actions (e.g., pressing a key), but paradigms that require continuous actions have also made important contributions to our understanding of joint performance (Ray and Welsh; Rocca and Cavallo; Wahn et al.).

Studies of task sharing now demonstrate a variety of issues in joint tasks and actions. Several groups investigated the influences of interpersonal relationships on joint tasks and actions (Ciardo and Wykowska; Czeszumski et al.; Giesen et al.; Mendl et al.) while others examined the influences of joint settings on the frame of reference (Baess et al.; Dolk and Liepelt; Dosso et al.; Ray and Welsh). Although most studies in this collection focused on co-representation (integration) between co-acting individuals, others pointed out the importance of a division of labor in task sharing (Constable et al.; Wahn et al.; Yamaguchi et al.). The neural basis of joint task performance is still an underinvestigated area of study (Czeszumski et al.; Michel et al.) that requires further development in future research.

The present collection includes 13 articles by 37 authors. What these studies tell us about task sharing? In a nutshell, most studies in the present collection found little evidence for corepresentation (Baess et al.; Constable et al.; Dolk and Liepelt; Dosso et al.; Michel et al.; Yamaguchi et al.) or limited support for co-representation that was conditional on the interpersonal relationship with the co-acting partners (Ciardo and Wykowska; Giesen et al.; Czeszumski et al.; Mendl et al.; Ray and Welsh).

After the initial demonstration of co-representation (Sebanz et al., 2003), a large number of studies have explored conditions under which co-representation occurs or does not occur (e.g., Welsh et al., 2009; Dittrich et al., 2013; Yamaguchi et al., 2018). These efforts have enriched the empirical ground to understand cognitive processes and representations underlying joint task performance, and alternative accounts of task sharing have been proposed (e.g., Dittrich et al., 2012; Dolk et al., 2014; Prinz, 2015; Yamaguchi et al., 2019). The present collection adds further empirical evidence to aid such efforts. They imply that the six blind men still have difficulty seeing what a real elephant is like, but we have started to understand why it is so difficult to see the elephant.

# AUTHOR CONTRIBUTIONS

MY wrote the first draft, and all authors contributed to revisions and approved the final version for publication.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Yamaguchi, Welsh, Klauer and Dittrich. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# When a Social Experimenter Overwrites Effects of Salient Objects in an Individual Go/No-Go Simon Task – An ERP Study

René Michel<sup>1</sup> \*, Jens Bölte<sup>1</sup> and Roman Liepelt1,2 \*

1 Institute of Psychology, University of Münster, Münster, Germany, <sup>2</sup> Institute of Psychology, German Sport University Cologne, Cologne, Germany

#### Edited by:

Timothy N. Welsh, University of Toronto, Canada

#### Reviewed by:

Cristina Iani, Università degli Studi di Modena e Reggio Emilia, Italy Luisa Lugli, Università degli Studi di Bologna, Italy Olav Krigolson, University of Victoria, Canada

\*Correspondence:

René Michel r.michel@uni-muenster.de Roman Liepelt r.liepelt@dshs-koeln.de

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 20 September 2017 Accepted: 18 April 2018 Published: 17 May 2018

#### Citation:

Michel R, Bölte J and Liepelt R (2018) When a Social Experimenter Overwrites Effects of Salient Objects in an Individual Go/No-Go Simon Task – An ERP Study. Front. Psychol. 9:674. doi: 10.3389/fpsyg.2018.00674 When two persons share a Simon task, a joint Simon effect occurs. The task corepresentation account assumes that the joint Simon effect is the product of a vicarious representation of the co-actor's task. In contrast, recent studies show that even (nonhuman) event-producing objects could elicit a Simon effect in an individual go/no-go Simon task arguing in favor of the referential coding account. For the human-induced Simon effect, a modulation of the P300 component in Electroencephalography (EEG) is typically considered as a neural indicator of the joint Simon effect and task corepresentation. Showing that the object-induced Simon effects also modulates the P300 would lead to a re-evaluation of the interpretation of the P300 in individual go/no-go and joint Simon task contexts. To do so, the present study conceptually replicated Experiment 1 from Dolk et al. (2013a) adding EEG recordings and an experimenter controlling the EEG computer to test whether a modulation of the P300 can also be elicited by adding a Japanese waving cat to the task context. Subjects performed an individual go/no-go Simon task with or without a cat placed next to them. Results show an overall Simon effect regardless of the cat's presence and no modulatory influence of the cat on the P300 (Experiment 1), even when conceivably interfering context factors are diminished (Experiment 2). These findings may suggest that the presence of a spatially aligned experimenter in the laboratory may produce an overall Simon effect overwriting a possible modulation of the Japanese waving cat.

Keywords: Simon effect, EEG, joint action, action perception, referential coding, compatibility effect

# INTRODUCTION

Coordinating human interaction is part of our daily life's challenges. Even simple activities as carrying furniture together require precise coordination of one's own action with our co-actors' actions (van der Wel et al., 2016). Own produced actions and perceived actions are mentally represented in a functionally similar way (Prinz, 1997; Hommel et al., 2001), a perspective that follows from the logic of common coding between perception and action planning and control (Prinz, 1997). An often-used task to test the concomitant interplay between perception and action in a shared task context is a modified version of the Simon task (Simon and Rudell, 1967; Lu and Proctor, 1995).

In the standard Simon task, a participant is asked to respond on a non-spatial, dichotomous stimulus attribute (e.g., color: red/green) with two spatially arranged response buttons (e.g., horizontally: left/right) while ignoring a task-irrelevant spatial stimulus dimension (e.g., stimulus location: left/right). According to the dimensional overlap model (Kornblum et al., 1990), the stimulus location primes the spatially compatible response. This results in faster and more accurate responses for compatible trials (required response and stimulus on the same side) compared to incompatible trials (spatial location of response and stimulus differ) which will elicit a response conflict requiring additional time to be solved (De Jong et al., 1994; Nicoletti and Umiltà, 1994; Hommel et al., 2001). This compatibility effect is the so-called Simon effect (Simon and Rudell, 1967; Hedge and Marsh, 1975).

In a variant of the standard Simon task, the individual go/nogo Simon task, subjects are asked to respond only to one of the two stimulus attributes (e.g., respond to a green stimulus; do not respond to a red stimulus; Sebanz et al., 2003). Here, the Simon effect is typically absent (but see Stenzel and Liepelt, 2016b). It is argued that there is no stimulus-response compatibility because the actor's response is not spatially coded (Hommel, 1996; Sebanz et al., 2003, 2006; Liepelt et al., 2011, 2013; Kiernan et al., 2012; Dolk et al., 2013a).

Sebanz et al. (2003) developed the joint Simon task to test the impact of another person's action on one's own task performance during joint action (Sebanz et al., 2003). Two participants performed a standard Simon task simultaneously, sitting side by side to each other. As in the individual go/no-go Simon task, each participant responded to only one of the two stimulus attributes (Sebanz et al., 2003, 2006). Although, when regarded separately, each participant performed an individual go/no-go Simon task (which normally does not elicit a Simon effect), the Simon effect re-appeared in this joint setup, therefore called joint Simon effect (Hommel et al., 2001; Sebanz et al., 2003, 2005, 2006; Tsai et al., 2006; Tsai and Brass, 2007; Kiernan et al., 2012; Welsh et al., 2013a).

Sebanz et al. (2003, 2006) explained the joint Simon effect by assuming an automatic representation of our co-actors' actions and tasks. The task co-representation account implicates that merely seeing the stimuli relevant for a co-actor already activates the required action of our interaction partner based on the knowledge about his/her task rules stressing the social aspect of the effect (Tsai and Brass, 2007). As own and foreign actions are mentally represented in a functionally similar way (Sebanz et al., 2006), the concept of common coding (Prinz, 1997) is extended to entire tasks. The joint representation of both task shares (own plus other half of the Simon task), evokes a mental representation of an entire Simon task. Given this shared representation, a spatially driven stimulus-response compatibility effect emerges such as if, e.g., my left partner's action is represented like my left response hand in the standard Simon task.

The task-co-representation account assumes that shared representations measured by the joint Simon effect reflects the basis for social interaction (Knoblich and Sebanz, 2006) as it was found to be mediated by social factors like group membership and cooperative or competitive relationship of the co-actors (Hommel et al., 2009; Ruys and Aarts, 2010; Iani et al., 2011).

However, studies with non-social set-ups, e.g., with robots or programmed wooden hands, are inconclusive with respect to the question if an interaction partner needs to be always socially encoded (Tsai and Brass, 2007; Müller et al., 2011; Stenzel et al., 2012, 2013; Stenzel and Liepelt, 2016b; Puffe et al., 2017). Furthermore, the size of joint and individual go/no-go Simon effects seems to depend on agency cues like human body form (Tsai and Brass, 2007), ostensive cues like turn taking characteristics of the response (Stenzel and Liepelt, 2016b), and the exact task conditions showing some dependence of stimulus modality (Lien et al., 2016; Puffe et al., 2017). Additional factors that influence the presence of a Simon effect in an individual go/no-go Simon task are related to the degree to which participants spatially code their responses (Dittrich et al., 2012, 2013). Enhanced spatial response coding may be achieved, for example by using different hand positions (Liepelt, 2014) or by responding with pointing actions (Porcu et al., 2016) as well as by decreasing the spatial proximity between two actors so that the other person's action moves from extrapersonal space into the peripersonal space (Guagnano et al., 2010).

Due to the increased number of findings showing a Simon effect in individual go/no-go Simon task settings, a new account has been proposed – the referential coding account (Weeks et al., 1995; Vlianic et al., 2010; Dolk et al., 2011, 2013a,b). Its theoretical grounding is the theory of event coding – TEC (Hommel et al., 2001). According to TEC, a bundle of feature codes representing a combination of their attributes (e.g., spatial orientation, sound, color, form etc.) mentally represents actions. Based on early assumptions of ideomotor theory (Lotze, 1852; James, 1890), these feature codes resemble those perceptual events that typically follow the action in the outside world. The more attributes internal and external events share, the more likely they activate each other. High similarity between perceived events and events used for action control increases self-other integration (Prinz, 2005; Dolk and Prinz, 2016). The referential coding account explains the joint Simon effect by assuming a discrimination problem between externally perceived and internally activated events (Dolk et al., 2014): the higher the similarity between internal and external events is (i.e., the more features they share), the harder is the discrimination problem. To resolve it, an actor must focus on task features that best distinguishes own from other events in a given task context. Spatial orientation can serve as such a discriminating feature (Miller et al., 2011), but depending on task context other features such as color (Sellaro et al., 2015) or valence (Stenzel and Liepelt, 2016a) can be used as well to resolve the discrimination problem.

According to the referential coding account, individual go/nogo Simon effects occur when an event-producing object shares enough attributes with the participant's action (e.g., a clicking sound representing an auditory effect of an action) and when two actors produce events in relative spatial proximity. Thereby, in principle the referential coding account is able to explain the presence of a joint Simon effect produced by a social co-actor and non-socially produced Simon effects produced by objects such as a Japanese waving cat or a metronome (Dolk et al., 2013a) parsimoniously by applying the same basic mechanism.

To investigate the neural mechanisms underlying the Simon effect, the EEG is an appropriate method providing a high temporal resolution (for an overview see Leuthold, 2011). The P300 is a positive component at parietal electrodes with a latency of 250 to 500 ms after stimulus onset. It serves commonly as a relative measure for stimulus evaluation (Kutas et al., 1977; Magliero et al., 1984; Kok, 2001), functioning as a mediator between perceptual analysis and response preparation (Verleger et al., 2005) as well as an indicator for action control (Fallgatter and Strik, 1999). Using visual stimuli in a standard Simon task, the stimulus-response compatibility has been shown to influence the amplitude and the latency of the P300 (Ragot and Renault, 1981; Magliero et al., 1984; Renault et al., 1988; Valle-Inclán, 1996; Zhou et al., 2004). Regarding individual go/no-go Simon tasks, no-gotrials in contrast to go-trials show larger amplitudes and longer latencies for the P300 which provides evidence for its involvement in response inhibition (Roberts et al., 1994; Falkenstein et al., 1995; Bokura et al., 2001; Tekok-Kilic et al., 2001).

Sebanz et al. (2006) investigated this no-go P300 effect in a joint Simon task contrasting a group condition (two participants in a joint Simon task) with an individual condition (one participant performing an individual go/no-go Simon task). Only in the group condition, a Simon effect was found. Further, a higher P300 amplitude on no-go-trials in the group as compared to the individual condition was interpreted as an indication of the joint Simon effect and task co-representation. To confirm the referential coding account's postulation that human- and objectinduced Simon effects have the same underlying mechanisms, a study with an object-induced Simon effect investigating the no-go P300 is needed. If the postulation is correct, the no-go P300 effect found by Sebanz et al. (2006)should also be observed at an objectinduced Simon effect. For this investigation, the experimental setup used in Experiment 1 of Dolk et al. (2013a) qualifies best: they asked participants to perform an auditory individual gono/go Simon task with or without sitting next to a Japanese waving cat. In contrast to the cat absent condition, a Simon effect occurred in the cat present condition.

Lien et al. (2016) already adopted this Japanese waving cat manipulation used in Experiment 1 by Dolk et al. (2013a) and added EEG recordings. In two experiments, subjects performed subsequently both a standard and a go/no-go Simon task with or without the cat placed next to them and with auditory (Experiment 1) or visual stimuli (Experiment 2). In contrast to Dolk et al. (2013a), Lien et al. (2016) used pitched tones instead of reversed Dutch words as auditory stimuli and red or green colored points presented within a picture of a hand pointing to the left, right or central direction as visual stimuli. Whereas a Simon effect was found for the standard Simon task independent from cat presence, for the go/no-go task a Simon effect was only observed in the cat present condition when using auditory stimuli but not when using visual stimuli. Regarding EEG, they found a modulation of the lateralized readiness potentials (LRPs) induced by the cat in the go/no-go task only for auditory stimuli. As they used LRPs as a neuronal indicator instead of the P300 used for human-induced Simon effects by Sebanz et al. (2006), the question whether object-induced Simon effects also elicit such a P300 effect still needs neuropsychological confirmation.

Thus, the present study has the objective to add this pending evidence by replicating Experiment 1 of Dolk et al. (2013a) adding EEG recordings to investigate the P300 because it was previously taken as an indicator for a joint Simon effect in humans (Sebanz et al., 2006). Thus, we tested if a Japanese waving cat elicits a joint Simon effect (cat present condition) and compared the participant's performance to an individual go/nogo Simon task (cat absent condition). Additionally, we contrasted the P300 on no-go-trials in the cat present and cat absent condition. Deviating from Dolk et al. (2013a), visual instead of auditory stimuli were presented for a better comparability of the P300 with Sebanz et al. (2006).

Based on the referential coding account, we predict (1) a larger Simon effect in the cat present condition as compared to the cat absent condition and (2) a significantly increased (more positive) amplitude for the P300 component for the cat present condition compared to the cat absent condition in the collected EEG data. In contrast, based on the task-co-representation account, we predict (1) neither a behavioral Simon effect in cat present or cat absent conditions (2) nor a compatible/incompatible P300 difference corresponding to the Simon effect.

# EXPERIMENT 1

# Method

# Participants

Twenty-four participants (12 female) at the age of 20 to 30, M = 23.08, SD = 2.22, took part in the Experiment<sup>1</sup> . Nineteen of them were psychology students. All participants were righthanded and had normal or corrected-to-normal vision. All participants gave their written informed consent to participate in the study, which was conducted in accordance with the ethical standards laid down in the World Medical Association Declaration of Helsinki (2013) and approved by the ethical committee of the University of Muenster. Participants with psychiatric diseases, heavy head injuries in the past or metallic cranial-implants were excluded with the help of a screening questionnaire. For participating students received course-credit for participation.

#### Material

The participant sat on a fixed chair in front of the right edge of the screen. A fixed button was placed in front of the participant. A Japanese waving cat (height: 12.5 cm, width: 9 cm, depth: 7 cm, see **Figure 1**) facing the subject was placed to the left of the participant in the cat present condition (for the entire task arrangement see **Figure 1**). The cat's left arm waved at steady frequency of 0.4 Hz and movement angle of 50◦ in the vertical plane. While waving, the cat produced a steady clicking sound.

The participant was instructed to place the right hand flatly on the table while putting the index finger on the button. The

<sup>1</sup>An a priori power analysis was conducted with G∗Power 3.0 using the effect sizes reported in the study of Dolk et al. (2013a), indicating 23 subjects for targeting sufficient statistical power of 0.90 at an alpha level of 0.05.

left hand was placed on the left upper leg during the whole experiment. The laboratory was slightly dimmed, the examiner controlling the EEG measurement was positioned out of the participant's field of view two meters away on the left side.

#### Procedure

Participant's task was to push the button as quickly and accurately as possible only when a blue dot (diameter = 2.2 cm) was presented. A yellow dot (diameter = 2.2 cm) was used as stimulus in the no-go-trials.

After eight warm-up trials, the genuine experiment with eight blocks containing 64 trials each was initiated. Each block consisted of 32 go- and 32 no-go-trials, with half of the trials being response-compatible (stimulus on the right side of the screen) or response-incompatible (stimulus on the left side of the screen), respectively. Within each block the trial sequence was randomized. Subsequent to each block, there was a short break of 1.5 min. The cat was presented randomly but counterbalanced over all participants either in the first or second half of the experiment. Preceding both cat present and cat absent condition, there was an instruction which only differed in introducing the Japanese waving cat.

Each go-trial started with the sole presentation of a fixation cross (200 ms, 0.6 cm × 0.6 cm). Then, along with the fixation cross, a blue dot (to the left or right of the fixation cross, distance = 5.8 cm) was presented for 500 ms. If the response button was pressed within the 500 ms, fixation cross and the blue dot disappeared immediately. Each no-go-trial started with the sole presentation of a fixation cross for 200 ms followed by the combined presentation of the fixation cross and a yellow dot to the left or right of it (distance = 5.8 cm). The yellow dot's initial presentation duration was 350 ms and then was adjusted to the participant's reaction time (RT) by setting of the preceding nogo stimulus presentation duration off against the participant's last RT.

Time out was set to 1000 ms and the participant received the feedback "too slow." False positive answers led to the feedback "mistake." The whole procedure took about 40 min. Finally, participants completed a questionnaire targeting in how far the Japanese waving cat attracted the participants' attention or was perceived as an object (instead of manlike).

#### EEG Measurement

EEG was recorded with ASA© (Advanced Source Analysis, ANT Neuro, Enschede, Netherlands) with a 32-electrode configuration of a 64 ANT-Waveguard cap (10–20 system). Resistance was kept below 5 k. The signal was amplified (ExG 20x, fixed = 5 mV/V) and recorded continuously during the whole experiment with an


TABLE 1 | Mean reaction time (RT) in milliseconds (trimmed 10%) and errors rates (ER) in percentages per cat presence, compatibility and experiment.

Error rates were calculated as percentage of all 512 trials. Standard deviations for RTs are shown in parentheses.

average reference and a lowpass-Butterworth-filter (half-power cut off = 0.27 × sampling frequency) and a sampling frequency of 256 Hz. Vertical EOG was measured by placing a bipolar electrode beneath and above the left eye. Horizontal EOG was measured by placing a bipolar electrode at the outer canthus of each eye. AFz was used as ground electrode.

#### EEG Preprocessing

The continuous data was filtered in ASA© (version 4.8.1) with a half-power Butterworth-bandpass filter (0.1–20 Hz, 24 db/oct) based on the FFT-method. Noisy channels were interpolated. For artifact correction, a principal component analysis (PCA; Ille et al., 2002) was implemented based on manually marked artifacts. In eeglab (version 12.0.2.06b, MATLAB R2012b) the signal was down-sampled to 128 Hz sampling frequency and rereferenced to the mastoid electrodes (M1 and M2). The signal was epoched (200 ms before, 500 ms after stimulus onset) along with a baseline correction (200 ms before stimulus onset). Epochs with artifacts (threshold =±75 µV) were excluded. In erplab (version 4.0.2.3) only errorless and artifact-free epochs were averaged to event-related potentials (ERPs) separately for each condition.

# Results

Two participants had to be excluded from all further analyses (one because the mean RT was twice as high as for the rest of the participants, the other one due to EEG recording problems) leading to a sample size of 22 participants.

#### Behavioral Measurement

R (version 3.3.2) was used for statistical analysis. Analysis of error rates showed a mean error rate below 1% (for a detailed overview of RTs and error rates see **Table 1**); all error related trials were excluded from further analysis. For the following analysis, trimmed means (10% trim) of RT of correct go-trials were taken as dependent variable.

For an analysis of variance for repeated measures (ANOVA) mean RT (10% trim) were calculated for each combination of the variables compatibility (compatible vs. incompatible) and cat presence (cat present vs. cat absent). An ANOVA<sup>2</sup> including the within-subject factors compatibility and cat presence showed a significant main effect for compatibility, F(1,21) = 7.59, p = 0.01, η 2 <sup>g</sup> = 0.01 (see **Figure 2**) with a facilitation for compatible trials, M = 297 ms, compared to incompatible trials, M = 302 ms (compatibility effect = 6 ms). The interaction compatibility × cat presence was not significant, F(1,21) < 1.

### EEG Analysis

For seven participants 1–4 channels were interpolated. Based on artifact detection for the preprocessed data, on average 0.8% of the trials per participant had to be excluded, SD = 1.79, maximum = 6.8%. Remaining trials were averaged to ERPs across the factors compatibility, cat presence and go/no-go.

To investigate the main effect of cat presence on the P300 component for no-go-trials, a repeated measure, two-tailed cluster-based permutation test was calculated for a time window from 300 to 500 ms after stimulus onset. There were 2500 random permutations for each participant (Bullmore et al., 1999; Groppe et al., 2011). This resulted in 1530 tests (over 30 electrodes and 51 time points). To access an overall alpha-level of 0.05, a test wise alpha-level of 0.00033 was applied. Electrodes were considered

<sup>2</sup>To control for order effects, the same ANOVA with the additional factor order of presentation (cat presented in first half of experiment vs. cat presented in second half of experiment) only revealed a significant interaction cat presence × order of presentation, F(1,20) = 43.41, p < 0.001, η 2 <sup>g</sup> = 0.07, with faster RT for the cat absent condition when it was presented in the first half (295 ms, SD = 10.5; vs. cat present: 311 ms, SD = 10.7) and faster RT in the cat present condition when it was presented in the second half (286 ms, SD = 13.4; vs. cat absent: 306 ms, SD = 13.6). The main effect order of presentation, the interaction order of presentation × compatibility, both F(1,20) < 1, and order of presentation × cat presence × compatibility, F(1,20) = 2.56, p = 0.125, were not significant.

as spatial neighbors within a radius of approximately 5.44 cm leading to clusters with a mean of 2.7 neighboring electrodes, SD = 1.2. The main effect cat presence for no-go-trials was not significant, p-values ≥ 0.56 (see **Figure 3** for corresponding waveforms). The same tests were calculated for the main effect cat presence for both go- and no-go-trials within an interval from 0 to 500 ms to cover the whole epoch leading to 3840 comparisons with a test wise alpha-level of 0.00013, but no cluster reached significance, p-values ≥ 0.21.

To investigate a main effect of compatibility, a repeated measure, two-tailed cluster-based permutation test was calculated as described above with the following changes: To detect compatibility effects at all stages of the reaction process, an interval from 0 to 500 ms after stimulus onset was used leading to 3840 comparisons (over 30 electrodes and 128 time points). To access an overall alpha-level of 0.05, a test wise alphalevel of 0.00013 was applied. The main effect of compatibility was significant with higher amplitudes for incompatible trials. The effect was present in the entire left hemisphere within a time interval of 100 to 150 ms. The peak was located in the parietal and centro-parietal area with the smallest significant t-value t(21) = −2.09 and significant corrected p-values of 0.0016 (see **Figures 4**, **5**). The antagonistic effect in the right hemisphere with larger amplitudes for compatible trials than for incompatible trials within the same time window did not reach significance.

## Discussion

Experiment 1 of the present study aimed to replicate the objectinduced joint Simon effect found by Dolk et al. (2013a) and the P300 effect induced by a human co-actor (Sebanz et al., 2006) using an individual go/no-go Simon task with visual stimuli.

Behavioral data showed a (1) main effect of compatibility with faster RTs for compatible trials than for incompatible trials but (2) no significant interaction between cat presence and compatibility. Regarding the ERPs, there was also a (3) main effect of compatibility located in the (centro-) parietal left hemisphere within a time interval of 100 to 150 ms with larger amplitudes for incompatible trials than for compatible trials. Regarding no-go-trials, there was (4) no significant modulation effect of cat presence on the P300 component.

The (1) main effect of compatibility with faster RTs for compatible trials than for incompatible trials prompts the presence of a Simon effect (Simon and Rudell, 1967) in the cat present as well as in the cat absent condition. The EEG data provide a neuronal correlate for this omnipresent Simon effect in form of the (3) early main effect of compatibility located in

the (centro-) parietal left hemisphere. A similar early activation pattern for a compatibility effect was found by Valle-Inclán (1996) as well as Wascher and Wauschkuhn (1996) using a standard Simon task. In the present study, the Simon effect was not modulated by the Japanese waving cat, which was indicated by the non-significant interaction of cat presence and compatibility (2). The compatibility effect was not moderated by any effect of order of presentation. In accordance with our behavioral findings, there was (4) no significant modulatory effect of cat presence on the P300 for no-go-trials in the EEG data. In summary, the Japanese waving cat failed to modulate action inhibition despite the presence of an overall compatibility effect in an individual go/no-go Simon task with visual stimuli, which was contrary to our prediction.

Why did we observe a Simon effect, even in the cat absent condition without the Japanese cat? Neither the referential coding account (Dolk et al., 2013a), nor the task-co-representation account (Sebanz et al., 2006) can easily explain this pattern of Simon effects. While referential coding can explain the finding of a Simon effect in the cat presence condition, but not in the absence condition, task-co-representation fails to explain the finding of a Simon effect in both conditions because of a missing social co-actor.

One might speculate that the time-taking for preparation of the EEG measurement executed by the examiner and the examiner's presence throughout the whole experiment could have evoked some kind of examiner effect. The examiner was located two meters left of the participant to control the EEG recording on a separate computer executing some mouse clicks or taking notes, which may have served as visual or auditory events that attracted the actor's attention. If so, according to referential coding, one would need to assume that the presence of the experimenter's actions on the participant's left side must have forced the participant to spatially code one's own action as right throughout the entire experiment, which may have been a stronger effect as of the presence of the Japanese cat itself.

Another explanation may arise from the no-go-stimuli's presentation time: Sebanz et al. (2006) and Dolk et al. (2013a) worked with fixed presentation times matched to the maximum presentation time of go stimuli. Keep in mind that the go stimuli presentation duration is often shorter than the maximum presentation duration because the stimulus disappears as soon as the participant reacts. In the present study, the presentation time of no-go-stimuli was matched to the participant's RT in gotrials to achieve a better comparability of go and no-go-trials. This matching might have changed the task structure for participants.

The cross modality of the visual stimuli and the primarily auditory events produced by the Japanese waving cat might also have influenced task performance. Dolk et al. (2011) used auditory stimuli not causing any cross modality. In addition, participants had to discriminate between auditory go and no-gostimuli shifting attention to the auditory system. Discriminating auditory events (between the clicking sound produced by the cat and one's own button press) was mandatory for task achievement. In the present study, we used visual stimuli, which might have taken attentional resources away from the visual events produced by the Japanese cat undermining its modulatory effect. This may explain why the cat did not further modulate the Simon effect.

Despite the shortcomings of the above explanation, we changed the paradigm to investigate the influence of the aforementioned problems. We tried to reduce as much as possible (1) the spatial coding of the examiner in our EEG task context, (2) used a fixed stimulus duration for the no-go-trials and (3) shifted from visual stimuli to auditory stimuli.

# EXPERIMENT 2

Experiment 2 was a replication of Experiment 1 introducing some minor changes aimed to more closely adopt the experimental setup to the study of Dolk et al. (2013a). As Dolk et al. (2013a) found a significant impact of a Japanese waving cat on the Simon effect, which we did not in Experiment 1 using visual stimuli, we performed the following changes to our task setup. We reduced the examiner's influence to a minimum by screening off the examiner by a curtain from the participant's room. Further, in line with the study of Dolk et al. (2013a), the no-go-stimulus presentation duration was no longer matched to the participant's RT now using auditory stimuli.

# Method

#### Participants

Twenty-four participants (19 female) at the age of 18–52 years, M = 22.92, SD = 7.19, took part in the Experiment. Sixteen of them were psychology students. Screening procedure and participants' payment was as equal to Experiment 1.

#### Material

The experimental arrangement was identical to Experiment 1 expect the following changes: Two near field studio monitors M-Audio AV32 were placed left and right to the screen (see **Figure 6**). Additionally, the examiner sat on the participant's left side behind a noise-absorbing curtain completely screening off the examiner from the participant.

#### Procedure

The experimental procedure was identical to Experiment 1 expect the following changes: the participant had to press the button as quickly and accurately when the target sound was presented. No reaction was required for the no-go stimulus. Time-reversed versions of the spoken Dutch words paars or groen were used as target or distractor, respectively, counterbalanced over participants.

Each trial started with the presentation of a fixation tone (80 ms) and a fixation cross in the center of the screen (200 ms, 0.6 cm × 0.6 cm). Then, along with the fixation cross, the target or distractor sound was presented via the left or the right speaker for 300 ms. If the response button was pressed within the 300 ms, fixation cross and target sound disappeared immediately.

#### EEG Measurement and Preprocessing

Both measurement and preprocessing was implemented in the same way as already outlined in Experiment 1.

### Results

Three participants had to be excluded from all following analyses. One of them due to a high mean error rate of 11% (compared to 1% of the rest of the sample), while the other two had to be excluded due to recording problems. This led to a sample size of 21 participants.

#### Behavioral Measurement

R (version 3.3.2) was used for statistical analysis. Analysis of error rates showed a mean error rate of 1% (see **Table 1**); all error

related trials were excluded from further analyses. Mean trimmed RT (10% trim) of RTs of correct go-trials served as dependent variable in the following analyses (see **Table 1**).

As in Experiment 1, mean RTs were submitted to an ANOVA<sup>3</sup> including the within-subject factors compatibility (compatible vs. incompatible) and cat presence (cat present vs. cat absent). This analysis showed a significant main effect of compatibility, F(1,20) = 7.12, p = 0.015, η 2 <sup>g</sup> = < 0.01 (see **Figure 7**) with faster RTs for compatible trials, M = 571 ms as compared to incompatible trials, M = 577 ms (compatibility effect = 6 ms). The interaction compatibility × cat presence was not significant, F(1,20) = 2.79, p = 0.11.

#### EEG Analysis

Data of one participant required interpolation of one channel. Based on artifact detection for the preprocessed data, on average 1.1% of the trials per participant had to be excluded, SD = 1.79, maximum = 7.5%. Remaining trials were averaged to ERPs across the factors compatibility, cat presence, and go/no-go.

To investigate the main effect of cat presence on the P300 for no-go-trials, similar to Experiment 1, a repeated measure, two-tailed cluster-based permutation test was calculated for a time window from 300 to 500 ms leading to 1530 tests (over 30 electrodes and 51 time points) with an overall alpha-level of 0.05 by establishing a test wise alpha-level of 0.00033. Definition of electrode neighbors and clusters was parallel to Experiment 1. The main effect cat presence was not significant (p-values ≥ 0.42; see **Figure 8** for corresponding waveforms). The same tests as before were calculated separately for go and no-go-trials within an interval from 0 to 500 ms to cover the whole epoch leading to 3840 comparisons with a test wise alpha-level of 0.00013. These tests did not reach statistical significance either, no significant t-score, p-values ≥ 0.08. Thus, there was no effect of cat presence for go- or no-go-trials.

To analyze the main effect of compatibility, a repeated measure, two-tailed cluster-based permutation test was calculated: An interval from 300 to 500 ms after stimulus onset was used leading to 1530 comparisons (over 30 electrodes and 51 time points) with an overall alpha-level of 0.05 by establishing a test wise alpha-level of 0.00003. The main effect compatibility was significant with higher amplitudes for incompatible trials than for compatible trials. The difference was evident in the right hemisphere in the parietal and centroparietal area within an interval of 100 to 150 ms, smallest significant t-value t(20) = −2.09, significant p-values < 0.05 (see **Figures 9**, **10**).

# Discussion

Experiment 2 was conducted for conceptual replication of Experiment 1 using an optimized task design. It served to investigate whether a joint Simon effect (Sebanz et al., 2006) is evoked by a Japanese waving cat. Furthermore, the underlying neurophysiological processes were registered using EEG.

Similar to Experiment 1, there was an (1) overall Simon effect with faster response times for compatible trials than for incompatible trials regardless of the presence or absence of the Japanese waving cat. The predicted (2) interaction effect between cat presence and compatibility was not found. Additionally, a (3) compatibility effect was found in EEG. In contrast to Experiment 1, this effect was located in the (centro-) parietal right hemisphere within a later time window of 300–500 ms after stimulus onset and not in the left hemisphere as in Experiment 1. Furthermore, there was no (4) significant P300 effect regarding no-go-trials.

The finding of an (1) overall Simon effect suggests that the adapted paradigm, namely fixing the no-go-trials' presentation time, screening off the examiner by a curtain as well as changing the stimulus modality did not affect the Simon effect. This suggests that despite the presence of the curtain the knowledge about the presence of the experimenter was enough to produce referential coding. This would be in line with studies showing evidence for a joint Simon effect when the two actors are seated in different rooms (Tsai et al., 2008; Ruys and Aarts, 2010) and the spatial arrangement of the two rooms allows a spatial coding of responses (Sellaro et al., 2013). Alternatively, or in addition, other factors of our setup may also contribute to a spatial coding of one's own action. The (2) absence of the interaction of cat presence and compatibility shows that cat presence had no further modulatory influence on task performance. The (3) compatibility effect observed in a different location and later time window compared to Experiment 1 can be understood as a neurophysiological correlate of the Simon effect. The different location and time window is best explained by the change in stimulus modality from visual to auditory stimuli.

<sup>3</sup>The same ANOVA including the factor order of presentation did not show a main effect of order of presentation, F(1,19) < 1, no interaction of order of presentation × compatibility, F(1,19) < 1, no interaction of order of presentation × cat presence, F(1,19) = 2.15, p = 0.159, and no three-way interaction of order of presentation × cat presence × compatibility, F(1,19) = 1.95, p = 0.178.

The (4) missing modulation of the P300 effect for no-go-trials in the EEG-data, however, fits to the overall Simon effect and the missing interaction of cat presence and compatibility and provides ERP evidence that no modulation due to the Japanese waving cat took place.

# GENERAL DISCUSSION

In this study, we performed two experiments replicating previous research on individual go/no-go Simon effects (Dolk et al., 2013a) to investigate the ERP effects underlying object induced Simon effects. Sebanz et al. (2006) reported an enhanced P300-effect in no-go-trials when two participants shared a Simon task as when the same go/no-go task was performed alone. We aimed to find a similar P300-enhancement when a go/no-go Simon task was performed next to a Japanese waving cat as when the task was performed alone.

In Experiment 1, we observed a Simon effect regardless of the presence or absence of the Japanese waving cat. Along with this, an early compatibility effect located in the (centro-) parietal left hemisphere was registered in the EEG data. A further modulation of the P300 component elicited by the Japanese waving cat was not observed.

As the influence of the cat might have been obscured by situational factors, in Experiment 2 the examiner was screened off with the help of a curtain, the modality was changed from visual to auditory stimuli and the presentation time of no-gostimuli was no longer matched to the participant's go-RT. These changes led to a Simon effect independent from the presence or absence of the Japanese waving cat. As in Experiment 1, there was a comparable compatibility effect present in the EEG. It differed from the EEG in Experiment 1 by a later onset of the compatibility effect and a different scalp location. We attribute this difference to the change in stimulus modality in Experiment 1 (visual) and Experiment 2 (auditory). Similar to Experiment 1, there was no significant P300 effect modulated by the presence of a Japanese waving cat.

The explanation of these findings leads to two main questions: which factors elicited a Simon effect independent from the presence or absence of the Japanese waving cat in the current two experimental settings? Why did we not find a clear modulation of the P300 by the Japanese waving cat?

The lack of a modulation of the P300 by the object may be understood when taking the voltage differences in the P300 between object absent and present condition as an indicator for an object-induced Simon effect (Sebanz et al., 2006). This procedure is based on the prerequisite that a Simon effect is

absent in the object absent condition and present in the object present condition. As ruled out before, this requirement was not met in the present study. We clearly found a significant Simon effect in both object present and object absent conditions. The missing modulation of the P300 is therefore in line with the finding of an overall Simon effect observed for RTs and indicates that the cat had no modulating influence on the Simon effect in the present study.

This finding partly matches to recent results from Lien et al. (2016). The Lien study only found a significant modulation of the LRP by a Japanese waving cat when auditory stimuli were presented, but not for visual stimuli. Nevertheless, this only matches our findings of Experiment 1 where visual stimuli were used. The question remains, why we did not find a modulation of the Simon effect by cat presence when auditory stimuli (Experiment 2) were used.

On a behavioral basis, the interpretation that the visual stimulus modality may have diminished the modulating influence of the Japanese waving cat is supported by recent findings of Lien et al. (2016) and Puffe et al. (2017) who also replicated Experiment 1 by Dolk et al. (2013a) with a hidden or visible cat and with visual and auditory stimuli. Both studies found no modulation of cat presence with visual stimuli but a significant modulation when auditory stimuli were used. Our experiments fit into this pattern for Experiment 1 (visual stimuli), but not for Experiment 2 (auditory stimuli). Therefore, it remains unclear which factors elicited a Simon effect independent from the presence or absence of the Japanese waving cat in our study?

By adopting the task setup to the study of Dolk et al. (2013a), stimulus modality and the presentation times of the no-go stimuli could be ruled out as possible explanations for the missing effect of cat presence in Experiment 1. However, due to the EEG setup we used the impact of the experimenter could not fully be prevented. According to Tsai and Brass (2007), one factor modulating a joint Simon effect is the presence of a responding social co-actor. The only additionally present person in our study was the experimenter.

While the experimenter might have caused the Simon effect in Experiment 1, sitting two meters away on the left side, we tried to reduce his influence to a minimum by screening him off with a curtain in Experiment 2. As we also found a Simon effect in Experiment 2, it seems that even when placed behind a curtain in extra-personal space, the experimenter might have an impact on the spatial response coding for the participants, which would be in line with previous studies (Tsai et al., 2008; Ruys and Aarts, 2010; Sellaro et al., 2013). Our finding of a Simon effect

when the experimenter was located in extra-personal space (in Experiment 2) is contrary to those of Guagnano et al. (2010) who did not show a joint Simon effect when two co-actors were located outside of peri-personal space (i.e., in extra-personal space) but support studies of Welsh et al. (2013a,b) showing a joint Simon effect when two co-actors were located in extra-personal space.

In addition to these previous studies, our findings seem to show that the exact task of the person placed behind the curtain is not relevant to induce a Simon effect in an individual go/nogo Simon task setting. One should be aware that a person sitting directly next to the participant simply observing the task does not elicit a joint Simon effect (Sebanz et al., 2003). Furthermore, our findings are in line with studies showing that it is not only relevant what we actually perceive of other persons actions, but what we imagine what other persons might be doing even when we cannot see them (Sellaro et al., 2013). The EEG experimenter represents a socially acting person being in the same room as the experimenter making it likely to catch attention. However, this person is clearly not involved in taking over the other half of the Simon task as it is the case in typical joint Simon tasks. Therefore, we do not think that action or task co-representation can account for the finding of the overall Simon effect we observed.

However, a weaker form of social attention might be involved in the effect we observed. In line with this assumption, we would therefore argue that perceiving an event-producing experimenter (Experiment 1) or imagining an event-producing experimenter (Experiment 2) is enough to induce referential coding and the Simon effect (Sellaro et al., 2013; Dittrich et al., 2017; Klempova and Liepelt, 2017).

This would also be in line with the findings of Puffe et al. (2017) who not only investigated if a Japanese waving cat next to the subject can elicit a Simon effect, but who also implemented a condition in which the cat was hidden behind a speaker so that it cannot be seen but only heard. This condition is somehow comparable to our approach to screen off the experimenter behind a curtain so he could not be seen but only heard (Experiment 2). As the experimenter had to produce some events while controlling the EEG recordings, he might have functioned in a similar way as the hidden but sound-producing Japanese waving cat in Puffe et al. (2017). As the hidden cat elicited a Simon effect when auditory stimuli were used, this might also be a suitable explanation for the overall Simon effect in our Experiment 2.

The assumption of attention induced effects fits to our finding that the omnipresent Simon effect in both experiments amounted to six or seven milliseconds, respectively. This effect size is not comparable to compatibility effects elicited by a standard Simon paradigm (approximately up to 26 ms, Simon and Rudell, 1967)

but it is comparable to joint Simon effects (ranging between 7 and 15 ms, Kiernan et al., 2012; Dittrich et al., 2013). This relatively small effect size and a couple of studies showing that other small adjustments of the experimental setting influence the joint Simon effect let one conclude that the joint Simon effect is very sensitive to setting and task adjustments in general (Guagnano et al., 2010; Dittrich et al., 2012, 2013; Lien et al., 2016; Stenzel and Liepelt, 2016b).

For instance, Dittrich et al. (2013) observed a Simon effect by emphasizing the spatial dimension (correspondence of response button and seat position). We followed this approach by placing the response button and the participant's seat position on the right side of middle axis of the screen. This could also have stressed the spatial dimension in both the cat present and cat absent condition to result in a spatial coding of the participant's actions. Lugli et al. (2015) further systematically altered the seating position in a joint Simon task with two actors after a training phase. Results showed that the seating position is even more important to the rise of a Simon effect than the spatial compatibility of stimulus and response button. Thus, the positioning of response button and participant's seat might also contribute to the finding of an omnipresent Simon effect in the current study.

Furthermore, findings from Stenzel and Liepelt (2016b) showed that the response mode is more influential for the joint Simon effect than the attributes of the object placed next to the participant. Thus, having an object or co-actor in a turntaking response mode results in a larger joint Simon effect than a continuously waving Japanese cat. Thus, a continuously Japanese waving cat might not be sufficient to evoke an enhanced joint Simon effect under all circumstances. In a paradigm, similar to the one used by Dolk et al. (2013a), a Japanese cat might bring about a Simon effect. In paradigms in which an individual Simon effect is already present, the Japanese cat does not exert enough influence to modulate the already existing Simon effect, neither in behavioral nor in electrophysiological measures. This is in line with findings of Lien et al. (2016) showing that the presence of a Japanese waving cat did not modulate the size of the standard Simon effect.

Nevertheless, the present study has the limitation that we did not include a human co-actor condition (e.g., Sebanz et al., 2006)

# REFERENCES


to directly compare object induced and human induced Simon effects. Further, there was no control condition in which either subject and cat changed positions or in which the experimenter changed position (from left to right) to clarify whether the cat or the experimenter function as a stronger reference frame. Although combining all those conditions in a single study using a within-subject design might cause undesired effects of fatigue or lacking attention due to the required length of the experiment, further research should address these needs by suitable experimental designs, e.g., between-subject designs. An enlarged series of experiments to cover all control-conditions is also conceivable.

All in all, considering the small effect size of the Joint Simon effect and the evidence attesting its high sensitivity for experimental setup changes, it is most likely that – in our case - minimal experimental setup differences to Dolk et al. (2013a) led to an omnipresent Simon effect. Thus, we argue that the EEG experimenter caused the Simon effect independent of cat presence in our experiments. Nevertheless, based on our findings we were not able to provide evidence that a social co-actor and a salient object elicit the same ERP effect and neuronal process. However, our findings suggest that attention to other event-producing humans or objects may be an important factor for future research on joint action. Further, our findings suggest caution where to position the examiner, which might unintentionally influence experimental outcomes.

# AUTHOR CONTRIBUTIONS

RM: data collection, data analysis, drafts and revisions of manuscript, and study design. RL: drafts and revisions of manuscript and study design. JB: drafts and revisions of manuscript, study design, and data analysis.

# FUNDING

The present research was financially supported by the German Research Foundation Grants DFG LI 2115/1-1; 1–3 awarded to RL. We acknowledge support by Open Access Publication Fund of the Westfälische Wilhelms-Universität Münster, Münster.

underlying joint compatibility effects. Q. J. Exp. Psychol. 70, 1808–1823. doi: 10.1080/17470218.2016.1207690




**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Michel, Bölte and Liepelt. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Are You Keeping an Eye on Me? The Influence of Competition and Cooperation on Joint Simon Task Performance

#### Jonathan Mendl, Kerstin Fröber and Thomas Dolk\*

Department of Experimental Psychology, University of Regensburg, Regensburg, Germany

Social interaction plays an important role in human life. While there are instances that require cooperation, there are others that force people to compete rather than to cooperate, in order to achieve certain goals. A key question is how the deployment of attention differs between cooperative and competitive situation; however, empirical investigations have yielded inconsistent results. By manipulating the (in-)dependence of individuals via performance-contingent incentives, in a visual go–nogo Simon task the current study aimed at improving our understanding of complementary task performance in a joint action context. In the independent condition each participant received what s/he achieves; in the cooperative condition each participant received the half of what both achieved, and in the competitive condition participants were instructed that the winner takes it all. Extending previous findings, we found sequential processing adjustments of the Simon effect as a function of the interdependency (i.e., competition, cooperation) and transition between (i.e., go–nogo requirements) interacting individuals. While sequential processing adjustments of the Simon effect in both the competition and cooperation condition were unaffected when alternating between responsible actors (i.e., nogo–go transition), sequential processing adjustments were enlarged under competition for repeating responsibilities of one and the same actor (i.e., go– go transitions). In other words, the prospect of performance-contingent reward in a competitive context exclusively impacts flexible behavioral adjustments of one's own actions. Rather than fostering the consideration and differentiation of the other actor, pushing one's own performance to the limit appears to be the suitable strategy in competitive instances of complementary tasks. Therefore, people keep their eyes on themselves when aiming at beating a co-actor and emerging as the winner.

Keywords: joint action, go–nogo Simon task, reward, cooperation and competition, sequential processing adjustments, referential coding

# INTRODUCTION

For humans, it is nearly impossible not to interact with others (Watzlawick et al., 1967). Beyond the significant role of exchanging information and communicating with each other, there are many instances in everyday life that require cooperation (e.g., carrying a table together), while there are others that force people to compete in order to achieve a certain goal (e.g., career position, success

#### Edited by:

Motonori Yamaguchi, Edge Hill University, United Kingdom

#### Reviewed by:

Andrea Cavallo, Università degli Studi di Torino, Italy Cristina Iani, Università degli Studi di Modena e Reggio Emilia, Italy

> \*Correspondence: Thomas Dolk thomas.dolk@ur.de

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 29 March 2018 Accepted: 16 July 2018 Published: 03 August 2018

#### Citation:

Mendl J, Fröber K and Dolk T (2018) Are You Keeping an Eye on Me? The Influence of Competition and Cooperation on Joint Simon Task Performance. Front. Psychol. 9:1361. doi: 10.3389/fpsyg.2018.01361

**23**

in nearly any kind of sport). Although any of those instances are quite familiar to all of us, the question arises: When and to what extent (if anything) do we consider the other person's actions during social interaction?

Experimental approaches aiming at investigating the underlying mechanisms of social interactions in the laboratory typically use the Simon task (Craft and Simon, 1970). In the standard, two-choice version of the Simon task, participants are asked to respond with a left or right keypress to a particular feature of the stimulus (e.g., the color blue or green), which randomly appears to the left or right side of the screen. If the spatial location of the stimulus corresponds with the spatial location of the assigned response (i.e., compatible trial), responses are typically faster and less error prone. In contrast, if the spatial location of the stimulus and the assigned response differ (i.e., incompatible trial), response times (RTs) and error-rates (ERs) increase. The difference between RTs or ERs in incompatible and compatible trials is called the Simon effect (cf. Simon and Rudell, 1967; for an overview, see Simon, 1990; Lu and Proctor, 1995). This Stimulus–Response (S–R) compatibility effect is typically explained by the dimensional overlap model (Kornblum et al., 1990), which concerns the effect of a match between the task-irrelevant feature of the stimulus (i.e., location) and the required response (i.e., left/right location). Accordingly, the location of the stimulus is assumed to automatically activate a spatially corresponding response, which facilitates task performance on compatible trials and impairs performance on incompatible trials, because resolving the conflict between automatically activated and required responses takes time.

To investigate social interactions, Sebanz et al. (2003) distributed the Simon task among two individuals. Accordingly, each participant was responsible for one stimulus color by operating his/her assigned response-key, while sitting on one side of the screen, converting the two-choice Simon task into a joint go–nogo task. In order to investigate the impact of performing the go–nogo task together with another person, Sebanz et al. (2003) had participants carry out the same go– nogo Simon task individually, thus in the absence of the partner (i.e., individual go–nogo task). While there was no significant S–R compatibility effect in the individual go–nogo condition, there was a compatibility effect when two participants performed the same go–nogo Simon task together, and this finding became known as the social or joint go–nogo Simon effect (JSE; Sebanz et al., 2003; for a review, see Dolk et al., 2014). Interestingly, even though the other person's (or one's own nogo) action typically has no direct consequence for the continuation of the experiment (as it simply offers no additional information to facilitate one's own performance), the mere perception (or expectation) of an alternative action in the joint but not in the individual go–nogo condition seems to impact one's own task performance. In other words, while it would be an appropriate strategy to concentrate on one's own task exclusively and completely ignore everything else, people seem unable to do so as soon as there are other (attention attracting) action events in the environment. Thus, the (social) task context seems to modulate the allocation of attention toward the specific S–R associations (cf. Baess and Prinz, 2015). Accordingly, signifying the spatial S–R assignments reintroduces a dimensional overlap of the corresponding dimensions, thereby facilitating (S–R match) or impairing (S–R mismatch) task performance in the joint condition, whereas the lack of an alternative action to one's own in the individual condition eliminates the need for spatial response coding and thus, there is no dimensional overlap of spatial S–R features.

This explanation nicely fits with the existing theoretical frameworks aiming to explain the emergence of JSEs: The action co-representation (Sebanz et al., 2003) and the referential coding account (Dolk et al., 2013; for spatially inspired accounts, see Guagnano et al., 2010; Dittrich et al., 2012). Grounded in the deeply social nature of human beings, the action co-representation account assumes that one's own actions and others' actions are (automatically) represented in a functionally equivalent way. Accordingly, spatially assigned stimuli and responses of the whole task set are considered to be represented, which facilitates task performance in cases of an S–R match, but interferes with performance when there is an S–R mismatch. Based on the Theory of Event Coding (TEC; Hommel et al., 2001), the referential coding account in contrast holds that actions are cognitively represented by codes of their perceivable effects. Given that self- and othergenerated actions are represented by the same kind of effect codes, the representation referring to one's own action needs to be discriminated from all concurrently activated representations in order for the individual to behave appropriately in a given context. Emphasizing the spatial nature of one's own action as left/right in reference to the other person's action provides not only a powerful strategy to differentiate alternative action events, it also reintroduces the dimension overlap of spatially defined S–R features. Consequentially, discriminating alternative action events should be more challenging the more similar those events are, resulting in varying effect sizes with varying degrees of similarity (i.e., more similar = larger JSE). These assumptions nicely converge with previous findings showing increased (nonsocial and social) JSEs with increasing similarity of alternative action events: e.g., HumanRomantic−Partner > HumanFriend (Quintard et al., 2018), HumanIngroup > HumanOutgroup (Iani et al., 2011; Müller et al., 2011), Human > Puppet (Tsai and Brass, 2007), Human > Computer (Tsai et al., 2008), RobotHuman−like > RobotMachine−like (Stenzel et al., 2012); Japanese waving cat > Clock > Metronome (Dolk et al., 2013).

Thus, while social variables appear to play an influential role for the co-representation account, the referential coding account highlights the role of the similarity between alternative action events irrespective of its (non-)social nature. In both cases, however, attention allocation toward the spatially distinct alternative actions seem to impact the cognitive representation thereof and, what is more, subsequent behavior (for an attentional focusing account of joint compatibility effects, see Dittrich et al., 2017). This brings the introductory question back into play: When and to what extent do individuals take other people's (i.e., alternative) actions into account or in other words, which situations require the discrimination of self- and othergenerated events? One straightforward approach to tackling this issue is to manipulate the relationship between, or the

interdependence of, interacting individuals. While the offending behavior of an intimidating confederate reduced the JSE as compared to a friendly co-actor (Hommel et al., 2009), other studies manipulated the interdependence or in-/dependence of interacting individuals by inducing a more cooperative or competitive relationship via incentives (Ruys and Aarts, 2010; Iani et al., 2011; Ruissen and de Bruijn, 2016).

Using an auditory joint go–nogo Simon task (i.e., reacting to the pitch of a sound), Ruys and Aarts (2010) investigated three reward manipulations to induce different relationships between participants. In advance of the Simon task, participants were instructed that either: (i) the ten best performing subjects will win ten euros (independent group), (ii) each actor of the five best performing teams will earn five euros (cooperative group), or (iii) ten team winners will be randomly selected for the ten euros reward (competition group). Results revealed a larger JSE in the dependent groups (i.e., the cooperative and competitive groups), in comparison to the independent group. This finding has been taken to suggest that interdependency leads to a stronger attentional focus on the partner, and therefore to stronger shared representations and to a larger JSE. Iani et al. (2011) improved the definition of competition by instructing participants that each actor of the best performing team will be rewarded with five euros in the cooperative group, or that only the team winner will receive five euros as a reward in the competitive group. In sharp contrast to Ruys and Aarts (2010), the results revealed a significant difference between dependent groups, with a JSE in the cooperative group but no JSE in the competitive group. Thus, considering the co-actor might selectively occur when aiming to beat others as a team, but the exact opposite takes place when having an opponent. Further support for the crucial role of the exact type of interdependency is provided by a study of Ruissen and de Bruijn (2016) showing a smaller JSE in a group of participants that played Tetris in a competitive as compared to a cooperative (or isolated, i.e., solo) style of social interaction prior to the joint go–nogo Simon task performance.

Even though previous findings suggest more attention to the other's actions in a cooperative compared to a competitive relationship, there are several methodological issues that warrant further investigation in order to fully understand the processes that drive these socially driven flexible adjustments of attention allocation. In addition to investigating go–nogo Simon task performance in an independent, cooperative and competitive joint condition, the present study made use of an individual go–nogo task at the beginning of the experiment to provide a valuable reference for the resulting JSEs. Furthermore, the definitions of the terms cooperation and competition were further adjusted from those used by Iani et al. (2011). Here, cooperation instructions emphasize team work for achieving a common goal (and not a cooperative competition against unspecified others as in Iani et al., 2011), while competition instructions more clearly highlight the battle of opponents (i.e., the winner takes it all) to achieve the individual goal of emerging as the winner. To further amplify the effect of interactive contexts, reward was given for every (correct and fast enough) trial.

More importantly, however, the present study followed the recommendation of Liepelt et al. (2013) in taking sequential processing adjustments (i.e., trial-by-trial dependencies) in go– nogo Simon task performance into account to achieve a more detailed picture of the underlying processes (cf. Liepelt et al., 2013). That is, compatibility effects like Flanker, Simon, and Stroop are typically smaller after incompatible compared to compatible trials (Gratton et al., 1992; for a review, see Egner, 2014). This conflict adaptation or Gratton effect is considered to reflect reduced interference as a consequence of cognitive control already being up-regulated in the trial following an incompatible (conflicting) trial (Botvinick et al., 2001; for a review, see Botvinick, 2007). Liepelt et al. (2011) emphasized this effect in an individual and joint go–nogo Simon task, while highlighting the role of sequential processing adjustments for different types of trial-to-trial transitions. These can either be (i) nogo–go transitions, where the participant had to withhold a response in the previous trial but is required to respond in the current trial, or (ii) a go–go transition, in which the participant was required to respond in both the previous and the current trial. Interestingly, while sequential adaptation effects were stronger for nogo–go transitions than for go– go transitions in both tasks, these where overall smaller in the individual go–nogo task suggesting additional betweenperson discrimination (i.e., whose turn is it?) processes in the course of a nogo–go transition (Liepelt et al., 2011; Yamaguchi et al., 2016). For the present study, those transition effects are particularly interesting as they can indicate changes of the attentional focus, by signifying differences in sequential processing adjustments after one's own compared to the partner's response. Considering a positive to neutral and thus, rather cooperative style when engaged in social interactions with others as default (Iani et al., 2011), constantly attending to the partner's action enables flexible adjustments to the other in order to achieve the common goal together. The critical question, however, is whether and (if any) to which extent participants' attention is drawn to one's own or the other's performance in competitive interactive contexts. In other words, do participants in a competitive relationship apply a self- or other-referenced focus (Poortvliet and Darnon, 2010). While the latter is suggested to be applied when aiming to outperform the other, the former might be more suitable in particular (task) circumstances in which constant monitoring and comparing one's own and the others performance is quite demanding, thus resource-consuming. That is, attending to the co-actors' action might simply not be an appropriate strategy for participants who are trying to improve their own performance, because they have little to no direct influence on changing their opponent's actions in a go–nogo Simon task. The only thing they can influence in such a situation is their own action, which should result in a self-referenced focus (Iani et al., 2011), leading to no reliable nogo–go, but notable go–go transition effects, because they refer to sequential processing adjustments after one's own response.

Based on this framework, the present study investigated the processes underlying flexible adjustments to the contextual challenge of either cooperating or competing for reward when interacting with others. To that end, participants performed an individual go–nogo Simon task at the beginning of the TABLE 1 | Procedure of an experimental session.

fpsyg-09-01361 August 2, 2018 Time: 11:26 # 4


<sup>a</sup>Tasks 1 and 2 comprised one training block of 16 trials and two testing blocks of 128 trials each, while Task 3 contained only two testing blocks of 128 trials each.

experiment followed by a go–nogo joint Simon task with the prospect of reward that was largely independent of the co-actor's performance. That is, prior to the joint go–nogo Simon task, participants were instructed that each participant in the pair would receive the amount of reward that s/he actually earned for fast and correct responses on their own (i.e., maximizing my own reward irrespective of the co-actor; independent goal). Prior to the final part of the experiment participants were informed that the amount of reward will be equally divided between both participants (i.e., maximizing the total reward sum together with the co-actor; shared goal = cooperation) or that the participant that earned the most reward will receive the whole amount of reward, including the amount earned by the other person (i.e., being better than the coactor to win and not to lose in the end the whole reward sum; competitive goal; see **Table 1** for an overview of the experimental procedure). If participants in the competition group develop an other-referenced focus, we would expect larger JSEs and sequential processing adjustments after nogo– go transitions compared to the cooperative group. On the other side, if participants in the competitive group apply a self-referenced focus, they should show a smaller JSE, and larger adjustments after go–go transitions compared to the cooperative group. Results in the individual go–nogo Simon task and the independent joint go–nogo Simon task should be in line with previous findings showing no go–nogo Simon effect but significant sequential processing adjustment in the former and significant effects in both measures in the latter condition.

# MATERIALS AND METHODS

# Participants

Forty-eight right-handed undergraduate students of the University of Regensburg (44 female; Mage = 19.7, SDage = 1.9, Rage = 18–28 years) participated in the present study.<sup>1</sup>,<sup>2</sup> Participants had normal or corrected-to-normal vision and were naive with regard to the hypothesis of the experiment. Participants gave their written informed consent before their inclusion in the study in accordance with the ethical standards of the German Psychological Society (DGPs; 2016) and the 1964 Declaration of Helsinki. According to the DGP's ethics

<sup>1</sup> In order to select the sample size, we entered Experiment 2 of the study by Iani et al. (2011) into the following website https://designingexperiments.shinyapps.io/ BUCSS\_ss\_power\_spa/. Considering the design (i.e., two-factor mixed ANOVA), sample-size (i.e., N = 32), observed F-value (i.e., F = 8.82), number of levels for between-subjects (i.e., two) and within-subjects factors (i.e., two), the effect of interest (i.e., the interaction between both factors), the alpha-level for the previous and current study (i.e., both 0.05), a desired level of assurance of 0.5 (i.e., correcting for publication bias) as well as a desired level of statistical power of 0.8 revealed a sample size of 21 participants per group. However, based on the findings by Liepelt et al. (2011) for sequential trial-by-trial adjustments using 24, we decided to also test 24 participants per group.

<sup>2</sup>The difference between the number of male and female participants could bias the internal validity of the present study. Given, however, that most studies investigating the influence of interdependency on go–nogo Simon task performance had comparable female samples (i.e., Ruissen and de Bruijn, 2016: 92%; Hommel et al., 2009: 89%) and the fact that the manipulation of interdependency can be considered to be more effective in males as compared to females (Van Vugt et al., 2007), the impact of the present unbalanced gendersample should (if any) further strengthen the present results in a balanced gender-sample.

FIGURE 1 | Experimental setup in the individual go–nogo Simon task (Task 1; A) and in the joint go–nogo Simon tasks (Tasks 2 and 3; B). In both go–nogo Simon task contexts (A,B) the participants are required to respond to their assigned stimulus (blue circle, person on the right, incompatible trial; green circle, person on the left, compatible trial) by operating the response key in front of them. Stimulus–Response assignments as well as spatial position of the participants were counterbalanced across participants but held constant across the tasks. Whereas in the individual go–nogo Simon task (1; A) participants worked on adjacent computers, in the joint go–nogo Simon tasks (2 and 3; B) both participants sat in front of one computer.

commission, an institutional research board's ethical approval is only required if (i) research carries additional risk beyond daily activities or (ii) any funding is subject to such an ethical review. No such requirements were present for this study. After the session, all participants were debriefed and rewarded with partial course credit. Participants were tested in pairs and did not know each other prior to the experiment. Data from three participants were excluded due to mean reaction times or error rates of more than 2.5 SDs from the task mean.

# Material and Procedure

For the present go–nogo Simon tasks, a green and a blue circle with a diameter of one centimeter were used as stimuli (0.96◦ × 0.96◦ ; cf. Hommel et al., 2009). They were presented 8.75 cm to the left or the right of the center (eccentricity of 8.7◦ visual angle) using E-Prime 2.0 (Psychology Software Tools, Sharpsburg, PA, United States).

Upon arrival at the laboratory, pairs of participants were informed about the three consecutive segments of the experiment, namely performing the first task alone and the following two together with the other person. Prior to the instruction phase of the first (an individual go–nogo Simon) task, both participants were seated at their respective workspaces composed of two seats in front of a computer with a 17 inch monitor (display resolution at 1,024 × 768 pixels) at a viewing distance of approximately 60 cm (**Figure 1A**). To enable a consistent spatial arrangement of left/right chair and corresponding response across all tasks, participants in Task 1 sat back-to-back leaving the second chair at each workspace empty. That is, while the participant assigned to the left workspace was seated in the left chair and responded via the left response key (i.e., the "Y"-key on a QWERTZkeyboard), the participant assigned to the right workspace was seated in the right chair and operated the right response ("M"-) key (**Figure 1A**). Both participants were instructed to put their right index finger on the respective response key while leaving their left hand underneath the table on their left thigh.

To familiarize participants with the task, the experiment started with an instruction phase (∼5 min) including the presentation of the two stimuli, their assignment to each participant and a training of 16 trials in total. After the instruction phase was completed, the experimental phase of Task 1 started. There were two blocks of 128 trials, which equally often contained each stimulus (blue vs. green) with each S–R mapping (compatible vs. incompatible). This task was used to calculate the individual reaction time (RT) threshold for performance-contingent reward receipt in Task 2 and Task 3. The threshold was determined by the 0.33-quantile of all correct responses sorted from fast to slow (cf., Fröber and Dreisbach, 2014, 2016). To maintain vigilance throughout the whole experiment, short self-paced breaks between blocks and a 2-min break between Task 1 and Task 2 outside the laboratory were provided.

Following Task 1 and a recovering break, participants reentered the lab to continue with the second segment, a joint go– nogo Simon task with performance-contingent reward. In order to keep S–R assignments and responsibilities consistent with Task 1, both participants were asked to take their respective seat of either the left or the right workspace (counterbalanced across pairs of participants). Thus, while the workspace remained the same for one participant, the other had to change, but the spatial assignment of chair and response-key remained the same (see **Figure 1B**). After participants were reminded about stimuli and respective assignments, they were instructed about the possibility of earning four cents for every correct and very fast response (i.e., faster than the individual RT threshold) and irrespective of the partner's performance, to explicitly emphasize an independent relationship between interacting individuals. Note, however, to keep the task fair, a participant would lose two cents in case of an error and the partner would gain these two cents, because an error of one participant always represented a lost opportunity for the other participant to gain reward. After the instruction, participants performed 16 more training trials in order to get familiar with the task and to give the participants a feeling of about how fast they have to react to receive the reward. Following this short training, participants got feedback about the amount of money they would have received before the experimental phase of Task 2 started. As in Task 1, participants had to perform 256 testing trials divided in two blocks and they received feedback about the earned amount of money after each block.

After Task 2, participants continued with the third segment, again a joint go–nogo Simon task with performance-contingent reward. The procedure was similar to the last task with the following exception: In contrast to Task 2, the amount of reward each participant received at the end of Task 3 depended upon the interactive mode, that is, whether participants competed or cooperated. More precisely, in the cooperative group, participants were instructed that the amount of reward both participants earned during the course of the experiment will be equally divided at the end of the experiment, thereby aiming to emphasize to work as a team for a common goal. Consequentially, error punishment was changed such that wrong responses still led to a loss of two cents, but the amount was not added to the partner's score. In the competitive group, however, participants were informed about "the winner takes it all principle," aiming to increase the challenge of receiving the desired goal. Thus, the participant who earned the most during the course of the experiment will receive not only her/his own reward but also the amount of the reward earned by the co-actor. Accordingly, error punishment was the same as in Task 2: Producing an error resulted in a loss of two cents and a gain of two cents for the opponent. After this instruction, the experimental phase started immediately (i.e., without further training) with 256 testing trials, divided into two blocks and a feedback about the earned amount of money after each block (see **Table 1** for an overview of the experimental procedure).

Each trial of the different Simon tasks started with a fixation cross in the center of the screen for 250 ms followed by the imperative stimulus (i.e., a blue or a green circle) presented to either the left or the right side of the screen for 1,000 ms or until a response was given. If the response was correct and fast enough, the next trial started after an inter-trial interval (ITI) varying randomly between 500 ms and 1,200 ms in steps of 100 ms. If not, the German words for error (i.e., "Falsch!") or too slow (i.e., "Zu langsam!") were displayed on the screen for 1,000 ms, thus extending the ITI for about 1,500–2,200 ms in Task 1 and the training trials of Task 2. There was no error feedback in the testing trials of Task 2 or in Task 3.

After the three sessions of go–nogo Simon task performances, participants were asked to complete three computerized questionnaires at their own workspace, respectively. The first questionnaire involved the "Inclusion of Other in the Self " (IOS) scale (Aron et al., 1992), a single-item pictorial measure for perceived interpersonal connectedness. Here, participants are asked to indicate which of the seven pictures best describes their own relationship with the co-actor. The IOS was aimed to provide a proof of concept for the interactive mode (i.e., competitive vs. cooperative) in Task 3. Following the IOS, participants answered six questions about their focus of attention (I fully concentrated on my own task in the last two blocks; In the last two blocks of the experiment, I kept a close eye on the other participant's reaction; A sort of rhythm developed between my reaction and the reaction of the other participant; I tried to ignore the reaction of the other participant; The reaction of the other participant strongly distracted me from my task; I strongly concentrated on the other participant's task). Participants could answer on a five-point scale with possible answers "very true for me," "somewhat true for me," "neutral," "somewhat false for me," and "very false for me." Those questions were intended to measure the attentional focus of the subjects in Task 3. The last questionnaire was the BIS/BAS Scale (Carver and White, 1994), which has 24 items in form of statements indicating approach and avoidance motivation. Participants responded on a four-point scale with "very true for me," "somewhat true for me," "somewhat false for me," or "very false for me." The reward responsiveness subscale of the behavioral approach system in particular could influence participants in the rewarded Tasks 2 and 3.

# Design

A 2 (CompatibilityN: compatible, incompatible) × 2 (CompatibilityN−1: compatible, incompatible) × 2 (Transition: go–go, nogo–go) × 2 (Block: 1, 2) × 2 (Group: cooperation, competition) mixed analysis of variance (ANOVA) was conducted for each of the three tasks. The within-subjects factors were compatibility in the current trial (CompatibilityN), compatibility in the previous trial (CompatibilityN−1), Transition and Block, while Group was a between-subjects factor. In order to investigate the impact of the specific interdependence (i.e., the in-/dependence) on interacting individuals in the go–nogo Simon task, we included the within-subjects factor Task (2, 3) in the original 2 × 2 × 2 × 2 × 2 ANOVA (**Supplementary Table S1**).

# RESULTS

# Data Preprocessing

For statistical analysis, we excluded the first trial of each block, erroneous and post-error trials (together 3.2%) as well as trials with RTs lower than 100 ms and RTs that were more than 3 SDs from the individual cell mean (together 0.4%). Error rates were rather low 1.3%, and were not analyzed further. The significance criterion was set to p < 0.05.

# RT Analysis for Task 1 (Individual Go–Nogo)

The 2 × 2 × 2 × 2 × 2 ANOVA revealed no significant main effect of CompatibilityN, F(1,43) = 1.42, p > 0.05. However, there was a sequential adaptation effect as indicated by a significant interaction of Compatibility<sup>N</sup> and CompatibilityN−1, F(1,43) = 93.61, p < 0.001, η 2 <sup>p</sup> = 0.69. The Simon effect was smaller after incompatible than after compatible trials [−10 vs. 15 ms; t(44) = 9.57, p < 0.001, d = 1.54; for descriptive details, see **Table 2**]. This interaction was further qualified by a higher order interaction between Transition, Compatibility<sup>N</sup> and CompatibilityN−1, F(1,43) = 61.55, p < 0.001, η 2 <sup>p</sup> = 0.59. As can be seen in **Figure 2**, this interaction can be explained by larger sequential processing adjustments of the Simon effect for nogo– go transitions than for go–go transitions [49 vs. 0 ms; t(44) = 7.76, p < 0.001, d = 1.79]. The significant main effect of Block, F(1,43) = 9.54, p < 0.01, η 2 <sup>p</sup> = 0.18, indicated that participants responded faster in the first (M = 329, SD = 32) than in the

TABLE 2 | Response times (SD) in milliseconds for compatible and incompatible trials as a function of task (Individual go–nogo, Independent go–nogo, Dependent go–nogo) and transition.


second block (M = 337, SD = 32). The significant main effect of Transition, F(1,43) = 16.10, p < 0.001, η 2 <sup>p</sup> = 0.27, showed faster RTs for go–go (M = 328, SD = 34) than for nogo–go transitions (M = 338, SD = 30). This was further qualified by a significant interaction between Block and Transition, F(1,43) = 25.49, p < 0.001, η 2 <sup>p</sup> = 0.37, revealing a larger Transition effect in the second than in the first block [16 vs. 4 ms; t(44) = 5.09, p < 0.001, d = 0.68]. The significant two-way interaction between Block and CompatibilityN−1, F(1,43) = 6.94, p < 0.05, η 2 <sup>p</sup> = 0.14, indicated faster RTs after incompatible trials compared to compatible trials in block 2 than in block 1 [4 vs. −1 ms; t(44) = 1.16, p < 0.05, d = 0.51]. The interaction between Transition and CompatibilityN, F(1,43) = 4.56, p < 0.05, η 2 <sup>p</sup> = 0.10, showed a larger Simon effect for nogo–go as compared to go–go transitions [5 vs. 0 ms; t(44) = 2.17, p < 0.05, d = 0.33]. All other main effects or interactions did not reach significance (all Fs < 3.11, all ps > 0.084).

# RT Analysis for Task 2 (Independent Joint Go–Nogo)

The respective 2 × 2 × 2 × 2 × 2 ANOVA revealed a significant main effect of CompatibilityN, F(1,43) = 7.85, p < 0.01, η 2 <sup>p</sup> = 0.15, indicating faster responses for compatible compared to incompatible trials (Mcompatible = 301, SDcompatible = 27, Mincompatible = 307, SDincompatible = 26). As in Task 1, there was a sequential adaptation effect as indicated by a significant interaction of Compatibility<sup>N</sup> and CompatibilityN−1, F(1,43) = 172.07, p < 0.001, η 2 <sup>p</sup> = 0.80, with a smaller Simon effect after incompatible than after compatible trials [−9 vs. 21 ms; t(44) = 12.96, p < 0.001, d = 1.90; **Table 2**], as well as between CompatibilityN, CompatibilityN−<sup>1</sup> and Transition, F(1,43) = 41.94, p < 0.001, η 2 <sup>p</sup> = 0.49 (for a comparison between Tasks, **Supplementary Table S1**). As can be seen in **Figure 2**, the sequential processing adjustments were larger for nogo– go compared to go–go transitions [54 vs. 7 ms; t(44) = 6.55, p < 0.001, d = 1.64]. The significant main effect of Transition, F(1,43) = 34.28, p < 0.001, η 2 <sup>p</sup> = 0.44, showing faster RTs for go– go (M = 300, SD = 27) than for nogo–go transitions (M = 309, SD = 24), varied as a function of Block, F(1,43) = 13.37, p = 0.001, η 2 <sup>p</sup> = 0.24, such that there was a smaller Transition effect in block 1 as compared to block 2 [5 vs. 14 ms; t(44) = 3.69,

p < 0.01, d = 0.65]. The interaction between Transition and CompatibilityN−1, F(1,43) = 6.37, p < 0.05, η 2 <sup>p</sup> = 0.13, indicated faster RTs after compatible trials compared to incompatible trials for nogo–go transitions than for go–go transitions [4 vs. −3 ms; t(44) = 2.55, p < 0.05, d = 0.59]. Furthermore, the interaction between Block, CompatibilityN−<sup>1</sup> and Group, F(1,43) = 4.09, p < 0.05, η 2 <sup>p</sup> = 0.09, was significant, indicating larger RTdifferences after compatible trials compared to incompatible trials between blocks in both the competitive and the cooperative group [6 ms vs. 3 ms; t(43) = 2.02, p < 0.05, d = 0.60]. No other main effects or interactions reached significance (all Fs < 2.24, all ps > 0.142).

# RT Analysis for Task 3 (Dependent Joint Go–Nogo)

In the RT analysis of task 3, a significant main effect of Compatibility<sup>N</sup> was observed, F(1,43) = 8.40, p < 0.01, η 2 <sup>p</sup> = 0.16, indicating faster responses for compatible than for incompatible trials (Mcompatible = 299, SDcompatible = 25, Mincompatible = 305, SDincompatible = 25; **Table 2**). The interaction between Compatibility<sup>N</sup> and CompatibilityN−1, F(1,43) = 145.02, p < 0.001, η 2 <sup>p</sup> = 0.77, with a smaller Simon effect after incompatible than after compatible trials [−12 vs. 23 ms; t(44) = 12.14, p < 0.001, d = 2.22], and the interaction between CompatibilityN, CompatibilityN−<sup>1</sup> and Transition, F(1,43) = 71.64, p < 0.001, η 2 <sup>p</sup> = 0.63, with larger sequential processing adjustments for nogo–go compared to go–go transitions [58 vs. 12 ms; t(44) = 8.19, p < 0.001, d = 1.69], were further qualified by a higher order interaction between CompatibilityN, CompatibilityN−1, Transition, and Group, F(1,43) = 4.60, p < 0.05, η 2 <sup>p</sup> = 0.10. As can be seen in **Figure 3**, this four-way interaction can be explained by a significant sequential adaptation of the Simon effect for go–go transitions in the competition group [20 ms; F(21) = 11.38, p < 0.01, η 2 <sup>p</sup> = 0.35], but non-significant sequential adaptation of the Simon effect for go–go transitions in the cooperative group [5 ms; F(22) = 0.63, p = 0.438]. There was a smaller Simon effect after incompatible trials than after compatible trials for go–go transitions in the competitive group [−7 vs. 13 ms; t(21) = 3.37, p < 0.01, d = 1.04], but not in the cooperative group [4 vs. 9 ms; t(22) = 0.79, p = 0.438], and a smaller Simon effect after incompatible trials than after compatible trials for nogo–go transitions in the competitive group [−22 vs. 32 ms; t(21) = 10.11, p < 0.001, d = 2.99] and the cooperative group [−24 vs. 39 ms; t(22) = 14.37, p < 0.001, d = 4.17]. However, the interaction between Compatibility and Group did not reach significance [F(1,43) = 0.66, p > 0.05]. Furthermore, the main effect of the Transition reached significance, F(1,43) = 44.41, p < 0.001, η 2 <sup>p</sup> = 0.51, suggesting faster RTs for go–go (M = 295, SD = 24) than for nogo–go transitions (M = 309, SD = 25). The interaction between Transition and CompatibilityN−1, F(1,43) = 4.11, p < 0.05, η 2 <sup>p</sup> = 0.09, indicated faster RTs after incompatible trials compared to compatible trials for go–go transitions than for nogo–go transitions [4 vs. 0 ms; t(44) = 2.05, p < 0.05, d = 0.41]. All other main effects or interactions were not significant (all Fs < 3.66, all ps > 0.062).

# Between-Task Analysis and Questionnaires

Although the go–nogo Simon effect increased as a function of the interdependence of interacting individuals – competition < independence < cooperation (**Figure 4**) – the respective interaction between the factors CompatibilityN, Group and Task did not reach significance, F(1,43) = 1.82, p = 0.184 (**Supplementary Table S2**). The rest of this analysis' results brought no further information to the findings detailed above. In the analyses of the questionnaires, T-tests showed no significant difference between groups on the IOS scale, the mean response to the strategy questions, or the BAS reward responsiveness score (all ts < 0.52, all ps > 0.604). Furthermore,

there were no significant Spearman correlations between Group, IOS response, mean Strategy response and BAS reward responsiveness (all ps > 0.689).

# DISCUSSION

The present research investigated the influence of in-/dependence on interacting individuals in joint go–nogo Simon tasks. More precisely, reward prospect for each (fast and correct) trial for each participant was context-dependently manipulated to enable the instantiation of different interdependencies between co-acting participants. That is, participants were, prior to the joint go–nogo Simon task, instructed that (i) each participant of the pair would receive the amount of reward that s/he actually earned for fast and correct responses on their own (i.e., independent reward) or (ii) that the amount of reward would be equally divided between both participants (cooperative dependence) or (iii) that the participant that earned the most reward would receive the whole amount of reward, including the amount earned by the other person (competitive dependence). Extending previous findings, the present study revealed sequential processing adjustments of the go–nogo Simon effect as a function of the interdependency of (i.e., competition, cooperation) and transition between interacting individuals (i.e., go–nogo requirements). While sequential processing adjustments of the Simon effect in both the competition and cooperation condition were unaffected when alternating between responsible actors (i.e., nogo–go transition), sequential processing adjustments were enlarged under competition for repeating responsibilities of one and the same actor (i.e., go–go transitions). In other words, the prospect of performance-contingent reward in a competitive context exclusively impacts flexible behavioral adjustments of one's own actions. Rather than fostering the consideration and differentiation of the other actor (i.e., otherreferenced frame), pushing one's own performance to the limit appears to be the suitable strategy in competitive instances of complementary tasks (i.e., self-referenced frame; Poortvliet and Darnon, 2010). Therefore, people keep their eyes on themselves when aiming at beating a co-actor and emerging as the winner.

Even though the present findings provide further valuable insight into the mechanisms driving flexible adjustments to changing contextual challenges when interacting with others (Liepelt et al., 2011; Yamaguchi et al., 2016), two critical aspects need further elaboration to close the gaps in the literature. One concerns the obviously crucial role of defining in-/dependence, and the other why the present study failed to show a modulation of the joint go–nogo Simon effect as a consequence of the interdependency of interacting individuals beyond the sequential trial-to-trial processing adjustments. First, how in-/dependence is defined appears to be particularly important for how attention is deployed. Ruys and Aarts (2010) found a JSE difference between dependence and independence, but no difference between the dependent conditions (cooperation, competition). In contrast, in our study, the sequential processing adjustments indicate a different attentional focus between cooperation and competition. In this way, the findings are in line with the study of Iani et al. (2011) as well as Ruissen and de Bruijn (2016), which show a distinction between cooperative and competitive dependence on the level of the JSE. An explanation for this inconsistency lies in the rather vague definitions of cooperation and competition in the study of Ruys and Aarts (2010). While they manipulated competition by rewarding 10 randomly selected team winners, Iani et al. (2011) improved this manipulation by rewarding one winner within each team. This distinction could explain the discrepancy of the JSE modulations between the different studies. However, only the present study shaped the cooperative relationship without alluding to unspecified other teams, while in the study of Iani et al. (2011) as well as Ruys and Aarts (2010), reward was given to the best performing team, which induced a competitive relationship with other teams. In this aspect, the present definition covers the complex construct of cooperative interaction in a proper way by solely manipulating the relationship within the team.

More interestingly, however, the modulation of the JSE as a consequence of the interdependency manipulation found by Iani et al. (2011) and by Ruissen and de Bruijn (2016) did not reach significance in the present study, even though descriptively the results point in the same direction of a smaller JSE in the competitive as compared to the cooperative group

(**Figure 4**). One reasonable explanation concerns the specific reward manipulations in the present study. Highlighting the significance of each trial via reward prospect for each correct and fast enough trial (i.e., each response in the fastest third of all correct RTs in the individual go–nogo Task 1) seems to have pushed task performance to the ceiling, leading overall as well as within each task to decreasing RTs and smaller JSEs, and thus to not much room for significant variability. Interestingly, this observation stands in sharp contrast to what is typically found for the standard (i.e., two choice) Simon task, namely increasing Simon effects with decreasing RTs (Hommel, 1993). Even though Hommel et al. (2009) use this pattern of a standard Simon task to reject the possibility that the non-significant JSE in the negative relationship condition, where participants reacted alongside an intimidating confederate, is solely driven by response speed, the attenuation of the JSE with decreasing RTs is perfectly in line with the present and previous findings. The Google scholar citation index for the initial JSE study of Sebanz et al. (2003) on April 1st 2018 revealed 17 viable studies that used a visual joint go–nogo Simon task with two participants sitting next to and sharing the same workspace with each other (**Figure 5**) 3 . The positive correlation of r = 0.52 (p < 0.01) indicates that

smaller RTs were predictive of smaller JSEs. Thus, in contrast to Hommel et al. (2009) and the findings in a standard Simon task showing increasing Simon effects with decreasing RTs (Hommel, 1993), go–nogo Simon effects are attenuated with increasing RTs, suggesting the involvement of different processes in the emergence of those two effects. In a standard Simon task with two different stimulus features and two response alternatives, the irrelevant spatial feature of the stimulus overlaps with the spatial feature of the response and is considered to automatically activate a representation of the spatially corresponding response. Interestingly, if participants react more slowly, response code activation induced by the location, which may conflict with the correct response, seems to decay over time, leading to smaller Simon effects (Hommel, 1994). If participants try to maximize performance and react as fast as possible, conflict resolution in a two-choice task takes up extra time in incompatible trials, thereby leading to larger Simon effects. In contrast, in a joint go–nogo Simon task, there are substantially different processes at play (Dolk and Prinz, 2016). Participants have only one response key and need to respond to only one of two stimulus features, thus a selective rather than a (two-) choice reaction is required on any given trial. While one's own alternative response hand in the two-choice Simon task provides a reference for spatial response coding that signifies spatial S–R overlap and thus elaborated Simon effects, this attention allocation toward task inherent S–R assignments seems to require a salient alternative (social or non-social) event in the individual's workspace (Dolk et al., 2011, 2013, 2014). Accordingly, if the participant tries to react as quickly as possible by applying top-down control to primarily focus on one's own task, this might be responsible for smaller Simon effects with faster RTs. In any case, it will be important for future work to clarify the different underlying processes that govern the emergence of the standard and the (joint) go–nogo Simon effect when performance is pushed to the limit.

<sup>3</sup>Used studies (in alphabetic order): Colzato et al. (2012a): Independent group, interdependent group; Colzato et al. (2013): Convergent group, divergent group; Dittrich et al. (2012) and Costantini et al. (2013): Experiment 2 (horizontal joint go–nogo condition), Experiment 3 (horizontal joint go–nogo condition); Dittrich et al. (2017): Partition group, no partition group; Ferraro et al. (2011): Experiment 1; Hommel et al. (2009): Positive confederate, negative confederate; Iani et al. (2011): Experiment 1 (same group, different group), Experiment 2 (positive interdependence, negative interdependence); Iani et al. (2014): Experiment 1, Experiment 2; Klempova and Liepelt (2017): Experiment 1 (joint go–nogo condition), experiment 2 (joint go–nogo condition); Lam and Chua (2010): Joint go–nogo different alternative condition; Liepelt et al. (2011), Malone et al. (2014), and Ruissen and de Bruijn (2016): Solo condition, competitive condition, cooperative condition; Sebanz et al. (2003): Experiment 1, Experiment 2 (no feedback group); Stenzel et al. (2014): Experiment 1 (agency+/intentionality+ condition); Yamaguchi et al. (2016).

The most noteworthy finding of the present study is that the flexible adjustments of attention allocation differ based on the dependencies of interacting individuals, as shown in the four-way interaction between compatibility in the present trial, compatibility in the last trial, trial-to-trial transition, and group. Sequential trial-by-trial processing adjustments were enlarged under competition for repeating responsibilities of the same actor (go–go transitions), which implies a stronger focus on one's own task. This self-focus may be an attempt to maximize performance in order to have a higher chance of getting the reward. This finding nicely converges with behavioral and electrophysiological results of de Bruijn et al. (2008) showing that disengaging from the partner can be beneficial for one's own performance. Together with the present result, these findings provide compelling evidence against the view of Ruys and Aarts (2010) arguing that the type of relationship between the participants is irrelevant for the emergence of shared representations and, as long as there is interdependence, participants attend to the partner's performance. Compared to the finding of Ruissen and de Bruijn (2016) as well as Iani et al. (2011), who observed differences on the level of the JSE, the present findings provide an even more complex distinction of different types of interdependencies between interacting individuals derived by attention allocation, namely a stronger focus on one's own performance under competition. This interplay between attention allocation and the size of the JSE is perfectly in line with various experiments. For example, Colzato et al. (2012a) found that participants, whose attention was drawn to interdependence by circling interdependent pronouns (e.g., we, our) in essays, show a larger JSE compared to participants with a self-centered focus after having circled independent pronouns (e.g., I, me). Similarly, Colzato et al. (2013) found a larger JSE in a group of participants after a divergent thinking task, which lead to a broader attentional focus, compared to a convergent thinking task that promoted an exclusive cognitive-control state. All of those findings support the view, that, if the context at hand enables one narrowing the focus to one's own task, the JSE is typically decreased. As such, the (joint) go–nogo Simon task appears to be a viable tool to

# REFERENCES


investigate flexible adjustments of attention allocation governing self-other integration when interacting with others (cf. Colzato et al., 2012a,b; Dolk et al., 2012, 2013, 2014).

# CONCLUSION

Taken together, the present study demonstrates that participants flexibly adjust their allocation of attention based on the in- /dependence of receiving performance-contingent reward when interacting with others and thus to the contextual specificity of social interactions. Rather than fostering the consideration and differentiation of the other person, as happens when the relationship is characterized by cooperative dependence, pushing one's own performance to the limit appears to be the suitable strategy in a competitive context. Therefore, people keep their eyes on themselves when aiming at beating a co-actor and emerging as the winner.

# AUTHOR CONTRIBUTIONS

All authors contributed equally to study design, data analysis, drafts and revisions of the manuscript. In addition, JM was responsible for data collection.

# FUNDING

This work was supported by the German Research Foundation (DFG) within the funding program Open Access Publishing.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg. 2018.01361/full#supplementary-material




**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Mendl, Fröber and Dolk. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Multimodal Go-Nogo Simon Effect: Signifying the Relevance of Stimulus Features in the Go-Nogo Simon Paradigm Impacts Event Representations and Task Performance

#### Thomas Dolk<sup>1</sup> \* and Roman Liepelt<sup>2</sup>

<sup>1</sup> Department of Psychology, University of Regensburg, Regensburg, Germany, <sup>2</sup> Institute of Psychology, German Sport University Cologne, Cologne, Germany

#### Edited by:

Motonori Yamaguchi, Edge Hill University, United Kingdom

## Reviewed by:

Basil Wahn, The University of British Columbia, Canada Francesca Ciardo, Fondazione Istituto Italiano di Tecnologia, Italy

\*Correspondence:

Thomas Dolk thomas.dolk@psychologie.uniregensburg.de

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 14 May 2018 Accepted: 01 October 2018 Published: 25 October 2018

#### Citation:

Dolk T and Liepelt R (2018) The Multimodal Go-Nogo Simon Effect: Signifying the Relevance of Stimulus Features in the Go-Nogo Simon Paradigm Impacts Event Representations and Task Performance. Front. Psychol. 9:2011. doi: 10.3389/fpsyg.2018.02011 Numerous studies have shown that stimulus-response-compatibility (SRC) effects in the go-nogo version of the Simon task can be elicited as a result of performing the task together with another human or non-human agent (e.g., a Japanese-waving-cat, a working-clock, or a ticking-metronome). A parsimonious explanation for both social and non-social SRC effects is that highlighting the spatial significance of alternative (non-/social) action events makes action selection more difficult. This holds even when action events are task-irrelevant. Recent findings, however, suggest that this explanation holds only for cases of a modality correspondence between the Simon task as such (i.e., auditory or visual) and the alternative (non-/social) action event that needs to be discriminated. However, based on the fact that perception and action are represented by the same kind of codes, an event that makes the go-nogo decision more challenging should impact go-nogo Simon task performance. To tackle this issue, the present study tested if alternative stimulus events that come from a different sensory modality do impact SRC effects in the go-nogo version of the Simon task. This was tested in the presence and absence of alternative action events of a human co-actor. In a multimodal (auditory–visual) go-nogo Simon paradigm, participants responded to their assigned stimulus – e.g., a single auditory stimulus while ignoring the alternative visual stimulus or vice versa – in the presence or absence of a human co-actor (i.e., joint and single go-nogo condition). Results showed reliable SRCs in both, single and joint go-nogo Simon task conditions independent of the modality participants had to respond to. Although a correspondence between stimulus material and attention-grabbing event might be an efficient condition for SRCs to emerge, the driving force underlying the emergence of SRCs rather appears to be whether the attentional focus prevents or facilitates alternative events to be integrated. Thus, under task conditions in which the attentional focus is sufficiently broad to enable the integration and thus cognitive representation of alternative events, go-nogo decisions become more difficult, resulting in reliable SRCs in single and joint go-nogo Simon tasks.

Keywords: stimulus-response compatibility, go-nogo Simon task, modality, event representations, referential coding, Theory of Event Coding

# INTRODUCTION

fpsyg-09-02011 October 23, 2018 Time: 14:25 # 2

In the last 15 years, cognitive scientists have invested much effort into investigating how and to what extent people mentally represent their own and other people's actions/tasks and how these cognitive representations influence an individual's own behavior when interacting with another person. The most prominent paradigm of this line of research is widely known as the joint Simon paradigm, in which two people share the standard version of the Simon task (Sebanz et al., 2003).

In the standard Simon task, single participants execute spatially defined actions in response to non-spatial stimulus features (e.g., "Press right in response to the high-pitched tone and press left in response to the low-pitched tone). Critically, however, both tones randomly appear to the left and the right of participants, leading to trials of spatially compatible and spatially incompatible stimulus-response (S-R) assignments (i.e., a high-pitched tone presented to the right side of the participant would be compatible, whereas the same tone presented to the left would be incompatible). Note that although stimulus locations are entirely task-irrelevant, they automatically activate spatially corresponding responses (i.e., the spatial location of the stimulus primes the response on the same side of space). In the case of a spatial match between the automatically activated and the assigned response, task performance is facilitated, whereas performance is impaired in the case of a spatial mismatch (Kornblum et al., 1990). This stimulus-response compatibility (SRC) effect, also known as the Simon effect (Simon, 1969; for reviews, see Proctor and Vu, 2006; Rubichi et al., 2006; Hommel, 2011), does typically not occur if the task is turned into a go-nogo task by having the participant execute single key presses in response to only a specific stimulus feature (i.e., a single tone/color; Hommel, 1996). However, an SRC re-emerges if the participant shares the same go-nogo task with another participant who responds to the other stimulus by operating the other response key–a phenomenon known as the social/joint SRC (Sebanz et al., 2003).

Such joint action effects have been taken to suggest that interacting individuals do not only form a cognitive representation of their own action or task but also (co-) represent the action or task of their co-actor (Sebanz et al., 2003, 2005; Welsh, 2009; Welsh et al., 2013; van der Wel and Fu, 2015). Co-representation is considered to be automatic and mandatory social in nature (Knoblich and Sebanz, 2006; Schmitz et al., 2017), such that the joint Simon task re-introduces a functionally similar kind of response competition as the standard Simon task (Kornblum et al., 1990; Sebanz et al., 2003). Recently, however, an increasing number of studies have challenged a purely social interpretation of SRC effects (e.g., Guagnano et al., 2010; Dolk et al., 2011, 2013; Dittrich et al., 2012, 2013; Sellaro et al., 2015; Stenzel and Liepelt, 2016; Michel et al., 2018). Some studies also provided evidence against a functional equivalence between the joint Simon task and the standard Simon task (Liepelt et al., 2011; Klempova and Liepelt, 2016). In line with these findings, Dolk et al. (2013) showed that the presence of another responding person is not required for (joint) SRC-like effects to occur. The presence of non-human "co-actors," such

as a Japanese waving cat, a clock, and a metronome, elicited SRCs that were comparable in size to the SRCs typically found when two people perform a go-nogo Simon task together (e.g., Sebanz et al., 2003; Guagnano et al., 2010; Liepelt et al., 2011; Welsh et al., 2013). Thus, response competition in a go-nogo Simon task may not be driven by the presence of another person performing a task-related action, but rather by the presence of another attention-grabbing action event during the task processing. According to the Theory of Event Coding (TEC; Hommel et al., 2001, Hommel et al., 2009) actions are cognitively represented by codes of their sensory consequences that are shared between self- and other-generated actions. Therefore, action control faces a discrimination problem between self-related event representations and simultaneously externally activated (non-self-related) event representations (Dolk et al., 2013). However, the exact nature of this action discrimination problem is not yet understood.

Studies analyzing the sequential modulation of Joint and Solo go-nogo SRC effects (Liepelt et al., 2011, 2013; Yamaguchi et al., 2018b) suggest that the relevant decision in the joint Simon task is a decision between the own go stimulus and the nogo stimulus (=go stimulus of the partner). When the go-nogo decision has to be performed together with a joint action partner, the presence of additional events due to the response of the partner during the nogo processing may enhance the relevance of the nogo stimulus, via a process that has been termed nogo tagging (Liepelt et al., 2011). In line with this idea, Baess and Prinz (2015) showed a modulation of stimulus processing as indicated by the Go- and NoGo-N1 component of the electroencephalogram (EEG). The modulation of the nogo decision by the presence of the responding partner has been interpreted as a change in agent identification – my turn vs. your turn (Liepelt et al., 2011; Wenke et al., 2011; Baess and Prinz, 2015). Based on the assumption that the presence of additional events during nogo processing enhances the task relevance of these events (Liepelt et al., 2011), we hypothesize that the presence of additional events during nogo processing may make it more difficult to discriminate between go and nogo processing (Kühn and Brass, 2010a,b; Weller et al., 2017). However, up to now, studies targeting SRCs in go-nogo versions of the Simon task either concentrated on manipulating the nature of alternative (social or non-social) action events (Tsai and Brass, 2007; Tsai et al., 2008; Lam and Chua, 2010; Müller et al., 2011; Stenzel et al., 2012, 2014; Dolk et al., 2013; Stenzel and Liepelt, 2016; Klempova and Liepelt, 2017) or varied the presence or absence of the response event by means of a responding partner (Sebanz et al., 2003; Welsh et al., 2007; Atmaca et al., 2011; Sellaro et al., 2013). To our knowledge no previous study has tested the impact of additional stimulus events on joint task performance. This is, however, a theoretically important question, as referential coding (Dolk et al., 2013) and TEC (Hommel et al., 2001) accounts would assume that perception and action are cognitively represented by the same kinds of codes (Prinz, 1997) and therefore alternative stimulus events that are present during the go-nogo decision should increase the difficulty of the discrimination problem. If this is true, this would indicate that joint go-nogo effects are driven not by the social context and co-representation of the

action or task producing an agent discrimination conflict, but rather by concurrently activated stimulus or response events increasing the difficulty of the actor's own go-nogo decision (Kühn and Brass, 2010a,b).

Two recent studies testing the joint go-nogo effect using event-producing non-social objects found reliable SRC effects for the auditory modality using an auditory go-nogo Simon task when a Japanese waving cat provided visual waving cues and auditory cues (Puffe et al., 2017; Lien et al., 2016). These studies, however, did not find reliable SRC effects in a visual go-nogo Simon task when using the same objects. Due to this asymmetry, Puffe et al. (2017) suggested that the correspondence between the attention-attracting event and the stimulus material of the Simon task determines whether or not an SRC is present. However, a visual task may have focused visual attention to the visual stimuli on the screen, which would also explain why subjects did not perceive the event-producing object placed on the table before the screen. The auditory stimuli where presented via two laterally located loudspeakers with a distance of about one meter, which could have broadened the attentional, focus bringing back the event-producing object into the attentional focus.

In the present study, we therefore tested the impact of stimulus and response events concurrently present during the go-nogo decision on single (single condition) and joint Simon task (joint condition) performance. Due to the previously observed asymmetry of task modality and the externally activated (task-irrelevant action/stimulus) event (Puffe et al., 2017), we also manipulated the modality of the go-nogo Simon task. By presenting the additional (task-irrelevant) event at the same location as the task relevant stimulus, the width of the attentional focus was held constant. This was done to test if the presence of the SRC in the go-nogo Simon task is due to (a) a modality correspondence between the attention-attracting event and the stimulus material or (b) a broadening of the attentional focus to integrate alternative (action and/or stimulus) events.

We predicted that if the integration of alternative events within the attentional focus and the corresponding enhanced difficulty of response discrimination underlie the SRC in the gonogo Simon task, we should find a SRC effect in the presence of alternative events in Single visual and auditory go-nogo task conditions. Effects for both modalities should be larger when a concurrent response event is additionally present in the joint condition. In contrast, if the SRC effect is due to the modality correspondence of the attention-attracting event and the stimulus material, we should not find an SRC effect in Single visual and auditory go-nogo task conditions. That is because alternative events in our study are always presented in a different modality. Naturally, effects should be present in the joint condition in both visual and auditory modality conditions, as the co-actors response contains both visual and auditory information.

# MATERIALS AND METHODS

### Participants

G <sup>∗</sup>Power 3.1 software (Faul et al., 2009) revealed that a sample size of N = 32 is required to guarantee sufficient statistical power of 1−β = 0.80 with α = 0.05, and partial η <sup>2</sup> = 0.23 (Iani et al., 2011, Experiment 2). Based on this analyses and aiming to extend the classical finding of Sebanz et al. (2003) with 40 participants to a multimodal go-nogo Simon paradigm, we tested N = 40 participants (28 female; Mage = 23.5, SDage = 2.8, Rage = 18–29 years). This guaranteed sufficient statistical power and compensates for potential dropouts in participants. Participants had no history of neurological or hearing problems. They were all right-handed as assessed by the Edinburgh Inventory (Oldfield, 1971; MLQ = 92.8, SDLQ = 8.3, RLQ = 80–100), were naive with regard to the hypothesis of the experiment and were paid for their participation. Participants gave their written informed consent before their inclusion in the study in accordance with the ethical standards of the German Psychological Society (DGPs; 2016) and the 1964 Declaration of Helsinki. According to the DGP's ethics commission, an institutional research board's ethical approval is required only if (i) research carries additional risk beyond daily activities or (ii) any funding is subject to such an ethical review. No such requirements were present for this study.

# Stimuli and Procedure

Only one auditory and only one visual signal was chosen as go and nogo stimuli in the present bi-modal go-nogo version of the Simon task. The auditory signal consisted of the spoken Dutch color word – "pars" (purple) – played in reverse so that no word was recognizable to our German participants (i.e., "chap") and presented at approximately 60 dB to either the left or right loudspeaker separated by a distance of one meter (i.e., 50 cm to the left or 50 cm to the right of the midline of the screen). The visual stimulus, a green light, was delivered via the left or the right light emitting diode (LED, r = 1 cm) attached on the top of the left and right loudspeaker (exceeding a visual angle of 79.6◦ × 18.9◦ ; see **Figure 1**). However, to maintain participants' fixation at the center of the computer screen, an array of three squares, framed in white on a gray background (10.7◦ × 2.2◦ ), was presented throughout each trial (i.e., from beginning until response execution), with the middle square serving as the fixation point (2.2◦ × 2.2◦ ).

Upon arrival at the laboratory, pairs of participants were informed that they would perform the same task in two different conditions, i.e., they would perform the task alone in one condition (i.e., single condition, **Figure 1**, upper panel) and the same task together with the other person in the other condition (i.e., joint condition, **Figure 1**, lower panel; see Tsai et al., 2008; Atmaca et al., 2011; Pfister et al., 2014, for the same practice of introducing different experimental condition to the participants).

In the joint condition (**Figure 1**, lower panel); both participants were seated next to each other. They operated a response button with their right index-finger (25 cm in front and 25 cm from the midline of a 17<sup>00</sup> computer monitor) and were asked to place their left hand underneath the table on their left thigh. Prior to the experiment, participants were familiarized with the task, including the presentation of the two stimuli and their assignment as go and nogo stimuli (e.g., "Person on the

RIGHT press the response key if you see the green light and person on the LEFT respond by pressing the key if you hear 'chap"'). The individual target stimulus (auditory, visual), the response side (left, right) and the order of conditions (Single, Joint) were counterbalanced across participants (i.e., half of the participants started with the joint followed by the single condition, while the other half performed both conditions in reversed order).

In the single condition (**Figure 1**, upper panel), everything was held constant (i.e., assigned stimulus and response side) except that the left or right chair remained empty.

The whole experiment consisted of two consecutive sessions, one single and one joint session, with the order of sessions counterbalanced across participants. Each session comprised three blocks, one training of 2 trials (equals 8) and two experimental blocks of 64 trials for each stimulus (auditory vs. visual) and S-R mapping (compatible vs. incompatible; equals 256 trials). To improve participant vigilance throughout the whole experiment, short breaks between blocks and a 5 min break between conditions outside the laboratory were provided.

Each trial (irrespective of the condition) began with the simultaneous presentation of the square array and a fixation-sound for 300 ms. After 700 ms, the critical stimulus – either the auditory or the visual signal – was presented for 300 ms to the left or the right loudspeaker/LED. Participants were encouraged to respond as quickly and as accurately as possible. After a response was given or 1500 ms had passed, a 1000 ms inter-stimulus-interval (i.e., a blank screen) followed. Note that in the Single go-nogo condition, 1500 ms had to pass in case of a nogo trial before the inter-stimulus-interval started.

# RESULTS

# Reaction Times

For statistical analysis, we excluded all trials in which the responses were incorrect (0.7%), or had a reaction time (RT) less

Dolk and Liepelt The Multimodal Go-Nogo Simon Effect

than 150 ms or greater than 1000 ms (1.2%; Röder et al., 2007; Dolk et al., 2011, 2013; Liepelt et al., 2011). Responses were coded as compatible (stimulus ipsilateral to the correct response side) and incompatible (stimulus contralateral to the correct response side). To investigate the SRCs, correct RTs were submitted to an analysis of variance (ANOVA) with Compatibility (compatible, incompatible), and Condition (single, joint) as within-subjects factors and Modality (auditory, visual) as a between-subjects factor.

This 2 × 2 × 2 ANOVA revealed a significant main effect of Compatibility, F(1,38) = 95.42, p < 0.001, η<sup>p</sup> <sup>2</sup> = 0.72, showing that responses were faster with stimulus-response compatibility (mean RT = 269 ms, SD = 43 ms) than with stimulus-response incompatibility (mean RT = 286 ms, SD = 45 ms)<sup>1</sup> . The main effect of Condition was also significant, F(1,38) = 8.56, p < 0.01, ηp <sup>2</sup> = 0.18, showing that responses in the single condition were overall faster (mean RT = 269 ms, SD = 46 ms) than in the joint condition (mean RT = 286 ms, SD = 41 ms). The main effect of Modality was not significant (F < 1).

More importantly, the SRC varied between conditions, as indicated by a significant interaction of Compatibility × Condition, F(1,38) = 9.15, p < 0.01, η<sup>p</sup> <sup>2</sup> = 0.19. The step-down analysis by the factor Condition revealed significant SRCs in both conditions, with a 21 ms compatibility effect observed in the joint condition, F(1,38) = 90.72, p < 0.001, ηp <sup>2</sup> = 0.70, and a 14 ms compatibility effect in the single condition, F(1,38) = 41.16, p < 0.001, η<sup>p</sup> <sup>2</sup> = 0.51 (**Figure 2** and **Table 1**). Note, this modulation of the SRC by condition as well as the SRC as such was independent of the specific stimulus modality to which participants responded (all Fs < 1)<sup>2</sup>,<sup>3</sup> .

However, responses to the auditory modality in the single condition were faster (mean RT = 265 ms, SD = 59 ms) compared to the joint modality [mean RT = 293 ms, SD = 50 ms; F(1,19) = 7.78, p < 0.05, η<sup>p</sup> <sup>2</sup> = 0.29], while this was not the case for responses to the visual modality [mean RTSingle = 274 ms, SDSinlge = 31 ms; mean RTJoint = 279 ms, SDJoint = 31 ms; F(1,19) < 1] as indicated by a significant interaction of Condition × Modality, F(1,38) = 4.20, p < 0.05, ηp <sup>2</sup> = 0.10.

visual). Errors bars represent the standard error (SE).

Indicated by one reviewer, the data of both co-actors might not be fully independent. To cope for this, we split the data using the factor modality and ran two separate ANOVAs. However, results did not change (for details, see **Table 2**).

# Error Rates

The 2 × 2 × 2 ANOVA revealed a significant main effect of Condition, F(1,38) = 7.61, p < 0.01, η<sup>p</sup> <sup>2</sup> = 0.17, indicating that participants made more errors when performing the task together with another person (0.6%) compared to when working alone (0.2%). This effect was varied as a function of Modality, F(1,38) = 7.44, p < 0.05, η<sup>p</sup> <sup>2</sup> = 0.16, showing that participants made more errors in response to auditory compared to visual stimuli in the single condition (0.3% vs. 0.0%) but the reverse was true in the joint condition (0.3% vs. 0.9%). No other effects or interactions reached significance (all Fs < 1).

<sup>1</sup>To provide the reader with a baseline effect, we run the same experiment with 10 new subjects (6 female; Mage = 24.1, SDage = 3.3, Rage = 20–31 years) in the standard two-choice version. Results revealed a significant SRC effect (33 ms), F(1,9) = 24.72, p < 0.01, η<sup>p</sup> <sup>2</sup> = 0.73, showing that responses were faster with stimulus-response compatibility (mean RT = 398 ms, SD = 73 ms) than with stimulus-response incompatibility (mean RT = 431 ms, SD = 64 ms).

<sup>2</sup>To rule out any effect of the order of conditions (single and Joint), we included Order as a between-subjects factor into the 2 × 2 × 2 ANOVA (Compatibility, Condition, Modality). The respective 2 × 2 × 2 × 2 analysis revealed no significant four-way interaction, F(1,36) = 2.16, p = 0.150, η<sup>p</sup> <sup>2</sup> = 0.06, suggesting that the order had no influence on the observed overall pattern of results. However, given that Modality overall had no influence on the emergence of SRCs in single and joint conditions, one might still wonder as to whether the order of conditions might influence the SRCs independent of Modality. A respective 2 × 2 × 2 ANOVA revealed no significant interaction between Compatibility, Condition, and Order, F(1,38) = 3.66, p = 0.063, η<sup>p</sup> <sup>2</sup> = 0.09. For the sake of completeness, however, we still performed an additional step-down analysis by order. While the Compatibility × Condition interaction for those who started with the single go-nogo condition did not reach significance, F(1,19) = 0.61, p = 0.443, η<sup>p</sup> <sup>2</sup> = 0.031, this interaction was significant for those who started with the joint go-nogo condition, F(1,19) = 16.65, p = 0.001, η<sup>p</sup> <sup>2</sup> = 0.47. Note, however, that although the go-nogo SRC effect in the single condition was significantly smaller [t(19) = 4.64, p < 0.001] as compared to the joint condition [t(19) = 8.74, p < 0.001] the SRC effect was reliable across both single [t(19) = 4.47, p < 0.001] and the joint [t(19) = 5.38, p < 0.001] tasks in both groups. Thus, even though there is some variation depending on the order of conditions, the overall pattern of a reliable SRC in the single and joint go-nogo condition is consistent.

<sup>3</sup>As requested by one reviewer we now provide an additional bin analyses in order to shed more light on the temporal dynamics of the multimodal SRC. To that end, we computed, separately for each condition and participant, the RT distributions, which we divided into four bins (quartiles). These data were analyzed by means of an ANOVA with condition, compatibility, bin, and modality as factors.

A respective 2 × 2 × 4 × 2 ANOVA with Bin (1,2,3,4) as additional within-subjects factor revealed no significant four-way interaction (F < 1). These results clearly provide no evidence in favor of a modality-driven difference in the time course of the go-nogo Simon effect different to what is often observed in the two-choice Simon task (for more discussion on the issue, see Wascher et al., 2001; Leuthold and Schröter, 2006; Xiong and Proctor, 2016; D'Ascenzo et al., 2018). As to whether these results indicate a further example for the difference, rather than the similarity of two-choice and go-nogo Simon task is an interesting topic that warrants further investigation.

TABLE 1 | Mean and standard deviation of reaction time (ms), error rate (%),for compatible and incompatible trials as well as spatial compatibility effect (SRC; compatible minus incompatible trials) as a function of condition (joint, single), and modality (auditory, visual).


<sup>∗</sup>p < 0.001, †not significant.

# DISCUSSION

The aim of the present study was to investigate the effect of alternative stimulus events in the absence (single task) or presence (joint task) of alternative action events on task performance. When participants responded to stimuli in a single sensory modality and withheld responses to stimuli in another modality, we found reliable SRCs in both the single and the joint go-nogo Simon task condition (single < joint), for both visual and auditory sensory modalities. This finding contradicts the assumption that reliable go-nogo SRCs in the single go-nogo condition are restricted to cases in which there is correspondence between the modality of stimulus material and attention-grabbing alternative events. Rather, the present

TABLE 2 | Results of separate ANOVAs for the auditory and visual participants.

findings suggest that the spatial coupling of alternative events, here accomplished by presenting auditory and visual stimuli in the same locations, facilitates their integration, and thus creates the need to discriminate between them in order to respond appropriately in a given context. The finding of such integration is in line with multisensory research showing that the processing of spatial stimuli coming from different sensory modalities seems to rely on a shared pool of attentional resources (Wahn and König, 2017). When the task of responding to events coming from visual and tactile modalities is distributed across two persons, the crossmodal congruency effect was found to be socially modulated (Heed et al., 2010). However, in contrast to our finding of an increased SRC in the joint as compared to a single go-nogo Simon task condition, Heed et al. (2010) observed a significantly reduced crossmodal congruency effect under joint as compared to single conditions. This reduction was mainly due to faster performance on incongruent trials. One might attribute these different findings to different modality combinations used across these studies – visual-auditory in our study vs. visual-tactile in the study of Heed et al. (2010). However, a more recent study by Wahn et al. (2017) showed a similar reduction of the joint crossmodal congruency effect with an audio-visual crossmodal congruency task. Thus, an effect of different modality pairings is unlikely to explain this discrepancy. Instead, the opposite effects between the Heed study and our study are more likely to be attributed to different task demands (Liepelt and Fischer, 2016) and whether the joint task allows a division of labor or not. When a division of labor across persons is possible, the burden or distraction of alternative event representations is reduced (cf. Sellaro et al., 2013, 2018). In the present study, however, the discrimination of alternative events cannot be handed over to the partner and thus cannot be separated. On each trial a discrimination has to be performed in order to either go or withhold the response. Thus, in the present study the need to discriminate between these events is an additional demand, explaining the increase in reaction


time in the joint as compared to the single-task condition (cf. Yamaguchi et al., 2018a). Furthermore, our findings relate to a study showing that peripersonal space boundaries shrink when subjects face another individual (Teneggi et al., 2013). During joint action it has been shown that attention to items appearing in the peripersonal space and intentional weighting interact, so that the effect of enhanced spatial processing for those items is counteracted by a stronger weighting of discriminative action features (Liepelt, 2014), thus increasing the Simon effect.

In previous work using tasks that require performance of selective (i.e., go-nogo) responses to different features within the same sensory modality (e.g., auditory, tactile and/or auditory sensation), SRCs are typically observable in the presence (i.e., "joint" condition) of (social or non-social) reference-providing events in the response dimension, but not when those attention-grabbing events are absent (i.e., single condition; Sebanz et al., 2003; Dolk et al., 2011, 2013; for a review, see Dolk et al., 2014). The present findings extend this body of work by indicating that stimuli presented in different sensory modalities influence information processing and response selection not only when jointly performing such complementary multimodal go-nogo Simon task, but even in the absence of any perceivable reference-providing event in the response dimension, viz. the single go-nogo condition (Stenzel and Liepelt, 2016). Additionally, this finding provides further evidence against the notion of action and/or task co-representation (Atmaca et al., 2011; Sebanz et al., 2003) 4 , thereby calling for an alternative explanation (for a review, see Prinz, 2015).

Sudden onsets of stimulus events in two different modalities that call for distinct, corresponding (spatially defined) action alternatives – to act/go or not to act/nogo – may inevitably direct attention to features that enable perceptual discrimination in the stimulus domain. Given that this (stimulus) event discrimination is typically followed by perceivable consequences of spatially related action alternatives (cf. Baess and Prinz, 2015) 5 (Milanese et al., 2010, 2011; Iani et al., 2014), discriminable features can increase the weight of codes on which their cognitive representation is determined (Hommel et al., 2001; Memelink and Hommel, 2013). As stimulus events in the Simon tasks are typically coupled with particular action events, the tight spatial and temporal co-occurrence of perceptual (i.e., stimulus and action) events leads to the transient, episodic integration of the respective features into event-files, object-files, or object tokens (Kahneman et al., 1992; Schneider, 1995; Hommel et al., 2001, respectively).

Consequentially, strengthening one member of these cognitive bindings through intentional weighing (or the distribution of attentional weights thereupon; Bundesen, 1990; Schneider, 1995) may influence the activation of other members involved in such bindings, such as the spatial features that discriminate their subsequent responses from other events in the Simon task (Hommel et al., 2001; Memelink and Hommel, 2013). The activation strength of specific features depends upon whether and how strongly the dimension of features is defined by task-relevance and task setting. In the present study, the sensory stimulus modality (auditory/visual), the size of scope of the attentional focus, and spatially pre-defined action alternatives (left/right) seem to be important dimensions receiving the most weight in the event-file.

In other words, making the representation of alternative stimulus and action events more task-relevant – by emphasizing the coding of discriminable features via stimulus processing – increases the competition between these representations as well as those events associated and spatially/temporally coupled with them. Based on the experimental setting of the present study, this means that these representations involve sensory features according to the specific stimulus modality and spatial features of the to-be-executed action alternatives, which induces at least two different competitions between feature codes (Duncan, 1996; Dutzi and Hommel, 2009). Given that response selection can only proceed when stimulus events have successfully been dissociated, reaction time should increase with every extra feature dimension that is considered in the process of event-coding in go-nogo settings (single > joint, see **Figure 2**). Accordingly, in contrast to previous findings of (social) SRCs, the present results provide no indication for social facilitation when sharing a multimodal Simon task with another person. Instead, and in line with the presented framework, additional action events that need to be discriminated in the course of response coding further signified the task-relevance of nogo stimuli, thereby providing an explanation for the further increase of SRCs from single go-nogo to joint go-nogo conditions, a process that has been termed nogo-/inhibitory tagging (Liepelt et al., 2011).

From a mechanistic perspective, stimulus events in the Simon task are widely accepted to exert their impact on response competition mainly via task-irrelevant (i.e., spatial) features. This results either in the activation of the same (compatible trials) or the opposite (incompatible trials) response leading to facilitation or interference, respectively. This impact of competing event representations should be even stronger if the significance of task-relevant stimulus features (i.e., via the multi-modality) highlights the corresponding (spatially defined) action alternatives. This seems to hold irrespective of whether the action is to-be-executed or not (cf. Kühn and Brass, 2010a,b) and even more relevant when alternative stimulus events share locations of possible occurrences (Stenzel and Liepelt, 2016; Puffe et al., 2017). In prior work, stimulus events and attention-grabbing alternative (action) events were

<sup>4</sup>A similar pattern of results was already shown by Sebanz et al. (2005) who forced one participant to respond to the pointing direction of the stimulus hand whereas the other person had to respond to a colored ring attached to the stimulus hand (Sebanz et al., 2005). Counterbalancing single and joint go-nogo conditions of this task across participants revealed a SRC in the joint, but most interestingly also a reliable effect in the single go-nogo condition. Even though the authors described the latter finding more as an accident, i.e., as an compatibility effect in its own right (see Hommel, 1996) or due to an carryover effects from joint to single conditions (Sebanz et al., 2005), it highlights the so far widely underestimated impact of stimulus feature (i.e., attentional breadth) on information processing and response selection that clearly warrant further investigations.

<sup>5</sup>Based on recent findings of Kühn and Brass (2010a,b), who showed that with-holding an action (e.g., a nogo due to instruction) is explicitly and more importantly represented as action-specific, we expect an instruction "not to act" to be cognitively represented as a simple alternative to one's own action event representation. Thus, withholding a response does not necessarily need the perception of such an alternative (as long as it refers to a comparable action event) to activate its sensory consequences.

spatially distinct and influential only in cases of a (modality) correspondence between stimulus and response event (e.g., Dolk et al., 2013; Lien et al., 2016; Puffe et al., 2017). In the present experiment, the spatial overlap of both the relevant feature dimension of the stimuli and the alternative stimulus event in single and joint go-nogo Simon task conditions seems to challenge go-nogo decisions reliably. In the joint condition, where associated action events are to be distinguished on top of the perceptual discrimination via the stimulus modality, the task relevance of those go-nogo decisions can be considered to be further strengthened, thereby providing an explanation for the significantly increased SRC in the joint condition.

In sum, although a spatial and temporal correspondence of stimulus material and attention-grabbing event might be an efficient condition for SRCs to emerge, the driving force underlying the emergence of SRCs rather appears to be (the width of) the attentional focus that either prevents or facilitates alternative events to be integrated and therefore requiring discrimination from task-relevant events. This assumption is in line with previous findings showing reliable SRCs in the single gonogo condition or even enlarged SRCs in the presence of (non- /social) action events when: (i) attentional capacities are available to integrate alternative events (e.g., Dolk et al., 2013; Lien et al., 2016; Puffe et al., 2017), (ii) all perceivable events are in the focus of attention (e.g., Stenzel and Liepelt, 2016), (iii) attention is directed toward the space of alternatives by acting upon

# REFERENCES


those directly (e.g., Porcu et al., 2016), or (iv) current cognitive states attenuating or enlarging the attentional focus (Colzato et al., 2012a,b). Thus, as soon as the attentional focus is broad enough to enable the integration and cognitive representation of alternative events, the difficulty of discriminating between events that are concurrently active is increased by any additional stimulus or response event challenging this process. The results of this are reliable SRCs in single and "joint" go-nogo Simon tasks.

# AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

# FUNDING

This research was financially supported by the German Research Foundation Grant DFG LI 2115/1-3 awarded to RL.

# ACKNOWLEDGMENTS

We thank Patricia Grocke for help with data acquisition.



of Dolk, Hommel, Prinz, and Liepelt (2013). PLoS One 12:e0184844. doi: 10. 1371/journal.pone.0184844



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Dolk and Liepelt. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Sharing Different Reference Frames: How Stimulus Setup and Task Setup Shape Egocentric and Allocentric Simon Effects

Pamela Baess\*, Tom Weber and Christina Bermeitinger

Institute of Psychology, University of Hildesheim, Hildesheim, Germany

Different reference frames are used in daily life in order to structure the environment. The two-choice Simon task setting has been used to investigate how task-irrelevant spatial information influences human cognitive control. In recent studies, a Go/NoGo Simon task setting was used in order to divide the Simon task between a pair of participants. Yet, not only a human co-actor, but also even an attention-grabbing object can provide sufficient reference in order to reintroduce a Simon effect (SE) indicating cognitive conflict in Go/NoGo task settings. Interestingly, the SE could only occur when a reference point outside of the stimulus setup was available. The current studies exploited the dependency between different spatial reference frames (egocentric and allocentric) offered by the stimulus setup itself and the task setup (individual vs. joint Go/NoGot task setting). Two studies (Experiments 1 and 2) were carried out along with a human co-actor. Experiment 3 used an attention-grabbing object instead. The egocentric and allocentric SEs triggered by different features of the stimulus setup (global vs. local) were modulated by the task setup. When interacting with a human coactor, an egocentric SE was found for global features of the stimulus setup (i.e., stimulus position on the screen). In contrast, an allocentric SE was yielded in the individual task setup illustrating the relevance of more local features of the stimulus setup (i.e., the manikin's ball position). Results point toward salience shifts between different spatial reference frames depending on the nature of the task setup.

#### Edited by:

Motonori Yamaguchi, Edge Hill University, United Kingdom Kerstin Dittrich, University of Freiburg, Germany

#### Reviewed by:

Francesca Ciardo, Fondazione Istituto Italiano di Tecnologia, Italy Cordula Vesper, Aarhus University, Denmark

\*Correspondence: Pamela Baess baessp@uni-hildesheim.de

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 15 April 2018 Accepted: 08 October 2018 Published: 30 November 2018

#### Citation:

Baess P, Weber T and Bermeitinger C (2018) Sharing Different Reference Frames: How Stimulus Setup and Task Setup Shape Egocentric and Allocentric Simon Effects. Front. Psychol. 9:2063. doi: 10.3389/fpsyg.2018.02063 Keywords: egocentric frame of reference, allocentric frame of reference, Simon effect, task sharing, joint action

# INTRODUCTION

Imagine yourself as the pilot in a cockpit of an airplane. In front of you, there is a multitude of displays, electronic flight instruments, and instruments ensuring the safety during your flight. Next to you is another pilot who – alongside with you – controls and checks all visual aids for flight security. As far as operational issues, both pilots share the responsibility it takes to manage the flight and incoming stream of information provided by all visual displays. In general, human-machine displays are an example par excellence for demonstrating the requirement of forming spatial codes in order to structure the environmental input. One dominant way of structuring the environment makes use of spatial labels such as up and down or left and right. With reference to the example of the cockpit, flying is a shared responsibility involving both pilots and still requires the formation of one's own spatial codes while concurrently representing the task and responsibilities of the other pilot as well.

**46**

In laboratory tasks, it is well-known that the information regarding a spatial location of a stimulus is hard to ignore, even though completely task-irrelevant, which became known as the Simon effect (SE) (for review, Simon, 1990; Lu and Proctor, 1995; Proctor and Vu, 2006; Hommel, 2011). In a Simon twochoice task setting, participants have to respond to one stimulus feature (for example, red and green color) through assigning one response to each color.

For example, the left/right response button is required as a reaction to a green/red stimulus shown on the screen if the left/right button was assigned to green/red through means of task instruction. The location of the stimuli varied displayed either on the left or right side of the screen's center (e.g., Craft and Simon, 1970). Interestingly, responses were much quicker when stimulus location and response location overlap [stimulus–response (SR) compatible], saying green stimulus on the left side requiring the left button response than when they do not overlap (SR incompatible). This difference in reaction time is referred to as the SE and it is explained in terms of an interaction between two parallel and independent processing routes connecting perception to action: an unconditional and a conditional component (Kornblum et al., 1990; Dejong et al., 1994). The unconditional route leads to automatic activation of a spatially corresponding response (for example, stimulus on left side triggering left response), irrespective of task instructions. Contrary, in the conditional route, the response is activated based on the task-required associations between stimuli and spatial codes (for example, left button when green stimulus occurred). Importantly, the effects of both routes overlap for SR compatible trials (e.g., green stimulus on left side requiring left response). Hence, in case of SR incompatible trials (e.g., green stimulus on right side requiring left response), both activated responses differ. Here, a conflict between both activations is the result causing a slowdown of response speed.

If one response alternative is removed (and thus no source of conflict between stimulus codes and response codes available, codes referring to the cognitive representation of stimulus and response, respectively), rendering the task from a two-choice task setting to a Go/NoGo task setting (e.g., react only to green stimuli with the left button and withdraw from responding for red stimuli), typically no reliable SEs are obtained (Hommel, 1996; Shiu and Kornblum, 1999; Ansorge and Wuhr, 2004) which is explained by the absence of the source of response conflict in Go/NoGo task settings.

Most compelling was the seminal finding of Sebanz et al. (2003) reporting the re-occurrence of a SE in a Go/NoGo task setting when sharing the task with a partner in such a way that each participant is responsible for reacting only to a particular stimulus color with a specified response button, but no SE in an individual Go/NoGo task setting. This was further interpreted as a so-called joint SE (JSE), i.e., the SE in the tradition of the twochoice task setting through dividing the Simon task between two participants (for review, Dolk et al., 2014), introducing the idea of a co-representation of the co-actor's task. Although others (Dolk et al., 2011, 2013; Dittrich et al., 2012, 2013) emphasized that a human co-actor is not necessarily required in order to obtain a JSE (see below), it nevertheless is theoretically fascinating as under certain conditions, a co-actor or object might provide a spatial reference in joint Go/NoGo task settings. Thus, the coactor or object apparently strengthened a spatial representation of the task (e.g., I-Go, You-Go) in order to reintroduce response conflict as the source for the reoccurrence of a SE. Here, we will refer to "task setting" in order to differentiate between two-choice and Go/NoGo variants of the Simon task. Different versions of Go/NoGo task settings were contrasted: with the presence of a coactor (joint Go/NoGo task setting) or alone (individual Go/NoGo task setting). Both Go/NoGo task settings introduce variations in the task setup: a joint task setup required the differentiation between the responding agents (I-Go vs. You-Go), and thus cognitive representations as the basis of this differentiation, however this was not required in the individual task setup. Thus, the task setup might include cognitive representations of how the stimuli and responses in the Go/NoGo task setting were divided between two participants. In the joint task setup, this includes how both participants represent their part of the Simon task, including their critical stimulus feature (e.g., green or red) and response button (left or right button). Contrary, in the individual task setup, only cognitive representations of one's own stimulus feature (e.g., green) and response button are required. In this reading, task setup contains all the representations involved representing one's own task during a joint or individual Go/NoGo task setting, but alongside with it, even all the other, taskirrelevant specifications how the Go/NoGo task settings are carried out (e.g., physical distance between co-actors; objects in the room). Others have coined the term of "task shaping" (Prinz, 2015; Dolk and Prinz, 2016) in joint two-choice or Go/NoGo task settings as a broader term when studying task setups in the joint or individual context. However, task setup in our reading refers more to the concrete situation in which the twochoice or Go/NoGo task setting is accomplished. In addition, the stimulus setup (see below) contains the exact representation of the alignment of the stimuli visible in the Simon task on the screen.

Coming back to the seminal findings of a JSE in joint Go/NoGo task setting, further studies explored how the JSE could be re-established even without a human co-actor (Dolk et al., 2011, 2013; Dittrich et al., 2012, 2013). By enriching the task setup (e.g., by presenting external, attention-grabbing objects or ambiguous response devices), the enriched task setup provided sufficiently salient reference points granting the response conflict as an essential source for the existence of a SE in Go/NoGo task setting. For example, Dolk et al. (2013) documented a SE for an auditory Simon task in the individual Go/NoGo task setting when the task setup included a Japanese waving cat or other attentiongrabbing objects placed, for example, on the left side of the participant. Or, even an enriched task setup through increasing the salience of the responses through a joystick (Dittrich et al., 2012) can provide sufficient reference points for the finding of a SE in the individual Go/NoGo task setting. Reports of a SE even without the involvement of a co-actor have promoted the idea of alternative accounts (Dittrich et al., 2012; Dolk et al., 2013) emphasizing the potential role of the cues in the enriched task setup (e.g., provided by an attention-grabbing object) serving as reference points for a spatial coding of the scenario. Importantly, all this research inspected how changes in the task setup can

reintroduce a SE with Go/NoGo task setting. However, the possibility that the stimulus setup itself could introduce response conflict in joint and individual Go/NoGo task settings is fairly untested. The current study explicitly explores the role of an enriched stimulus setup by contrasting different task setups, i.e., joint and individual task setups in Go/NoGo task settings. The use of an enriched stimulus setup is based on the assumption to form different spatial reference frames besides the commonly used one with reference to the spatial side regarding the screen's center.

While the idea of multiple reference frames has received a substantial amount of consideration in studies with the twochoice Simon task settings (for review, Rubichi et al., 2006), it has (almost) been completely neglected in Go/NoGo Simon task settings. In the following, we will first briefly review the available evidence of multiple spatial codes in two-choice task settings and then elaborate about the idea of multiple spatial codes under Go/NoGo task settings.

# Multiple Spatial Codes in Two-Choice Simon Task Settings

As discussed above, the standard, two-choice Simon task setting (for review, Simon, 1990) is explained with the concept of SR overlap. However, the standard stimuli (e.g., green/red circle in the left/right side of the screen's center) used in Simon tasks offer only one kind of crucial SR-overlap, namely with regard of the center of the screen (which is in this case identical with the participant's body midline). This is a very reduced labor situation and can by far not be compared to such a complex scenario as given in the cockpit with multiple spatially aligned displays. Therefore, prior research studied whether different SEs indicating the existence of different spatial codes can be reported for the two-choice Simon task setting when the stimulus setup itself provided more reference points for a potential SR-mapping (for review, Rubichi et al., 2006). Crucially, the stimulus setup (and not the task setup) was enriched in order to provide reference points for a spatial coding along the center of the screen (also called hemispace) alongside with a spatial code within the left right side of the screen's half (labeled as hemifield or relative stimulus position). With reference to the literature, two principle ways of implementation can be contrasted. Some studies used some sort of an "external-object-approach" meaning that the stimulus setup was enriched by presenting additional, external objects (such as vertical lines or horizontally aligned boxes) on the screen in spatial relation to the critical stimulus. These objects however were not part of the critical stimulus as such, but helped to introduce different spatial locations on the screen. Consequently, the critical stimuli could occur on different spatial locations along the horizontal or vertical dimension (Nicoletti and Umilta, 1984; Umilta and Liotti, 1987; Lamberts et al., 1992; Roswarski and Proctor, 1996). For example, in the study by Roswarski and Proctor (1996), three short vertical lines were presented on the screen demarking four potential locations for the occurrence of the critical stimulus (two in each hemifield). Here, SEs occurred for both possible reference frames, i.e., for hemispace (with reference to the center of the screen) and hemifield (referring to the relative position within each hemispace), provided that the reference lines were visible before the critical stimulus. Through such external objects, multiple spatial locations of stimulus occurrence were established allowing the formation of spatial codes, as indicated by the presence of different SEs with regard to different spatial reference frames. Yet, these spatial codes were not formed automatically as the spatial codes, possibly formed for the relative position within each hemispace, might have been overwritten during response selection when the reference frame, and the target stimuli were simultaneously presented. To summarize, the "external objectapproach" provided evidence for SE (and/or the SR prober as utilized in some studies) recruiting different spatial reference frames depending crucially on the experimental manipulation: different spatial reference frames were only established when reference objects or spatial cues (for example, indicating the side of the screen of the upcoming stimuli) were provided before the occurrence of the crucial stimulus (Lamberts et al., 1992; Roswarski and Proctor, 1996). In other words, different spatial reference frames (as indicated by SEs) were only formed when additional cues, be it temporal and/ or spatial, were provided. Besides this, a dominant SE based on the center of the screen (i.e., hemispace) was the robust finding.

Another set of studies followed a different procedure ("sameobject-approach") by enriching the stimulus setup through embedding the critical stimulus into a more global object (Wang et al., 2016; Baess and Bermeitinger, unpublished). For example, Wang et al. (2016) presented the critical stimulus, a fork, in combination with another object, here a plate, so that the fork was superimposed on the plate. Participants were required to react to the color of the fork, yet the position of the fork with respect to the plate was completely task-irrelevant. The fork's position could be assessed in two different ways, regarding one's own body midline (i.e., egocentric position; fork and plate on the left or right side of the screen's center) or regarding its position on the plate (i.e., allocentric position; fork on the left or right side of the plate). With this stimulus setup, egocentric and allocentric SEs were simultaneously obtained, however, the allocentric SE was subject to carry-over effects from a preceding spatial judgment task inducing the allocentric perspective. In contrast, Baess and Bermeitinger (unpublished) reported evidence for the simultaneous formation of egocentric and allocentric SEs independent of previous task instructions. The authors used drawings of stick-figure manikins holding a colored ball in either hand (allocentric reference frame). The manikins were presented either at the left or right side of the screen (egocentric reference frame). Here, reliable egocentric (with reference to manikin's screen position) and allocentric (with reference to manikin's ball position) SEs simultaneously occurred, without any previous task demands and prior spatial or temporal cues presented before the critical stimulus. A further manipulation contrasted the amount of manikin stimuli (one manikin vs. nine manikins) simultaneously shown on the screen introducing the possibility of another non-spatial perceptional reference frame recruiting the Gestalt law of grouping (Koffka, 1935/1963). Interestingly, the egocentric reference frame interacted with this non-spatial perceptional one: larger egocentric SEs were reliably

observed when one manikin was presented compared to when nine manikins were presented simultaneously. In contrast, the allocentric SEs remained unaffected by the manipulation of the non-spatial perceptual reference frame: reliable allocentric SEs were observed for both variants of the non-spatial perceptual reference frame. To conclude, in contrast to the previous external-object-approach, the reference points required for the formation of spatial codes surrounded the critical stimulus itself embedding it into another, more global object. Throughout the rest of this paper, we will use "egocentric" SEs when referring to the reference frame based on the screen's center (in the "external-object approach" labeled as hemispace) and "allocentric" SEs when the reference frame was given as part of a more global object (in the "external-object-approach," called as hemifield).

# The Present Study

Although the idea of different spatial reference frames present in one stimulus setup is well-rooted, not much is known how different spatial reference frames are shared between two participants. As shown with the case of the JSE (for review, Dolk et al., 2014), a co-actor or an attention-grabbing external object have proven to be salient enough to enrich the task setup in order to provide a reference frame (and thus a JSE) even in joint and individual Go/NoGo task settings. The possibility of an interaction between reference frames introduced by the task setup and those implemented through the stimulus setup is definitely fascinating. To the best of our knowledge, only the study by Ciardo et al. (2016) addressed this question so far. Using the "external-object-approach" by dividing the screen with short vertical lines, the critical stimulus could occur randomly at any of the four different locations. Thus, the stimulus setup included two different reference frames, namely hemispace (in our reading, egocentric) and hemifield (in our reading, allocentric). Moreover, the task was conducted either together with a co-actor or alone (joint and individual Go/NoGo task setting). Evidence was reported for a SE for hemispace in the joint Go/NoGo task setting, but not in the individual Go/NoGo task setting. No SEs based on hemifield in either task setup were observed. In a further experiment, two participants performed the task as two-choice task setting with this stimulus setup under joint and individual task setup. Here, SEs for both spatial reference frames were observed and no difference based on task setup was evident. Consequently, this study provides initial evidence for the interaction between different spatial reference frames provided by stimulus setup and task setup. Yet, to state, it is unclear how different spatial reference frames given through task setup and stimulus setup are effective when using the "sameobject-approach" with an enriched stimulus setup. Therefore, the present research investigated the formation of different spatial reference frames (as indexed by the egocentric and allocentric SEs) – as indicative of an enriched stimulus setup – under different, i.e., individual and joint, task setups with Go/NoGo task settings further. The present study was tailored to investigate the formation of multiple spatial codes in joint and individual Go/NoGo task settings using the enriched stimulus setup with stick-figure manikins.

Of particular interest to us is how different task setups (individual vs. joint) influence the formation of multiple reference frames based on the stimulus setup presented. Based on the previous literature with enriched task setup, i.e., individual and joint, respectively, with Go/NoGo task settings, one would predict a SE in the joint Go/NoGo task setting, but none in the individual one: an egocentric SE based on the stimulus' screen position should be only elicited when a co-actor or an attentiongrabbing object provides sufficient reference as a source for the occurrence of cognitive conflict. Regarding the other, allocentric SE based on the manikin's relative ball position, the hypothesis would be similar: no SE in the individual Go/NoGo task setting and if one, then a SE in the joint Go/NoGo task setting. Alternatively, the available evidence of different spatial reference frames in enriched stimulus setups points toward the possibility of simultaneous egocentric and allocentric SEs in the two-choice Simon task setting (Baess and Bermeitinger, unpublished). Yet, it is so far unclear whether these two SEs can also be observed in Go/NoGo task settings of this Simon task, albeit in general, SEs are, if not triggered by the task setup, absent in Go/NoGo task settings given the lack of response conflict.

In order to scrutinize the saliency of different reference frames depending on task setup and stimulus setup, we conducted three experiments. Experiment 1 used the stick-figure manikins of the two-choice version of the Simon task with egocentric and allocentric reference frames (Baess and Bermeitinger, unpublished) under an individual and joint Go/NoGo task setting. Experiment 2 repeated Experiment 1 with different stimulus material. In Experiment 3, an external, attentiongrabbing object (i.e., a Japanese waving cat) was placed next to the participant using otherwise the same stimulus setup as in Experiment 1. Across all three experiments, the enriched stimulus setup with different spatial reference frames remained constant, but the task setup changed (with/without co-actor/attentiongrabbing object).

# EXPERIMENT 1

The present experiment implemented a Go/NoGo version of the two-choice egocentric and allocentric Simon task introduced by Baess and Bermeitinger (unpublished). Drawings of stick-figure manikins with a ball of blue or yellow color in either hand (allocentric reference frame) were presented left or right to the screen's center (egocentric reference frame). With this enriched stimulus setup, both spatial reference frames were instantly processed as indicated by reliable egocentric and allocentric SEs in the two-choice task setting. Further, Baess and Bermeitinger (unpublished) reported that the size of the egocentric SE (i.e., based on the manikin's position with reference to the screen's center) was modulated by a non-spatial perceptual reference frame as introduced through the amount of identical stimuli shown on the screen.

The current experiment used exactly the same version of the Simon task in a Go/NoGo task setting. The task was divided between two participants in such a way that each one was responding only to one stimulus color (i.e., color of the ball in

the manikin's hand: blue or yellow). The task setup contained a joint Go/NoGo task setting and an individual Go/NoGo task setting. Moreover, the amount of stimuli presented on the screen was manipulated: either one manikin or a set of nine manikins was simultaneously shown. This variation created the possibility of another non-spatial perceptual reference frame based on the Gestalt law of grouping (Koffka, 1935/1963). As this non-spatial perceptual reference frame influenced particularly the size of the egocentric SE in the two-choice task setting (Baess and Bermeitinger, unpublished), it was also included in the present set of studies.

Based on previous literature on the influence of task setup on the emergence of a SE (for review, Dolk et al., 2014), egocentric and allocentric SEs should occur in the joint Go/NoGo task setting, but not the individual one. As suggested by the two-choice task setting (Baess and Bermeitinger, unpublished), when nine manikins were simultaneously shown on the screen, responses should be generally faster than when one manikin was presented.

# Materials and Methods

#### Subjects

Forty participants were recruited for this study. One participant was excluded due to a lack of compliance (two-choice responses in single Go/NoGo condition). Thus, the final sample consisted of 39 participants (mean age: 21.6 years; 19–34 years, five male). Six participants were left-handed (mean laterality quotient: −55.80, SD = 46.30) as assessed with a handedness questionnaire (Oldfield, 1971). Participants were individually recruited through advertisement at the University of Hildesheim and received partial course credit for participation. Parts of the experiments were performed together with a same gender participant. Their personal relationship was assessed with the IOS scale (Aron et al., 1992) showing a mean relationship of 2.64 (1.48 SD) on a scale from 1 to 7. All participants gave written informed consent and were treated in accordance to the Declaration of Helsinki. The study was approved by the local ethic's committee of the University of Hildesheim ("Fachbereich 1").

#### Stimuli and Experimental Tasks

Stick-figure manikins holding a blue or yellow ball were used as stimuli (**Figure 1A**). The manikins were created using Adobe Illustrator, they were 22 mm in width and 37 mm in height. The critical stimulus feature, i.e., the ball, was 7 mm in diameter. The amount of manikins shown simultaneously on the screen was manipulated (see **Figure 1A**, right part), i.e., resulting in the one-element and nine-element condition for manipulating the non-spatial perceptual reference frame which was implemented through separate blocks. In both conditions, stimuli occurred randomly at any out of 16 stimulus positions (in case of the nine-element condition, two additional stimulus positions were used, resulting in 18 possible positions including two midline positions, see **Figure 1A** right part) left or right side of the midline. The stimulus positions were chosen along four imagined rows on the screen [row 1: four positions; row 2: five positions (including one midline position that was only used in the nine-element condition); row 3: five positions (including one midline position that was only used in the nine-element condition); row 4: four positions]. In case of the nine-element condition, nine stimulus positions were filled simultaneously at a given trial in such a way that the majority of all presented stimuli was either on the egocentric left or right side of the screen. In both conditions, the exact stimuli positions used varied on a trial-by-trial basis. Both, the oneelement and nine-element condition were performed under two different Go/NoGo task settings. In the joint task setup, a pair of participants performed the Go/NoGo task setting together in such a way that one participant was assigned to one particular stimulus color (i.e., blue or yellow) throughout the whole experiment (see **Figure 1A**, left part). In contrast, during the individual task setup, the participant performed exactly the same Go/NoGo task setting (i.e., same relevant stimulus color and response button) alone without the involvement of a coagent. The non-spatial perceptual reference frame as indexed by the one-element or nine-element condition was implemented under both task setups with counterbalanced order across the subjects (i.e., half of the subjects started with one-element condition, the other half with nine-element condition). The stimuli and experimental program was identical to the one used for the two-choice task setting (Baess and Bermeitinger, unpublished).

#### Procedure

After arriving in the laboratory, participants were assigned with another participant. The participants were asked to take a seat on one of the two chairs in a custom-made sound attenuated chamber. Spatial labels regarding the assignment of the chairs (e.g., left vs. right chair) were avoided during the whole experiment. Instead, throughout the whole experiment, both participants were referred to either as Participant A or Participant B and their corresponding chairs where labeled like that. The label of "Participant A" or "Participant B" was randomly assigned between both participants, but remained the same during the whole experiment. The chair and thus the spatial seating position regarding the screen remained the same for each participant during all parts of the experiment. The order of the nonspatial perceptual reference frame (i.e., one-element vs. nineelement condition) was counterbalanced across all participants and remained the same for each part of the Go/NoGo task setting.

The participants were instructed to respond as quickly as possible to their relevant stimulus color (i.e., either blue or yellow) by pressing one of two custom-made response devices with their dominant hand (see also **Figure 1A**, left part for the setup). The custom-made response buttons did not produce any perceivable sound when executing the button. The response devices and therefore the responses of each participant were covered by a paper box in the joint Go/NoGo task setting.

The experiment was run under the Presentation software (Neurobehavioral System, Version 18) on a 16<sup>00</sup> color CRT screen (116 cm distance to the participants). For each condition (i.e., one-element or nine-element condition), 192 trials were recorded split into three separate blocks of 64 trials each. The stimuli were shown against a white background for a maximum of 2500 ms or until a response was executed. One trial lasted

for max. 4500 ms (500 ms centrally presented fixation cross, max. 2500 ms stimulus duration, 1500 ms inter-trial-interval). In one block, in half of the trials, the (majority of) stimuli were presented on the egocentric left side of the screen and in the other half, the (majority of) stimuli were presented on the egocentrically right side. Orthogonally to this, the ball was for half of the trials on the left side of the manikin and for other half on the right side of the manikin. This ensured that each combination of manikin's screen position and manikin's ball position was presented equally often. As shown in **Figure 2**, four different cases can be differentiated as a function of manikin's screen position (egocentric reference frame) and manikin's ball position (allocentric reference frame): (1) Screen Position-compatible – Ball Position-compatible trials, (2)

button is required for a blue stimulus, however, the color-response button associations were alternated across all participants. Transparent stimuli were not visible on the screen and are only displayed for illustrative purpose. In the one-element condition (A), the given stimulus in a trial occurred at any of the 16 lateral positions on the screen (eight left positions, eight right positions). In the nine-element condition (B), 9 of the 18 possible stimuli positions (16 lateral stimulus positions and 2 midline stimulus positions) were filled with the actual stimulus. The majority of the stimuli were on either the left or right side of the screen marking either Stimulus Screen Position compatible or incompatible trials. As shown, the amount of stimuli on the left or right side varied (between 4 and 7, as shown in the examples of the upper and lower panel).

Screen Position-incompatible – Ball Position-compatible trials, (3) Screen Position-compatible – Ball Position-incompatible trials, and (4) Screen Position-incompatible – Ball Positionincompatible trials.

For half the trials in each manikin position × ball position condition, the ball in the manikin's hand was blue whereas for the other half, the ball was yellow. In total, eight trials were presented per block for each combination of

(egocentric) manikin position, (allocentric) ball position, and ball color.

The response of each participant was required in half of the trials of one block (i.e., either for the yellow or the blue balls). For half of the participants, Participant A responded to blue stimuli whereas Participant B reacted to yellow stimuli.

The experiment started with a short training (20 trials in total) together with a partner. At the end of the training block (but not in the other parts of the experiment), the participants received visual feedback regarding the accuracy of the button presses.

After the training, Participant B left the chamber, filled out the IOS scale and handedness questionnaire, and performed another task completely unrelated to the present experiment (in the same room, but outside of the chamber). The other participant (Participant A) executed both versions of the non-spatial perceptual reference frame (i.e., one-element and nine-element condition, counterbalanced order across the participants) under the individual Go/NoGo task setting. After completion, both participants performed both variants of the non-spatial perceptual reference frame (i.e., one-element and nine-element condition, in the same order as the individual Go/NoGo task setting) under the joint Go/NoGo task setting. Finally, the Participant B executed both versions of the nonspatial perceptional reference frame (i.e., one-element and nineelement condition) whereas Participant A filled out the IOS scale and handedness questionnaire and performed another unrelated experiment outside of the chamber. Participant A sat always on the left chair and Participant B on the right chair (distance between both participants: 60 cm), however, the relevant stimulus color was varied between both participants.

#### Data Analysis

Only correct trials were analyzed further (1.07% of all trials were erroneous). Outlying reaction times were identified as 1.5 interquartile ranges above the third quartile with respect to the individual responses times (Tukey, 1977) or below 100 ms. In total, 10.3% of trials were discarded as outliers.

Data were analyzed with a repeated measures analysis of variance (ANOVA) with the within-subject factors Task Setup (individual Go/NoGo, joint Go/NoGo), Number of Elements (one-element condition, nine-element condition), egocentric Stimulus Screen Position (compatible, incompatible to participant's side), and allocentric Stimulus Ball Position (compatible, incompatible to participant's side). In addition, additional analysis included Task Order (single Go/NoGo task first, joint Go/NoGo task first) as between-subjects variable in the outlined repeated measures ANOVA. Mean values are given along with standard errors of the mean (SEM).

# Results

The overall ANOVA yielded a significant effect of Number of Elements, F(1,38) = 23.07, MSE = 31,912.27, p < 0.001, η 2 <sup>p</sup> = 0.378, indicating faster responses for the nine-element condition (366.63 ms ± 8.37) compared to the one-element condition (380.93 ms ± 8.20). The main effect of Stimulus Screen Position was almost significant, F(1,38) = 3.78, MSE = 1352.01, p = 0.059, η 2 <sup>p</sup> = 0.090, illustrating generally

Go/NoGo task setting.

faster responses for egocentrically compatible SR mappings (372.31 ms ± SEM) compared to egocentrically incompatible SR mappings (375.25 ms ± SEM). The interaction between Number of Elements and Stimulus Screen Position was significant, F(1,38) = 3.99, MSE = 1110.28, p = 0.053, η 2 <sup>p</sup> = 0.095. We further received an interaction between Number of Elements, Task Setup, and allocentric Stimulus Ball Position, F(1,38) = 4.45, MSE = 1913.02, p = 0.041, η 2 <sup>p</sup> = 0.105. The results are displayed in **Figure 3**. Based on these interactions and following our research interest, we disentangled the interactions by conducting separate analysis for the one-element condition and the nine-element condition. In addition, the analysis including the potential effect of Task Order (single Go/NoGo first, joint Go/NoGo first) showed no main effect of Task Order and importantly, the five-way interaction between Task Order × Number of Element, Task Setup, Stimulus Ball Position, Stimulus Screen Position was clearly not significant (see **Appendix Table A1** for the complete summary of the ANOVA).


Allocentric SE (i.e., referring to Ball Position) 2.88 (±2.23) −2.05 (±3.11)

TABLE 1 | Mean reaction times (in ms) and standard error of the mean (SEM) for compatible and incompatible trials as a function of the mapping between Stimulus Ball Position and Stimulus Screen Position in the joint Go/NoGo Task setup and the individual Go/NoGo Task setup, respectively, as well as the egocentric and allocentric Simon Effects (SE, in ms, SEM in parenthesis), separately for the one-element and the nine-element condition from Experiment 1.

Egocentric Stimulus Screen Position and allocentric Stimulus Ball SEs (i.e., the difference between SR incompatible mappings and SR compatible mappings) are presented separately for the one-element and nine-element condition and the joint and individual Go/NoGo Task setting in Experiment 1. Asterisks refer to significant SEs (p < 0.05).

#### One-Element Condition

The overall ANOVA with the factors Task Setup, egocentric Stimulus Screen Position, and allocentric Stimulus Ball Position revealed a significant main effect of Stimulus Screen Position, F(1,38) = 5.91, MSE = 2456.34, p = 0.020, η 2 <sup>p</sup> = 0.135, pointing to faster responses for SR compatible trials (378.12 ms ± 8.30) compared to incompatible ones (383.74 ms ± 8.26). More interestingly, we found two interactions involving the factor Task Setup, i.e., an interaction Task Setup × egocentric Stimulus Screen Position, F(1,38) = 5.79, MSE = 1147.18, p = 0.021, η 2 <sup>p</sup> = 0.132, and an interaction Task Setup × allocentric Stimulus Ball Position, F(1,38) = 4.36, MSE = 1607.37, p = 0.044, η 2 <sup>p</sup> = 0.103. Thereby the three-way interaction between Task Setup × Stimulus Screen Position × Stimulus Ball Position almost reached the significance level, F(1,38) = 3.37, MSE = 1007.38, p = 0.074, η 2 <sup>p</sup> = 0.081. In addition, the Task Order was included as a between-subjects factor into the ANOVA mentioned above. This analysis showed that the factor Stimulus Ball Position significantly interacted with the Task Order, F(1,37) = 7.70, MSE = 744.98, p = 0.009, η 2 <sup>p</sup> = 0.172, however, the main effect of Stimulus Ball Position remained nonsignificant. Stimulus Screen Position was not influenced by Task Order as indicated by the significant main effect of Stimulus Position, F(1,37) = 5.82, MSE = 2475.78, p = 0.021, η 2 <sup>p</sup> = 0.136, but no interaction with Task Order. Both interactions with Task setup as observed in the omnibus ANOVA were still significant when controlled for the influence of Task Order.

Further analysis was continued with separate ANOVAs for each level of the factor Task setup.

#### **Joint Go/NoGo task setting**

In a ANOVA with the factors Stimulus Screen Position and Stimulus Ball Position, only the main effect of Stimulus Screen Position yielded significance, F(1,38) = 8.22, MSE = 3480.41, p = 0.007, η 2 <sup>p</sup> = 0.178: faster responses were observed for egocentric SR compatible trials (375.28 ms ± 8.28) compared to SR incompatible trials (384.73 ms ± 7.83). The corresponding SEs, i.e., the reaction time difference between SR incompatible trials and SR compatible trials, are given in **Table 1**. The additional analysis of the influence of Task Order confirmed this pattern of results: only the main effect of Stimulus Screen Position was significant, F(1,38) = 8.00, MSE = 3435.91, p = 0.008, η 2 <sup>p</sup> = 0.178, however, the interaction with Task Order was nonsignificant. No other main or interaction effect was observed. The corresponding mean values are listed in **Appendix Table A2**.

#### **Individual Go/NoGo task setting**

The ANOVA with the factors Stimulus Screen Position and Stimulus Ball Position yielded a significant main effect of Stimulus Ball Position, F(1,38) = 6.22, MSE = 2199.90, p = 0.017, η 2 <sup>p</sup> = 0.141, illustrating faster responses for allocentric SR compatible trials (378.10 ms ± 8.79) compared to incompatible ones (385.61 ms ± 9.38). The additional ANOVA with Task Order obtained a main effect of Stimulus Ball Position F(1,37) = 7.99, MSE = 2324.88, p = 0.008, η 2 <sup>p</sup> = 0.178, and a significant interaction of Stimulus Ball Position and Task Order, F(1,37) = 9.24, MSE = 2687.89, p = 0.004, η 2 <sup>p</sup> = 0.200 (see also **Appendix Table A2**). The interaction between Task Order and Stimulus Screen Position was close to significance, F(1,37) = 3.72, MSE = 659.91, p = 0.062, η 2 <sup>p</sup> = 0.091, but there was no main effect of Stimulus Screen Position.

#### Nine-Element Condition

The overall ANOVA with the factors Task setup, Stimulus Screen Position, and Stimulus Ball Position revealed no influence of any main factor on the reaction times. There was only a tendency for an interaction between Stimulus Screen Position and Stimulus Ball Position, F(1,38) = 3.16, MSE = 505.58, p = 0.083, η 2 <sup>p</sup> = 0.083, but given its tentative nature, it was not analyzed further. Nevertheless, the corresponding SEs are displayed in **Table 1**, despite failing to reach the significance level. The additional

ANOVA with Task Order did not obtain any significant main effect or interaction.

# Discussion

In line with the two-choice task setting with the same stimulus setup, i.e., stick-figure manikins (Baess and Bermeitinger, unpublished), a set of nine manikins was processed faster than a single manikin providing evidence for a non-spatial perceptual reference frame based on the simultaneous presentation of nine manikins centered around the screen's center, yet still with a spatial alignment in order to allow a spatial left or right coding in a given trial. Further studies are needed in order to address whether the non-spatial reference frame with the nine manikins could also be reported if the nine manikins were exclusively assigned to one side of the screen's center. Yet this is not the main scope of the current paper.

In contrast to the two-choice task setting, reliable egocentric or allocentric SEs were only observed in the one-element condition. Moreover, the results showed that both, i.e., the egocentric and allocentric SEs depended on the task setup. An egocentric SE was found in the joint Go/NoGo Task setting when the task was performed alongside with a partner. This finding is well in line with existing literature on the JSE with standard stimulus setup (for review, Dolk et al., 2014) as well as the study by Ciardo et al. (2016): only with a co-actor as part of the task setup offering some sort of reference frame, an egocentric SE based on the stimulus' position on the screen can be observed. Likewise Ciardo et al. (2016) showed that only an egocentric SE, but no allocentric SE, occurred when an enriched stimulus setup was utilized that allowed the formation of different reference frames.

Surprisingly, we also obtained an allocentric SE in the individual Go/NoGo task setting. This effect is compelling as – at outlined before – the individual Go/NoGo task setting was carried out without the partner's involvement and thus, without a salient reference point given in order to elicit a SE. Thus, the source for the emergence of the SE can only be found in the stimulus setup itself, as the task setup per se did not provide any sufficiently salient reference points. At the first glance, this finding is apparently at odds with the study by Ciardo et al. (2016). Yet, although both studies used an enriched stimulus setup with using different spatial reference frames, they crucially differed in the way, how this was implemented ("external-object" vs. "same-object" approach). Therefore, these differences could explain why the stimulus setup in our study may have been salient enough in order to promote an allocentric reference frame based on the manikin's ball position, but this might have not been the case for the vertical lines used in the other study (Ciardo et al., 2016).

Therefore, Experiment 1 showed how different spatial reference frames were shaped by the task setup. As one kind of SE was observed in the joint and individual Go/NoGo task setting (albeit being different in regard to the responsible reference frame), a salience shift between the reference frames occurred. When a partner was involved in the task as in the joint Go/NoGo task setting, the egocentric reference frame (i.e., left vs. right of the screen's center) receives more weighting resulting in an egocentric SE for the manikin's screen position. Contrary, without a partner in the individual Go/NoGo task setting, the allocentric reference frame became more salient capturing more details of fine-grained features of the manikins such as the side with which the manikin was holding the ball. In other words, the task setup determined whether more global features (as the spatial side of the manikin's position, egocentric SE) or more local features (as the side of the ball, allocentric SE) of the stimulus setup were processed further in Go/NoGo task settings resulting in the formation of the corresponding reference frames. As the stimulus setup was identical for both variants of the task setup, the presence of the co-actor seemingly modulated the formation of egocentric and allocentric spatial reference frames differently. The reliance of the SE on the physical distance between the coactors in the joint Go/NoGo task setting **(Guagnano et al., 2010)** could be used as another argument: despite identical stimulus setup, a JSE was only observed for participants within each other's peripersonal space. With application to the current study, it could be possible that through the whole task setup (with or without a partner) a salience shift between the different spatial reference frames occurred as other details of the stimulus setup became salient depending on the individual or joint task setup. When performing the task in the individual Go/NoGo task setting, the participants focused more on the manikins itself. Contrary, when another person is seated next to the participants, the more global left/right differentiation in this scenario might be fostered resulting the salience of the egocentric reference frame. Accordingly, following this argument, the shifting between the salience of the different spatial reference frames could be the underlying principle explaining the two different SEs.

However, given the nature of this task setup, i.e., involving two participants, the shared instructions of both participants and so on, it might be possible that some carry-over effects as a function of task setup occur depending crucially on the order in which the joint or individual Go/NoGo task setting was carried out. It has been shown that carry-over effects occurred between related tasks as a spatial compatibility task (spatial location is taskrelevant) and a Simon task (spatial location is task-irrelevant) (Lugli et al., 2013), even with joint and individual Go/NoGo task settings (Milanese et al., 2010). Our additional analyses with task order as a between-subjects factor partially support this idea. The additional analysis as part of the **Appendix Tables** displaying the allocentric SE as a function of task order might promote this idea showing that the allocentric SE was only present when the joint Go/NoGo task setting was carried out first. Yet, these values have to be interpreted with caution as they only consider half of the sample. Moreover, the possible influence of task order depending on whether the joint or individual Go/NoGo task setting was carried out first shows exactly how different task setup can potentially influence the formation of spatial reference frames. However, the potential influences of task order in our study were still clearly different from those studies observing the impact of a learning transfer between two different kinds of spatial compatibility tasks, i.e., spatial-compatibility task vs. Simon task (Milanese et al., 2010; Lugli et al., 2013). Further studies are needed in order to explicitly investigate potential task order effects between different variants of Go/NoGo Simon task settings further.

To sum up, Experiment 1 showed that the joint or individual task setup prompts the saliency of different spatial reference frames when those are directly embedded in the stimulus setup. Depending on the presence of a co-actor, either the egocentric or allocentric reference frame received more weight introducing different forms of response conflict as the source of the observed egocentric or allocentric SE. Experiment 2 tested this assumption further with different stimulus material, but otherwise identical stimulus setup and task setup.

# EXPERIMENT 2

As a salience shift between spatial reference frames triggered by the joint or individual task setup occurred in Experiment 1, this finding should be exploited further in Experiment 2. Therefore, new stimulus material was used but all other features of stimulus setup and task setup remained otherwise the same as in Experiment 1. This means, the stick-figure manikins used in Experiment 1 were quite human-like, albeit inanimate, but easily semantically connoted as such. Consequently, it might be possible that the salience shift between the different spatial reference frames was facilitated (if not enabled) by the human-like features of the manikin (e.g., body midline, two arms, two legs, head). In order to explicitly address this possibility, new abstract stimulus material was created by rearranging the parts of the stickfigure manikins in an abstract way (**Figure 1B**). Importantly, the abstract patterns (Experiment 2) and the manikins (Experiment 1) were physically identical; the only difference being that the abstract patterns did not represent any semantically meaningful content.

## Materials and Methods Subjects

Forty-four new participants were recruited as in Experiment 1 at the University of Hildesheim (mean age: 20.89 years, 18–28 years; six male). They got partial course credit for participation. The participants mean on the IOS scale ranging from 1 to 7 (Aron et al., 1992) was 2.59 (1.48 SD). Three participants (mean laterality quotient = −55.00, SD = 42.72) were left handed (Oldfield, 1971). All participants gave written informed consent and were treated in accordance with the Declaration of Helsinki.

#### Stimuli and Apparatus

Abstract geometrical patterns were used here. They were made out of the single elements of the stick-figure manikin, but newly arranged in such a way that they did not form any meaningful object (**Figure 1B**). They were 26 mm in width and 35 mm in height on the screen. All other experimental details were exactly as described in Experiment 1.

#### Procedure

The procedure was identical to the one laid out in Experiment 1 except the following details. Stimuli were presented on a 17<sup>00</sup> CRT screen. The distance of the participants to the screen was 60 cm and distance between both participants was 75 cm. The participants responded to other custom-made response buttons (without any perceivable sound associated with a button press) with their dominant hand. The responses by both participants were not covered in contrast to Experiment 1. The experiment was carried out in a different room (without separate experimental chambers). While one participant was executing the individual Go/NoGo task setting; the other participant performed another study unrelated to this experiment in the same room (yet still out of sight as separated by a black curtain).

#### Data Analysis

Errors (1.4%) and reaction time outliers (6.5%) have been removed in the same manner as in Experiment 1. The omnibus ANOVA was calculated with the within-subject factors Number of Elements (one-element condition, nine-element condition), Task Setup (joint Go/NoGo, individual Go/NoGo), egocentric Stimulus Screen Position (compatible, incompatible) to participant's side and allocentric Stimulus Ball Position (compatible, incompatible) to participant's side. Additional analysis was carried out including Task Order (single Go/NoGo first, joint Go/NoGo first) as between-subjects factor in the ANOVA.

# Results

In the omnibus ANOVA, the main effects of Number of Elements, F(1,43) = 19.00, MSE = 12,445.47, p < 0.001, η 2 <sup>p</sup> = 0.306 (faster responses for the nine-element condition: 318.28 ms ± 6.40, compared to the one-element condition: 326.79 ms ± 5.91) and Task Setup, F(1,43) = 23.15, MSE = 35,622.05, p < 0.001, η 2 <sup>p</sup> = 0.350 (joint Go/NoGo task: 315.47 ms ± 6.06, single Go/NoGo task: 329.69 ms ± 6.46), were significant. Moreover, there was a main effect of egocentric Stimulus Screen Position, F(1,43) = 18.14, MSE = 1891.58, p < 0.001, η 2 <sup>p</sup> = 0.297, pointing to faster responses for SR compatible trials (320.94 ms ± 6.09) compared to SR incompatible trials (324.22 ms ± 6.10). There were several two-way interactions, i.e., an interaction of Number of Elements and Stimulus Screen Position, F(1,43) = 15.90, MSE = 2012.20, p < 0.001, η 2 <sup>p</sup> = 0.270, an interaction of Task Setup and Stimulus Ball Position, F(1,43) = 5.01, MSE = 380.97, p = 0.030, η 2 <sup>p</sup> = 0.104 as well as an interaction of Task Setup and Stimulus Screen Position, F(1,43) = 8.27, MSE = 829.80, p = 0.006, η 2 <sup>p</sup> = 0.161. Further, the three-way interaction between Number of Elements, Task Setup, and Stimulus Screen Position, F(1,43) = 11.65, MSE = 1070.53, p = 0.001, η 2 <sup>p</sup> = 0.213, was significant (see also **Figure 4**). An additional ANOVA included the factor Task Order into the omnibus ANOVA (see **Appendix Table A3** for all values). The interaction Task Order × Number of Elements × Task Setup × Stimulus Ball Position was significant, F(1,42) = 8.43, MSE = 798.20, p = 0.006, η 2 <sup>p</sup> = 0.167. However, the five-way interaction was not significant.

#### One-Element Condition

In the ANOVA with the factors Task Setup, Stimulus Screen Position, and Stimulus Ball Position, the main effects of Task Setup, F(1,43) = 17.66, MSE = 14,813.94, p < 0.001, η 2 <sup>p</sup> = 0.291 (faster responses under the joint Go/NoGo Task Setup: 320.30 ms ± 5.87 vs. the individual Go/NoGo Task Setup: 333.27 ± 6.34) and Stimulus Screen Position, F(1,43) = 28.03,

are compatible (SR compatible), dashed bars display the conditions, in which the abstract pattern's position and the participant's seating position are incompatible (SR incompatible). Gray bars show conditions, in which the ball's position and the participant's seating position are compatible, green bars illustrate conditions in which the ball's position and the participant's seating position are incompatible. Bars are given separately for the individual and joint Go/NoGo task setting.

MSE = 3902.85, p < 0.001, η 2 <sup>p</sup> = 0.395 (compatible SR mapping: 323.46 ms ± 5.95 vs. incompatible SR mapping: 330.12 ms ± 5.94) reached significance. Further, there was a significant interaction between Task Setup and Stimulus Screen Position, F(1,43) = 20.60, MSE = 1892.67, p < 0.001, η 2 <sup>p</sup> = 0.324. Additional analyses were conducted including the Task Order as between-subjects factor into the ANOVA. No main effect or interaction with Task Order was found.

#### **Joint Go/NoGo task setting**

In the ANOVA with the factors Stimulus Screen Position and Stimulus Ball Position, there was only a main effect of Stimulus Screen Position, F(1,43) = 36.25, MSE = 5615.63, p < 0.001, η 2 <sup>p</sup> = 0.457. SR compatible trials were responded faster (314.65 ms ± 5.83) than SR incompatible trials (325.95 ms ± 6.07). The corresponding SEs are presented in **Table 2**. The additional analysis with the between-subject factor Task Order did not obtain any interaction or main effect.

### **Single Go/NoGo task setting**

In the corresponding ANOVA, neither a main effect nor an interaction was observed. The non-significant SEs are nonetheless given in **Table 2**. The additional analysis with Task Order showed an interaction between Stimulus Screen Position and Task Order, F(1,42) = 4.28, MSE = 303.14, p = 0.045, η 2 <sup>p</sup> = 0.093 (see **Appendix Table A4** for the SEs as a function of Task Order).

#### Nine-Element Condition

The ANOVA with the factors Task Setup, Stimulus Screen Position, and Stimulus Ball Position yielded a main effect of Task Setup, F(1,43) = 20.35, MSE = 21,084.01, p < 0.001, η 2 <sup>p</sup> = 0.321, indicating faster responses under the joint Go/NoGo Task Setup (310.64 ms ± 6.41 vs. 326.12 ms ± 6.84). Further, there was a two-way interaction between Task Setup and Stimulus Ball Position, F(1,43) = 4.88, MSE = 537.10, p = 0.033, η 2 <sup>p</sup> = 0.102. Additional analysis including the factor Task Order obtained a significant two-way interaction between Stimulus Ball Position, F(1,42) = 13.28, MSE = 1697.98, p = 0.001, η 2 <sup>p</sup> = 0.240, as well as a three-way interaction between Task Setup, Stimulus Ball Position, and Task Order, F(1,42) = 8.94, MSE = 831.06, p = 0.005, η 2 <sup>p</sup> = 0.175.

#### **Joint Go/NoGo task setting**

No main effect or interaction was obtained in a ANOVA with Stimulus Screen Position and Stimulus Ball Position. Nonsignificant SEs are listed in **Table 2**. The additional ANOVA including Task Order as a between-subject factor showed however that the factor Stimulus Ball Position was modulated by Task Order, F(1,42) = 27.44, MSE = 2452.43, p < 0.001, η 2 <sup>p</sup> = 0.395 (see also **Appendix Table A4**). The allocentric SEs based on Stimulus Ball Position were significant under both Task Orders [single Go/NoGo\_first: t(21) = 2.72, p = 0.013 vs. joint Go/NoGo first: t(21) = 4.89, p < 0.001], but differed in direction, resulting in an overall non-significant SE for Stimulus Ball Position.

#### **Single Go/NoGo task setting**

The ANOVA obtained a significant main effect of Stimulus Ball Position, F(1,43) = 3.86, MSE = 503.02, p = 0.056, η 2 <sup>p</sup> = 0.082, indicating faster responses for SR incompatible trials compared to SR compatible trials (see also **Table 2**). The additional analysis with Task Order as a between-subject factor did not reveal any significant interactions involving Task Order or a main effect of Task Order.

## Discussion

Consistent with Experiment 1 and those results from the twochoice Simon task setting (Baess and Bermeitinger, unpublished), faster responses were obtained in the nine-element condition pointing to the formation of a non-spatial perceptual reference frame. Regarding our research scope, a similar result pattern was observed as in Experiment 1: an egocentric SE in the joint Go/NoGo task setting of the one-element condition and an allocentric SE in the individual Go/NoGo task setting of the nine-element condition. Again evidence was obtained for a saliency shift between different spatial reference frames (as in Experiment 1) and the non-spatial perceptual reference frame


TABLE 2 | Mean reaction times (in ms) and standard error of the mean (SEM) for compatible and incompatible trials as a function of the mapping between Stimulus Ball Position and Stimulus Screen Position in the joint Go/NoGo Task setup and the individual Go/NoGo Task setup, respectively, as well as the egocentric and allocentric Simon Effects (SE, in ms, SEM in parenthesis), separately for the one-element and the nine-element condition, from Experiment 2.

Asterisks refer to significant SEs (p ≤ 0.05).

depending on the task setup. The egocentric SEs obtained in Experiments 1 and 2 were comparable in size. As the stimulus' screen position is a rather global feature of the stimulus setup, this consistency was expected. Yet, the occurrence of the allocentric SE differed between Experiment 1 (allocentric SE in the one-element condition) and the present one (allocentric SE in the nine-element condition). Thus, what drives the distinction between the stickfigure manikins and the abstract patterns? A stick-figure manikin represents a meaningful semantic category (with well-established spatial labels, like left/right arm and so on) compared to an abstract pattern of circles and lines (without any pre-established spatial labels). Moreover, the manikins naturally introduced a differentiation between theleft and right ball position. Technically, this was even introduced in the abstract geometrical patterns, but as intended, as part of a non-meaningful object. This might explain why the occurrence of the allocentric SEs was determined by the non-spatial perceptual reference frame. When one abstract pattern was presented, it might have been more difficult to form spatial codes based on the allocentric reference frame. Contrary, when a set of stimuli was presented simultaneously, it might have been easier to spot this fine-grained spatial differences required for the formation of an allocentric reference frame. Interestingly and consistent with Experiment 1, the allocentric SE was only obtained in individual Go/NoGo task setting meaning when no partner was involved in one's own task. As the additional analysis showed, the allocentric SE in the individual Go/NoGo task setting of the nine-element condition was not influenced by task order. Therefore, carry-over effects from one task setup to the other one were less likely to be the cause of the observed salience shift between the different spatial reference frames.

To conclude, Experiment 2 showed again a salience shift between different spatial reference frames, which was modulated by the task setup and the non-spatial perceptual reference frame. The presence of a co-actor promoted the formation of an egocentric SE with regard to the abstract pattern's screen position (and thus an egocentric reference frame). Opposite to it, when no co-actor was involved, details of the stimulus setup were focused in a much greater detail as indicated by the allocentric SE related to the ball's position in the abstract pattern (and thus the allocentric reference frame).

# EXPERIMENT 3

Both previous experiments obtained evidence for the idea how different spatial reference frames enabled through an enriched stimulus setup are modulated by the task setup, meaning whether a Go/NoGo task setting was performed alone or together with a co-actor. The co-actor promoted the formation of spatial codes based on the egocentric reference frame in both experiments so far. In addition, without a co-actor, saying in the individual task setup, local details of the stimuli received a greater amount of processing as indicated by the formation of spatial codes based on the allocentric reference frame. Recent studies on the JSE with the standard stimulus setup have shown that it does not require a human co-actor in order to evoke a SE (for overview, Dolk et al., 2014). As reported, an external, attention-grabbing object such as a golden Japanese waving cat can also serve as a reference point crucial for the appearance of a SE (Dolk et al., 2013). Newer studies have emphasized that the stimulus modality (auditory vs. visual Simon Go/NoGo task setting) played an important factor for the efficacy of the Japanese waving cat as an attention-grabbing object (Lien et al., 2016; Puffe et al., 2017). Whereas the Japanese waving cat could successfully be used as an external salient reference point in the auditory Go/NoGo Simon task setting, it failed to do so in the visual Go/NoGo task setting. This was interpreted as evidence that the waving cat was not salient enough to induce SEs for visual stimuli. It was further assumed that the visual stimuli bound the attention more to the screen in the visual Go/NoGo task setting compared to auditory stimuli broadening the attentional focus due to the setup with loudspeakers to each side of the screen (Puffe et al., 2017). However, in these studies, visual stimuli were either presented centrally superimposed on a task-irrelevant directional photo of

a hand (Lien et al., 2016) or spatially aligned left or right from the midline of the screen (Puffe et al., 2017). In both studies, no SE could be obtained for visual Go/NoGo task settings, neither in the condition with the Japanese waving cat nor without. Yet, Stenzel and Liepelt (2015) provided evidence for SEs in an individual visual Go/NoGo task setting when a photo of a Japanese waving cat was displayed in one corner of the screen, as part of the task setup<sup>1</sup> , but clearly outside of the critical stimulus. Under this task setup, reliable SEs were obtained for both, a photo of a human hand or a photo of the Japanese waving cat. Interestingly, the size of the SE did not vary between a photo of the Japanese waving cat or a human hand. This illustrates the feasibility to induce spatial codes also under a visual Go/NoGo task setting if the task setup is salient enough to include spatial reference points.

As attention-grabbing object such as the Japanese waving cat are in principal salient enough to support the formation of spatial codes as the source of the SE (cf. Dolk et al., 2013), the present study aimed at replicating Experiment 1 by replacing the coactor with a Japanese waving cat. This manipulation allowed us to investigate the formation of spatial reference frames within the enriched stimulus setup as used in Experiments 1 and 2 in a Go/NoGo task setting without any co-actor. Because the task setup never included a co-actor, Experiment 3 provides some kind of baseline of how different spatial reference frames could be formed in individual Go/NoGo task settings without the influence of a co-actor, but with or without the potential impact of an external, attention-grabbing object.

# Materials and Methods

#### Subjects

Forty-one new participants were recruited for this study. One participant was excluded due to lack of compliance (two-choice responses in the individual Go/NoGo task setting). One further participant was not naïve to the purpose of the study due to attending a course by one of the authors. Thus, the final sample consisted of 39 participants (mean age: 22.0 years, 18– 35 years; five male). Three participants (laterality quotient: −55.00, SD = 44.44) were left handed according to a handedness questionnaire (Oldfield, 1971). One participant did not have a preferred hand. One participant did not fill out the handedness questionnaire and questions regarding its age and gender. All participants gave written informed consent and were treated in accordance with the Declaration of Helsinki.

#### Stimuli and Apparatus

The same stick-figure manikins as in Experiment 1 on a 16<sup>00</sup> CRT-monitor were used. In contrast to previous experiments, this experiment was carried out alone without a co-actor's involvement. For the sake of consistency, two chairs were placed in front of the monitor, although the second chair was never used (distance between both chairs: 5 cm). Further, only the oneelement condition was executed under two variations of the task setup. In the cat-present task setup, a golden Japanese waving cat (height: 17 cm; width: 10.5 cm, depth: 7 cm) was placed left side of the monitor and participant (**Figure 1C**). The automatic batterydriven movement of its left arm produced a barely noticeable, unsystematic waving sound as part of the waving movement. The participants were clearly able to see the cat in their peripheral visual field. In the cat-absent task setup, the whole arrangement remained the same except that the cat was not visible any more (it was hidden inside of a paper cylinder) and was switched off. The participants performed under both task setups as an individual Go/NoGo task setting.

#### Procedure

The order of the cat-present and cat-absent task setups was counterbalanced across the participants. In order to make the necessary changes in the testing chambers, the participants were briefly asked to leave the testing chamber with the explanation that the experimental leader had to start the new condition. The experiment instructors changed the task setup in the test chambers according to the counterbalanced order. As indicated by **Figure 1C**, the two test chambers were yet other ones than used so far. Importantly, the cat itself never left the test chamber but was hidden inside of a paper cylinder (not visible for the participants) in the cat-absent task setup. In the cat-present task setup, the cat was placed before the paper cylinder. To maintain symmetry, a lamp was positioned on the right side of the monitor, which remained switched on during the whole experiment. All participants were seated on the right chair and used the custommade response button of Experiment 2 to react with their right index finger (distance monitor and participant: 52, respectively, 55 cm depending on the test chamber). Only one response button was placed on the table. The participant sat throughout the experiment on the right chair and the cat (if present) was always at the left side of the screen. Half of the participants responded to blue stimuli and the other half to yellow stimuli.

### Data Analysis

Errors (0.42%) and reaction time outliers (2.87%) were identified as in previous experiments. The omnibus ANOVA was calculated with the within-subject factors Task Setup (cat-present, cat-absent), egocentric Stimulus Screen Position (compatible, incompatible), and allocentric Stimulus Ball Position (compatible, incompatible).

### Results

In the omnibus ANOVA, a main effect of egocentric Stimulus Screen Position was obtained, F(1,38) = 6.28, MSE = 469.94, p = 0.017, η 2 <sup>p</sup> = 0.142, 90% CI of the effect size [0.01; 0.31], indicating faster responses for SR compatible trials (317.48 ms ± 6.21) compared to SR incompatible ones (319.93 ms ± 6.08), irrespective of the task setup. The corresponding egocentric SE was 2.45 ms (±0.98). The other main effects or the interactions were clearly not significant (all ps > 0.3), see also **Table 3** and **Figure 5**.

### Discussion

Experiment 3 obtained evidence for an egocentric SE when using the stimulus setup with different possibilities to form spatial

<sup>1</sup>As the photo of the cat was not part of relevant stimulus setup, we consider it a modulation of task setup in a reading that the features of an enriched task setup were projected on the screen itself instead of spatially aligned in the scenario.

#### TABLE 3 | Egocentric and allocentric SE from Experiment 3.

fpsyg-09-02063 November 29, 2018 Time: 17:9 # 15


are incompatible (SR incompatible). Gray bars show conditions in which the ball's position and the participant's seating position are compatible, green bars illustrate conditions in which the ball's position and the participant's seating position are incompatible. Bars are given separately for the individual Go/NoGo task setting with and without Japanese waving cat.

codes in an individual Go/NoGo task setting. The egocentric SE was completely unaffected by the presence of a Japanese waving cat. At the first glance, this result is surprising as previous research failed to report a reliable SEs in a visual Go/NoGo task setting (e.g., Hommel, 1996). Yet, the emergence of an egocentric SE in our study might as well illustrate that spatial reference frames may be utilized also in Go/NoGo Simon task settings provided that sufficiently salient reference points were embedded in the stimulus setup. This might explain why we found a small but reliable egocentric SE when other's failed to do so. This notion is further supported by the fact even changing the task setup by including the Japanese waving cat did not modulate the SEs. Therefore, one might even claim that the Japanese waving cat in our task setup did not serve as a spatial reference point as in other studies (Dolk et al., 2013; Lien et al., 2016; Puffe et al., 2017). The stimulus setup used in our experiments with the possibility to form different spatial reference frames was already salient enough to promote the formation of the egocentric reference frame. In line with this statement, the Japanese waving cat did not add "new" reference points to the task setup, on top of the ones already inherent in the stimulus setup of our study. Thus, our overall egocentric SE in both individual Go/NoGo task settings might illustrate that our stimulus setup is per se salient enough to boost the formation of spatial reference frames. Finally, one might wonder why no allocentric SE occurred at all in Experiment 3. Following our previous experiments, an allocentric SE consistently occurred in an individual Go/NoGo task setting, but not in a joint Go/NoGo task setting. Hence, the allocentric SE in our previous studies was demonstrated when the whole task setup involved a human co-actor. Only under this condition, we observed a salience shift between more global features of the stimuli (as indicated by the egocentric SE) and more local features of the stimuli (as evident by the allocentric SE).

To conclude, this study showed that the stimulus setup itself could promote the formation of spatial reference frames. Other, external attention-grabbing objects did not modulate the spatial reference frames further.

# GENERAL DISCUSSION

The present study examined how the formation of different egocentric and allocentric reference frames was modulated by the task setup, performing a visual Go/NoGo Simon task setting alone or together with a co-actor. Central to our studies was the usage of an enriched stimulus setup ("same-object-approach") allowing the simultaneous formation of egocentric, allocentric, and even a non-spatial perceptual reference frame. The possibility that the stimulus setup itself might include enough salient reference points in order to establish cognitive conflict as the source of the SE has not yet received much attention in Go/NoGo Simon task setting. Experiment 1 gave evidence for an egocentric SE under joint Go/NoGo task setting and an allocentric SE under individual Go/NoGo task setting. Both SEs were obtained when one critical stimulus was presented on the screen (one-element condition). Experiment 2 confirmed, in principal, previous results using abstract stimulus material. Here, an egocentric SE was obtained in the joint Go/NoGo task setting when one stimulus was shown on the screen and an allocentric SE was found in the individual Go/NoGo task setting when a set of nine identical stimuli were shown on the screen allowing the formation of a non-spatial perceptual reference frame by applying the Gestalt principle of grouping. Lastly, Experiment 3 investigated whether an external, attention-grabbing object such as the Japanese waving cat would also offer additional reference points (besides the ones already inherent in our stimulus setup) in the task setup. The finding of an overall egocentric SE totally independent of the Japanese waving cat showed that our enriched stimulus setup is already salient enough to provide reference points as a core of spatial conflict. The reference points offered by the Japanese waving cat did not add anything additionally to the scenario. In the following, we will discuss our results along these two main lines, i.e., (i) the salience shift between egocentric and allocentric reference frames and (ii) the influence of stimulus setup and task setup on the formation of spatial reference frames.

# Salience Shift Between Egocentric and Allocentric Reference Frames: The Influence of Task Setup

When the participants worked at any point during the experimental session together with a human co-actor (as in Experiments 1 and 2), we observed an egocentric SE in the joint Go/NoGo task setting and even an allocentric SE in the individual Go/NoGo task setting, albeit no co-actor or other attentiongrabbing object was involved in the task setup. This salience shift between spatial reference frames (egocentric vs. allocentric) as a function of individual or joint task setup is compelling. Previous work showed that a JSE emerged in a Go/NoGo task setting when a human co-actor or an external, attention-grabbing object was present to provide spatial reference crucial for the appearance of a SE (for overview, Dolk et al., 2014). Both, the human coactor or the external, attention-grabbing object enriched the joint task setup as a part of the representation of the whole task. In these studies, the crucial comparison between joint and individual task setups illustrated how a co-actor or an external object could be used as salient reference frames. These studies utilized a stimulus setup that allowed only one possible spatial reference frame, namely the egocentric reference frame based on the stimulus' screen position. Yet, the possibility that the stimulus setup itself could foster the formation of spatial codes provided its sufficient salience has so far not yet been systematically considered. Only the study by Ciardo et al. (2016) used some sort of enriched stimulus setup ("external-object-approach") while varying the task setup. Here, an egocentric SE (also labeled SE for hemispace) was reported in the joint Go/NoGo task setting, but – in contrast to our variation of enriched stimulus setup ("same-object-approach") – no allocentric SE (also labeled as relative position within each hemispace). This discrepancies in the allocentric reference frame might be explained best by recalling the differences in the enriched stimulus setups used in their and our study: while in our study the reference points for the allocentric reference frame were in close proximity (or even part of more global features) of the critical stimulus, the reference points for the allocentric reference frame were in the other study clearly separated from the critical stimulus. It has been shown elsewhere that the enriched stimulus setup used in our study is per se salient enough to simultaneously provoke different spatial and even non-spatial perceptual reference frames (Baess and Bermeitinger, unpublished). This was not observed in the study Ciardo et al. (2016) using a different approach to enrich the stimulus setup (see also for the two-choice task setting, Roswarski and Proctor, 1996). We might therefore conclude that a salience shift between different spatial reference frames occurred in our studies depending on the task setup: When the Go/NoGo task setting involved a co-actor, the global features of the stimulus setup (i.e., the spatial location of the stimulus with regard to its position on the screen) received detailed processing resulting in an egocentric SE. Contrary, when no co-actor was part of the task setup as in the individual Go/NoGo task setting, more local features of the stimulus setup were elaborated leading to the emergence of an allocentric SE. This idea of a salience shift between different spatial reference frames in a Go/NoGo Simon task setting has so far not yet been shown. Most likely, the previously used (enriched or standard) stimulus setup was not salient enough in order to foster cognitive conflict based on different spatial reference systems. This salience shift between egocentric and allocentric reference frames could follow the idea of an intentional weighting mechanism suggested as a central principle underlying human cognitive control (cf. Memelink and Hommel, 2013). In a nutshell, some features (e.g., the left/right labels with regard to the screen's center) are weighted more strongly during the joint task setup as a coactor is next to the corresponding agent, so that the left/right features representing an egocentric reference frame received a stronger emphasis resulting in an egocentric SE for the joint Go/NoGo task setting. In contrast, when the very same task is executed alone, those features promoting an egocentric reference frame might under this condition be less salient and received less weight. Alternatively, the fine-grained local features of the manikin itself, i.e., the side of the "hand" holding the ball, might now receive more weight leading to the dominance of an allocentric reference frame. To state, the observed salience shifts between different spatial reference frames show how a co-actor's presence can change the relevance of reference frames within the same enriched stimulus setup.

The human flexibility to adopt between different reference frames and even perspectives has been shown in other paradigms as well. Samson et al. (2010) showed that the perspective of a human avatar influenced one's own perspective ("altercentric intrusions") in visual perspective taking experiments, although the participants were explicitly instructed not to do so. Here, the perspective of the avatar could not easily be ignored. In this study, participants had to mentally rotate themselves into the avatar's position in order to take over the perspective of the avatar. However, the participants in our study were neither instructed to explicitly take over a certain perspective nor did the stick-figure manikin's frontal view promote the idea of mentally rotating oneself into the manikin's perspective. However, the human automatic ability to mentally take over other's perspective might work as an explanation for the differences in the allocentric reference frame between the stick-figure manikins and the abstract geometrical patterns.

A study by Freundlieb et al. (2017) illustrated that spatial compatibility effects as a marker of visuospatial perspective taking occurred only when the co-actor had visual access to the stimulus setup, even when the co-actor performed a different task. As the co-actors in our joint Go/NoGo task setting performed the same Simon task with mutual visual access, this might illustrate further why the egocentric reference frame might be the dominant one. It has also been show in a Navon-Task that the reaction times slowed down when different features of the same stimulus (global vs. local) had to be considered within a pair of co-actors (Bockler et al., 2012). When the co-actors focus of attention (e.g., global features) differed from one's own focus of attention (e.g., local features), this led to a conflict in selecting the appropriate response as evident by a slowdown of response times. In our study, the switch between global and local features of the stimulus setup took place uninstructed and automatically when

the task setup changed between individual and joint Go/NoGo task settings.

# The Influence of Stimulus Setup and Task Setup on the Formation of Spatial Reference Frames

Our Experiment 3 illustrated the influence of the overall task setup as the practical abstraction level of "task shaping" (Prinz, 2015; Dolk and Prinz, 2016). When no "task shaping" could take place in any form in the task setup, i.e., no co-actor was at any point involved in the task setup, the enriched stimulus setup of our study evoked the formation of spatial reference frames differently. Here, only an overall egocentric SE was yielded, unrelated to other attention-grabbing objects as part of the task setup. As the stimulus setup utilized in Experiments 1 and 3 was identical and the presence of a co-actor being the only difference, this might illustrate the "core" impact of a task setup involving a co-actor. In other words, when the overall task setup (i.e., the experiment in general) involved a co-actor, even independent of the current task setup (i.e., the joint or individual task setup), this might be salient enough to represent – in some extent – the co-actor as part of the overall task setup. Yet, the level of co-actor's representation as part of the overall task setup might be a rather general one, for example, it could be restricted to acknowledging that the overall task setup involved, at some point, a human co-actor. Hence, this level of "joint encounter" as part of the overall task setup, inseparably inherent within this line of research, could possibly boost the mechanisms assigning different weights to different spatial reference frames during the processing the identical stimulus setup under different Go/NoGo Task setups. Consequently, the effects were salient schifts between egocentric and allocentric reference points as observed between joint and individual Go/NoGo task settings. These salience shifts did not occur when no co-actor, but an external attention-grabbing object, was part of the task setup. Thus, the effects of the human co-actor were seemingly twofolded: (i) the co-actor's presence shapes the overall task setup and (ii) the-co-actor's presence reinforces the egocentric reference frame in the joint task setup. Importantly, as Experiment 3 showed, the co-actor was not per se required for the formation of spatial codes within the egocentric reference frame, but served as a trigger (i.e., weight) in order to foster the switch between different spatial reference frames across different task setups.

# CONCLUSION

Our series of experiments provides evidence how an enriched stimulus setup influenced the formation of spatial, i.e., egocentric

# REFERENCES


and allocentric reference frames differently for the joint and individual task setup. SEs were obtained in Go/NoGo task settings when using a stimulus setup that provided sufficient reference points for the formation of spatial reference frames. If the overall task setups involving at some point a co-actor, a salience shift between spatial reference frames and thus between global and local details of the stimulus setup as the source of the underlying cognitive conflict was observed. Further studies are required in order to scrutinize the interplay between stimulus setup and task setup in social and non-social contexts more thoroughly.

# ETHICS STATEMENT

This study was carried out in accordance with the recommendations of the German Society of Psychology's research standards. The protocol was approved by the local ethic committee ("Fachbereich 1") of the University of Hildesheim. All subjects gave written informed consent in accordance with the Declaration of Helsinki.

# AUTHOR CONTRIBUTIONS

PB designed, programmed, and analyzed all experiments and wrote the manuscript. TW collected the data for Experiment 1 as part of his Bachelor's degree. CB developed the initial version of the paradigm and commented on parts of the manuscript.

# FUNDING

We acknowledge financial support by the Stiftung Universität Hildesheim for the open access fee.

# ACKNOWLEDGMENTS

The authors are grateful for their student helpers Marcel Dietrich, Martin Honscha, Jonas Jänig, Johanna Murr, Sarah Klages, and Milena Szabo for their help with the data collection.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg. 2018.02063/full#supplementary-material



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Baess, Weber and Bermeitinger. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Is Your Color My Color? Dividing the Labor of the Stroop Task Between Co-actors

Motonori Yamaguchi\*, Emma L. Clarke and Danny L. Egan

Department of Psychology, Edge Hill University, Ormskirk, United Kingdom

Performing a task with other actors involves two opposing forces, division of labor between co-acting individuals and integration of divided parts of the task into a shared mental representation (co-representation). Previous studies have focused primarily on the integration of task representations and limited attention has paid to the division of labor. The present study devised a test of the integration and the division in a joint task setting. A joint version of the Stroop task was developed, in which pairs of actors were assigned different sets of target colors. If the actors integrate their co-actor's task, the colors assigned to their co-actor should be represented as if they were the actor's own target colors; the Stroop effect should be as large when distractor color words denote their co-actor's target colors as when these words denote the actor's own target colors. If the actors divide the labor of the Stroop task, the colors assigned to their partner should be represented as non-target colors; the Stroop effect should be smaller when the distractor color words denote the co-actor's target colors than when these words denote the actor's own target colors. The results of response time did not provide clear support for either position, while those of response accuracy supported the division of labor. Possible cognitive mechanisms that support the division of labor and the integration of task representation are discussed.

#### Edited by:

Anna M. Borghi, Sapienza Università di Roma, Italy

#### Reviewed by:

Filomena Anelli, IRCCS Istituti Clinici Scientifici Maugeri (ICS Maugeri), Italy Pietro Spataro, Sapienza Università di Roma, Italy

#### \*Correspondence:

Motonori Yamaguchi yamagucm@edgehill.ac.uk; cog.yamaguchi@gmail.com

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 17 November 2017 Accepted: 19 July 2018 Published: 06 August 2018

#### Citation:

Yamaguchi M, Clarke EL and Egan DL (2018) Is Your Color My Color? Dividing the Labor of the Stroop Task Between Co-actors. Front. Psychol. 9:1407. doi: 10.3389/fpsyg.2018.01407 Keywords: joint performance, Stroop interference, semantic gradient, division of labor, co-representation

# INTRODUCTION

Performing a single task jointly with other actors provides an opportunity to divide the labor of the task between co-acting individuals. The division of labor reduces the workloads of the individuals and allows them to focus more efforts on part of the task to which they are assigned. This gives rise to an advantage of group performance (Wegner, 1986). A joint task may also require coordinating actions of the co-acting individuals, which enables collective efforts to accomplish work that is greater than what might be achieved by each of the individuals alone (e.g., moving a heavy furniture). Coordinating actions with co-actors requires monitoring actions of the others, and this is made possible by integrating others' task contexts into the actor's own task representation (Knoblich and Sebanz, 2006). Nevertheless, coordination requires each actor to monitor their coactors at a given moment, which would impose an additional workload and make the actors' actions interdependent, imposing additional constraints on their own actions. Therefore, the division of labor and the coordination of actions are two opposing forces that need to be balanced for a successful completion of a joint task (see Moreland, 1999, for a similar idea in organizational contexts).

Studies of joint action have focused on the integration of task representations, or task co-representation (Sebanz et al., 2003; Knoblich et al., 2011). The notion of co-representation has been supported by the findings in the joint Simon task (Sebanz et al., 2003). The Simon task is a choice-reaction task in which an individual actor responds to non-spatial attributes of stimuli (e.g., colors) by pressing response keys on the left or right. The actor is asked to ignore the stimulus location, but responses are still faster and more accurate when stimulus and response locations correspond than when they do not, yielding the Simon effect. The actors in the joint Simon task divide the labor of the Simon task in such a way that one actor responds to one stimulus type (e.g., red circles) by pressing one response key (e.g., on the left) and the other actor respond to the other stimulus type (green circles) by pressing the other key (on the right). This is essentially a go/nogo task that only requires each actor to respond to stimuli on some trials and withhold responding on other trials. When the same go/nogo Simon task is performed by a single actor, no Simon effect is obtained (Hommel, 1996), because the spatial attribute is no longer relevant to represent the response, eliminating the spatial correspondence between stimulus and response. With two actors performing together, the joint Simon task still produces the Simon effect, implying that the spatial attribute is used to represent the responses. This finding has led researchers to suggest that co-acting individuals not only represent their own part of the task but also integrate their coactor's part into their own task representation. Such a joint task representation completes the entire picture of the Simon task. A strong version of this co-representation account suggests that the actors represent their co-actor's actions as if these actions were on the actors' own command (Knoblich and Sebanz, 2006).

Task co-representation would enable a better coordination of actions between co-acting individuals. However, by considering their co-actor's actions, task co-representation can also cause additional cognitive conflicts between the actor's own action program and the action program representing their co-actor's response that does not need to be executed (Sebanz et al., 2006). Nevertheless, there has also been evidence suggesting that coacting individuals may not represent their co-actor's part of the task or actions; instead, they represent their own actions with reference to their co-actor's actions (Dolk et al., 2014). If so, the co-acting individuals do not necessarily monitor what their co-actor does or how their co-actor performs their part of the task, but they may simply be aware of the fact that they have divided the labor of a joint task with their co-actor. Previous findings support this position, showing that the actors in the joint Simon task monitor the proportion of compatible trials for their own part but not for their co-actor's part (Yamaguchi et al., 2018) and that the actors in a joint task-switching setting do not monitor the task that their co-actor has performed on a preceding trial (Wenke et al., 2011; Yamaguchi et al., 2017b). Such task monitoring appears to occur under specific conditions (Dudarev and Hassin, 2016; Liefooghe, 2016; Yamaguchi et al., 2017a). Therefore, the actors may only represent limited aspects of the co-actor's part of the task, and they divide the labor of the joint task, eliminating an additional burden monitoring their coactor's part of the task. The purpose of the present study was to devise another test of the division of labor and the integration of task representation in a join task setting.

# Joint Stroop Task

Previous studies suggest that actors do represent stimuli that occurred on the co-actor's trials (Dolk et al., 2013; Eskenazi et al., 2013) and the action that the co-actor has made on a preceding trial (Welsh et al., 2005). The present study assessed whether stimuli (and, to some extent, responses) assigned to their co-actor are represented as part of the actor's own task. To this end, a joint version of the Stroop task (Stroop, 1935) was utilized. The Stroop effect is one of the most robust interference phenomena that can easily be reproduced even under an uncontrolled environment, such as a college seminar room. It occurs when people try to name the colors of color words whose meanings are incongruent with the colors that they are meant to name (e.g., the word "BLUE" printed in red). The Stroop effect is often thought to involve a quintessential form of automaticity (e.g., LaBerge and Samuels, 1974; Posner and Snyder, 1975; MacLeod, 1992), but it has also been shown that the effect depends on a number of factors (e.g., Kahneman and Treisman, 1984; Moors and De Houwer, 2006). Importantly to the present study, the Stroop effect has been shown to depend on how the irrelevant word names are related to the target colors to which participants respond (Klein, 1964; Levin and Tzelgov, 2016). In particular, Stroop interference is largest when the task-irrelevant word names come from the set of the target colors, but it decreases when the task-irrelevant word names are from outside the set of the target colors. For instance, if the target colors were 'red' and 'green,' then the word 'YELLOW' would produce less interference than the word 'RED.' This finding has been known as semantic gradient. The present study used this finding to address the issue of what aspects of the co-actor's task the actors represent when performing a task jointly with others.

In the present version of the joint Stroop task, a pair of actors performed the Stroop task with a set of four target colors. Each actor responded to two of the four colors by pressing response keys. As Klein (1964) showed, the size of the Stroop effect should depend on how closely the word meanings are related to the target colors to be named, producing the semantic gradient of Stroop interference. Levin and Tzelgov (2016) recently showed that the semantic gradient occurs when different types of distractor words were presented in separate blocks but not when they were intermixed within a block. Consequently, the present study tested three types of blocks across which different types of irrelevant word names occurred (see **Figure 1** for examples).

In the first block, incompatible word meanings denoted the target color names assigned to the actor him- or herself (own target color block); in second block, incompatible word meanings denoted the target color names assigned to the co-actor (coactor's target color block); and in the third block, incompatible word meanings denoted the non-target color names that were not assigned to either actor (non-target color block). All of these blocks also included compatible trials for which the irrelevant word meanings denoted the names of the colors in which the words occurred. Based on Klein's (1964) semantic gradient, it was expected that the Stroop effect would be larger in the own target

color block than in the non-target color block. The main question was one of whether the Stroop effect in the co-actor's target color block was similar to that in the own target color block or that in the non-target color block.

If the actors in the joint Stroop task co-represent their coactor's part of the task, then the actors should react to the co-actor's target colors as if they were their own target colors. In this case, the Stroop effect in the co-actor's target color block should be similar to the effect in the own target color block and should be larger than the effect in the non-target color block. Such outcomes would imply that actors integrated their co-actor's target colors into their own set of target colors. If the actors do not co-represent co-actor's part of the task, then the actors should not react to the co-actor's target colors as if they were their own target colors. Thus, the Stroop effect in the co-actor's target color block should be similar to the effect in the non-target color block and should be smaller than the effect in the own target color block. Such outcomes would imply that actors divided the labor of monitoring their own set of target colors from their co-actor's target colors.

It is worth noting two recent studies that also used a joint version of the Stroop task (Demiral et al., 2016; Saunders et al., 2018). In the first study by Demiral et al. (2016), the main purpose was to compare individual and joint task conditions in terms of the ERP signal. An important finding was that the P3b component of ERP, which was thought to reflect a translation from stimulus to response, increased on nogo trials of the joint task (which was performed by the co-actor) as compared to nogo trials of the individual task. This suggested to the authors that actors 'mapped stimuli onto the co-actor's response' even when these actors did not need to perform the task on the co-actor's trials, consistent with the co-representation view. However, on nogo trials of the joint task in their experiment, the actors were required to report whether their co-actor's response was correct by saying 'yes' or 'no' and by pressing a key in case the co-actor made an error. This requirement on nogo trials forced the actors to monitor their co-actor's trials and determine what response the co-actor should be making. Thus, monitoring of the coactor's responses was built in the task, and it is not surprising that the actors had to perform the co-actor's trials (mentally) in such a situation. The present study used a version of the Stroop task without this additional requirement, and it would be the actor's spontaneous choice if they co-represent their coactor's target colors as part of their own task representation. Hence, the present design provided a stronger test of task co-representation in a joint Stroop task than Demiral et al.'s (2016). The second study by Saunders et al. (2018) study was similar to the present study, but a main difference was that all types of distractor words were intermixed within a single block in that study. The present study separated the three types of distractor words (own target color, co-actor's target color, and non-target color) into different blocks because Levin and Tzelgov (2016) reported that the semantic gradient was observed only when different distractor words were separated between blocks. As the semantic gradient played a central role in formulating the hypotheses, this is an important methodological feature of the present study. Also, the present study involved two alternative responses per actor, rather than one response per actor in Saunders et al.'s (2018) study. These differences could determine whether the actors co-represent in a joint task, so it is important to assess whether the results of the present study would deviate from those reported by Saunders et al. (2018).

The present study consisted of two experiments, which differed in two respects. First, the individual task in Experiment 1 consisted only of trials for which the target colors were always from the actor's own target colors, so that the actors responded on all trials (i.e., all trials were go trials). The individual task in Experiment 2 consisted of trials for which the target colors were either from the actor's own target colors or their co-actor's target colors, so that the actors responded on half of the trials (go trials) and withheld responding on the other half (nogo trials). The number of trials was the same for the two experiments, but these procedural differences meant that the number of go trials that each actor performed between the individual and joint tasks was the same in Experiment 1, whereas the number of go and nogo trials that occurred in a block between the individual and joint tasks was the same in Experiment 2. Second, the sample size was nearly doubled in Experiment 2 to examine whether the main results of Experiment 1 were replicated.

# EXPERIMENT 1

fpsyg-09-01407 August 6, 2018 Time: 17:23 # 4

# Method

## Participants

Thirty-two participants participated in the present experiment (21 females; mean age = 19.42, SD = 1.50, range = 18–21). Twenty-four participants were originally recruited, and eight participants were added later as per the suggestion from a reviewer to match the sample size in Saunders et al.'s (2018) Experiment 1. All participants were recruited from the Edge Hill University community in pairs. With the current design, a statistical power of at least 0.95 is achieved for a medium effect size if the sample size is 18 or above. Each participant in a pair received £3 for participation. All participants reported having normal color vision and normal or corrected-to-normal visual acuity. They were naïve as to the purpose of the experiment. The present study followed the recommendations of the British Psychological Society Code of Ethics. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the the Departmental Research Ethics Committee of the Psychology Department at Edge Hill University.

## Apparatus, Stimuli, and Procedure

The apparatus consisted of a 23-in widescreen monitor and a personal computer. Stimuli were six color words (GREEN, RED, BLUE, YELLOW, PINK, and WHITE), presented in one of the same six colors against a gray background. The stimuli were presented in the Arial font at the 60-pt font size. Responses were registered by pressing keys on a QWERTY desktop keyboard.

The experiment was conducted individually for each pair in a cubicle under normal fluorescent lighting. Participants read onscreen instructions, and the participant who sat on the left side (Actor A) placed their left and right index fingers on the 'z' and 'c' keys, respectively, and the participant who sat on the right side (Actor B) placed their left and right index fingers on the '1' and '3' keys on the numerical keypad. For each pair, four colors were chosen and assigned randomly to the four keys (see **Figure 1** for an example). These colors appeared as the target color to which participants responded, and the remaining two colors were used only as irrelevant word meanings.

Each pair performed two phases, the individual task phase and the joint task phase. In the individual task phase, each participant performed trials alone while the co-actor remained inactive. For each participant, there was one block of 12 practice trials and three blocks of 64 test trials each. Each test block consisted of 32 compatible trials for which the task-irrelevant word meaning was the same as the target color, and 32 incompatible trials for which the task-irrelevant word meaning was different from the target color. The target colors were either of the two colors assigned to the actor. There were three types of blocks in which the irrelevant word meanings on incompatible trials were manipulated (**Figure 1**). In the first type of blocks (own target color block), the task-irrelevant word meanings were colors assigned to the actor who was performing the task. In the second type of blocks (co-actor's target color block), the task-irrelevant word meanings were colors assigned to the co-actor who was not performing the task. In the third type of blocks (non-target color block), the task-irrelevant word meanings were colors that were not assigned to either participant in the pair. In each of the test blocks, there were two possible color words on incompatible trials that occurred in an equal frequency and in a random order; the two target colors also occurred in an equal number of trials. In the practice block, all types of task-irrelevant word meanings could occur, and trials were randomly chosen without replacement. The order of the test blocks was permuted to counterbalance across pairs, and it was maintained for the two actors.

In the joint task phase, both participants performed trials, and which actor responded on a given trial depended on the target color. The joint task phase was similar to the individual task phase; it consisted of one block of 12 practice trials and two cycles of three blocks of 64 test trials each (six test blocks in total). Two of the three test blocks in each cycle used the colors assigned to one of the actors, so a block was the own target color block for one actor and it was the co-actor's target color block for the other actor. The remaining block was the non-target color block for both actors. The order of the test blocks was the same in the two cycles.

Each trial started with a fixation cross at the screen center for 500 ms, followed by a 500-ms blank display. A word appeared at the center for 1,200 ms unless a response was made before the deadline. A feedback message was presented for 500 ms. The message was "Correct!" for the correct response, "Error" for an incorrect response, "Not your turn!" for a response by a wrong actor, and "Faster!" when there was no response within the 1,200-ms response window. Another 500-ms blank display was presented before the next trial. In the joint task, half of the trials in each block were assigned to one actor, and other trials to the other actor. In the individual task, all trials were assigned to one actor, and the other actor remained silent. Response time (RT) was the interval between word onset and a keypress.

# Results

Trials were discarded if RT was less than 200 ms, a wrong actor responded, or no response was registered within the 1,200-ms time window (1.23% of all trials). Mean RT and percentage errors (PEs) are summarized in **Table 1**. The Stroop effects are shown in **Figure 2**. RT and PE were submitted to 2 (Task Condition: joint vs. individual) × 3 (Block Type: own target color vs. co-actor's target color vs. non-target color) × 2 (Stimulus Compatibility: compatible vs. incompatible) ANOVAs (see **Table 2**).

Response time revealed main effects of Task Condition (Ms = 423 ms for individual task and 490 ms for joint task) and of Stimulus Compatibility (Ms = 444 ms for compatible trials and 469 ms for incompatible trials), and their interaction, which indicated that the Stroop effect was smaller for the individual task (M = 17 ms) than for the joint task (M = 34 ms). In the individual task, the Stroop effect was 21 ms for the own color block, 21 ms for the co-actor's color block, and 8 ms for the non-target color block; in the joint task, the Stroop effect was 40 ms for the own



target color block, 28 ms for the co-actor's target block, and 34 ms for the non-target color block.

Percentage error revealed main effects of Block Type (Ms = 5.50% for own target color, 4.15% for co-actor's target color, and 3.99% for non-target colors) and Stimulus Compatibility (Ms = 3.30% for compatible trials and 5.78% for incompatible trials). There was also a significant interaction between Block Type and Stimulus Compatibility. The Stroop effect tended to be larger for the own target color block (M = 4.21%) than for the coactor's target color block (M = 1.62%) but not for the non-target color block (M = 1.62%).

## Discussion

The Stroop effect was obtained in the present experiment. In RT, the Stroop effect was only numerically smaller for the non-target colors than the other colors in the individual task, and it was only numerically larger for the own target colors than the coactor's target or non-target colors in the joint task; however, none of these differences were supported statistically. These results provided little evidence for the semantic gradient that would have been expected if colors were represented differently according to whether they belong to the actor's target set. Consequently, the RT data supported neither the division of labor or the integration of task representations. In PE, the Stroop effect depended on the block type, yielding a larger Stroop effect for the own target colors than the co-actor's color target or non-target colors. These results are consistent with the division of labor, that is, when the co-actor's target colors were represented as if they were non-targets.

As noted by Saunders et al. (2018), the discrepancy between RT and PE may reflect the possibility that there are two sources of the Stroop effect, stimulus recognition and response selection, and these measures may be sensitive to different processes. The Stroop effect could occur in stimulus recognition due to stimulus conflict, a conflict between the color name and an incongruent word meaning. The Stroop effect could occur in response selection due to response conflict, a conflict between the response that the color name indicates and the response that the incongruent word meaning indicates. The joint task eliminated response conflict in the co-actor's target color block because the actors were never required to make the responses assigned to the co-actor's target colors. Thus, the Stroop effect for the own target colors could involve stimulus conflict and response conflict, whereas the Stroop effect for the co-actor's target colors could involve stimulus conflict but not response conflict. If this is the case, the RT results would imply that the Stroop effect in RT only reflected stimulus conflict and that all types of colors are represented similarly. The PE results would then imply that the Stroop effect in PE reflected both stimulus conflict and response conflict. Although these findings are consistent mostly with Saunders et al.'s (2018) study, the outcomes may depend on the use of manual responses as in the present study and Saunders et al.'s (2018). It would be interesting to see if the same results are obtained with vocal responses as in Demiral et al.'s (2016) study.

Another interesting outcome of the present experiment was that, in RT, the Stroop effect was larger for the joint task than for the individual task. Apart from the fact that the task was performed by one actor or two actors, a difference between the individual and joint task settings in the present experiment is that the target colors in the individual task were always the actor's own target colors (i.e., go trials), whereas the target colors in the joint task were either the actor's own target colors (go trials) or the co-actor's target colors (nogo trials). The additional requirement of withholding responses on the co-actor's trials might have increased the task demand and reduced attention that the actors could devote to their own trials. In fact, responses were generally slower in the joint task than in the individual task. The actors might have exercised stronger proactive control in the individual task than in the joint task, reducing the Stroop effect in the former case.

In Experiment 2, this difference between the individual and joint tasks was excluded, so that the actors were now presented with their own target colors on half of the trials and the co-actors' target colors on the other half in the individual and joint tasks.

The sample size was also nearly doubled to examine whether the lack of the differences among the three blocks types in RT merely reflected low statistical power in Experiment 1.

# EXPERIMENT 2

# Method

#### Participants

A new group of 62 participants (49 females; mean age = 21.23, SD = 3.14, range = 18–43) were recruited from the same university community.

#### Apparatus, Stimuli, and Procedure

The apparatus and stimuli were the same as those used in Experiment 1. The procedure closely followed that of Experiment 1, with the following changes in the individual task phase. In the individual task, the color words appeared in one of the four colors, two target colors of the actor and two target colors of the co-actor. On half of the trials, the stimuli were in the actor's target color to which the actor responded (go trials); on the other half, the stimuli were in the co-actor's target color with which the actor withheld responding to (nogo trials). Each of these target colors occurred equally frequently, and half of the trials were compatible trials and the other half were incompatible trials. Each test block consisted of 64 trials as in Experiment 1.

### Results

Trials were filtered in the same manner as in Experiment 1 (2.18%). Mean RT and PE are summarized in **Table 3**, and the Stroop effect is shown in **Figure 2**. RT and PE were submitted to 2 (Task Condition: joint vs. individual) × 3 (Block Type: own target color vs. co-actor's target color vs. non-target color) × 2

TABLE 2 | The results of ANOVAs on response time and percentage error in Experiment 1.


Effects in bold are significant at α = 0.05.

TABLE 3 | Mean response time (ms) and percentage error (PE; values in the parentheses are standard errors of the mean) in Experiment 2.


(Stimulus Compatibility: compatible vs. incompatible) ANOVAs (**Table 4**).

Response time showed that there were significant main effects of Task Condition (Ms = 553 ms for the individual task, and 526 ms for the joint task) and Stimulus Compatibility (Ms = 524 ms for compatible trials, and 555 ms for incompatible trials). There was a significant 3-way interaction among all three variables. To follow up this interaction, Bonferroni-corrected multiple comparisons were performed on the Simon effect in the three blocks separately for the individual and joint tasks. For the individual task, the Stroop effect was 33 ms for the own target color block and 45 ms for the co-actor's target block, which did not differ significantly (p = 0.405). The Stroop effect for the non-target color block (M = 19 ms) was significantly smaller than that for the co-actor's target color block (p = 0.031) but not for the own target color block (p = 0.277). Therefore, the interaction was driven by the larger Stroop effect for the co-actor's target block in the individual task phase.

Percentage error showed that the only significant effect was the main effect of Stimulus Compatibility (Ms = 5.35% for compatible trials, and 7.00% for incompatible trials). The Stroop effects were 1.99%, 1.62%, and.39% for the own target color, coactor's target color, and non-target color blocks, respectively, in

TABLE 4 | The results of ANOVAs on response time and percentage error in Experiment 2.


Effects in bold are significant at α = 0.05.

the individual task, and were 3.24%, 1.17%, and 1.45%, for these three blocks in the joint task.

color block was larger than those in the other blocks, as in Experiment 1.

## Discussion

The results of the present experiment agreed mostly with Experiment 1, except for two outcomes. First, in RT, there was no overall difference in the Stroop effect between the individual and joint tasks. However, in the individual task, the Stroop effect for the co-actor's target color was elevated as compared to the non-target colors, although the Stroop effect for the own target colors did not differ from the coactor's target colors or non-target colors. In the joint task, the Stroop effect was similar among the three types of target colors. That the elevated Stroop effect was obtained only for the co-actor's target in the present experiment but not in Experiment 1 suggests that it was likely due to the additional requirement to withhold responding when the target color was that of the co-actor's in the individual task. This may be due to binding of response inhibition with the co-actor's target colors (e.g., Yamaguchi et al., 2018), which slowed responding when the word meanings were the co-actor's target colors. Second, in PE, Experiment 1 showed a larger Stroop effect for the own target colors than the co-actor's target colors or non-target colors, but there were only numerical tendencies (especially in the joint task) but no statistically significant differences in Experiment 2. The lack of this tendency in the individual task might reflect an elevated Stroop effect for the co-actor's target color as in RT, but the results are not conclusive in this respect. Overall, there was little evidence of semantic gradient across the different types of color words in RT, suggesting that all color words were represented similarly in the individual and joint tasks. There was some tendency in the joint task that the Stroop effect in the own target

# GENERAL DISCUSSION

The present study used the joint version of the Stroop task and examined whether actors in a joint task setting integrate their co-actors' part of the task or divide the labor between them. If co-acting individuals share a mental representation of the joint Stroop task, they represent their co-actor's target colors as if they were their own target colors. Consequently, the co-actor's target colors should be represented in the same way as the actor's own target colors, and the Stroop effect would be as large when the color names are from their co-actor's target colors as when the color names are from their own target colors. If co-acting individuals divide the labor of the Stroop task, the co-actor's target colors should be represented in the same way as the nontarget colors, and the Stroop effect would be smaller when the color names are from their co-actor's target colors than when the names are from the actor's own target colors. Both predictions presume the semantic gradient (Klein, 1964); the Stroop effect is smaller for non-target colors than the actor's own target colors.

Nevertheless, the results of Experiment 1 showed little evidence that the semantic gradient occurred in RT. The Stroop effect for the actor's own target colors or for the co-actor's target colors was no different from the Stroop effect for the non-target colors. Given that the semantic gradient has been one of the key findings in Stroop interference, this outcome was unexpected, but the lack of the differences in the Stroop effect for the actor's own target colors and the co-actor's target color is consistent with the finding of Saunders et al. (2018). With nearly twice as large sample size as in Experiment 1, the results of Experiment 2 also showed no evidence that the semantic gradient occurred in

RT. There were little difference in the Stroop effect between the actor's own target colors and the co-actor's target colors. Although the co-actor's target colors did showed the Stroop effect that was larger than the effect for the non-target colors in the individual task, there is no such evidence in the joint task.

The PE data did show a larger Stroop effect for the actor's own target colors than for the co-actor's target colors or the nontarget colors in Experiment 1, and the joint task of Experiment 2 also showed this pattern. The outcomes are consistent with the division of labor, but the discrepancy with the RT results made it difficult to consider this finding to be conclusive. Saunders et al. (2018) also found a similar discrepancy between RT and PE and suggested the possibility that the Stroop effect arises from two different processes and the Stroop effects in RT and PE depend on different processes. The present results would be expected if RT mostly reflected stimulus recognition and PE reflected both stimulus recognition and response selection. If so, all distractor words are processed in a similar manner at the level of stimulus recognition, but there is a division of labor at the level of response selection. It should be acknowledged, however, that these results may depend on the use of manual responses because the actors could not make their co-actor's responses in this setting. It is still possible to utter the co-actor's target colors if vocal responses are used. Hence, the generalizability of the results to vocal response should be tested in future investigations.

The present results differed from the previous joint Stroop study by Demiral et al. (2016), which suggested that the actors monitored the co-actor's target colors as reflected in the ERP components during nogo trials that was larger in the joint task than in the individual task. The discrepancy is likely due to the additional task requirement on nogo trials of Demiral et al.'s (2016) study, in which the actors were required to report whether their co-actor made an error. This requirement forced the actors to monitor and determine the co-actor's responses on every trial, so co-representation of the co-actor's trial was built in the task. There was no such requirement in the present study, so the actors were free to choose whether to represent their co-actor's target colors. The present results suggest that the actors did not choose to represent their co-actor's target colors.

This conclusion corroborates the recent findings from other types of joint tasks. For instance, co-acting individuals in the joint Simon task monitored the proportion of compatible trials of their own but did not monitor the proportion of compatible trials of their co-actor's (Yamaguchi et al., 2018). Although the actors appear to monitor certain information about stimuli to which their co-actor responded in the joint Simon task, it is simply because the stimulus information is required to determine whether the actors had to make response on that trial (e.g., colors) or because encoding of the stimulus information was obligatory (e.g., stimulus location; Treisman and Gelade, 1980; Logan, 1998). Thus, representing certain aspects of the co-actor's stimuli was also built in the task itself. It has also been argued that certain aspects of the co-actor are salient and may be used as a reference point to represent part of the actor's own task (Dolk et al., 2014; Prinz, 2015), which would explain why the Simon effect is obtained in the joint task while the proportions of compatible and incompatible trials for an actor does not affect the Simon effect on the other actor. Similarly, a study using a joint version of task switching also showed switch cost was obtained only when the preceding trial was performed by the same actor as the current trial, but not when the preceding trial was performed by the co-actor (Yamaguchi et al., 2017b). This finding also suggests that the actors recognize the task on a preceding trial as their own if they actually performed the trial for themselves but not if their co-actor performed it, implying the division of labor.

There is a study showing that actors recognized stimuli presented to their co-actor better than stimuli that were new (Eskenazi et al., 2013), suggesting that they do not ignore irrelevant stimuli assigned to their co-actor. The present study does agree that the actors processed the color names from the co-actor's target colors to a degree that the co-actor's colors still produced the Stroop effect, but there were no significant differences between the co-actor's colors and the non-target colors and no evidence that the co-actor's color names were processed so far as to activate the action program representing the co-actor's response (Knoblich and Sebanz, 2006; Sebanz et al., 2006). Therefore, the co-actor's part of the task appears to have no special status in the present task setting.

Although a strong claim about task co-representation receives little support from the present results, it is clear that joint task would benefit from both the division of labor and the integration of the co-actor's part of the joint task into one's task representation, but the question of which part of the coactor's task is taken into account in a joint task setting still remains unanswered. There are at least two possibilities by which the division of labor and the integration of task representation coexist in a joint task setting. The first possibility is that they are two different modes of joint task performance from which actors can choose depending on the demands of the given task setting. For instance, there are a number of studies on the joint Simon task that showed that the joint Simon effect depended on various social factors (e.g., Hommel et al., 2009; Ruys and Aarts, 2010; Iani et al., 2011). It has also been shown in joint task switching that switch cost was reinstated after the co-actor's trials when two actors shared the same action effect (Yamaguchi et al., 2017a). These observations seem to be consistent with the two-mode hypothesis of the joint task performance.

The second possibility is that the integration of task representation and the division of labor reflect two different levels of cognitive processes that control joint task performance. Cognitive processes are structured hierarchically (e.g., Logan and Crump, 2011), with the higher level process monitoring information relevant to the global task goal and the lower level process monitoring information relevant to the local task goal. In a joint task setting, the higher level process may monitor aspects of the task that are relevant to the global goal such as who performs a given trial, while the lower level process monitors aspects of the task that are relevant to the local goal such as which of the alternative responses should be made. It is possible that the higher level process represents aspects of the co-actor's part of the task as long as it is relevant to the global goal of selecting an actor.

The hierarchical processing hypothesis is consistent with recent findings from a naming task (Philipp and Prinz, 2010), in which actors uttered their own name or their co-actor's name in response to target stimuli (black and white diamonds) that were superimposed on a photographic image of the actor, the co-actor, or an unfamiliar individual. A critical finding was that, in the joint task for which each participant uttered only one of the names in response to one of the targets, responses were faster when the picture was the actor's own face than when it was their co-actor's or that of an unfamiliar individual (the face-actor compatibility effect; also see Baess and Prinz, 2017), but responses did not depend on whether the name was compatible with the picture (the face-name compatibility effect). When a single actor performed the same task alone with the two alternative names, the face-name compatibility effect emerged. The authors suggested that the irrelevant pictures primed who to take a turn on a given trial when two responses are divided between two actors in the joint task, but the same pictures primed what response should be made when the two responses are assigned to a single actor in the individual task. Wenke et al. (2011) also proposed that actors in joint task settings represent when it is their own turn or the co-actor's turn, rather than the actions that coacting individuals perform. Therefore, joint performance reflects conflict in actor identification, but not conflict in response selection.

There are not enough data to distinguish between these two possible mechanisms of joint performance in which the integration and division may co-exist. Further investigations are necessary to explore what cognitive mechanisms support the division of labor and the integration of task representations.

# REFERENCES


As a general remark, theories of group cognition and team performance tend to emphasize the similarity of individuals as a hallmark of collective behavior (Wegner, 1986), but an advantage of group performance also comes from the diversity of knowledge and skills (Lewis and Herndon, 2011). Studies of joint performance has focused mainly on what is shared between co-actors, but limited attention has been paid to what is divided between co-actors (Wahn et al., 2017). These two questions serve two ends of the spectrum in task sharing. Future studies of joint task performance should shed light on the processes and representations that underlie effective task sharing by assessing how individuals balance the divide of the labor and the integration of task representations in a joint task setting.

# AUTHOR CONTRIBUTIONS

MY conceived, designed, and prepared the materials. MY supervised data collection of Experiment 1, and EC and DE collected the data of Experiment 2. MY wrote the first draft. All authors contributed to revisions and approved the final version.

# ACKNOWLEDGMENTS

The authors wish to thank Rachel Martin and Deanna Myers, for their assistance in data collection of Experiment 1. The data of Experiment 2 were collected as part of the undergraduate dissertation projects of EC and DE, supervised by MY. The experimental data are available for a reanalysis purpose from the OSF project page (https://osf.io/p65gm/).



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer PS and handling Editor declared their shared affiliation at the time of the review.

Copyright © 2018 Yamaguchi, Clarke and Egan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Influence of Co-action on a Simple Attention Task: A Shift Back to the Status Quo

#### Jill A. Dosso\*, Kevin H. Roberts, Alessandra DiGiacomo and Alan Kingstone

Department of Psychology, University of British Columbia, Vancouver, BC, Canada

There is a growing consensus among researchers that a complete description of human attention and action should include information about how these processes are informed by social context. When we actively engage in co-action with others, there are characteristic changes in action kinematics, reaction time, search behavior, as well as other processes (see Sebanz et al., 2003; Becchio et al., 2010; Wahn et al., 2017). It is now important to identify precisely what is shared between co-actors in these joint action situations. One group recently found that participants seem to withdraw their attention away from a partner and toward themselves when co-engaged in a line bisection judgment task (Szpak et al., 2016). This effect runs counter to the typical finding that attention is drawn toward social items in the environment (Birmingham et al., 2008, 2009; Foulsham et al., 2011). As such, the result suggests that joint action can uniquely lead to the withdrawal of covert attention in a manner detectable by a line bisection task performed on a computer screen. This task could therefore act as a simple and elegant measure of interpersonal effects on attention within particular pairs of participants. For this reason, the present work attempted to replicate and extend the finding that attention, as measured by a line-bisection task, is withdrawn away from nearby co-actors. Overall our study found no evidence of social modulation of covert attention. This suggests that the line bisection task may not be sensitive enough to reliably measure interpersonal attention effects – at least when one looks at overall group performance. However, our data also hint at the possibility that the effect of nearby others on the distribution of attention may be modulated by individual differences.

Keywords: line bisection, social presence, replication, joint attention, joint action, covert attention

# INTRODUCTION

By its very nature, spatial attention involves the selection of some locations or objects rather than others. This is readily seen when the normal operation of attention breaks down, as in the case of patients with unilateral spatial neglect. Such patients experience pathological disruptions to their spatial attention as a function of right parietal lobe damage. This damage results in biased attention to rightward locations and objects at the expense of attention to leftward locations and objects (Corbetta and Shulman, 2011; Karnath, 2015). Even in the typical population, however, there is evidence of asymmetries in spatial attention. Reliably, typically developing individuals allocate slightly more attention to the left side of space. This small bias to overestimate or over-attend the left side of space can be seen in the overestimation of the length of felt and imagined lines (Brooks et al., 2014), in the greater tendency to miss

#### Edited by:

Motonori Yamaguchi, Edge Hill University, United Kingdom

#### Reviewed by:

Roland Pfister, Universität Würzburg, Germany Thomas Gallagher-Mitchell, Liverpool Hope University, United Kingdom

> \*Correspondence: Jill A. Dosso jill.dosso@psych.ubc.ca

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 12 April 2018 Accepted: 15 May 2018 Published: 04 June 2018

#### Citation:

Dosso JA, Roberts KH, DiGiacomo A and Kingstone A (2018) The Influence of Co-action on a Simple Attention Task: A Shift Back to the Status Quo. Front. Psychol. 9:874. doi: 10.3389/fpsyg.2018.00874

rightward items when left and right locations are stimulated simultaneously (Goodbourn and Holcombe, 2015), in spontaneous looking behavior (Nuthmann and Matthias, 2014), and perhaps most routinely, in the standard visual line bisection task (Jewell and McCourt, 2000).

In the prototypical line bisection task, participants are asked to judge whether a mark ("transector") on a long horizontal line is located to the right or to the left of the horizontal line's true center. Typically, on-screen cues that precede the presentation of the line have been shown to attract attention, inducing a perceived lengthening of the line segment nearest the cue (McCourt et al., 2005; Toba et al., 2011). Importantly, one study found that distractors could influence line bisection performance without being fixated. Covert attention, therefore, is sufficient to produce these effects (Thomas et al., 2015).

Recently, the notion that social stimuli could induce these same types of attention shifts has been investigated. In nonbisection tasks, gazing eyes have been shown to reflexively bias attention in the direction of their gaze (Friesen and Kingstone, 1998; Kuhn et al., 2009), even among patients with left-neglect (Bonato et al., 2008). Moreover, social stimuli including the eyes are preferentially looked at when images are viewed (Birmingham et al., 2008, 2009; Foulsham et al., 2011); and when a visible experimenter was used as a distractor in a line bisection task, a perceptual-attentional bias in line bisection toward the experimenter was documented (Garza et al., 2008). Thus, the consensus across a large body of work is that attention shifted by and toward social information within a scene (Friesen and Kingstone, 1998; Driver et al., 1999; Vuilleumier, 2000; Kuhn and Land, 2006; Theeuwes and Van der Stigchel, 2006; Birmingham et al., 2009; Laidlaw et al., 2012; Rösler et al., 2017).

It is unclear, however, whether the presence of co-actors will also shift attention in a similar manner (Hayward et al., 2017). Commonly, joint action studies feature two individuals facing and acting together on stimuli presented on a computer screen (e.g., Sebanz et al., 2003; Eskenazi et al., 2013; Brennan and Enns, 2015; Dudarev and Hassin, 2016; Wahn et al., 2017). Employing the line bisection task in this format could therefore provide a simple index of the co-actor's impact on the topography of attention within the screen. Recent papers have found evidence that, in contrast to the large body of evidence touched on above, attention is directed away from live co-actors, inducing a perceived shortening of the line segment nearest the other person (i) in the horizontal plane when pairs of individuals sit facing the same direction, and (ii) in the radial direction when individuals face one another (albeit only among those who show a high level of physiological arousal) (Szpak et al., 2015, 2016). This unique finding that, in some cases, joint action can lead to changes in the static topography of covert on-screen attention is surprising because it suggests that live co-actors impact attention quite differently than one would expect, given the established literature. In addition, this effect seems to extend beyond the physical body of the coactor, and to include the jointly attended computer monitor. This task could therefore act as a simple and elegant measure of interpersonal effects on attention within particular pairs of participants.

There are, however, two outstanding points regarding this measure. First, the attentional withdrawal effect appears to be quite small. Social Influence Score (SIS) – the index of attentional attraction or withdrawal that was used – was calculated as a value in millimeters across three experiments (Szpak et al., 2016). This value was obtained by comparing the perceived midpoint of horizontal lines when seated beside a co-actor versus when seated alone. A shift in the perceived midpoint toward the co-actor (positive SIS) was taken as evidence of attentional attraction, whereas a shift away from the co-actor (negative SIS) was taken as evidence of attentional withdrawal. SIS had a negative value in all three experiments, consistent with attentional withdrawal away from the co-actor, but this value was significantly different from zero (i.e., no change in attention) only for two of three experiments. Moreover, the three SIS values were not different from one another across the three experiments, rendering any conclusions to be of an equivocal nature. Thus, it seemed valuable to replicate the effect in a different laboratory to assess its reliability. Second, though Szpak and colleagues report the attentional withdrawal effect at a group level, the significance of the effect across individuals, and its relationship to individual and pair factors, is not yet known.

Given the potential value of the paradigm regarding social attention, the present work sought to replicate the reported bias in horizontal line bisection away from nearby others, and to form an exploratory profile of potential individual differences in the population in the extent to which they show an effect (Szpak et al., 2016). Based on Szpak and colleagues attentional withdrawal hypothesis, one would predict that participants will overestimate the length of the line segment nearest themselves to a larger extent when in the presence of a partner rather than when alone. On the other hand, if the partner draws attention in the same way as other cue types, one would expect participants to instead overestimate the length of the more distant line segment (Toba et al., 2011) to larger extent in the direction of the partner's location. A third possibility given the possible marginal magnitude of the effect is that nearby others may have no impact on attention in this context, from which one would predict no significant shift in line bisection performance across manipulations of partner position.

# MATERIALS AND METHODS

Sample size was selected based on an a priori power analysis using G∗Power 3.1.9.2 software (Faul et al., 2007). In their first experiment, Szpak et al. (2016) report as their main measure of interest a SIS of −0.22 mm (SD = 0.43) and this was compared to a theoretical value of zero (which would indicate that attention is neither attracted nor withdrawn from the co-actor). An effect size (d) was calculated to have a value of 0.51. In order to detect this effect with a power of 0.80, a total sample size of 16 pairs (32 participants) was required. More participants than this were collected in anticipation of the need to make exclusions. Participants were recruited from a pool of undergraduate students and received course credit for participating. Twenty-seven pairs (n = 54) were tested. Mean

age was 21.4 years (SD = 5.2). Participants self-identified as female (n = 42) and male (n = 12). Their self-reported ethnicities were Asian (n = 35), Caucasian/White (n = 14), Latin American (n = 1), Middle Eastern (n = 1), Multiethnic (n = 1), and undisclosed/could not be categorized (n = 2). Based on their handedness responses (Oldfield, 1971), they were right-handed (n = 51) or ambidextrous (n = 3). Participants were paired with one another at random, and provided informed consent before participating.

Three chinrests were placed 450 mm apart. Stimuli were created and presented using PsychoPy software (Peirce, 2007). Each black and white line was 18 mm long, and was bisected at one of six possible locations (−3, −2, −1, 1, 2, or 3 cm from true center, see **Figure 1**). The central chinrest was located in front of the monitor at a distance of 600 mm, within peripersonal space (Gamberini et al., 2008). On each trial, participants were instructed to indicate, using keypresses, the shorter side of each line<sup>1</sup> . The absolute position of each line was jittered between trials from −1.5 to 1.5 mm of the true center of the screen. Three circles indicated when each participant should provide their response. Participants' hands were covered by a cloth, preventing them from seeing one another's responses. These circles were also jittered −1.5 to 1.5 mm from true center. On each trial, the order of participants' responses was randomized. There were 72 trials per block. Each pair participated in six blocks: one person would be seated in the center for three blocks in which their partner was seated on the left, seated on the right, and absent from the room (these blocks presented in a random order). Then, the procedure was repeated with the other participant seated in the center.

After testing, questionnaire responses were collected: demographic information, ratings of participants' liking and awareness of their partner, the Inclusion of Other in the Self scale, the Edinburgh Handedness Inventory-short form, the Self-Consciousness Scale, and the Autism Spectrum Quotient (Scheier and Carver, 1985; Aron et al., 1992; Baron-Cohen et al., 2001; Veale, 2014).

Data from two pairs were excluded because one member failed to comply with instructions. In addition, three individuals were excluded when testing sessions were forced to end early, one individual was excluded for self-reporting an attention-related diagnosis (ADHD), and three individuals were excluded after reporting that their vision was below normal and uncorrected, but in these cases data from partners was retained in the analysis. Furthermore, 12 additional participants were excluded for responding with more than 90% right or left answers on a single block or who, on any block, made more "right is longer" responses in the most extreme leftward bisection condition as compared to the most extreme rightward bisection condition. This yielded 31 participants in the final analysis. From participants' responses, the point of subjective equality (the theoretical line bisection position for which the participant would produce 50% "left" and 50% "right" responses) was calculated for each block in which they were seated in the center. This procedure was intended to match previous work (Nicholls et al., 2014; Szpak et al., 2016). Line bisection thresholds were estimated separately for each participant and each seating condition (partner left, no partner, partner right) by fitting psychometric functions to response data using the Palamedes toolbox (Prins and Kingdom, 2009). A cumulative Gaussian function was fit to response data using a Maximum Likelihood criterion, where the threshold parameter was free to vary, the slope was fixed at 1, and the guess and lapse rate were both fixed at 0 (**Figure 2**); these parameters are consistent with the function fitting performed in Szpak et al. (2016).

<sup>1</sup>After testing was completed, we noted that this instruction varied from Szpak et al. (2016), who asked participants to respond to the longer side of the line. However, our instruction is more consistent with the instruction sometimes used in the literature to judge whether the transector is to the left or right of the true line midpoint in the presence of a cue (McCourt and Olafson, 1997; Toba et al., 2011). Because task instructions may interact with performance on this type of task (Fink et al., 2002), we performed a control experiment to address the possibility that task instructions could yield a difference between our findings and those of Szpak and colleagues. Matching the main sample, we targeted a sample size of 32 participants after exclusions. Forty-seven new participants performed two blocks of the control task. In one block, participants followed the "respond shorter" instruction. In the other block, participants followed the "respond longer" instruction. Following the same exclusion criteria used for the main sample, 11 participants were excluded, leaving 36 for the analysis. Thresholds obtained in the two conditions were not statistically different [t(35) = −1.24, p = 0.22, BF<sup>10</sup> = 0.36], indicating that the point of subjective equality measurement was not affected by instruction.

# RESULTS

All supporting data for this paper are available at https://osf.io/ pghe5/. **Figures 3**, **4** were generated using the ggplot2 package in R software (Wickham, 2009).

# Preplanned Analyses

Based on the thresholds identified for each participant for each partner location (**Figures 2**, **3**), the mean change in threshold toward the other individual was calculated in mm, termed the "SIS" (Szpak et al., 2016). A positive SIS indicates a shift in attention toward the other individual while a negative score indicates a shift in attention away (and toward the self). In their first experiment, Szpak and colleagues found a mean SIS of −0.22 mm which was significantly different from zero. In the current study, mean SIS was found to be 0.12 mm, with a 95% confidence interval of (−0.06, 0.30). A Bayes Factor for this analysis was obtained using the ttestBF function in the BayesFactor package for R (Morey and Rouder, 2015). Mean SIS was not significantly different from zero [two-tailed, one-sample t-test: t(30) = 1.35, p = 0.19; BF<sup>10</sup> = 0.44]. A Bayes Factor smaller than one indicates greater evidence for the null hypothesis (the measured value is not different from zero) than the alternative hypothesis (the measured value is different from zero). In this case, the data are 1/0.44 or 2.3 times more likely under the null than the alternative hypothesis. However, the present mean SIS was significantly different from that calculated by Szpak and colleagues [two-tailed, one-sample t-test: t(30) = 3.87, p = 0.0006, BF<sup>10</sup> = 55.7]. A Bayes Factor between 10 and 100 is considered "strong" evidence for the alternative hypothesis that

FIGURE 3 | Mean line bisection thresholds across partner locations. Error bars represent within-subject 95% confidence intervals (Cousineau, 2005; Morey, 2008). Note that thresholds are not different based on partner position.

our measured value is different than the comparison value (Kass and Raftery, 1995).

# Exploratory Analyses

In the original work by Szpak and colleagues, calculation of SIS involved collapsing effects across left and right seating positions. To investigate the possibility that leftward and rightward effects

might differ in our sample, a within-subjects ANOVA was performed with partner location (left, right, and absent) as the IV and threshold as the DV. This analysis revealed no effect of partner location on line bisection thresholds [F(2,60) = 1.07, p = 0.35]. To evaluate whether this constituted good evidence for the null hypothesis, a Bayesian ANOVA was performed using the BayesFactor package in R (Morey and Rouder, 2015). The Bayes Factor (BF10) for this analysis was 0.13. Therefore, these data are 1/0.13 or 7.7 times more likely under the null hypothesis than under the alternative hypothesis.

To address the question of whether individual participants could show meaningful shifts toward or away from their partner, the error around each participant's individual SIS was calculated (**Figure 4**). To estimate the error of threshold estimates, a non-parametric bootstrap was performed, using 1000 bootstrap simulations for each condition. The standard error for the SIS for each participant was calculated as the standard deviation of the composite bootstrapped sampling distribution created by averaging the subtraction of the "no partner" from the "partner right" and the "partner left" from the "no partner" bootstrapped sampling distributions. The 95% CI was calculated individually for each participant as their SIS estimate, ±1.96 times the standard error. A negative SIS with a 95% CI that did not include zero was considered attentional withdrawal for that individual. A positive SIS with a 95% CI that did not include zero was considered attentional attraction for that individual. Within the final sample of 31 participants, five instances of attentional withdrawal were found (three females paired with females, one female paired with a male, one male paired with a female) and nine instances of attentional attraction (seven females paired with females, two females paired with males). The remaining 17 individuals in the sample did not fit either definition and could be considered attentionally neutral with respect to their co-actor. To investigate potential sources of this individual difference in SIS, the correlation between SIS and the following measures was calculated: rating of liking the partner (1–5), rating of awareness of the partner (1–5), self-other integration score, total score on the self-consciousness scale, and total score on the autism quotient. None of these measures was significantly correlated with SIS (all r between −0.16 and 0.01, all p > 0.42). However, male and female subjects differed from one another in their SISs [t(15.6) = 2.57, p = 0.02, BF<sup>10</sup> = 1.58], with women showing positive scores on average (M = 0.20) and men showing negative scores on average (M = −0.19), see **Figure 3**. Subjects who were tested first within their pair did not significantly differ in SIS from subjects who were tested second [t(23.8) = −0.19, p = 0.85, BF<sup>10</sup> = 0.35].

# DISCUSSION

The present work attempted to replicate and extend linebisection as an effective method for measuring a spatial change in social attention. Previous work found that during a joint line bisection task, on-screen attention was biased away from the side of the screen nearest the co-actor (Szpak et al., 2016). Thus this task could provide a useful and straightforward index of social attentional shifts, and could be used alongside paradigms that measure action kinematics, reaction time, and search behavior in joint contexts (Sebanz et al., 2003; Becchio et al., 2010; Wahn et al., 2017). To further characterize the tool, measures about the individual (Autism-Spectrum Quotient, Self-Consciousness Scale) and the pair (Inclusion of Other in the Self Scale, ratings of awareness and liking of the other individual) were collected in order to try to capture sources of individual differences in this measure. The task was matched to the original paradigm on a host of factors, including stimulus dimensions, viewing distance, interpersonal spacing, and sequence of blocks and of trials. Task instructions differed from those used in the original paradigm but more closely resembled those used in the literature (McCourt and Olafson, 1997; Toba et al., 2011). A control experiment (see footnote 1) excluded instruction as a meaningful source of empirical variation between experiments.

This work failed to replicate the effect of attentional withdrawal from the co-actor as measured by on-screen line bisection performance. These discrepant results suggest three possibilities. First, it may be that the attentional withdrawal phenomenon is real but fragile, such that small cross-laboratory differences or demographic differences between previous and current samples, extinguish the effect at the group level. In this scenario, the present work would represent a false negative with respect to the "true" effect, or would capture a boundary condition under which this effect is not observed. Assuming that the effect size of the original study is accurate, the present failure to replicate is unlikely to be a false negative due to inadequate power due to the combination of an achieved power of 0.79, the observation of a positive overall SIS, and strong evidence that this value differed from that obtained by Szpak et al. (2016). A second possibility, given the discrepancy between current and previous work, is that the attentional withdrawal phenomenon is real but, due to the small power of the original study, the original effect size estimate was inflated and thus the present study was underpowered (Ioannidis, 2008; Button et al., 2013; Open Science Collaboration, 2015). This seems unlikely for the same reasons mentioned above; the two results were significantly different from one another and differ in their direction rather than simply their magnitude. This seems to indicate that the two studies do not capture the same process.

A third possibility is that co-actors do not impact line bisection performance in this paradigm, and prior work reflects an unfortunate false positive.

There are two methodological points that merit consideration here. First, Szpak et al. (2016) do not report details about the fit of their curves. Our participants often failed to reach 100% "left" responses in the leftmost stimulus condition (and 100% "right" in the rightmost, see **Figure 2**), presumably because even the most extreme stimulus conditions remained somewhat difficult. Assuming that the current data resembles the previous sample, this raises a concern about the validity of this procedure as a measure of line bisection thresholds. While the current work followed the procedure used by Szpak et al. (2016) for the purpose of a straightforward replication, future work might employ more sophisticated curve-fitting (e.g., allowing additional parameters

in addition to threshold to vary) to ensure that PSE calculations are truly reflective of participants' response patterns. Second, the present study excluded a number of participants whose data did not meet criteria regarding accurate task performance (12 participants were excluded who responded with more than 90% right or left answers on a single block or who, on any block, made more "right is longer" responses in the most extreme leftward bisection condition as compared to the most extreme rightward bisection condition). Szpak and colleagues report excluding a maximum of three participants per experiment based on the width of their psychometric functions. While it is certainly possible that the original group were able to obtain superior participant compliance through some other means, the discrepancy is notable. If the present data are re-examined to include all participants who were initially excluded for data quality reasons, mean SIS actually takes on a significantly positive value [M = 0.24 mm; two-tailed, one-sample t-test: t(42) = 2.06, p = 0.046, BF<sup>10</sup> = 1.11]. Thus, the inclusion of additional participants does not lead to a replication of the attentional withdrawal effect obtained by Szpak and colleagues; if anything, it provides support for an attentional attraction effect that dovetails with much of the social attention literature (e.g., Toba et al., 2011).

While evidence of attentional withdrawal in the joint line bisection task was not shown at the group level, exploratory analyses revealed an interesting underlying structure within the current sample. First, a subset of individuals showed evidence of attentional withdrawal (16%) while others showed attentional attraction (29%). As noted, attentional attraction is consistent with the task performance one would expect based on the bulk of the social attention and line bisection literatures (Friesen and Kingstone, 1998; Theeuwes and Van der Stigchel, 2006; Garza et al., 2008; Toba et al., 2011), suggesting that for these participants, the co-actor might impact the attention system through similar mechanisms as those involved for other cue types. Attentional withdrawal, on the other hand, is consistent with the social discomfort hypothesis: that attention is withdrawn from nearby others under conditions of personal space invasion (Terry and Lower, 1979; Szpak et al., 2016). None of the questionnaire measures correlated with the SIS, so it is difficult to speculate about any underlying dimensions on which participants varied that could explain their different performances: selfconsciousness, autistic traits, integration of the other into the self, and awareness or liking of the other individual were all independent of SIS. However, gender emerged as an organizing variable, with men generally showing attentional withdrawal from the co-actor, and women showing attentional attraction. It would be interesting to investigate in the future whether men experienced the situation as more invasive of their personal space

### REFERENCES


(as would be predicted by the social discomfort hypothesis), perhaps due to larger body size, and/or whether women were more likely to attend to the other individual as they would other cue types (as would be predicted by the majority of the line bisection literature). The latter prediction could be consistent with work finding differences in sensitivity to social information across the sexes. This includes a higher willingness to make eye contact and a stronger tendency to orient to faces by female as compared to male infants, and stronger gaze-cueing effects in female as compared to male adults (Connellan et al., 2000; Lutchmaya and Baron-Cohen, 2002; Lutchmaya et al., 2002; Bayliss et al., 2005; Frischen et al., 2007). In conclusion, based on the current evidence we see little support for the joint line bisection task as a reliable overall measure of spatial allocation of social attention. Thus we cannot recommend it for future application within this domain. However, the data do suggest that should researchers wish to pursue the bisection task as a means for measuring social attention, we would encourage its investigation at the individual level, rather than the group level.

# ETHICS STATEMENT

The protocol was approved by the University of British Columbia Behavioural Research Ethics Board. All subjects gave written informed consent in accordance with the Declaration of Helsinki.

# AUTHOR CONTRIBUTIONS

JD, AD, and AK conceived and designed the study. JD programmed the experiment and coordinated the acquisition of the data; wrote the first draft of the manuscript. JD and KR analyzed and interpreted the data. All authors contributed to the final manuscript.

# FUNDING

This study was supported by Natural Sciences and Engineering Research Council of Canada (Award ID: RGPIN 170077) and Social Sciences and Humanities Research Council of Canada (Award ID: 435-2013-2200).

# ACKNOWLEDGMENTS

The authors would like to thank Alissa Burrows and Jane J. Kim for assistance in testing participants.

functioning autism, males and females, scientists and mathematicians. J. Autism Dev. Disord. 31, 5–17. doi: 10.1023/A:1005653411471




to the collective benefit in a joint visuospatial task. Front. Psychol. 8:669. doi: 10.3389/fpsyg.2017.00669

Wickham, H. (2009). ggplot2: Elegant Graphics for Data Analysis. New York, NY: Springer-Verlag. doi: 10.1007/978-0-387-98141-3

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Dosso, Roberts, DiGiacomo and Kingstone. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# "Two Minds Don't Blink Alike": The Attentional Blink Does Not Occur in a Joint Context

#### Merryn D. Constable1,2 \*, Jay Pratt2,3 and Timothy N. Welsh1,2,3 \*

<sup>1</sup> Faculty of Kinesiology and Physical Education, University of Toronto, Toronto, ON, Canada, <sup>2</sup> Department of Psychology, University of Toronto, Toronto, ON, Canada, <sup>3</sup> Centre for Motor Control, University of Toronto, Toronto, ON, Canada

Typically, when two individuals perform a task together, each partner monitors the other partners' responses and goals to ensure that the task is completed efficiently. This monitoring is thought to involve a co-representation of the joint goals and task, as well as a simulation of the partners' performance. Evidence for such "co-representation" of goals and task, and "simulation" of responses has come from numerous visual attention studies in which two participants complete different components of the same task. In the present research, an adaptation of the attentional blink task was used to determine if co-representation could exert an influence over the associated attentional mechanisms. Participants completed a rapid serial visual presentation task in which they first identified a target letter (T1) and then detected the presence of the letter X (T2) presented one to seven letters after T1. In the individual condition, the participant identified T1 and then detected T2. In the joint condition, one participant identified T1 and the other participant detected T2. Across two experiments, an attentional blink (decreased accuracy in detecting T2 when presented three letters after T1) was observed in the individual condition, but not in joint conditions. A joint attentional blink may not emerge because the co-representation mechanisms that enable joint action exert a stronger influence at information processing stages that do not overlap with those that lead to the attentional blink.

#### Edited by:

Roberta Sellaro, Leiden University, Netherlands

#### Reviewed by:

Mario Dalmaso, Università degli Studi di Padova, Italy Pamela Baess, University of Hildesheim, Germany

#### \*Correspondence:

Merryn D. Constable merryndconstable@gmail.com Timothy N. Welsh t.welsh@utoronto.ca

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 02 February 2018 Accepted: 24 August 2018 Published: 12 September 2018

#### Citation:

Constable MD, Pratt J and Welsh TN (2018) "Two Minds Don't Blink Alike": The Attentional Blink Does Not Occur in a Joint Context. Front. Psychol. 9:1714. doi: 10.3389/fpsyg.2018.01714 Keywords: attentional blink, joint action, co-representation, joint information processing, cognition, attention

# INTRODUCTION

In many daily tasks, such as cooking in a kitchen or searching for several items in a room, an individual will recruit the help of other people to complete the task more efficiently than if that individual performed the task alone. For this efficiency to occur, each individual in the group should know the overall goal of the task and the smaller sub-goals of their co-actors. Further, each individual should monitor their co-actors' actions so that they can coordinate efforts and decrease redundant performance. Consider, for example, a situation in which Bob and Doug are searching for the items they need to go out and buy coffee and jelly doughnuts from the local coffee shop. Both Bob and Doug understand the super-ordinate goal of leaving the house efficiently and that, to achieve that goal, each person might be responsible for finding different items: Bob may be tasked with retrieving the wallet and the keys to the van while Doug must find the hats and mittens. To ensure the overall job is completed efficiently, Bob and Doug will likely maintain the goals of

**83**

the other person in mind and monitor the performance of each other to know when the jobs are done. In this scenario, holding the co-actors' task in mind will not only help to determine when the whole task is done, but it may also help to complete the overall task more efficiently because each individual would not ignore a target of their partner if they happen to come across it first: that is, Doug should not ignore and leave behind the keys if he finds them before Bob. Stopping their own search to identify and obtain the target of the partner might slow down their own sub-tasks, but may increase the efficiency of the overall search task. Thus, maintaining (co-representing) a partner's goals in addition to one's own goals may make the overall task more efficient despite a small and temporary cost of the individual's own performance.

To gain an understanding of the processes enabling the completion of joint action and search tasks, researchers have typically adapted paradigms that have been developed to understand how people perform tasks individually to the joint action context for use with dyads (e.g., Sebanz et al., 2003; Welsh et al., 2005; Atmaca et al., 2008; Constable et al., 2015). The key feature of these studies is that the task is divided among two individuals such that each individual performs a subtask that is essentially independent of their co-actor, but that collectively the pair of individuals are performing the full task in a social environment. The logic behind this approach is the following: If individuals working independently in this social environment do not co-represent or code for the actions and goals of the partner, then the behavioral effect that emerges when an individual completes the whole task while acting alone should not emerge in the performance of the co-actors. However, if individuals working independently in this social environment co-represent the actions and goals of the co-actor, then the behavioral effect that emerges when an individual completes the whole task while acting alone should emerge in the performance of the co-actors. The results of these joint action studies have been largely consistent with the latter hypothesis because behavioral effects that emerge when individuals complete a whole task alone also emerge in the behavior of individuals completing subcomponents of the whole task. Thus, even though each individual has a distinct and independent task to complete, the data from joint action and search studies suggest that individuals know and code for the goals and tasks of their partner simultaneously to their own goals and responses.

An example of such a joint action and social search task that has been used to generate an understanding of the corepresentation process is one in which two participants sit across from each other at a table and execute a series of movements from separate starting positions to a pair of target locations (e.g., Welsh et al., 2005, 2007, 2009; Hayes et al., 2010; Skarratt et al., 2010; Cole et al., 2012, 2018; Doneva and Cole, 2014; Janczyk et al., 2016; see also Ondobaka et al., 2012). In the studies by Welsh and colleagues, the targets appear randomly at one of two locations such that the location of the target on trial "n" does not predict the location of the target on trial "n+1." Participants take turns responding to the targets in a paired-alternating manner such that the actor (Bob) would make two responses and then the partner (Doug) would execute two responses and so on (i.e., BBDDBBDD, etc.). With this method, the researchers were able to examine reaction times (RTs) on trials on which the target was in the same or a different location as the previous trial. When individuals perform such a sequence of responses, there are a multitude of studies that show RTs on trials in which the target is at the same location as the previous trial are longer than if the target is at a different location. These longer RTs for trials with repeated relative to different target locations are thought to emerge because shifting attention to and executing a response at one location eventually leads to the activation of an inhibitory code at that location. This inhibitory code hinders the return of attention and/or the reactivation of the response to that location – an inhibition of return (IOR) effect [e.g., Posner and Cohen, 1984; Maylor and Hockey, 1985; Welsh and Pratt, 2006; see Klein (2000) for review].

The key findings of the Welsh et al., (2005, 2007, 2009); studies [see also Cole et al. (2012, 2018)] was that an IOR effect emerged both when the participants acted two times in a row (an individual IOR effect on BB and DD trials) and when the participants acted after observing the response of their partner (a social IOR effect on BD and DB trials). Thus, IOR emerged when the individual executed their own response or observed the response of the partner. It is important to reemphasize here that, although both individuals executed movements to the same set of targets, their responses were independent from each other and were incidental to each partner's task. In other words, the partner's previous response did not predict nor was coordinated with the subsequent response of the actor, yet IOR emerged. Although some researchers have suggested that the social IOR effect emerges solely due to attentional mechanisms (see Atkinson et al., 2014; Doneva and Cole, 2014), the most common account is that the social IOR effect is generated because the knowledge and observation of the partners' action lead to a co-representation and simulation of the partner's response, subsequently activating the same mechanisms that generate the IOR effect when the person acts alone. In support of the hypothesis that the same mechanisms are activated following the execution and observation of the response, Welsh et al. (2009) found that the magnitude of the social IOR effects (RTs on same target trials minus RTs on different target trials) was significantly correlated with the magnitude of the IOR effect on individual trials. Overall, the data from the studies of the social IOR effect indicate that, even though two individuals complete independent tasks in succession in a common environment, the tasks, goals, and actions of the independent partners are co-represented and affect each other's performance.

Similar co-representation and simulation accounts have been extended to account for other joint action and social search tasks such as the joint negative priming effect (Frischen et al., 2009; Welsh and McDougall, 2012). In these studies, participants are presented with a pair of displays (first a prime and then a probe display). Each display has a target and a distractor stimulus, and the task is to respond to the location of the target and ignore the location of the distractor. The location of the target and distractor varies from trial-to-trial and from prime to probe display. The two key trial types in the negative priming task are: (1) the baseline control trials – the target and distractor on the probe display appear at different locations from the

target and distractor on the prime display and (2) the ignored repetition trials – the target on the probe display appears at the same location as the distractor on the prime display. It has repeatedly been demonstrated that RTs for probe targets on ignored repetition trials are longer than on baseline control trials. One of the predominant explanations of the longer RTs for probe targets of ignored repetition trials than on baseline trials is the selection inhibition account [Tipper, 1985; see Tipper (2001) for a review]. According to this account, selection of the target from the distractor on the prime display involves both the activation of the target information and the active inhibition of the distractor information. The inhibitory mechanism activated for the distractor on the prime display persists for some time. If the probe target is subsequently presented at the location of the prime distractor, the residual inhibition at that location hinders processing of the probe target at that location, increasing RTs. On baseline trials, the probe target is presented at a previously unoccupied location and thus processing of that probe target is unaffected by the selection process on the prime display and is relatively more efficient than the processing of the probe target on ignored repetition trials. Thus, this negative priming effect for probe targets occurs because of the successful target/distractor selection on the prime display.

In the individual version of the task used in studies of the joint negative priming effect (Frischen et al., 2009; Welsh and McDougall, 2012), a single participant completed the selection on both prime and probe displays. In the joint version, participants completed the task in pairs – one participant (Bob) completed the selection on the prime display and only responded to target 1, and the second participant (Doug) completed the selection on the probe display and only responded to target 2 (Frischen et al., 2009; Welsh and McDougall, 2012). These studies have revealed that, even though each individual is responsible for only responding to their own stimuli on separate displays (and could effectively ignore the stimuli in their partner's display), a negative priming effect still emerges on joint trials – Doug's RTs to target 2 on the probe trials are longer when target 2 in the probe display is presented at the same location as distractor 1 on the prime display than when target 2 is presented at a different location. This joint negative priming effect was suggested to emerge because, even though Bob's (the first person) task precedes and is irrelevant to Doug (the second person) and Doug could have simply ignored the prime display, Doug will spontaneously co-represent the goals and actions of Bob and simulate Bob's performance (i.e., simulate the target selection and response execution as well as the subsequent inhibition of the distractor). This co-representation and subsequent simulation of task performance activates the same mechanisms that would be activated if Doug worked alone and performed the entire task. This simulation leads to the same interference effects that emerge as though the individual performed the task on their own [see also Welsh et al. (2005) for a similar account of the social IOR effect]. In support of the hypothesis that the same mechanisms are activated on individual and joint trials, Welsh and McDougall (2012) reported that the magnitude of the negative priming effect on individual and joint trials was significantly correlated (see also Welsh et al., 2009). Overall, the results of the joint negative priming and social IOR studies provide evidence in favor of the hypothesis that coactors maintain a representation of their partner's task and may engage in a simulation of their partner's performance when they observe that selection, even when it is temporally distinct and independent from their own task.

It is important to recognize that although the work reviewed here has shed some important new light on the processes of joint action and social searches, the tasks used in this work largely engage spatial and response selection processes (e.g., Sebanz et al., 2003; Welsh et al., 2005; Frischen et al., 2009; see also Ray and Welsh, 2011). That is, even though the social IOR and negative priming tasks have a temporal component in that participants consistently alternate their task performance, the main features that define these tasks are that participants must determine target from non-target locations and rapidly execute spatially defined responses to the selected target. How and if co-representation affects processes involving temporal selection and identification is largely unknown.

The primary goal of the present studies was to address this gap regarding temporal selection and identification by adapting a task that is better suited to investigating those processes in a joint action context: the attentional blink task. Importantly, the attentional blink is thought to result from the activation of mechanisms that are distinct from those that generate IOR and negative priming effects. In other words, the attentional blink task allows us to explore the co-representation of targets and temporal selection in joint action tasks in that participants alternate identifying targets in a task that does not involve spatial and response selection and execution as in previous joint tasks (i.e., we were not just measuring the IOR and NP processes in a different way).

In the typical (single participant) attentional blink task, an individual participant watches a series of stimuli (often letters) presented in rapid succession. The task of the participant is to watch the string of stimuli and determine if two targets are presented in the series of stimuli (e.g., Raymond et al., 1992, 1994). The key to the design of these tasks is that the two targets are embedded in the series of stimuli at different intervals apart from each other – the second stimulus could be presented immediately after the first target (Lag 1) or anywhere from 2 or more stimuli after the first target (Lag 2, Lag 3, etc.). The key finding from this work is that the detection of the second target (T2) is impaired by detection of the first target (T1), with the greatest impairment in the performance occurring when the T2 is presented two to three stimuli (Lag 2–3 or approximately 180 ms) after T1. Performance at identifying the T2 typically increases and returns to baseline levels when T2 is four or more stimuli after T1 (Lag 4+). This shortterm decrement in performance for identifying the T2 at Lag 2–3 is known as the attentional blink [Raymond et al., 1992; see Dux and Marois (2009) for a review]. Although there is no single account of attentional blink effect that can explain all the findings, most accounts are based on the notion that the effect occurs because of early attentional mechanisms or limited loading or processing resources in working memory, not response selection and production processing (see Dux and Marois, 2009; cf. Jolicoeur, 1998).

Participants in the present studies completed a series of attentional blink tasks. Each task consisted of a series of rapidly presented letters and participants were required to determine if two targets appeared in the string of letters. The three conditions were: (1) an individual condition in which one participant responded to both targets, (2) a joint condition in which one person (Bob) identified the first target (T1) and the partner (Doug) identified the second target (T2), and (3) a second joint condition in which the roles were reversed – the partner (Doug) responded to T1 and the other person (Bob) responded to T2. The tasks were completed such that one joint task was always completed first with Bob responding to T1 and Doug responding to T2. After the first joint task, the participants completed their individual task conditions. For the final block, participants completed the joint task again but with the roles switched – Doug responded to T1 and Bob responded to T2. The rationale for choosing this specific order will be discussed in subsequent paragraphs.

The most theoretically relevant conditions for the present study were the joint conditions in which one of the participants responded to T2 only. The performance of participants on identifying T2 when their partner identified T1 (joint condition) provided an index of the joint attentional blink. If knowledge and co-representation of a co-actor's task influences the mechanisms associated with the joint attentional blink, then a joint attentional blink will emerge. Such a finding would be consistent with the studies suggesting that knowledge and co-representation may lead to other social attention effects such as social IOR (Welsh et al., 2005) and negative priming (Welsh and McDougall, 2012). A joint attentional blink effect should emerge if the partner responding to T2 co-represents and simulates the performance of their partner who identifies T1. If knowledge and co-representation of the other persons' task does not occur or if co-representation does not influence the processing of target information at these stages, then a joint attentional blink should not emerge.

Although the two joint conditions were the most critical, the individual condition served two important purposes. First, it served as a measure of internal validity to ensure that the stimulus conditions employed in the present study could evoke the attentional blink. Second, because each participant completed the individual task in between the two joint tasks, the individual task provided one-half of the participants with task experience prior to the critical joint task in which they identified T2 after their partner identified T1. Research has revealed that recent task experience can modulate the perception and imagination of action (e.g., Chandrasekharan et al., 2012; Wong et al., 2013) – two processes thought to involve action simulation. It is likely that task performance enhances these processes because experience strengthens the representations of the action and perceptual codes associated with the task, and leads to increased knowledge of the task and response conditions. Thus, providing one-half of the participants with task experience prior to responding to T2 allowed us to investigate whether or not experience with the task potentiates the co-representation and the subsequent joint attentional blink.

# EXPERIMENT 1

We adapted a conventional attentional blink task such that pairs of participants could complete both an individual attentional blink task and a joint attentional blink task. In the individual task, the participant identified both T1 and T2. In the joint task, one participant identified T1 and the other participant identified T2.

# Materials and Methods

#### Participants

Twenty-six undergraduate students from the University of Toronto participated in the experiment for course credit. Participants were aged 17–28 years (M = 19.88, SD = 2.78) and 19 were female. All participants had normal or corrected-tonormal vision. All participants provided informed consent prior to completing the tasks. The methods employed were approved by the Office of Research Ethics at the University of Toronto.

### Apparatus

Stimuli were presented on a 1024× 768 CRT monitor with a refresh rate of 85 Hz. Presentation of the stimuli was controlled by Python using Psychopy (Peirce, 2007). All responses were entered on a standard QWERTY keyboard. The computer screen and keyboard were positioned on a table in front of the participants. During the individual block, participants sat directly in front of a computer screen (a distance of approximately 57 cm away). During the joint blocks, the participants sat side-by-side approximately 57 cm from the computer screen. The computer used in the joint tasks was different from that used in the individual tasks. The three computers were separated by an office partition. The position of the participants in the room and the task order they performed was randomized.

#### Design and Procedure

In each testing session, there was a total of four blocks of 240 trials. Each participant, however, only participated in three of the four blocks. The first block was always a joint condition, the second and third blocks were individual conditions that participants completed separately and simultaneously, and the last block was a joint condition. Specifically, the first block was a joint task in which Participant A responded to T1 and Participant B responded to T2. The second/third blocks consisted of individual task trials in which both participants completed the task individually by responding to both T1 and T2. The individual tasks were completed at the same time on separate computers. The final block of trials was a joint task trials in which Participant B responded toT1 and Participant A responded to T2.

The experimental program, and hence the trial sequence, was the same for each condition. A trial began with a black central fixation cross that was presented on a gray background for 16 frames (187.2 ms). This cross was followed by a stream of 19 black letters and 1 white letter (1 VA). Each letter was presented for two frames (23.4 ms) with an inter-stimulus interval of seven frames (81.9 ms). Each non-target letter was selected from a pool of letters without replacement. T1 was selected from a pool of eight target letters, was colored white, and could appear at position 4, 5, 6, or 7 in the letter stream. A T1 was presented on every

trial. T2 was always a black X and could appear 1, 3, 5, or 7 letter positions after T1. T2 was presented on 50% of trials. Participants were instructed to remember T1 and T2 to respond to two probe questions following the stream of letters (**Figure 1**) using the keyboard. For T1, participants pressed the key that corresponded to the identity of the letter. For T2, participants pressed "Y" or "N" to indicate if they detected the presence of the back "X' or not, respectively. The response for T1 was always inputted prior to the response for T2.

Trials in the different task conditions were always the same. For the individual condition, participants identified and responded to both T1 and T2. Participants shared the task in the joint blocks – one participant would respond to T1 and the other responded to T2. For a given block of trials in the joint conditions, the role of the participants remained the same such that one participant (Bob) responded to T1 and the other participant (Doug) responded to T2 in the first joint task, and then changed roles in the second joint task block so that Doug responded to T1 and Bob responded to T2 in the last block of trials. Although each participant was present for the instructions and knew the task of their partner, they were not specifically instructed to attend to or monitor their partner's task. In between the two joint tasks, each participant completed an individual block in which one person responded to both T1 and T2.

# Results and Discussion

Accuracy rates for T2 at each lag were calculated. For individual blocks, responses at T2 were only analyzed if the response at T1 was accurate. For the joint blocks, responses at T2 were analyzed regardless of accuracy at T1 because responses were made by two separate individuals (cognitive systems) and participants were not given any specific instructions to monitor the performance of their partner on T1. T1 was identified accurately on an average of 95.61% trials (SD = 4.70%) on the joint task and an average of 90.46% trials (SD = 7.49%) in the individual task. Data sets characterized by exceptionally low (below 50%) T2 accuracy at Lag 7 (at a time point in which identification should be at baseline levels; i.e., high) were removed prior to the analysis. This performance criterion accounted for the removal of two paired data sets in the joint condition and six individual data sets. To determine if an attentional blink was present in each condition, the analysis focused on the difference between the accuracy of detecting T2 at Lag 3 and Lag 5 (MacLean and Arnell, 2012).

Separate paired samples t-tests were conducted on the individual and joint conditions (**Figure 2**).

An attentional blink was detected in the individual condition with T2 detection rates at Lag 3 being lower than at Lag 5, t(19) = −7.14, p < 0.001, 95% CI of Lag 3/Lag 5 difference [−30.84, −16.86]. Conversely, no joint attentional blink was observed, t(21) = 0.552, p = 0.587, 95% CI of Lag 3/Lag 5 difference [−3.36, 5.79]. To further determine if participants demonstrated an attentional blink in the joint task with a magnitude that is consistent with the attentional blink in the individual task, the difference between the detection rates at Lag 3 and Lag 5 in the joint task was calculated for each participant and compared to the 95% confidence intervals for the attentional blink in the individual task (−30.84 to −16.86). Only 1 of the 22 participants had a Lag 3/5 difference in the joint task that was in the range of the difference scores in the individual task.

To further explore the possibility that an attentional blink was present in the individual and joint conditions, the detection rates for T2 at Lags 3 and 5 in the different tasks were submitted to separate Bayesian analyses. This analysis has the benefit of generating an estimate of the amount of evidence in favor of the alternative hypothesis that there is an attentional blink and the null hypothesis that there is no attentional blink in the different conditions. The model used in the Bayesian analysis specified that the detection rates in the Lag 5 condition would be higher than the Lag 3 condition. The results of the analysis were consistent with results of the t-tests. That is, the estimated Bayes factor (BF) for the individual condition indicated that the data were 40,789 times more likely under the alternative hypothesis than the null hypothesis (BF<sup>10</sup> = 40,789). This BF equates to extreme evidence in favor of the alternative hypothesis that there was an attentional blink in the individual condition. For the joint condition, the BF indicated that the data were 6.462 more likely under the null hypothesis (BF<sup>10</sup> = 0.155). This result is considered as a moderate evidence in favor of the null hypothesis that there would be no difference between the detection rates for the Lags 3 and 5 in the joint condition. Overall, the results of the t-tests, Bayesian analyses, and the confidence intervals of the difference in detection rates at Lags 3 and 5 are consistent and provide converging evidence for the conclusion that an attentional blink was present in the individual condition, whereas no attentional blink was present in the joint condition.

As discussed earlier, it could be that experience performing a task increases the knowledge of the task and increases the potential for, or strength of, the co-representation and simulation of the partner's task. As such, a joint attentional blink might only emerge after the participant responding to T2 in the joint task has experience performing both parts of the task in the individual condition; that is, activation of the mechanisms leading a joint attentional blink for individual participants may be dependent on the person being able to form a representation of the whole task. To test this prediction, additional analyses were performed on the subgroup of participants who performed the individual task before they completed the joint task in which they responded to T2 – the group of participants who identified T2 in the last block of trials. No joint attentional blink was observed in this subgroup, t(10) = 1.07, p = 0.31, 95% CI of the Lag3/Lag5 difference [−10.27, 3.61]. The results of the Bayesian analysis that tested a model where detection rates were lower at Lag 3 than at Lag 5 revealed that the data were 1.27 times more likely under the null hypothesis (BF<sup>10</sup> = 0.79). This analysis provides only anecdotal/inconclusive evidence in favor of the null hypothesis. Despite the low sample size in this case, it is clear that there is no behavioral evidence in favor of a joint attentional blink that, at an individual level, is a robust phenomenon.

# EXPERIMENT 2

Although an attentional blink was present in the individual task where the individual responded to both T1 and T2, there was no evidence for an attentional blink in the joint conditions of Experiment 1. This finding stands in contrast to previous joint visual search literature in which selection by the partner on the preceding trial/display subsequently effects the selection of the individual (e.g., Frischen et al., 2009; Welsh et al., 2005; Welsh and McDougall, 2012). Thus, it is possible that co-representation does not influence the mechanisms leading to the attentional blink. It is interesting to note, however, that there has been one previous report of a null joint effect – the psychological refractory period [see Dux and Marois (2009) for some discussion in the mechanisms involved in this effect]. Interestingly, Liepelt and Prinz (2011) reported that a social psychological refractory period was not spontaneously elicited in conditions similar to Experiment 1 in which no specific instructions were given to participants to monitor the partner's performance. A social psychological refractory period was observed, however, when participants were instructed to "monitor" their partner's task. These instructions essentially asked participants to perform the whole task as an individual, but only actually respond to one-half of the task.

In consideration of the results of the findings of Liepelt and Prinz (2011), a second experiment was conducted to determine if specific instructions to monitor the performance of the partners could produce a joint attentional blink. Specifically, in Experiment 1, participants were not given any specific instructions for the participants to monitor the performance of the partner and co-representation and the mechanisms of the attentional blink were left to spontaneously emerge. Thus, Experiment 2 was conducted to determine if a joint attentional blink would emerge when participants were specifically asked to monitor what their partner was doing.

# Materials and Methods

fpsyg-09-01714 September 10, 2018 Time: 17:34 # 7

#### Participants

Forty-four undergraduate students from the University of Toronto participated in the experiment for course credit. A larger sample size was collected for Experiment 2 to increase the power for the analysis on the subgroup of participants who completed the individual task before completing the joint task – the subgroup that was analyzed to determine if completing the individual task first increases the potential for observing an attentional blink in the joint task. Participants were aged 18– 30 years old (M = 18.78, SD = 1.88) and 26 were female. All participants had normal or corrected-to-normal vision.

#### Design, Stimuli, Apparatus, and Procedure

All aspects of Experiment 2 were identical to those of Experiment 1 except for two important differences. First, the experimenter specifically instructed participants to monitor the performance of their partner during the joint task. That is, participants were told that they would receive global feedback on their performance on the trials. Participants were also told that, to determine who made an error on an incorrect trial, they would need to pay attention to the other person's task. Global feedback was provided to participants after the response to the T2 was registered. If both participants answered correctly, they were notified that they were correct. If one participant made an error or both participants answered incorrectly, then they were notified that they were incorrect. Note that this manipulation is only a subtle promotion of monitoring behavior because if participants had faith in their own answer and abilities, then they would not need to monitor what the other person was doing. Further, there was no direct incentive for participants to monitor because they were not asked if the other person made a correct response or not.

The second difference was that the number of trials in each block was decreased from 240 in Experiment 1 to 160 in Experiment 2. Because the proportions of target present and absent trials remained the same, this decrease in overall trial number meant that there were only 80 trials on which T2 was present in the given task. The number of trials was decreased in Experiment 2 because the global feedback took additional time to deliver. Thus, to maintain relative consistency in the overall time required to complete the task (and prevent boredom), the number of trials were decreased.

### Results

The data from one participant in both conditions were removed because they only completed half of the trials. One joint data set was lost along with five individual data sets because the program failed to record the output file correctly. The data from one final participant from the individual condition was removed because their accuracy rate for T1 was 0%. Accuracy rates for T2 at each lag for each participant were then calculated. Accuracy rates for T2 were calculated the same way as in Experiment 1. For the individual task, T2 accuracy was only considered for trials on which T1 was correctly identified, whereas T2 accuracy on all trials was considered for the joint task. T1 was identified accurately on an average of 93.68% trials (SD = 6.01%) on the individual task and an average of 97.07% trials (SD = 5.75%) in the joint task. All participants had accuracy rates for T2 above 50% at Lag 7 and, as such, all remaining data were retained.

Consistent with the approach to analysis in Experiment 1, separate paired samples t-tests and the equivalent Bayes test were conducted for joint and individual conditions on accuracy for T2 at Lag 3 and Lag 5. An attentional blink was detected in the individual condition, t(37) = −6.63, p < 0.001, 95% CI of Lag3/Lag5 differences [−28.06, −14.93]. The results of the Bayesian analysis are consistent with this finding: the data were 334,853 more likely under the alternative hypothesis, which is extreme support for a difference between Lag 3 and Lag 5 in the individual condition (BF<sup>10</sup> = 334,853). Conversely, as can be seen in **Figure 3**, no attentional blink was observed in the joint condition, t(41) = −1.57, p = 0.12, 95% CI of difference scores [−7.252, 0.911]. The BF was unable to differentiate between support for the null and the alternative hypotheses (BF<sup>10</sup> = 0.973). Finally, as in Experiment 1, the number of participants who demonstrated a joint attentional blink of the magnitude of the attentional blink in the individual task was determined by comparing the difference between the detection rates at Lag 3 and Lag 5 in the joint task to the 95% confidence intervals for the attentional blink in the individual task (−28.06 to −14.93). Only 8 of the 41 participants had a Lag 3/5 difference in the joint task that was in the range of the difference scores in the individual task.

Although completing the individual task before identifying T2 in the joint task did not seem to potentiate the joint attentional blink in Experiment 1, this analysis was conducted on a relatively low sample size. With the larger sample size in Experiment 2, we again conducted a paired sample t-test on the participants who performed the individual task before

responding to T2 in the joint task – the group of participants who identified T2 in the last block of trials. Consistent with the findings of Experiment 1, no joint attentional blink was observed in this subgroup in Experiment 2, t(19) = −1.70, p = 0.11, 95% CI of the Lag3/Lag5 differences [−12.29, 1.29]. The results of the Bayesian analysis in which the detection rates at Lag 3 were compared to those at Lag 5 again provided inconclusive evidence that is slightly in favor of the alternative (BF<sup>10</sup> = 1.47). Overall, even with the increased sample size and instructions that prompted participants to monitor the behavior of the partner, an attentional blink did not clearly emerge in the joint task.

# General Discussion

The purpose of the present study was to determine if an attentional blink would emerge in a task in which two people search for two different targets in a series of rapidly presented characters. Although robust attentional blinks emerged in the individual task in both Experiments 1 and 2 (accuracy at detecting T2 at Lag 3 was worse than at Lag 5), no such effect emerged in the joint task. Interestingly, neither previous experience with the task (i.e., completing the individual task prior to the joint task) nor instructions to monitor the performance of the person identifying T1 potentiated or activated the mechanisms of attentional blink in the joint task. Overall, the absence of the attentional blink in the joint task suggests that the mechanisms that generate the attentional blink were not activated when individuals were aware that their partner must identify the first target.

The finding that the detection of T2 was not affected in the joint task was unexpected given the joint action and social attention literature showing that individuals spontaneously co-represent and simulate the performance of their partner. In particular, in the studies of the joint negative priming effect (Frischen et al., 2009; Welsh and McDougall, 2012), the participant responding to the second (probe) display could completely ignore the first (prime) display because it is irrelevant to their task. Nonetheless, the joint negative priming effect emerged, suggesting that the participant responding on the probe display not only pays attention to the prime display, but also engages in the target/distractor selection process that leads to negative priming. Based on the findings of the joint negative priming effect (and similar findings in the social IOR effect; e.g., Welsh et al., 2005, 2007), it was predicted that the person responding to T2 could spontaneously co-represent their partners task and search for and identify T1 even though it was not part of their task. Evidently, such was not the case.

The absence of the joint attentional blink is similar to previous research on the attentional blink when individuals act alone. Specifically, Raymond et al. (1992) reported that the accuracy of responses to T2 was essentially unaffected in a task in which T1 was present, but the participant was instructed to ignore it. Thus, on first glance, it might not seem surprising that the detection of T2 in the joint task was not affected by T1 in the present studies because the participant detecting T2 did not ever have to identify and could effectively ignore T1. However, previous work that examined how the (non)identification of T1 affected the processing of T2 was always conducted in individual task contexts (i.e., without the presence of a co-actor identifying T1 and identifying T1 was not relevant at all). In the present study, each co-actor knew the task of their partner: the participant detecting T2 knew that the other participant was attempting to identify T1. Further, previous work using other social visual search tasks has revealed that the preceding action of a partner affects the performance of an individual in a manner that is similar to when the individual performs the entire task on their own, even if that response is independent of and not immediately relevant to the subsequent response (e.g., Welsh et al., 2005; Frischen et al., 2009; Welsh and McDougall, 2012). Thus, the absence of a social attentional blink requires a theoretical explanation, and a detailed discussion of the possible reasons why will be the focus of the remainder of the paper.

## Co-representation

Previous information processing effects observed in joint contexts were suggested to emerge because co-actors observed and knew (co-represented) their partners' task and response, and that this co-representation leads to the spontaneous activation of the mechanisms that are activated when the individual performs the whole task on their own (e.g., Sebanz et al., 2003; Welsh et al., 2005; Frischen et al., 2009). Based on this premise, it was predicted that each partner in the present studies would co-represent the task of the partner. As a result of this corepresentation, even though they were not required to respond to T1, the participant responding to T2 alone would represent (and perhaps simulate) the task of the partner and that this co-representation would subsequently activate the mechanisms leading to the attentional blink. Such was evidently not the case. Before addressing why the effect did not emerge, two further observations will be discussed.

The first observation is that completing the individual task before the joint task did not affect the emergence of the joint attentional blink. Completing the individual task first could have increased the potential for a joint attentional blink because recent work suggests that experience with a movement task increases the accuracy of action perception (e.g., Chandrasekharan et al., 2012; Wong et al., 2013), increases the responsiveness of cortical areas activated during action observation (Calvo-Merino et al., 2005; Catmur et al., 2007), and affects the manner in which a co-actor adapts their actions for their partner (Ray et al., 2017). Previous experience is thought to have these effects because performance of the task (generating the action and sensing and perceiving the outcomes of the action) establishes, refines, and/or strengthens the coupling between the representations of the action and the perceptual consequences of those actions (Prinz, 1992; Hommel et al., 2001; Kunde, 2001; Elsner and Hommel, 2004; Gozli et al., 2016). Because it is these coupled perception-action codes that are thought to be activated during action observation and joint action, experience-based enhancements of these perceptionaction codes would have increased the knowledge and potential strength of the co-representation processes thereby increasing the potential for a joint attentional blink. No joint attentional blink, however, was observed in the performance of these individuals

who gained task experience before performing the T2 detection in the joint task.

The second observation is that an attentional blink did not emerge even under instructions to monitor the performance of the partner (Experiment 2). These overt instructions to monitor the performance of the partner that identified T1 were expected to promote co-representation and the potential for the activation of the mechanisms that would generate a joint attentional blink. The absence of a joint effect under these instructions is not consistent with the findings in a paper reporting a social psychological refractory period effect – this effect only emerged under instructions that promoted partners to monitor each other's performance (Liepelt and Prinz, 2011). However, it is possible that the social psychological refractory period effect emerged (though not spontaneously) because it involves response initiation or selection processes (Lien and Proctor, 2002) similar to many other effects that have companion joint effects such as the joint Simon effect (Sebanz et al., 2003) and the social IOR effect (Welsh et al., 2005).

So why was it that the attentional blink did not emerge in this study? First, despite instructions and expectations, it is possible that the participant responding to T2 did not know what the partner was doing and, as such, did not engage in co-representation. Without co-representation, joint effects are unlikely or unable to emerge. Although this possibility cannot be definitively ruled out, we believe it is likely that corepresentation did occur because both participants were present during the delivery of the instructions and there is a wealth of previous research showing that joint effects (presumably due to spontaneous co-representation) under such conditions. Further, the participants in Experiment 2 were explicitly instructed to monitor the performance of the partner. Finally, the joint attentional blink did not emerge even in the subgroup who experienced the individual task prior to completing the T2 detection in the joint task – the subgroup who definitely had knowledge of both of the task components. Thus, we are confident that each participant knew the task and that corepresentation occurred. The discussion will now turn to possible reasons why an attentional blink did not emerge despite corepresentation.

### Potential Reasons Why the Joint Attentional Blink Did Not Emerge

Based on the assumption that co-representation did occur, it seems that co-representation does not exert an effect upon the processes linked to the attentional blink. There are a number of possible reasons why the joint attentional blink did not emerge. The three most likely possible accounts will be addressed in turn. First, note that the majority of the previous studies on joint action have accounts that emphasize the role of "action" processing in generating the effects – processes that operate in spatial attention and response planning and selection such as the joint Simon effect (Sebanz et al., 2003), joint negative priming (Frischen et al., 2009; Welsh and McDougall, 2012), and social IOR (Welsh et al., 2005). Although some explanations of the attentional blink effect have a response selection component (Jolicoeur, 1998), the majority of the accounts of the attentional blink hold that the attentional blink emerges because of earlier attentional processes and/or limitations in the loading of or processing of information in working memory (see Dux and Marois, 2009). Hence, it is possible that mechanisms of co-representation preferentially operate on the level of decision making, response selection, and response programming rather than at earlier attentional and working memory processes.

In this context, it should be noted that there is evidence that the presence of another individual does affect perceptual and attentional processing. For example, there is evidence for spontaneous visuospatial perspective taking across a number of tasks (e.g., Böckler et al., 2011; Freundlieb et al., 2016, 2017, 2018). Further, Böckler et al. (2011) reported that the global/local processing of a stimulus was affected by the partner's level processing (performance was less efficient when co-actors were to report a different level of feature than when they were to report the same level of feature). Finally, Constable et al. (2015) revealed that an object-specific recognition effect was altered by the hand posture of a co-actor. Interestingly, all these perceptual and attentional tasks, such as negative priming and IOR, involve a spatial dimension either regarding the features of the stimuli or of the co-actor. Thus, the attentional blink might not have emerged in the joint condition because the task employed in the present study is essentially non-spatial in nature (all stimuli were presented centrally), involved stimuli that were distinguished based on timing and identity, and did not involve response selection during the critical period of processing (cf. Dolk et al., 2014; Dittrich et al., 2017; for demonstrations of a spatial effect where co-representation may not exert an influence). In sum, the joint attentional blink might not have emerged because the processes activated and affected by corepresentation and those involved in attention blink do not overlap.

Another, potentially related, possibility concerns the conceptual overlap (or non-overlap) in tasks. Much in the same way observing another person's actions interfere more with one's task when they are relevant for one's own task (Bortoletto et al., 2013), perhaps another person's task only interferes when there is close conceptual or dimensional overlap across tasks. For example, in the social Simon task (Sebanz et al., 2003), there is spatial and color features of the targets that are shared, or at least relevant, across participants. Similarly, in the work of Böckler et al. (2011) and Constable et al. (2015), perceptual state is relevant for the task. In the case of the present attentional blink task, the two tasks might not have had sufficient conceptual overlap to generate the joint effect – there is a temporal staggering of the stimuli: one partner completes an identification task before the other partner completes a detection task of a white stimulus in a string of black stimuli. These differences might have made the overall joint task less of a dynamic interaction than typical joint action tasks (e.g., Welsh et al., 2005; Frischen et al., 2009) and make each partner's task more conceptually distinct.

A final explanation of the findings concerns the mental (attentional) states induced by completing a task with another individual. Previous work has revealed that if participants

acting alone are required to complete an additional task (such as thinking about a holiday) while concurrently doing the attentional blink task, the attentional blink effect is attenuated (Olivers and Nieuwenhuis, 2005, 2006). Further, positive affect has also been shown to attenuate the attentional blink (Olivers and Nieuwenhuis, 2006). Given that positive affect is linked with diffusion of attention (Ashby et al., 1999), there is converging evidence suggesting that a diffuse attentional state can attenuate the attentional blink. Because of the social nature of the present joint task, it is also possible that the resulting positive environment and affect in the joint condition may have led to a diffuse attentional state. Thus, a joint attentional blink might not have been observed because of this diffuse attentional state. It should be noted, however, that previous studies typically report attenuated attentional blink effects rather than an abolishment of the effect as seen in the present study. As such, we feel that it is unlikely that a diffuse attentional state was the sole source of the absence of a joint attentional blink in the present study.

# SUMMARY

In sum, the two experiments reported herein provide no evidence for the emergence of a joint attentional blink even when participants had previous task experience and specific instructions to monitor the performance of the partner. The

# REFERENCES


possible reasons for the lack of such a joint effect are explored which can guide future research into understanding joint temporal processes.

# ETHICS STATEMENT

This study was carried out in accordance with the recommendations of the Declaration of Helsinki. The protocol was approved by the the Office of Research Ethics at the University of Toronto. All subjects gave written informed consent in accordance with the Declaration of Helsinki.

# AUTHOR CONTRIBUTIONS

MC designed the study, ran the experiment, and edited the manuscript. TW wrote the manuscript. JP consulted on the design of the study and the manuscript.

# FUNDING

This research was supported by research grants from the Natural Sciences and Engineering Research Council and the Ontario Ministry of Research and Innovation.



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Constable, Pratt and Welsh. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Intimacy Effects on Action Regulation: Retrieval of Observationally Acquired Stimulus–Response Bindings in Romantically Involved Interaction Partners Versus Strangers

Carina Giesen\*, Virginia Löhl, Klaus Rothermund and Nicolas Koranyi

General Psychology II, Institute of Psychology, Friedrich Schiller University Jena, Jena, Germany

#### Edited by:

Kerstin Dittrich, Albert-Ludwigs-Universität Freiburg, Germany

#### Reviewed by:

Anouk van der Weiden, Utrecht University, Netherlands Roman Liepelt, German Sport University Cologne, Germany

> \*Correspondence: Carina Giesen carina.giesen@uni-jena.de

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 12 April 2018 Accepted: 16 July 2018 Published: 03 August 2018

#### Citation:

Giesen C, Löhl V, Rothermund K and Koranyi N (2018) Intimacy Effects on Action Regulation: Retrieval of Observationally Acquired Stimulus–Response Bindings in Romantically Involved Interaction Partners Versus Strangers. Front. Psychol. 9:1369. doi: 10.3389/fpsyg.2018.01369 Previous research has shown that stimulus–response (SR) binding and retrieval processes also occur when responses are only observed in another person (Giesen et al., 2014). Importantly, this effect depends on the two individuals interacting interdependently during the task (e.g., competition or cooperation). Interdependence, however, must not necessarily result from task-related demands, but can also reflect an intrinsic feature of a given relationship. The present study examines whether observing responses of one's romantic partner also produces stimulus-based retrieval of observed responses even if the task itself does not involve interdependence. Participants performed a task pairwise, either with their romantic partner or with a stranger. In a sequential prime-probe design, both participants of a pair gave color responses themselves (actors) or merely observed these (observers) in alternating fashion. As expected, stimulus-based retrieval of observationally acquired SR-bindings occurred only in romantically involved pairs; participants interacting with a stranger showed no retrieval effects. We conclude that mental representations of self and other are more closely intertwined in romantic couples, which produces automatic retrieval of observationally acquired SR binding effects even independently of the task itself.

Keywords: stimulus–response binding, event files, joint action, romantic relationship, observational learning

# INTRODUCTION

"Only let me assure you, my dear Miss Elizabeth, that I can from my heart most cordially wish you equal felicity in marriage. My dear Charlotte and I have but one mind and one way of thinking. There is in everything a most remarkable resemblance of character and ideas between us."

Mr Collins, Pride and Prejudice by Jane Austen

It is an obvious truth that relationships with other people represent a central aspect of our social lives and influence our thinking, feeling, and behavior in various ways. Among the numerous relationships that we initiate and maintain, the one that we have to our romantic partner or spouse is a special one. The relationship partner is of primary significance for satisfying fundamental affiliation and intimacy motives (Baumeister and Leary, 1995) and it is in most cases him/her to

whom we turn to when we need someone to talk to or when support is needed in stressful times (e.g., Coyne and DeLongis, 1986; Revenson, 1994).

A core characteristic of satisfied and stable couples is that the relationship partners display high interdependence in thoughts and feelings and strongly co-ordinate their behavior (Agnew et al., 1998). Specifically, it has been shown that high-functioning couples have a strong tendency to match their responses to tasks or challenges and thereby become rather effective in dealing with everyday stress and developmental tasks (e.g., Bodenmann et al., 2006; Papp and Witt, 2010; Neff and Broady, 2011).

To date, joint action regulation in intimate relationships has typically been examined on the macro-level, for instance by assessing couple's overt responses to a (demanding) task either by observational procedures or self-reports. In contrast, the underlying cognitive micro-processes of joint action regulation have received far less attention. The present research tries to fill this gap by combining the "couple perspective" with recent advances in research on social influences in automatic joint action regulation. Specifically, the present study focuses on stimulus–response (SR) binding and retrieval processes which reflect a fundamental mechanism of action automatization. It will be argued that due to the high relevance of one's relationship partner, SR binding and retrieval also occurs by mere observation of one's romantic partner and thereby forms the basis for dyadic behavior coordination on a more elaborate level.

# Stimulus–Response (SR) Binding and Retrieval Processes

Processes of stimulus–response (SR) binding and retrieval depict a fundamental process of automatic action regulation (Logan, 1988): That is, whenever a response is made to a stimulus in a given (prime) trial, the mental representations of stimulus and response will be transiently bound together in an SR binding or event file (Hommel, 1998). Repeating one element of this binding in a subsequent (probe) trial (e.g., a stimulus repetition probe), will retrieve the entire SR binding from memory, meaning that re-execution of the previous response is facilitated. If the retrieved prime response is also appropriate in the current probe trial, SR retrieval will produce performance benefits, compared with a situation without stimulus repetition (i.e., a stimulus change probe). However, if the retrieved response is inappropriate in the current probe trial, SR retrieval will produce performance costs (relative to stimulus change probes, respectively; Rothermund et al., 2005). To date, a burgeoning amount of evidence documents that processes of SR binding and retrieval apply to a broad scope of stimuli, modalities, and responses (see Henson et al., 2014, for an overview), and thus play a dominant role for automatic action regulation.

Since the seminal work by Albert Bandura on social learning by observation, it is known that most of our action routines are not based on our own experience, but result from the observation of others. However, one will not blindly copy any action observed in another person. On the contrary, particular moderating conditions determine to which extend one will incorporate an observed response in one's own action repertoire (Bandura, 1986). Intriguingly, principles of observational learning may also (and to a similar extend) influence micro-processes of automatic action regulation. For instance, recent studies revealed that response execution is no necessary pre-condition for the formation of SR bindings: Notably, SR bindings are also created if the response to a stimulus is only observed in another person (Giesen et al., 2014, 2017). In their study, Giesen et al. (2014) created a joint version of the standard SR binding task. Two participants performed a shared color categorization task. One participant categorized the color of a word stimulus presented in the prime trial (prime actor). At the same time, the other participant (prime observer) only saw the word, but no color, and had to observe the prime response that was given by the prime actor – which should lead to the formation of an observational SR binding. Crucially, to test whether SR bindings were indeed acquired by observation, the former prime observer became probe actor and had to categorize the color of a word stimulus presented during the probe trial (see **Figure 1**). Stimulus relation from prime to probe (i.e., word repetition versus word change) and compatibility between observed prime responses and to-be-performed probe response (i.e., compatible vs. incompatible) were manipulated orthogonally. Analogously to the logic of "standard" SR retrieval effects, probe trials with stimulus repetition should trigger retrieval of the observationally acquired SR binding. The crucial question was thus whether probe actors' performance in the probe would reflect a pattern that is consistent with SR retrieval effects (indicated by a Stimulus Relation × Response Compatibility interaction). Indeed, this was the case: When to-be-performed probe responses were compatible with observed prime responses, performance was faster on probe trials with stimulus repetition than on stimulus change probes (yielding performance benefits due to SR retrieval of "appropriate" responses). However, when to-beperformed probe responses were incompatible with observed prime responses, performance was slower on stimulus repetition probes than on stimulus change probes (yielding performance costs due to SR retrieval of "inappropriate" interfering responses).

Crucially (and in analogous fashion to social learning phenomena on the macro level, see Bandura, 1986) social dependence among pairs of interacting participants during the task modulated this pattern of results. Giesen et al. (2014) contrasted three conditions: Some pairs of participants had to cooperate to gain an extra reward (a chocolate bar): pairs were informed that both participants would gain the extra reward if – and only if – both performed well in terms of response speed and accuracy; otherwise, both would get no extra reward. In a second group, pairs had to compete against each other, meaning that only the better participant of each pair would gain the extra reward, whereas the other would leave empty handed. In the last group, participants worked independently of each other to gain the extra reward, meaning that distribution of the reward depended solely on participants' individual performance. This manipulation of social interdependency between co-actors had a considerable influence on retrieval of observationally acquired SR bindings: only participants who were socially dependent on their co-actor (i.e., pairs in the cooperative or competitive condition) showed retrieval of observational SR bindings. In turn, participants who

Assignment of males (blue figures)/females (pink figures) to the roles of prime actor vs. observer was random. Stimuli are not drawn to scale.

did not depend on their co-actor to gain the extra reward showed no retrieval effects at all. These findings attest that retrieval of observationally acquired SR bindings is a conditionally automatic process that is contingent on the situational interdependency between interaction partners.

The idea that observed actions are mentally represented like one's own actions is central for a range of paradigms that investigate related phenomena like observational acquisition of action-effect bindings (Paulus et al., 2011), imitation tasks (e.g., Brass et al., 2001; van Baaren et al., 2009), or co-representation effects eminent in interactive/joint action tasks like the Joint Simon task (Sebanz et al., 2003). It is noteworthy that the type of social relation during the task is a strong modulating influence in these paradigms as well. For instance, interference effects in the Joint Simon task are also stronger as social relations become more interdependent (Hommel et al., 2009; Ruys and Aarts, 2010; Iani et al., 2011; Müller et al., 2011); the same holds true for effects of unconscious imitation (mimicry; van Baaren et al., 2009).

In previous research on stimulus-based retrieval of observed responses, interdependence between two individuals was situationally induced by instructing participants to cooperate with or compete against each other (Giesen et al., 2014). Interdependence, however, must not necessarily be the result of task-related demands, but can also reflect a permanent feature of a given relationship. Romantic relationships reflect a paramount example in this respect (Aron et al., 1991). According to Aron et al. (1991, p. 242), the interdependent structure of romantic relationship even implies that "the person acts as if some or all aspects of the partner are partially the person's own." Thus, persons tend to represent their romantic partner in their mental "self " representations to a considerable extent (this aspect is nicely illustrated in the starting quote). Furthermore, romantic partners perceive themselves less individualistic and more as part of a "self-and-partner" collective (Agnew et al., 1998).

# Aims of the Present Study

In the present study, we examined whether romantic relationships exert an influence on the retrieval of observational SR bindings that mimics the effects of social dependence documented by Giesen et al. (2014). To this end, we only recruited participants who were involved in a committed relationship. Participants first answered an online questionnaire in which we assessed relationship quality (among other measures). Then, pairs of two participants were invited to the lab, consisting either of the two partners of a relationship or of two people from different relationships. Participants thus worked through the observational SR binding task either with their romantic partner or with a stranger. Note that relationship status was constant between groups. In other words, groups only differed in whether pairs of participants were romantically involved with each other ("romantic partner" condition) or with someone else ("stranger" condition). We expected that working with one's own romantic partner (compared with working with a stranger) should directly influence retrieval of observational SR bindings as a function of attention. Specifically, participants are likely to regard actions performed by their romantic partner as more relevant and consequently attend more to them. According to Logan (1988), attention is not only beneficial for encoding, but also for retrieving SR episodes (see also Moeller and Frings, 2014). Furthermore, we reasoned that this should hold true not only for "standard" SR episodes (i.e., transient bindings between stimuli and self-performed responses), but also for bindings of stimuli and observed

responses. Thus, if attention due to increased relevance of actions performed by one's romantic partner (vs. a stranger) is critical for the retrieval of observationally acquired SR bindings, the Interaction Partner × Stimulus Relation × Response Compatibility three-way interaction should be significant. Specifically, we expected that (a) probe actors' performance of participants in the "romantic partners" condition reflect a pattern that is indicative of SR retrieval. In statistical terms, retrieval of observationally acquired SR bindings is indicated by an interaction of the factors stimulus relation (repetition vs. change) and response compatibility between observed prime and to-be-performed probe responses (compatible vs. incompatible). In other words, stimulus repetition in the probe trial should retrieve observationally acquired SR bindings from memory, reactivating the observed prime response. Thus, when to-be-performed probe responses are compatible with observed prime responses, performance should be faster on stimulus repetition probes, compared with stimulus change probes. In turn, when to-be-performed probe responses are incompatible with observed prime responses, performance should be slower on stimulus repetition probes, compared to stimulus change probes. Furthermore, based on the findings of Giesen et al. (2014), we expected that (b) SR retrieval effects should be absent for probe actors in the "stranger" condition (i.e., no Stimulus Relation × Response Compatibility interaction), because the task itself did not create any kind of interdependence between the participants of the pair.

# MATERIALS AND METHODS

## Participants

According to a priori calculations with the G∗Power 3.1 software (Faul et al., 2007), a sample size of n = 27 per "interaction partner" condition is required to guarantee sufficient statistical power of 1−β = 0.80 with α = 0.05 to detect a medium-sized (d = 0.50) effect in the "romantic partners" condition (where we predicted to find an effect that is statistically different from zero) and in the "strangers condition" (for which we predicted a null-effect).

The study took place in a predetermined time period during which the lab was available. Recruiting of romantic couples of which both participants could take part turned out to be particularly challenging. In the given time period, we managed to recruit 52 native German-speaking participants for the experiment (32 female<sup>1</sup> ; age: M = 24.9 years, SD = 7.1; relationship duration: M = 3.4 years, SD = 4.1). Participants were either students at FSU Jena (n = 40), received other educational training (n = 4), or were already working (n = 8). All participants were involved in a permanent, committed heterosexual romantic relationship. Due to an error in recruitment lists, resulting sample size per condition was slightly off-balanced (n = 22 for the "romantic partners" condition; n = 30 to the "stranger" condition). Since the recruited sample sizes deviated from those calculated in a priori power analyses, we performed post hoc power calculations with G∗Power to check the achieved power of each condition. Calculations showed that achieved power to detect a medium-sized effect (d = 0.50) was 1−β > 0.73 in the "romantic partners" condition (meaning that this condition was slightly under-powered) and 1−β > 0.84 in the stranger condition (meaning that this condition was sufficiently powered, which is especially important since we predicted a null finding).

Ethical approval of the study was granted by the Ethical Commission of the Faculty of Social and Behavioural Sciences, FSU Jena (FSV 18/25). All participants provided written informed consent.

All participants answered a brief online questionnaire (5 min) individually at home. In the lab, participants performed the computer experiment in pairs and then answered another brief questionnaire on their own. Lab sessions lasted 45–50 min. Participants received partial course credits or sweets for their voluntary participation. To incentivize participation, three Amazon vouchers (£15; £10; £5) were raffled among all participants. If participants showed an appropriate performance during the computer experiment, participants received more sweets as an extra reward. Importantly, distribution of the extra reward depended solely on the participants' individual performance and not on their partners' performance ("independence" condition of Giesen et al., 2014).

# Experimental Set-Up and Stimuli

During the computer experiment, two participants sat opposite to each other at a table, each one in front of a 19-in. flat-screen monitor to prevent participants' direct eye contact. The experiment was programmed with E-Prime 2.0. On each participant's monitor, word stimuli (25 neutral, frequently used German adjectives that were either mono- or disyllabic and consisted of four to seven letters) were presented in Times New Roman 16-pt font centrally on a blank black screen. Two response pads – one with a red and one with a green push-button in the middle and two black rest-state keys in front of and behind each push-buttons (see **Figure 1**) – were fastened to the table and served to collect responses. In detail, participants permanently pressed the rest-state keys with their left and right hand, respectively. Each participant had the task to categorize the color of the presented word stimulus. Participants performed this task in turns (i.e., only one participant saw a colored word stimulus, whereas the other saw the word stimuli presented in white font; see **Figure 1**). They gave their responses by releasing one of the rest-state keys to hit the according (red or green) push-button in front of the released rest-state key. The response pads were connected to the computer via the parallel port to collect the color categorization responses. Both the release response of the rest-state keys and the hit responses of the red/green push-buttons were measured, but only the release response reaction times (RTs) was used for analysis. That is because probe hit responses are confounded with movement speed (i.e., time to reach the push buttons). Release RT represent

<sup>1</sup>Twelve female participants in the "stranger" condition performed the task with a male confederate, since no male participant was available at the scheduled time of testing. Data of the confederate were excluded from all analyses.

a more pure measure of the time it took participants to initiate a response.

# Procedure

The current study consisted of three different parts. First, an online questionnaire requested personal demographics. To prove the inclusion criterion, participants had to report their relationship status, relationship duration, and sexual orientation. Furthermore, we assessed general relationship satisfaction with the German version of the "Relationship Assessment Scale" (RAS; Sander and Böcker, 1993). Participants answered seven items containing questions about their current romantic relationship. Using 5-point scales, they were asked to rate their relationship as 1 (low satisfaction) versus 5 (high satisfaction). Items 4 and 7 of the scale are reverse coded (Cronbach's alpha = 0.75). Second, after answering the online questionnaire at home, participants were invited to the lab to take part in the computer experiment. Two participants (referred to as Participants A and B) worked in a pair and performed a color categorization task in alternating fashion (see Giesen et al., 2014, Experiment 1, for a similar procedure). For both participants, instructions were presented on each participant's screen. Participants were able to determine the duration of reading the instructions individually. For both prime and probe displays, participants' task was to categorize the color of the presented word stimuli by pressing the corresponding (i.e., red/green) push-button in the middle of the response pads. Thus, color of word stimuli was task-relevant, whereas the identity/meaning of the word was irrelevant in prime and probe displays and served as a distractor (Rothermund et al., 2005). Importantly, the color categorization task was shared between both participants. Hence, only one participant of each pair saw a colored word during the prime or probe display (the actor). For the other participant (the observer) the same word was presented in white font (see **Figure 1**). In particular, for the first 160 primeprobe sequences, Participant A was "prime actor" and had to categorize the color of word stimuli presented during the prime display. By implication, participant B was "prime observer" and had the task to observe the color categorization response carried out by the prime actor. Importantly, participant B then became "probe actor" and had to categorize the color of the word stimulus presented during the probe display. By implication, participant A was "probe observer" and had to observe the color categorization response carried out by the probe actor. For the remaining 160 prime-probe sequences, participant B was the prime actor/probe observer and participant A the prime observer/probe actor. This was done to collect probe responses from both participants, since we used probe actors' release RTs as primary dependent variable for the analyses of interest.

Each prime-probe sequence followed the pattern shown in **Figure 1** (right side). First, as a ready signal, three exclamation marks were displayed centrally in each participant's screen in white font for 500 ms. After that, a fixation cross appeared for 250 ms. Subsequently, the prime display started with a word stimulus presented in red or green font for the prime actor and in white font for the prime observer. Stimuli remained on screen until prime actors hit one of the push-buttons to categorize the word color or until a maximal duration of 1,500 ms had elapsed. Immediately after the prime actor resumed to press both rest-state keys, another fixation cross appeared for a duration that varied randomly between 150 and 350 ms (M = 250 ms). The duration was variable between sequences to prevent an exact anticipation of the probe display's onset. Then the probe display started with another word stimulus presented in red or green font for the probe actor and in white font for the probe observer. Stimuli remained on screen until probe actors hit one of the push-buttons to categorize the word color or until a maximal duration of 1,500 ms had elapsed. Immediately after the probe actor resumed pressing both rest-state keys, the experiment continued as follows. In 25% of randomly selected prime-probe sequences, a memory test for the prime observer appeared after the probe display. The memory test served to ensure that prime observers attended to color responses of prime actors. Prime observers had to press the push-button that corresponded to the observed (prime) response. The memory test remained on screen until one of the push-buttons was pressed. Once prime observers continued pressing both rest-state keys, a black screen appeared for 1,250 ms, reminding participants to keep both rest-state keys pressed. Then, the next prime-probe sequence started.

Participants performed a practice block of 32 prime-probe sequences before starting the first experimental block. Only the practice block included immediate feedback for erroneous or too slow responses. If release responses were slower than 750 ms, the message "Respond faster!" was displayed. If actors in prime and probe hit the wrong push-button, the message "Error–wrong key!" appeared. If the wrong person released a rest-state key, the message "Error–wrong person!" appeared. All feedback messages were shown to both participants centrally on a red background in white font for 1,000 ms. If participants performed too many erroneous or too slow responses in the practice block, a second practice block followed. Upon successful completion of the practice, participants were informed that they worked independently of their interaction partner (Giesen et al., 2014), meaning that distribution of the extra reward for each of the two participants depended only on their own individual performance. Participants then worked through two experimental blocks comprising of 160 prime-probe sequences each. After every 40 prime-probe sequences, both interaction partners received a short feedback on their own performance (% errors; % slow responses).

Third, after completion of the computer task, participants received a brief paper questionnaire to assess how participants perceived the situation and their interaction partner during the task. Using 7-point bipolar scales, three items assessed participants' experienced discomfort versus comfort during the experiment (i.e., 1 = difficult/unpleasant/negative; vs. 7 = easy/pleasant/positive Cronbach's alpha = 0.73). Additionally, participants were asked to rate the experimental situation as 1 (competitive) versus 7 (cooperative). With four other items participants were further asked to indicate the impression the interaction partner had left (i.e., 1 = disagreeable/insecure/unfriendly/incompetent vs. 7 = agreeable/confident/friendly/competent; Cronbach's alpha = 0.94). Using a 5-point scale, participants were asked whether they were acquainted with their interaction partner

(1 = not at all vs. 5 = very familiar). A last dichotomous item asked whether they had used any strategies to perform the task. After completion of the questionnaire, participants were thanked and rewarded. Participants received the extra reward if more than 75% responses were faster than 750 ms, if less than 10% of color categorizations and less than 20% of memory tests were erroneous. Further, participants could deposit their own e-mail address to receive a debriefing.

# Design

The experimental design comprised a 2 × 2 × 2 mixed-factors design with the within-subject factors stimulus relation and response compatibility and the between-subject factor interaction partner. Stimulus relation was manipulated by presenting the same prime word in the probe in 50% of all prime-probe sequences (word repetition, e.g., small–small) and by presenting a probe word differing from the previously presented prime word in 50% of all prime-probe sequences (stimulus change/baseline, e.g., quiet–small). Response compatibility was varied by requiring probe responses that were compatible to observed prime responses in 50% of all prime-probe-sequences (compatible response, e.g., red–red) and by requiring probe responses that were incompatible to observed prime responses in 50% of all prime-probe sequences (incompatible response, e.g., green–red). The between factor (interaction partner) was manipulated by assigning participants either to work with their romantic partner (n = 22) or to perform the task together with a stranger (n = 30). Condition assignment depended partially on how feasible it was for participants to bring their romantic partner to the lab. To achieve homogenous and comparable groups, all participants were involved in heterosexual romantic relationships and worked with an opposite-sex interaction partner during the experimental session. Details on relationship ratings in both conditions are reported below. Release reaction time (RT) of the rest-state keys in the probe served as the primary dependent variable during the color categorization task. However, analyses of probe hit RTs yielded very similar results (see Footnote 3). Since probe hit RT are confounded with movement speed, we refrained from interpreting any results relating to probe hit RTs.

Font color of prime words was counterbalanced (50% of all prime stimuli were presented in red, 50% were presented in green to the prime actor). Likewise, font color of probe words was counterbalanced (50% red; 50% green; note that probe color depended on the experimental factor response compatibility).

# RESULTS

All statistical analyses were performed with R.

### Manipulation Checks Ratings of Experimental Situation and Interaction Partner

We computed mean ratings of participants' perception of the experimental situation and their interaction partner for both interaction partner conditions (see **Table 1**). Results indicated that the interaction conditions differed significantly only with TABLE 1 | Means (SD) of participants' ratings of the experimental situation and memory test performance.


Means in the same row with different subscripts differed at p < 0.01.

respect to the perceived (dis)agreeableness of the interaction partner. Not surprisingly, romantically involved interaction partners judged each other as more agreeable, confident, friendly, and competent (M = 6.1, SD = 1.1) than interaction partners in the "stranger" condition (M = 4.8, SD = 1.3), t(50) = 2.99, p = 0.004. The interaction partner conditions did not differ significantly with respect to ratings of perceived (dis)comfort of the situation, t(50) = 1.84, p = 0.07, and to the question how cooperative/competitive they experienced the situation, t(50) = 1.58, p = 0.12. Cooperation/competition and perception of the situation thus seem to be unaffected by the interaction partner manipulation. Participants in the stranger condition reported not to be acquainted with their interaction partner (M = 1.4, SD = 0.7). Naturally, romantic partners were acquainted with each other (M = 5.0, SD = 0.0). Acquaintance scores differed significantly between both conditions, t(50) = −23.66, p < 0.001.

#### Relationship Ratings

As part of the online questionnaire, participants rated their relationship satisfaction with the RAS before taking part in the computer experiment. We computed the average RAS scores (cf. Sander and Böcker, 1993) separately for each participant. In general, RAS scores were rather high. Importantly, however, the relationship satisfaction of participants who interacted with their romantic partner (M = 4.3, SD = 0.4) did not differ from the relationship satisfaction of participants who interacted with a stranger (M = 4.3, SD = 0.4), |t| < 1. However, and unexpectedly, the duration of the current relationship differed significantly between both interaction groups: Relationship duration was longer for participants in the "romantic interaction partners" condition (M = 4.8 years, SD = 5.7 years) than for participants in the "stranger" condition (M = 2.3 years, SD = 1.7 years), t(50) = −2.21, p = 0.032. Post hoc data exploration revealed that this difference was due to two outliers in the romantic partner sub-sample (i.e., a couple with very long relationship duration). When this outlier couple was removed, relationship duration no longer differed between both interaction partner conditions<sup>2</sup> .

<sup>2</sup>Due to the small sample size of the "romantic partner" condition, outlier values may exert a stronger influence on small samples (compared with larger sample sizes), which makes it even more important to control for these biases (we thank an reviewer for pointing this out). Thus, we removed the couple with outlier value on

#### Memory Test Performance

fpsyg-09-01369 August 1, 2018 Time: 8:1 # 7

Additionally, we computed probe actors' average error rates in the memory test (see **Table 1**) to ensure that participants of both conditions were motivated to a comparable extent to observe their interaction partner's prime reactions. Error rates were low in general (3.0%); most importantly, they did not differ between interaction partner conditions, |t| < 1. We conclude that all prime observers adequately attended and thus memorized their interaction partner's prime response.

# Probe Performance

Only probe actors' release RTs after correct prime responses and for correct probe responses were analyzed. Thus, 1.6% prime-probe sequences with erroneous responses of the prime and/or probe actor were excluded. We also excluded probe responses for sequences with erroneous responses in the memory test (3.0%; overall: 0.7%) and probe release RT outlier values<sup>3</sup> (5.4%). We then computed probe actors' mean release RTs for every condition of the factorial design (see **Table 2**). These means were entered into a 2 (stimulus relation: stimulus repetition vs. stimulus change/baseline) × 2 (response compatibility: compatible vs. incompatible) × 2 (interaction partner: romantic partners vs. strangers) mixed factor analysis of variance (ANOVA).<sup>4</sup>

<sup>3</sup>Probe release RTs below 250 ms or more than 1.5 interquartile ranges above the third quartile of the individual distribution of probe release RTs were regarded as outliers (Tukey, 1977).

<sup>4</sup>The same 2 (stimulus relation) × 2 (probe response compatibility) × 2 (interaction partner) ANOVA on mean probe hit RTs as dependent measures

Results revealed significant main effects of response compatibility, F(1,50) = 20.15, p < 0.001, η 2 <sup>p</sup> = 0.29, indicating that probe actors responded faster in sequences in which a compatible probe response was required (454 ms) compared to sequences in which an incompatible response was required (466 ms). Additionally, the main effect of interaction partner was also significant, F(1,50) = 5.94, p = 0.018, η 2 <sup>p</sup> = 0.11, showing that probe actors who worked with a stranger (441 ms) were faster than probe actors who worked with their romantic partner (486 ms) during the experiment. Most central to our prediction, the three-way interaction of stimulus relation, response compatibility, and interaction partner was significant, F(1,50) = 4.37, p = 0.042, η 2 <sup>p</sup> = 0.08, indicating that retrieval effects for bindings between observed prime responses and word stimuli (i.e., the Stimulus Relation × Response Compatibility interaction) differed between the "romantic partner" and "strangers" condition (see **Figure 2**). To investigate the three-way interaction in more detail, we conducted follow-up ANOVAs separately for both conditions of the interaction partner factor. In line with our hypothesis, the Stimulus Relation × Response Compatibility interaction was significant in the romantic partner condition, F(1,21) = 7.50, p = 0.012, η 2 <sup>p</sup> = 0.26 (see **Figure 2A**). When required probe responses were compatible with observed prime responses, stimulus repetition from prime to probe significantly facilitated performance compared with stimulus change probes (1 = 9 ms; t[21] = 2.35, p = 0.014 [one-tailed], d<sup>z</sup> = 0.50). In turn, when required probe responses were incompatible with observed prime responses, stimulus repetition from prime to probe led to a descriptive slowing of responses, compared with stimulus change probes, although this performance cost just missed conventional levels of significance (1 = −7 ms; t[21] = 1.59, p = 0.063 [one-tailed], d<sup>z</sup> = 0.34). In contrast, the Stimulus Relation × Response Compatibility

yielded similar (though somewhat noisier) results. Specifically, the three-way interaction just missed conventional levels of significance with F(1,50) = 2.59, p = 0.057 (one-tailed), η 2 <sup>p</sup> = 0.05. For illustrative purposes, we performed followup analyses similar to those reported in the main text to make sense of the underlying data pattern prevalent for probe hit RT. Accordingly, the Stimulus Relation × Response Compatibility interaction was significant for probe hit RT in the romantic partner condition, F(1,21) = 7.52, p = 0.012, η 2 <sup>p</sup> = 0.26, but was completely absent in the stranger condition, F < 1. However, we refrain from interpreting results for probe hit RTs, since they are confounded with movement speed, and are hence no ideal performance indicator.


SR, stimulus repetition; SC, stimulus change (baseline); SR-effect, stimulus repetition effect, computed as difference between SC minus SR; S × R Interaction Effect, interaction between stimulus relation and response compatibility, computed as the difference between SR-effects for compatible responses minus SR-effects for incompatible responses; C, compatible; IC, incompatible. Standard errors of the means in squared brackets.

TABLE 2 | Means (SD) of probe actors' release RT (ms).

relationship duration from the analyses. After excluding this couple from the subsample of all romantically involved interaction pairs (remaining N = 20), groups no longer differed in relationship duration (romantic partners: Mduration = 3.2; strangers: Mduration = 2.3), t(48) = −1.40, p = 0.16. Measures of relationship quality were unaffected by removal of the outlier couple and remained statistically equal between groups (romantic partners: MRAS = 4.3; strangers: MRAS = 4.3, | t| < 1). Most importantly, removal of the outlier couple did not affect ANOVA results: The three-way interaction of stimulus relation, response compatibility, and interaction partner remained significant, F(1,48) = 3.07, p = 0.043, η 2 <sup>p</sup> = 0.06 (one-tailed; given our specific prediction with regard to the nature of the three-way interaction, a one-tailed test is allowed and recommended, see Maxwell and Delaney, 1990, p. 144). The Stimulus Relation × Response Compatibility interaction remained significant in the romantic partner condition, F(1,19) = 5.03, p = 0.037, η 2 <sup>p</sup> = 0.21, compared to the stranger condition, F < 1. All results indicate that the Stimulus Relation × Response Compatibility × Interaction partner interaction was not due to the duration of the current relationship.

FIGURE 2 | Probe actors' average release RT (ms) as a function of stimulus relation (stimulus repetition: solid lines; stimulus change: dotted lines), response compatibility between observed prime and executed probe response, and interaction partner (A: probe performance of participants interacting with their own romantic partner; B: probe performance of participants interacting with a stranger). Error bars depict 95% confidence intervals for paired differences (CIPD; Pfister and Janczyk, 2013), computed for the difference of stimulus change minus stimulus repetition (SC-SR) within each probe response compatibility level.

interaction was completely absent in the stranger condition, F < 1, p = 0.868, η 2 <sup>p</sup> = 0.00 (**Figure 2B**). No other effect was significant.

# DISCUSSION

The present study examined stimulus-based retrieval of observationally acquired SR bindings in romantically involved couples versus pairs of strangers. We assumed that due to the interdependent structure of romantic relationships, romantically involved individuals would more closely represent their own and their partner's actions and would do so even if the task itself does not involve interdependence (i.e., even without instruction to cooperate or compete). Consequently, retrieval of observational SR bindings should be present in romantically involved interaction partners, but should be absent in unacquainted interaction partners. The present findings support our reasoning: Stimulus-based retrieval of observationally acquired SR bindings occurred only in romantically involved pairs; prime observers interacting with a stranger showed no retrieval effects for their interaction partners' behaviors.

Although numerically, stimulus repetition effects produced facilitation (i.e., positive) as well as interference (i.e., negative) effects for the "romantic partners" condition, the statistical pattern of stimulus repetition effects suggests that the effects are primarily driven by facilitation (i.e., significantly faster RTs in for probe trials with compatible responses), rather than interference effects (since RT differences for probe trials with incompatible responses did not differ significantly from zero). The presently observed asymmetry is not uncommon in studies on SR-binding and retrieval effects and has been reported before (e.g., Rothermund et al., 2005; Frings et al., 2007; Frings and Rothermund, 2011; Horner, 2015). However, we want to emphasize that the most central test for stimulus-based binding and retrieval effects is the interaction term (i.e., the net effect of both facilitation and interference effects); importantly, this interaction was significant for the "romantic partners" condition, but was absent (with F < 1) for the "stranger" condition.

Before discussing the theoretical implications of our findings, we address some alternative explanations for the present results. First, and somewhat unexpectedly, interaction partner conditions differed significantly in relationship duration. Thus, one might argue that participants in the "romantic partner" condition might have been those who are more able to enter and maintain long-lasting relationships which might be associated with a general disposition or ability to rely on observational SR retrieval. However, post hoc data exploration showed that this significant effect was due to an outlier couple with very long relationship duration in the "romantic partner" condition. Exclusion of this couple (a) removed any significant differences between interaction partner conditions on relationship duration, but (b) did not affect relationship quality scores between groups, which did not differ statistically. Most importantly, (c) the pattern of results obtained for probe release RT was unaffected by outlier removal since the three-way interaction remained significant (see Footnote 2 for details). We can therefore conclude that differences in retrieval of observational SR bindings between both interaction partner conditions cannot be explained by differences in relationship duration. In our view, findings are uniquely attributable to differences in mutual interdependence that accrue from interacting with a stranger or one's romantic partner.

Second, it is possible that participants in the "romantic partner" condition implicitly assumed that their romantic partner would share her/his outcome in the experiment (as they might probably do themselves), although the distribution of extra rewards was based on each participant's individual performance and was independent of the performance of the interaction partner. Expectation of shared outcomes is known to produce a perception of "common fate," which is a key element of

cooperative contact (e.g., Gaertner et al., 1999). This might have produced a more cooperative condition for participants in the "romantic partners" condition, compared with participants in the "stranger" condition, and thus reflects a possible alternative explanation for the observed effects<sup>5</sup> . However, we regard this possibility as somewhat unlikely, for several reasons: (a) Sharing outcomes might characterize only some, but not all romantic couples, and is highly influenced by various additional factors (e.g., individual preferences, personality style, etc.). We simply do not know whether and to which extent some or all of the romantic couples formed such a "common fate" perception. (b) Romantic relationships are the paramount example for positive interdependent relationships (possibly reflecting a "ceiling" effect in terms of positive interdependency). Thus, we consider it unlikely that any relationship got even more positively interdependent than it already is based on the mere possibility of shared profits. (c) We explicitly assessed to which extent participants perceived the experimental situation as cooperative/competitive. Importantly, both groups did not differ significantly on this measure (see "Results" section). This finding argues against any confounding influence due to expectations of "shared outcomes" or "common fate" perceptions in the "romantic partners" condition. However, we concede that most of these speculations are post hoc, and that it would be preferable to explicitly assess whether and to which extent the expectation of shared profits alone shaped participants perception of the experimental task as more cooperative and thus affected retrieval effects. To address this, one would need a follow-up study with the following design: Pairs of participants work independently of each other on the observational SR binding task. Importantly, participants may acquire a claim for an extra reward (based on their individual performance). However, each extra reward this then submitted to a "pool of shared profits," which may hold none, one, or two extra rewards (based on the individual performance of each participant). Crucially, both participants are informed that this pool of shared profits is distributed equally between both interaction partners at the end of the task. If the outlook of shared profits is sufficient to produce retrieval of observational SR bindings, the pattern obtained in the present study for the "romantic partners" condition should replicate. Future research is therefore needed to address this issue.

Third, another concern relates to the fact that interaction partners in the "stranger" condition showed no SR retrieval at all. Post hoc power calculations (see "Materials and Methods" section) showed that the achieved power in the "stranger" condition was sufficient to detect an effect of at least medium size. We can therefore conclude that the absence of SR retrieval effects in the "stranger" condition does not stem from insufficient statistical power. Several explanations are possible. On the one hand, it is possible that working on a task with one's own romantic partner goes along with closer monitoring of the interaction partner, compared with working with a stranger. Thus, observational SR bindings in the "stranger" condition might suffer from a lack of additional attentional processing, resulting in weaker SR bindings (Logan, 1988). However, if this was truly the case, one would also expect group-specific differences in the memory test for prime observers (i.e., higher error rates in the "stranger" condition). Notably, error rates in the memory test did not differ between groups. We can therefore conclude that prime observers in both interaction partner conditions attended to and consequently encoded observed prime responses to equal extent.

On the other hand, it is possible that the very fast overall RT level of the "stranger" condition affected retrieval of observational SR bindings. For instance, it is possible that the absence of a facilitation effect (on stimulus repetition compared with stimulus change probes) for compatible responses is due to the very fast overall RT pattern (i.e., a floor effect). In other words: Participants in this condition already responded so quickly that any further speed-up effect was negligible (or even impossible). However, if this line of reasoning is correct, one would expect that the interference effect (on stimulus repetition compared with stimulus change probes) for incompatible responses should in fact be stronger in the "strangers" compared with the "romantic partners" condition. That is because participants in the "romantic partners" condition are already so slow on a general level (i.e., reflecting a ceiling effect) that any further slowing due to retrieval of inappropriate responses has no further detrimental effect on probe performance. In our view, this is somewhat implausible, given that both effects, i.e., retrieval-induced facilitation for compatible responses and retrieval-induced interference for incompatible probe responses were more pronounced in the "romantic partners" condition. Nevertheless, we wanted to test this possibility empirically and performed quintile analyses.<sup>6</sup> However, none of the effects of interest did interact with the quintile factor, indicating that overall differences in response speed cannot account for the observed pattern of results.

Related to the previous point, we want to emphasize that we cannot exclude that the overall speed differences between interaction partner conditions occurred as a consequence of (and hence was caused by) the manipulation. Put differently, working on the task together with one's romantic partner might have relaxed participants to a certain degree due to this positive interdependency so that participants eased off (and also slowed down) a bit in their general wish to "get done" with the experiment. In turn, working with a stranger did not have this "easing" effect on participants, which is why participants in this condition responded significantly faster on a global level. Tentatively – although this is only a post hoc speculation – we want to point out that a similar main effect was also apparent in the study by Giesen et al. (2014, Exp 1). Namely, probe release RTs of participants in the cooperative condition were significantly

<sup>5</sup>We thank an reviewer for drawing our attention to this important alternative explanation.

<sup>6</sup> Specifically, we computed quintiles based on each participant's individual probe release RT distribution. We then ran a 2 (stimulus relation) × 2 (response compatibility) × 2 (interaction partner) × 5 (quintile) mixed-models ANOVA. One participant had to be excluded from this analysis due to empty cells, meaning that data of n = 51 participants entered into the analysis. However, the quintile factor did not interact with the effects of interest: Specifically, neither the stimulus relation × response compatibility × quintile interaction, F(4, 46) < 1, p = 0.67, nor the four- way interaction of stimulus relation × response compatibility × interaction partner × quintile interaction, F(4,46) = 1.76, p = 0.15, reached significance.

slower than those of participants in the independent (1 = 34 ms) or competitive (1 = 30 ms) condition. Importantly though, the overall difference in global RT did not affect the retrieval of observationally acquired SR bindings, which was apparent in cooperative and competitive pairs (but absent in independent pairs). Against this background, we want to stress that the main effect of interaction partner condition cannot account for the qualitatively different pattern due to SR retrieval effects.

As another explanation to account for the absence of SR retrieval effects in the "stranger" condition, one might argue that opposite-sex strangers represent a potential threat for participants in a committed relationship. As a result, these participants might shield their relationship by (un-)consciously activating self-regulatory mechanisms that corrupt social interactions with interaction partners from the opposite sex (e.g., Koranyi and Rothermund, 2012). Findings by Karremans and Verwijmeren (2008) support this reasoning. They observed that imitation of an unacquainted, attractive opposite-sex interaction partner was reduced when participants were in a committed relationship, compared with singles. However, note that retrieval of observationally acquired SR bindings was also absent in the study by Giesen et al. (2014) when pairs of participants worked independently of each other and although about half of the pairings were same-sex interaction partners. In addition, probably as many participants of this sample were not involved in a romantic relationship and thus had nothing to shield against, but still did not show any effects of observational SR binding.

In our view, it makes more sense to regard the present absence of SR retrieval in the "stranger" condition as important replication of the null finding from the initial study by Giesen et al. (2014) when pairs worked independently of each other. According to Bandura (1986), not everything that is encoded through observation will also be retrieved later. With respect to the present paradigm, this means that one will not blindly incorporate any observational SR binding for one's own action regulation. We therefore believe that the absence of observational SR retrieval represents the default in situations in which the interaction partner is not socially relevant either in the specific task/situation (e.g., when interaction partners work independently of each other) and/or in terms of more permanent forms of personal attachment (e.g., romantic partners, close friends, etc.).

# Theoretical Implications

An important question is how one can explain the modulating influence of social interdependence that is apparent not only for retrieval of observational SR bindings, but also for action co-representation effects. Building on earlier findings from Aron et al. (1991), several authors argued that the overlap between mental representations of self and other reflects a possible mediating process (e.g., Hommel et al., 2009; Giesen et al., 2014; Maister and Tsakiris, 2016). That is, as relationships become more interdependent or closer, the mental representations of self and other will be more closely interconnected. Following this line of reasoning, one is more likely to represent the response of another person like one's own response if that other person is a socially relevant other (e.g., a person with whom one interacts in a cooperative or competitive way). Hence, interacting with socially relevant others makes it more likely (a) to co-represent actions of a co-actor (Ruys and Aarts, 2010; Iani et al., 2011), but also (b) to rely on observationally acquired SR bindings to regulate one's own actions (Giesen et al., 2014). However, when another person is not socially relevant (e.g., when both participants work independently of each other), people are more likely to keep mental representations of self and other more distinct and separated from each other.

In this respect, the present study supports the notion that our cognitive system requires a minimum degree of connectedness between actor and observer in order to utilize observationally acquired SR bindings for one's own action regulation. Connectedness in this respect can be conceptualized as the extent to which the co-actor is socially relevant in a given situation. Importantly, perceiving another person as socially relevant might be the product of situationally induced dependencies (i.e., instructions to cooperate with or compete against a co-actor), but might also result from more chronic forms of personal attachment (e.g., romantic relationship status) that "bridge the gap" between co-actors whenever situational dependencies are absent. However, it is an unresolved issue whether the present findings would also generalize to other forms of close relationships (e.g., close friends, family members, or lifelong arch-enemies) or are restricted to romantic relationships, which show not only overlap in cognitive representations of self and other, but also share body representations (see Maister and Tsakiris, 2016).

In addition, the present findings advocate overlap between mental representations of self and other as a potential underlying mechanism in producing retrieval effects of observational SR bindings even if the task does not explicitly require representing the other's action. To bolster this claim, future research is needed to detect other conditions that also go along with closer or more distinct self-other representations. A worthwhile endeavor would be to explore manipulations that allow for a more direct test, for instance by experimentally inducing overlapping versus separate self-other representations.

# AUTHOR CONTRIBUTIONS

CG developed research idea, study, design, organized data collection and analyses, as well as manuscript preparation. VL responsible for data recruitment and analyses, involved in manuscript preparation. KR involved in manuscript preparation. NK involved in development of research idea, and study design as well as manuscript preparation.

# FUNDING

This research was supported by a grant from the Deutsche Forschungsgemeinschaft to KR (DFG RO1272/6-2).

# REFERENCES

fpsyg-09-01369 August 1, 2018 Time: 8:1 # 11


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Giesen, Löhl, Rothermund and Koranyi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# What's Shared in Movement Kinematics: Investigating Co-representation of Actions Through Movement

Matilde Rocca1,2 and Andrea Cavallo1,2 \*

*<sup>1</sup> Department of Psychology, University of Torino, Turin, Italy, <sup>2</sup> C'MoN, Cognition, Motion and Neuroscience Unit, Fondazione Istituto Italiano di Tecnologia, Genova, Italy*

Keywords: joint motor tasks, kinematics, co-representation, movement styles, social interaction

# INTRODUCTION

#### Edited by:

*Timothy N. Welsh, University of Toronto, Canada*

#### Reviewed by:

*James William Roberts, Liverpool Hope University, United Kingdom Paul Forbes, University College London, United Kingdom*

> \*Correspondence: *Andrea Cavallo andrea.cavallo@unito.it*

#### Specialty section:

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology*

Received: *13 April 2018* Accepted: *08 August 2018* Published: *28 August 2018*

#### Citation:

*Rocca M and Cavallo A (2018) What's Shared in Movement Kinematics: Investigating Co-representation of Actions Through Movement. Front. Psychol. 9:1578. doi: 10.3389/fpsyg.2018.01578* In recent years, psychological research has shown a growing interest in the study of human social interaction. This has led researchers to develop new paradigms and to formulate new theories about how people adjust minds and bodies when interacting with each other (Schilbach et al., 2013; Gallotti et al., 2017). One intriguing question that arises when dealing with social interactions concerns what information actors share about each other when involved in a joint action. One of the most influential theories in this field states that, given the fundamental social nature of joint actions, people have the tendency to represent and map both one's own and others' task demands (Sebanz et al., 2003, 2005). However, this view has recently been challenged by proponents of the "referential coding account" who have criticized the apparent nonsocial nature of the tasks and methodologies used to formulate and support the co-representation theory (Dolk et al., 2011, 2014).

In the present opinion article, we briefly describe the experimental paradigms often employed to study the co-representation theory (section Co-representation theory: proponents and opponents). Then, we illustrate potential methodological issues related to these paradigms (section A methodological problem), and finally we propose a new strategy, based on the characterization of movement kinematics, to address the open question about what is shared in shared actions (section A motor solution).

# CO-REPRESENTATION THEORY: PROPONENTS AND OPPONENTS

Investigating joint performance requires researchers to focus on interactive experimental settings, trying to overcome the long-lasting trend of studying humans in lonely environments (De Jaegher et al., 2010; Schilbach et al., 2013). To this end, Sebanz et al. (Sebanz et al., 2003) proposed a social version of a well-known individual Stimulus-Response Compatibility (SRC) paradigm: the Simon Task.

In the joint version of the task, two stimulus-response mappings of a two-choice task are distributed between two agents (e.g., Agent 1 presses for green squares; Agent 2 presses for red squares). Even with no need of taking the other's mapping into account, the results highlight an interference effect between a task-irrelevant aspect of the stimulus (e.g., its position on the screen) and a task-relevant aspect of the response (e.g., the position of the button to press). The similarity with the original Simon effect led researchers to formulate the co-representation theory, which states that, given the social nature of joint actions, people tend to co-represent automatically each other's portion of the task in a functionally equivalent way (Sebanz et al., 2003, 2005). This theory has received support from many other studies that have used SRC tasks to test its assumptions (e.g., Atmaca et al., 2008, 2011; Elekes et al., 2016).

The co-representation theory has nevertheless received criticism. Some authors have argued that the behavior people display during the joint Simon task derives from a universal information-processing rule, having little to do with social skills (Dolk et al., 2011, 2014). Different studies have demonstrated that a nonsocial attention-attracting event, such as a Japanese waving cat, elicits the very same behavior observed in the joint Simon task (Dolk et al., 2013; Puffe et al., 2017). The main idea, expressed by opponents of the co-representation theory in what they call the referential coding account, is that the other person's action simply provides a spatial reference for one's own action, in the same way as any sufficiently salient event would do.

These two perspectives seem to be hardly reconcilable, lying on contrasting interpretations. The debate thus appears to have reached a stalemate, and the co-representation theory is facing an unexpected impasse.

# A METHODOLOGICAL PROBLEM

It is worth noticing that the referential coding account does not intend to deny the social nature of joint performances: what the authors claim as nonsocial is the behavior that arises from joint SRC tasks used to investigate the co-representation theory (e.g., Dolk et al., 2011, 2014; Yamaguchi et al., 2018). The referential coding account is in fact a nonsocial way to explain the observed effects, which thus sometimes fall in an interpretational ambiguity.

This consideration raises a methodological problem. Two possible issues may in fact concern the use of SRC tasks in investigating co-representations: one is interpretational, one is practical. Both issues stem from the task that the two participants perform, which is for both a key press. This type of response is described as discrete, and is often contrasted with continuous responses (e.g., Song and Nakayama, 2009).

The interpretational issue relates to the poorness of the actions performed. Investigating joint performance with a task that involves discrete responses seems to reduce the social nature of the interaction. Using such a simple task is surely helpful in controlling the experimental setting, yet it pays the cost of dealing with an unnatural social setting. In daily environments, our social partners engage in actions that are much more complex, which we understand and predict (for a review see Springer et al., 2012; Hasson and Frith, 2016). Therefore, joint SRC tasks restrict the focus to a partner's action that may be too minimal to highlight a social effect.

The practical issue concerns the dependent measure obtained from joint SRC tasks: response time (RT). Although RT measures have helped to infer several aspects of human cognitive processes, it is well established that they restrict the investigation to a unidimensional assessment of behavior, without the opportunity of accessing the "continuity of the mind" (e.g., Spivey and Dale, 2004; Song and Nakayama, 2009). In joint SRC tasks, RTs show an interference effect, which suggest that we represent the other person's task and that this representation weakens our performance. However, RTs do not allow to access the content of this representation, limiting in a way the investigation of the phenomenon. For example, to coordinate with others, we must consider not only what movements others are doing, but also how they are moving (Keller et al., 2014; Gallotti et al., 2017). RTs can thus provide insightful information about the what component of co-representations, but they cannot be informative about the how–i.e., whether we also represent the specific movement styles of others' actions (but see Schmitz et al., 2017).

# A MOTOR SOLUTION

To overcome the methodological issues that seem to affect joint SRC paradigms, here we propose a different experimental strategy that might shed light on the co-representation phenomenon.

We propose to turn to experimental paradigms that elicit a more complex and enriched overt motor activity. These joint motor tasks could help to address both the interpretational and the practical issues linked to joint SRC tasks.

On the interpretational level, dealing with a partner that makes complex movements can enhance the ecological validity of the experiments, bringing the setting closer to a real-life social interaction. Human movements present unique features that distinguish them from artificial-generated motions (Thompson and Parasuraman, 2012; Steel et al., 2014); furthermore, besides fundamental regularities (Viviani and Flash, 1995), individuals show specific movement styles (Ting et al., 2015; Koul et al., 2016). The exclusively human capability to understand, predict, anticipate, and adjust to how other people move establishes the profound social aspect of joint performances. We thus believe that, assuming the validity of the co-representation theory, the use of motor tasks could help to reject alternative nonsocial interpretations of joint SRC results.

On the practical level, movement kinematics might constitute a much more informative dependent measure than RTs, although caution must be taken when dealing with multivariate measures that provide huge amounts of data (e.g., high levels of false positives; Simmons et al., 2011). When investigating internal processes, some authors suggest to replace RT measures with dependent variables that are more fluid, continuous, and that can change over time (Freeman et al., 2011); movement kinematics could be a good candidate because of their capacity of reflecting the unfolding of internal dynamic processes over time (Song and Nakayama, 2009; Freeman et al., 2011). Indeed, despite the role played by inhibitory processes (for a review see Schall et al., 2017), human movements reveal a lot about both our external and our internal world. For example, movement kinematics have proven to be different depending on objects' size, shape, mass, and even texture and fragility (Weir et al., 1991; Castiello et al., 1992; Savelsbergh et al., 1996; Ansuini et al., 2015; for review see Jeannerod et al., 1995; Castiello, 2005). Even more Rocca and Cavallo What's Shared in Movement Kinematics

interestingly, kinematic features encode information about more abstract internal states, including intentions (Cavallo et al., 2016; Becchio et al., 2018), decisions (McKinstry et al., 2008), numerical representations (Song and Nakayama, 2008), and other cognitive processes (Song and Nakayama, 2009; Freeman et al., 2011). Therefore, movement kinematics could be an adequate measure to investigate complex internal representations, like those of other persons' tasks and actions.

The characterization of human movement has already been extensively investigated in social interaction studies (Krishnan-Barman et al., 2017); however, these studies often focus on distinguishing between individual and social behavior, without fully addressing the question of whether and how we use information about the others to succeed in a joint action. A vast literature suggests that our movements are different in a social setting (Becchio et al., 2010; Krishnan-Barman et al., 2017), and that they are highly influenced by other people's movements (Blakemore and Frith, 2005; Heyes, 2011). This seems to indicate that other people's actions are actually represented in our brains when we act together; yet it remains unclear how specific these representations are, and how they come into play during joint performances: How and to what extent is the representation of others' task demands integrated within one's own motor system during joint actions? Does this representation include information about the others' motor behavior? Is this information specific to the confederate one is interacting with?

To address these questions, we propose to use joint motor tasks involving participants in sequential actions, with the aim of reaching a common goal. A possible method could be to maintain the movement requirements of the first agent (A1) constant throughout the interaction, while manipulating those of the second agent (A2)–i.e., modifying the difficulty of A2's task, while keeping that of A1 constant. The kinematic profile of the first agent's movements could then be a good predictor of the movement that the second agent is about to make **(Figure 1)**.

Compared to simultaneous actions, sequential motor tasks might increase the internal validity of the studies that aim to investigate co-representations, as they prevent from potential confounds caused by automatic imitation and motor contagion effects (Kilner et al., 2003; Heyes, 2011). Examining the similarity between the movement profiles of the two agents might in fact help on understanding how specific the representation of the other person's actions is, letting us begin accessing the content of co-representations.

Consider the kinematic modulation that occurs when an action is directed toward a small target: compared to large targets, movements toward small targets require greater precision, which is achieved through an earlier reach of the peak velocity and a longer deceleration phase (e.g., Marteniuk et al., 1987). We would in fact expect A2 to present an earlier time to peak velocity and a longer deceleration phase when his movement is directed toward a small, compared to a large target. If A1's velocity profile shows a modulation similar to that of A2, we would be facing two possible explanations. The first would suggest that A1 has formed a generic representation of A2's task: A2's

grasps the object in the intermediate target area and places it in a final target area that varies across trials (e.g., different distance and size; upper panels). We expect that A2's task demands will be processed by A1. If so, kinematic profiles of A1 movements should encode information about the movement that A2 is *about to* make (lower panel).

targets might act as distractors for A1, producing an interference effect. The second would suggest that A1 has formed a detailed representation of A2's action, including kinematic information about the specific way in which A2 is going to move. In both cases, we would expect a positive correlation between the velocity profiles of the two agents. However, in the second case, the observed correlation would be higher than any other correlation obtained by permuting agents between pairs (e.g., correlation between A1 movements of pair n and A2 movements of pair m).

Another interesting aspect to explore would concern how the first agent's actions change over the course of the interaction. Building a representation of a person's actions may be a process that needs time and practice. The quantification of this kinematic adaptation could help to investigate how we learn to adjust to others in a joint task, and this would lead to explore the applicability of other theoretical models, such as associative learning (Catmur et al., 2009) and predictive coding (Kilner et al., 2007), to the joint action domain.

Furthermore, sequential motor tasks could provide a good tool to investigate whether co-representations arise exclusively in the joint action domain, where a common goal has to be achieved. Recent literature suggests that common goals might not be fundamental for creating social interactions (Gallotti et al., 2017). At the same time, other evidence points to consider common goals at the heart of reciprocal motor influence (della Gatta et al., 2017). In order to disentangle these different perspectives, it could be useful to investigate, through the manipulation of the instructions, whether and how others' motor representations change as a function of the presence/absence of a common goal.

# CONCLUSION

With the present opinion paper, we aimed at describing and facing the methodological issues connected to the paradigms currently used to support the co-representation theory. We presented an alternative approach to investigate the corepresentation of actions, focused on the use of joint motor tasks.

We believe that shifting the attention to movement kinematics, and specifically to those emerging during sequential joint actions, could further the current understanding of how people successfully engage in joint performances. On the one hand, it is reasonable to think that the co-representation theory may gain from a motor approach the possibility of discarding the current criticism. On the other hand, a motor approach might

# REFERENCES


provide the opportunity of bringing the investigation forward. Movement kinematics could in fact be a good tool to investigate not only how we form representations about others, but also how we use co-representations to coordinate and adjust to others.

# AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct, and intellectual contribution to the work, and approved it for publication.

# ACKNOWLEDGMENTS

AC was funded by Fondazione Compagnia di San Paolo, Excellent Young PI Grant (CSTO167915).


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Rocca and Cavallo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Let's Move It Together: A Review of Group Benefits in Joint Object Control

#### Basil Wahn1,2 \*, April Karlinsky <sup>3</sup> , Laura Schmitz 4,5 and Peter König1,6

1 Institute of Cognitive Science, Universität Osnabrück, Osnabrück, Germany, <sup>2</sup> Department of Psychology, University of British Columbia, Vancouver, BC, Canada, <sup>3</sup> School of Kinesiology, University of British Columbia, Vancouver, BC, Canada, <sup>4</sup> Department of Cognitive Science, Central European University, Budapest, Hungary, <sup>5</sup> Fakultät für Philosophie, Wissenschaftstheorie und Religionswissenschaft, Ludwig-Maximilians-Universität, Munich, Germany, <sup>6</sup> Institut für Neurophysiologie und Pathophysiologie, Universitätsklinikum Hamburg-Eppendorf, Hamburg, Germany

In daily life, humans frequently engage in object-directed joint actions, be it carrying a table together or jointly pulling a rope. When two or more individuals control an object together, they may distribute control by performing complementary actions, e.g., when two people hold a table at opposite ends. Alternatively, several individuals may execute control in a redundant manner by performing the same actions, e.g., when jointly pulling a rope in the same direction. Previous research has investigated whether dyads can outperform individuals in tasks where control is either distributed or redundant. The aim of the present review is to integrate findings for these two types of joint control to determine common principles and explain differing results. In sum, we find that when control is distributed, individuals tend to outperform dyads or attain similar performance levels. For redundant control, conversely, dyads have been shown to outperform individuals. We suggest that these differences can be explained by the possibility to freely divide control: Having the option to exercise control redundantly allows co-actors to coordinate individual contributions in line with individual capabilities, enabling them to maximize the benefit of the available skills in the group. In contrast, this freedom to adopt and adapt customized coordination strategies is not available when the distribution of control is determined from the outset.

#### Edited by:

Karl Christoph Klauer, Albert Ludwigs Universität Freiburg, Germany

#### Reviewed by:

Janeen Dawn Loehr, University of Saskatchewan, Canada Merryn Dale Constable, University of Toronto, Canada

#### \*Correspondence:

Basil Wahn bwahn@uos.de

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 29 March 2018 Accepted: 18 May 2018 Published: 07 June 2018

#### Citation:

Wahn B, Karlinsky A, Schmitz L and König P (2018) Let's Move It Together: A Review of Group Benefits in Joint Object Control. Front. Psychol. 9:918. doi: 10.3389/fpsyg.2018.00918 Keywords: social cognition, joint action, social interaction, motor coordination, coordination strategies

# 1. INTRODUCTION

Humans frequently coordinate their actions to jointly manipulate and control objects. These objectdirected joint actions range from basic tasks such as carrying a table together (Sebanz et al., 2006) to complex ones such as flying an airplane (Hutchins, 1995). By controlling an object jointly, co-actors in a group may reach higher performance levels than individuals performing the same task alone: They may reach a group benefit (Reed et al., 2006; Wahn et al., 2016). However, controlling an object jointly also introduces additional coordination demands because co-actors need to predict or react to each other's actions and adjust their own action planning accordingly (Knoblich and Jordan, 2003; Sebanz et al., 2006; Vesper et al., 2017). Thus, joint object control introduces dependencies between co-actors because one actor's actions directly affect the actions of the other actor and vice versa.

Research on object-directed joint action (henceforth referred to as "joint object control") has investigated under which circumstances groups outperform individuals. In particular, researchers have identified different task types that determine whether the benefits of controlling an object together outweigh the costs of action coordination. In the present review, we discuss and compare two types of joint object control that have been shown to influence the emergence of group benefits: "distributed control" and "redundant control". Distributed control refers to tasks where co-actors have predetermined complementary action possibilities. For instance, one co-actor controls object movement along the horizontal dimension while the other coactor controls object movement along the vertical dimension. In contrast, redundant control refers to tasks where co-actors have the same action possibilities. For example, both co-actors can control object movement along the horizontal and vertical dimensions (see **Figure 1**). Note that in all of the studies that we consider in the present review, participants had visual access to the controlled object such that they could observe the combined effects of their own and their co-actor's actions on the object. In the following, we first review studies that have investigated distributed control. We then turn to studies that have investigated redundant control. Finally, we integrate the findings to determine factors that may explain differences in outcomes between the two task types and we point out directions for future research.

# 2. DISTRIBUTED CONTROL

For distributed object control tasks, researchers have investigated how co-actors' access to information about each other's actions influences group benefits. In an early study, Knoblich and Jordan (2003) manipulated whether or not co-actors received information about each other's actions. Two co-actors were instructed to control a cursor on a computer screen in order to track an object that moved along the horizontal axis. Control over the cursor's movements was distributed such that one coactor could press a key to increase the cursor's acceleration to the left whereas the other co-actor could press a key to increase acceleration to the right. Critically, co-actors either heard a tone whenever their co-actor pressed a key, or they did not receive any auditory information about each other's key presses. Joint performance was compared to an individual condition where individual participants controlled both movement directions bimanually. While individuals initially outperformed dyads, dyads eventually reached—but never exceeded—individual performance levels. Notably, joint performance improved only if co-actors received auditory information about each other's actions. Thus, individuals seem to have an initial performance advantage for this type of object control task.

Similar findings were observed when co-actors received haptic as opposed to auditory information about each other's actions (van der Wel et al., 2011). In the study by van der Wel et al. (2011), individuals and dyads moved a pole (similar to a pendulum) back and forth between two targets by pulling on two cords attached to the base of the pole. Control over the pole was either bimanual or distributed between two co-actors so that each co-actor controlled only one of the two movement directions by pulling one of the cords. Dyads reached a similar performance level as individuals, consistent with the study by Knoblich and Jordan (2003). The authors posited that receiving information about a co-actor's actions via the direct haptic coupling through the cords (in addition to seeing the pole move) was critical for dyads to achieve similar levels of performance as individuals (van der Wel et al., 2011). In a follow-up study, van der Wel et al. (2012) tested whether joint performance of the task would

(A) Distributed control: Control is divided between co-actors such that the left co-actor can move the cursor in the vertical dimension while the right co-actor can move the cursor within the horizontal dimension (see Wahn et al., 2016). (B) Redundant control: Both co-actors can move the cursor in the vertical and horizontal dimensions.

facilitate subsequent individual performance (and vice versa). However, they did not observe any transfer effects.

Taken together, these results suggest that when co-actors distribute control over an object to move it along one spatial dimension, co-actors need to receive information about each other's actions (beyond the visible outcomes of these actions) to attain performance levels akin to individuals performing the same task alone. Otherwise, individuals outperform dyads. Arguably, receiving such additional information allows co-actors to more easily simulate and predict each other's actions, thereby overcoming the problem of not being able to access each other's internal models (Wolpert et al., 2003; Sebanz et al., 2006).

Bosga and Meulenbroek (2007) investigated differences between individuals and dyads using a task where participants pressed force transducers to lift a virtual horizontal bar to a target area. In the individual condition, participants used two transducers to control both ends of the bar bimanually. In the joint distributed condition, each co-actor used a single hand and controlled only one end of the bar. In both conditions, participants could see the bar. Thus, co-actors could observe the combined effects of their actions on the controlled object but did not have direct information regarding the specific actions of their co-actor. In line with findings by Knoblich and Jordan (2003), Bosga and Meulenbroek (2007) found that individuals outperformed dyads: Individuals performed faster movement corrections while lifting the bar and were better at stabilizing the bar in the lifted position. These findings have been replicated in a follow-up study (Newman-Norlund et al., 2008).

Recently, Wahn et al. (2016) investigated distributed control across two spatial dimensions. Two participants controlled either the horizontal or vertical movement of a cursor via key presses. Their joint goal was to move the cursor from a start position to a target position as fast as possible. Reaching the target necessitated both coarse and fine types of control: Coarse control was needed for steering the cursor toward the target during the approach phase whereas fine control was needed for placing the cursor precisely on the target position during the homing-in phase. Compared to an individual bimanual condition, dyads did not attain a group benefit in the approach phase, but they did so in the homing-in phase. Thus, in contrast to the studies discussed above, dyads outperformed individuals even though co-actors were not provided with information about each other's actions (i.e., key presses) but could only observe the combined effects of their actions on the controlled object. As dyads exceeded individual performance levels only when a fine type of control was required, this suggests that group benefits for joint object control depend on the task demands (i.e., coarse vs. fine control).

In light of the reviewed findings, we suggest that the emergence of group benefits in distributed control tasks may be explained by the degree of coordination required. Specifically, if two co-actors distribute control over an object that moves within one spatial dimension such that they can steer the object in opposite directions, the actions of one co-actor immediately affect the actions of the other co-actor. This requires a high degree of interpersonal coordination. In contrast, when control is distributed across two spatial dimensions such as when one actor controls the horizontal and the other the vertical dimension, the actions of one co-actor do not directly constrain the actions of the other. This lowers coordination demands and facilitates group benefits.

Besides the degree of coordination required to control an object jointly, a further factor affecting group benefits are coactors' interindividual skill differences. The similarity in coactors' individual performance levels has been shown to predict group benefits in the two-dimensional object control task described above (Wahn et al., 2016): The more similar the coactors' individual skills, the higher the group benefit when they perform together. There is also evidence that individuals do not benefit equally from interpersonal coordination (Mojtahedi et al., 2017). In particular, when two co-actors physically lifted and balanced an object by each grasping one of the two handles of the object, only the "worse" co-actor benefited (relative to her individual bimanual performance) whereas the "better" coactor's performance tended to decrease when performing the task jointly (Mojtahedi et al., 2017). However, in line with Bosga and Meulenbroek (2007), the joint performance was still worse than the individual performance in this type of control task (Mojtahedi et al., 2017).

In sum, the majority of studies investigating joint tasks with distributed object control find that individuals outperform dyads (Knoblich and Jordan, 2003; Bosga and Meulenbroek, 2007; Newman-Norlund et al., 2008; Mojtahedi et al., 2017). Findings also indicate that joint performance depends on (1) whether co-actors receive specific information about each other's actions (beyond seeing their combined effects on the controlled object) (Knoblich and Jordan, 2003; van der Wel et al., 2011); (2) the degree of coordination required (e.g., coordination in one or two spatial dimensions); (3) the type of control required (i.e., a coarse or fine type of control) (Wahn et al., 2016); and (4) co-actors' interindividual skill differences (Wahn et al., 2016; Mojtahedi et al., 2017).

# 3. REDUNDANT CONTROL

Group benefits have also been investigated using redundant object control tasks where two co-actors have the same sets of action possibilities and are free to exercise control redundantly or to flexibly distribute control. That is, despite the option to use all of their action possibilities, one co-actor may choose to use only a subset of her possible actions while the other co-actor may choose to use the complementary set. This type of voluntary distribution of control was demonstrated in a study by Reed et al. (2006). Dyads were instructed to accelerate an object within one spatial dimension toward a target position and then to decelerate the object until it stopped on the target. Control was redundant such that both co-actors could accelerate and decelerate the object. The authors found that dyads collaborated by having each coactor focus on either accelerating or decelerating the object. Thus, co-actors chose to distribute control even though redundant control was possible. This coordination strategy successfully enabled dyads to reach a group benefit. These results suggest that in joint tasks where control is not distributed a priori, group benefits can be reached because co-actors can freely coordinate

their preferred distribution of control. Of note, no such role specialization or group benefits were observed when participants performed the same task with a playback of human behavior (despite participants believing they acted with another person), suggesting that real, online interaction is necessary to reach a group benefit (Reed and Peshkin, 2008).

Further evidence that dyads adopt customized control strategies under redundant control conditions has been provided by Masumoto and Inui (2013, 2015). In a periodic force reproduction task, co-actors were required to jointly reproduce a target force (by continuously pressing force transducers) which varied periodically over time. While performing this task, they could see a visualization of the target force as well as their reproduced force on a computer screen. The authors found that dyads with redundant control achieved a more accurate performance than individuals performing the same task alone (Masumoto and Inui, 2013, 2015). Similar to the study by Reed et al. (2006), dyads used a distributed control strategy. That is, when one co-actor increased the exerted force, the other coactor decreased her exerted force and vice versa. In a followup study, the authors manipulated co-actors' level of experience and found that whereas pairs with one experienced member initially showed greater levels of action coordination (i.e., more complementary force production) than pairs of two novices, the latter achieved similar performance levels after only one block of practice (Masumoto and Inui, 2014). Consistent with the benefits of practice in distributed control tasks discussed above (Knoblich and Jordan, 2003), these findings suggest that initial performance deficits (i.e., relative to individual performance or more skilled dyads) may be compensated for already within one experimental session.

In another set of studies, researchers investigated the effects of redundant object control on subsequent individual motor learning of a tracking task (Ganesh et al., 2014; Takagi et al., 2017). Dyads initially tracked the movements of a target object using a redundantly controlled cursor. Subsequently, they performed the same task individually. Individual performance on the task improved more after participants had practiced with a co-actor compared to when they had practiced alone, with a computer, or with a playback of a co-actor's performance (Ganesh et al., 2014). Thus, individuals benefited most from practicing with an interactive human partner. In a follow-up study, acting with a simulated interactive partner that was based on a human co-actor led to similar benefits in individual motor learning (Takagi et al., 2017). These findings, together with the results obtained by Reed and Peshkin (2008), suggest that using (simulated) interactive partners, rather than playback of human behavior, could be highly beneficial in real-world applications such as motor rehabilitation.

In sum, studies investigating redundant object control have shown that dyads outperform individuals, and that they distribute control when having redundant action possibilities (Reed et al., 2006; Masumoto and Inui, 2013, 2015). In addition, practicing a motor task jointly can benefit subsequent individual motor learning (Ganesh et al., 2014; Masumoto and Inui, 2017; Takagi et al., 2017).

# 4. INTEGRATION OF FINDINGS AND FUTURE DIRECTIONS

When comparing results for distributed and redundant control, findings suggest that dyads are more likely to attain a group benefit when they have redundant control. Why is redundant control more beneficial? We suggest that the opportunity to freely distribute control is a crucial factor as to whether or not group benefits are attained. Co-actors with redundant action possibilities have the option to distribute control in accordance with their coordination strategies and their individual capabilities, enabling them to combine their skills in the most efficient manner. Such customized control strategies are not available when the distribution of control is determined from the outset.

So far, a number of factors that have been investigated in distributed control have not yet been investigated in redundant control. In particular, it remains to be tested whether coactors' performance in redundant control tasks is affected by the degree of coordination (e.g., coordination in one or two spatial dimensions), by the type of control required (i.e., a coarse or fine type of control) (Wahn et al., 2016), and by co-actors' interindividual skill differences (Wahn et al., 2016; Mojtahedi et al., 2017). Future research could also investigate how much time co-actors typically need to voluntarily distribute control, and whether the type of control distribution varies across dyads and across time.

An interesting factor that has not yet been investigated for either of the two types of control is group size. Does the size of a group benefit increase proportionally with the size of the group? Or is there an upper limit where the optimal group size has been reached such that further increasing the size will not lead to larger benefits? Another open question is how the social relationship between co-actors affects joint performance. Relatedly, a recent study on joint visual search found that the joint performance of two friends was better than that of two strangers (Brennan and Enns, 2015a). Furthermore, it would be worthwhile to investigate how individuals adjust their behavior to the specific co-actor with whom they are paired. It is likely that after first coordinating with one co-actor and then switching to a different one, individuals need to modify how they integrate the new co-actor's actions into their own action planning, possibly leading to initial decrements in joint performance.

A more technical direction for future research would be to introduce more informative measurements of joint performance. To date, the typical measure used to assess group benefits is the averaged performance difference between joint and individual conditions. Going beyond this measure, recent studies on joint visuospatial tasks have developed criteria to assess to what extent a group benefit can be ascribed to an actual collaboration between co-actors (Brennan and Enns, 2015b; Wahn et al., 2017, 2018a,b). That is, researchers have simulated a joint performance (based on the co-actors' individual performances) for which they assumed that co-actors act independently (i.e., do not collaborate). This simulated performance was then compared to the veridical joint performance. If veridical performance levels are higher than simulated performance levels, this suggests that co-actors did in fact collaborate. Similarly, future studies of joint object control could use measures that go beyond mere performance averages, thereby gaining valuable insight into how group benefits come about.

Finally, future research may explore whether the factors affecting group benefits in joint object control tasks are applicable to related real-world tasks. Areas of application range from aviation where pilot and co-pilot exercise joint control over an airplane, to motor rehabilitation where practice with another person might benefit subsequent individual motor learning (Takagi et al., 2017). In these contexts, research on joint object control may provide insights into how to circumvent individual motor limitations or how best to promote injury recovery.

# REFERENCES


Hutchins, E. (1995). How a cockpit remembers its speeds. Cogn. Sci. 19, 265–288.


# AUTHOR CONTRIBUTIONS

BW and AK literature research. BW, AK, and LS manuscript draft. BW, AK, LS, and PK manuscript revision.

# ACKNOWLEDGMENTS

We acknowledge the support of a postdoc fellowship of the German Academic Exchange Service (DAAD) awarded to BW. Furthermore, we acknowledge the support by H2020- H2020-FETPROACT-2014 641321-socSMCs for PK and the support from the Deutsche Forschungsgemeinschaft (DFG), and the Open Access Publishing Fund of Osnabrück University.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Wahn, Karlinsky, Schmitz and König. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Multiple Frames of Reference Are Used During the Selection and Planning of a Sequential Joint Action

Matthew Ray <sup>1</sup> and Timothy N. Welsh<sup>2</sup> \*

<sup>1</sup> Offshore Safety and Survival Centre, Marine Institute of Memorial University, St. John's, NL, Canada, <sup>2</sup> Faculty of Kinesiology and Physical Education, University of Toronto, Toronto, ON, Canada

Co-actors need to anticipate each other's actions to successfully perform joint actions. The frames of reference (FOR) used to simulate a co-actor's action could impact what information is anticipated. We hypothesized that co-actor's would adopt their co-actor's body-centered FOR, even when they do not share the same spatial orientation, so that they could anticipate body-related aspects of their co-actor's task. Because it might be beneficial to plan joint actions based on environment and body-centered information, we hypothesized that individuals would utilize multiple FORs during response planning. To test these hypotheses, participants performed a sequential aiming task where the goal was to move a wooden dowel to one of four potential targets as quickly and accurately as possible. A cue was presented at the beginning of each trial that was either 25, 50, or 75% valid. Following the cue presentation, the first person to act (initiator) placed the wooden dowel, anywhere they liked, in the workspace. Then, the finisher performed their aiming movement from the location that the initiator had placed the dowel. The key dependent measure was the dowel placement of the initiator because it provided an index of how much the initiator attempted to facilitate the efficient performance of the finisher. The results revealed that individuals adopted an allocentric FOR (dowel placement was more biased toward cued locations as cue validity increased) and partially adopted their co-actor's body-centered FOR (dowel placement was biased toward the finisher's body, but not toward the co-actor's contralateral space). In conclusion, multiple FORs can be used to anticipate both body- and environment-related information of a co-actor's task. It may be difficult, however, for individuals to fully adopt their co-actor's body-centered FOR when they have differing orientations.

Keywords: joint action, shared task representations, response selection and planning, frames of reference, motor simulation, sequential joint actions

# INTRODUCTION

Joint actions have been defined as "any form of social interaction whereby two or more individuals coordinate their actions in space and time to bring about a change in the environment" (Sebanz et al., 2006, p. 70). Examples of joint actions that occur on a daily basis include passing a bag of groceries, helping a child put on their shoes, and navigating through a crowd of individuals. Despite the fact that joint actions appear to be performed with ease and little thought, there are numerous motor and cognitive problems that must be solved to enable successful joint actions.

Edited by:

Anna M. Borghi, Sapienza Università di Roma, Italy

#### Reviewed by:

Annelie Rothe-Wulf, Albert Ludwigs Universität Freiburg, Germany Francois Quesque, Lyon Neuroscience Research Center (INSERM), France

> \*Correspondence: Timothy N. Welsh t.welsh@utoronto.ca

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 30 October 2017 Accepted: 29 March 2018 Published: 01 May 2018

#### Citation:

Ray M and Welsh TN (2018) Multiple Frames of Reference Are Used During the Selection and Planning of a Sequential Joint Action. Front. Psychol. 9:542. doi: 10.3389/fpsyg.2018.00542 For example, each individual movement that contributes to the joint action originates from different people with unique bodies, abilities, thoughts, and experiences. In addition, individuals engaging in a joint action occupy different locations in space and might be oriented in a variety of ways toward other co-actors, the goal, and/or other important features of the environment. Despite these issues that need to be considered when selecting and planning joint actions, individuals are able to come together in space and time to achieve shared goals.

One way that individuals can overcome the challenges of coordinating actions in space and time is by accurately anticipating the actions of co-actors (Sebanz and Knoblich, 2009). The anticipation of another's action is thought to be enabled through a process in which each individual in the group develops shared task representations and simulates their co-actor's actions in their own motor system (Sebanz and Knoblich, 2009; Vesper et al., 2010). Shared task representations are hypothesized to contain information pertaining to the shared goal, each co-actor's task for achieving that goal, beliefs, and contextual information (Vesper et al., 2010; Pezzulo, 2011). This information from the shared task representation is hypothesized to inform the simulation processes used during the anticipation of a co-actor's action. Based on the importance of shared task representations and action simulation for anticipating a co-actor's action, it is apparent that the factors that influence what information is represented and simulated will also impact what is anticipated during joint actions.

One factor that could influence the anticipation of a coactor's task is the spatial frames of reference that are used when representing and simulating a co-actor's action. In individual actions, spatial information used for the selection and planning of actions can be represented in different frames of reference. In egocentric frames of reference, information is coded relative to its spatial relationship to an individual's own body (e.g., eyes, hands and trunk—Colby, 1998; Klatzky, 1998; Galati et al., 2010). Allocentric frames of reference are environment-centered and information is coded based on the spatial relationship between objects or places in the environment (Colby, 1998; Klatzky, 1998; Galati et al., 2010). An additional frame of reference that needs to be considered during joint actions is the body-centered frame of reference of a co-actor. Being able to represent and simulate a co-actor's action from their body-centered frame of reference would be particularly important in scenarios where individuals are trying to predict how movement features, that can be coded to the co-actor's body, might impact the actual execution of an action (i.e., is the movement more difficult or uncomfortable based on the spatial relationship of the co-actor's body, effector and the environmental goal). Depending on the type of joint action context, critical action-related information could be coded in egocentric, allocentric and/or the body-centered frame of reference of a co-actor. Therefore, the frame of reference used when representing and simulating a co-actor's action could influence what is predicted, and hence, how effectively co-actors plan their actions to achieve the shared goal.

Pezzulo et al. (2013) have developed a shared action space framework to describe the frames of reference that might be used during joint actions. They have proposed that shared action spaces develop based on the same mechanisms that recalibrate spatial representations during tool use. Their framework is based on the notion that different frames of reference would be used in different contexts and that learning is critical in the development of these shared action spaces. For instance, they have hypothesized that goals (congruent, competitive, complementary), spatial orientation (angular disparity between co-actors), type of perspective taking [what another individual perceives (level 1 perspective) vs. how another individual experiences or would act in the world (level 2 perspective)], social factors (parent and child), and the complexity of an action all influence the frame(s) of reference adopted during joint actions.

In the simplest scenario, when co-actors share the same viewpoint and/or they only need to consider what each other can perceive, and the task requirements are low, co-actors might adopt a merged egocentric perspective—one in which each individuals' egocentric action space is combined and represented in a shared action space. However, when individuals need to consider how a co-actor will actually experience their executed action and they do not shared the same viewpoint then a more complicated scenario emerges. In this case, if there is a large angular disparity between co-actors and they have opposite spatial codes (R/L) relative to their body and the environment, then adopting the body-centered frame of reference of a co-actor would depend on complex spatial transformations to align the frames of reference. Due to the complexity of these transformations, co-actors may adopt different frames of reference to perform joint actions together. Therefore, Pezzulo et al. (2013) suggested that in those complex joint action contexts, multiple frames of reference could be adopted. By considering the proposals of Pezzulo et al. (2013), it becomes clear that there are numerous factors that can affect the frames of reference used in joint actions and, hence, the type of action-related information (environment-centered, body-centered, or other person bodycentered) that could be anticipated and integrated into the selection and planning of joint actions. Although there is a growing body of literature on the complex processes that underlie joint action, the frames of reference that are used during the representation and simulation of a co-actor's action is one research topic that requires further attention.

The current literature that focuses on the frames of reference used during different types of joint actions provides evidence that individuals will co-represent a partner's response and take into account their co-actor's perspective. For example, previous research has revealed that co-actors will: adjust the height of their reaching trajectories based on the eye level of their coactor (Quesque and Coello, 2014); laterally shift their pointing trajectories toward the person being addressed in communicative pointing (Cleret de Langavant et al., 2011); change their reach-tograsp kinematics based on the relationship, relative position, and pronoun use of co-actors (Gianelli et al., 2013); spontaneously adopt the visuospatial perspective of a co-actor during a stimulusresponse compatibility task (Freundlieb et al., 2016); perform slower pointing movements and increase end point hold time during communicative pointing movements (Oosterwijk et al., 2017); and partially adopt a co-actor's body-centered frame of reference during a shared negative priming task (Frischen et al., 2009) and when performing a mental rotation task in the presence of a co-actor (Böckler et al., 2011). Taken together, these studies show that individuals can take their co-actor's bodycentered frame of reference into account when performing social motor behaviors. However, based on the framework put forth by Pezzulo et al. (2013), one could make the case that the task requirements (action demands, level of perspective taking required) were low and/or the angular disparity between coactors was small. Therefore, the findings from these studies may not apply or scale to more complex joint action contexts. For example, when actions are used to communicate or signal information (Cleret de Langavant et al., 2011; Quesque and Coello, 2014; Oosterwijk et al., 2017) or no physical interaction is required (Böckler et al., 2011; Gianelli et al., 2013; Freundlieb et al., 2016), then individuals could have performed the task by using a level 1 visual perspective (i.e., what another person perceives). In contrast, when an individual is trying to anticipate how a co-actor will execute their action so that they can select and plan an action to help facilitate the achievement of a shared goal, then a level 2 visual perspective (i.e., how another person experiences or would act in the world) would be required.

Sequential joint actions (i.e., joint actions that involve multiple steps performed in a serial manner) are one type of joint action where co-actors could better achieve the shared goal if they could fully adopt each other's body-centered frame of reference during the anticipation of potential actions. Adopting the perspective of a co-actor is important because it would better enable individuals to plan actions that accommodate specific features of their co-actor's task, and hence, would help in achieving the shared goal. To date, findings in the sequential joint action literature demonstrate that when co-actors shared a similar spatial alignment to the environment and/or performed simple tasks (e.g., binary response alternatives), they adopted their co-actor's body-centered frame of reference. For example, individuals planned their actions to accommodate a comfortable grasping posture for their co-actor when manipulating a passed object (Gonzalez et al., 2011; Ray and Welsh, 2011; Meyer et al., 2013; Dötsch and Schubö, 2015; Constable et al., 2016; Scharoun et al., 2016). Overall, the research reviewed thus far is consistent with the framework of Pezzulo et al. (2013) and shows that when action and task requirements are low and/or co-actors have a small angular disparity between them (<90◦ ), then individuals can represent space from their co-actor's body-centered frame of reference. However, there is currently a paucity of research that has explicitly investigated the frames of reference that are used in complex joint action tasks where co-actors have a large angular disparity between them (e.g., 180◦ ). Therefore, it is unclear if, as suggested by Pezzulo et al. (2013), co-actor's will still attempt to adopt their co-actors body-centered frame of reference in these complex scenarios, due to the complex spatial transformations required to align body-centered frames of reference, or if they will adopt an allocentric frame of reference or multiple frames of reference.

Although not the primary purpose of their research, Ray et al. (2017) have provided some initial evidence that individuals can adopt their co-actor's body-centered frame of reference during the response selection and planning of a more complex sequential joint action task where co-actors did not share the same spatial alignment. Ray et al. (2017) sought to determine if individuals could represent and simulate the difficulty of their co-actor's actions and, hence, accommodate that difficulty during their response planning. The role of the initiator of the sequential joint action task was to place a dowel on a line in between two targets. The location that the initiator placed the dowel on the line determined the location from which the finisher would have to initiate their final reaching movement. Movement difficulty was manipulated based on the size of the targets (i.e., index of difficulty; Fitts, 1954) and the side of space of the target (reaching movements to contralateral space are slower and less accurate than reaching movements in ipsilateral space; Fisk and Goodale, 1985). Participants performed a joint version of the task and an individual version of the task (in which the same person was both the initiator and the finisher). The joint task was completed before and after the individual version of the task and comparisons in the performance of the initiator before and after the individual task allowed for the investigation of how firsthand motor experience would impact the response selection and planning of the initiator.

Ray et al. (2017) hypothesized that if the initiator represented and simulated the difficulty of the finisher's potential actions, then the initiator would bias the dowel placement toward the smaller target of the pair and toward targets in contralateral space. The results showed that the initiator planned their actions to accommodate the index of difficulty of the finisher's potential movements, whereas side of space only partially influenced their response planning, and only after first-hand motor experience (i.e., in the joint task that was performed after the individual task). These results could be interpreted as showing that the initiator represented and simulated the finisher's potential actions from the finisher's body-centered frame of reference. However, given that the side of space only partially influenced response planning, and only after first-hand motor experience, it still remains unclear if the initiator actually adopted their co-actor's body centered frame of reference. The other factor that makes it difficult to determine what frames of reference were used is that the co-actor's used mirror effectors (the initiator and finisher sat across from each other and the initiator used their right hand and the finisher their left hand). Therefore, side of space (contralateral/ipsilateral) was the same for both individuals, and hence, the initiator could have simulated the potential actions from their own egocentric frame of reference and not the finisher's body-centered frame of reference. Because of these issues, it remains unclear if individuals can adopt their co-actor's body-centered frame of reference during complex joint actions where co-actors do not share the same spatial alignment or if they will use an allocentric frame of reference or multiple frames of reference.

In terms of the use of multiple frames of reference, a clarification is required here. There is already some evidence that when action and task requirements are low and co-actors have a small angular disparity between them, then it appears as though multiple frames of reference can be adopted. For example, there is evidence that when individuals have to physically interact with the object that they will pass to their co-actor (e.g., Ray

and Welsh, 2011; Meyer et al., 2013; Dötsch and Schubö, 2015), or synchronize the timing of imagined movements with a coactor (Vesper et al., 2014), then individuals will use both their own body-centered frame of reference and their co-actor's bodycentered frame of reference. In addition, there are findings in the joint attention literature that show that individuals might represent space from both their own body-centered frame of reference and their co-actor's body centered frame of reference (e.g., Frischen et al., 2009; Böckler et al., 2011). However, what has not been demonstrated thus far is whether individuals will represent their co-actor's portion of the task from multiple frames of reference (e.g., environment-centered and other person bodycentered). This is an important point because depending on the joint action task demands, individuals may need to anticipate both body-centered and environment-centered features of their co-actor's action so that they can facilitate the achievement of the shared goal.

In summary, it is clear that to expand our understanding of the frames of reference that are used in joint actions research needs to be undertaken that utilizes a joint action task that: (1) has high task and action requirements, (2) has a high angular disparity between co-actors, and (3) allows individuals to facilitate their co-actor's task using either, or both, environment-centered and body-centered frames of reference. To that end, the present studies were designed to investigate the frames of reference used during a complex sequential joint action task where co-actors had a large angular disparity between them, and individuals could use information derived from multiple frames of reference to facilitate their co-actor's task. If different pieces of information can only be generated from simulations that occur from specific frames of reference, then the ability of co-actors to adopt different frames of reference is integral to co-actors reaching shared goals. For instance, allocentric frames of reference would be useful when there are multiple potential actions and the individuals need to consider the spatial relationship of those actions. Whereas, simulating actions from a co-actor's bodycentered frame of reference would allow the increased difficulty of reaching movements in contralateral space (termed the "side of space" effect for this document) (Fisk and Goodale, 1985) or the preference for performing extensor over flexor movements to be anticipated (termed the proximity-to-body effect here) (Brown et al., 1948; Reed and Smith, 1961). The current studies were not designed to test the dominance of one frame of reference over another. Instead, the key questions of interest were whether individuals could adopt their co-actor's body-centered frame of reference during complex joint actions where co-actors do not share the same viewpoint and whether individuals could represent their co-actor's task from both allocentric and other person body-centered frames of reference during the response selection and planning of a sequential joint action.

# EXPERIMENT 1

To investigate the frames of reference adopted during sequential joint actions, individuals performed a sequential task, either alone or with a partner, which required them to move a wooden dowel as quickly and accurately as possible to one of four potential targets. During the joint version of the task, co-actors sat across from each other and the task was divided between the two individuals. The initiator was told that their co-actor (the finisher) would have to make their movement to the target from wherever they had placed the dowel on the board. Although the finisher's action is important to the task, the theoretically-relevant component of the task is the manner in which the initiator places the object for the finisher. Where the object is placed is important because the response selection and planning of the initiator can provide insight into the frames of reference used when representing and simulating the finisher's portion of the task (e.g., Ray and Welsh, 2011; Ray et al., 2017).

The targets were organized in a square and were equidistant from the center of the black board. The task was broken up into two steps. Prior to the initiation of a trial, a spatial cue (flash of an LED) indicated the potential target location for the trial with different degrees of predictability. In one block, the cue was nonpredictive. Because each bock consisted of 48 trials, the target was at the cued location on 12 of the 48 trials (25% valid block). In the other blocks, the cue predicted the target location with 50% (24 of the 48 trials) or 75% (36 of the 48 trials) predictability. After receiving the cue, the initiator of the sequential joint action moved a wooden dowel anywhere they wished on the task environment, except for onto the targets themselves. The participant was told that the subsequent movement of the dowel onto the actual target needed to be initiated from wherever the dowel was placed. When the location of the dowel at the end of the first movement was recorded, one LED flashed to indicate the actual location of the target for that trial and the dowel had to be moved as quickly as possible onto the target location. Based on this design, several hypotheses and predictions were made.

The first hypothesis was that the initiator would adopt an allocentric frame of reference to integrate the cue validity information into their response planning. Based on this hypothesis, it was predicted that the dowel placement of the initiator would be influenced by both the cue probability and spatial location of the other potential targets. If the dowel placement was influenced by the cue validity and the spatial location of the other potential targets, then the dowel placement would be closer to the cued location as cue validity increased. In contrast, if the dowel placement was not influenced by cue validity and the other potential targets locations, then the dowel would not be placed closer to the cued location as cue validity increased or reflect the locations of the uncued target locations (i.e., it would be placed close to the center of the board regardless of the predictability or location of the cue).

The second hypothesis was that the initiator would adopt their co-actor's body-centered frame of reference during response planning, and therefore, would plan the action to accommodate the side of space effect (Fisk and Goodale, 1985) and the proximity-to-body effect (Brown et al., 1948; Reed and Smith, 1961). First, during the individual task, if the individuals planned their action from an egocentric perspective and anticipated the increased difficulty associated with movements in contralateral space, then their dowel placement should be closer to the targets in contralateral space in comparison to targets in ipsilateral space (e.g., Ray et al., 2017). In addition, if they adopted an egocentric frame of reference and anticipated the differences in initiating movements near and far from the body, then consistent with the proximity-to-body effect (Brown et al., 1948; Reed and Smith, 1961), the dowel placement should be biased toward their own body. During the joint task, if the initiator adopted their co-actors body-centered frame of reference, then the initiator should bias the dowel placement toward their co-actor's contralateral space (their own ipsilateral space) and toward their co-actor's body (closer to the co-actor's body and farther from their own body).

The third hypothesis was that the initiator would represent and integrate information based on multiple frames of reference into the selection and planning of sequential joint actions. In the present task, the cue validity should be represented in an allocentric frame of reference while the proximity-to-body and side of space effects should arise from a body-centered frame of reference. If, during the individual or joint task, the initiator adopted multiple frames of reference and integrated spatial information from these frames of reference into the response planning, then the dowel placement would be influenced by a combination of cue validity and one or both of the body related features (proximity-to-body, side of the space). In contrast, if during the individual or joint task, the initiator did not adopt multiple frames of reference during the selection and planning of the sequential actions, then the response would be based solely on cue validity or the body related factors (proximity-to-body, side of space), but not both.

# Methods

#### Participants

Twenty-two right-handed participants (mean age = 23.5, SD = 4; 6 males, 16 females) were recruited from the student population at the University of Toronto. One participant was unable to follow instructions and their data was not included in any of the analyses. All participants were naïve to the purpose of the study. Handedness was self-reported and all participants reported normal or corrected-to-normal vision. There were two separate experimental sessions with each lasting approximately 45–60 min each and participants were compensated \$20 for their time. Written informed consent was given by all participants and this research complied with the Declaration of Helsinki and the procedures were approved by the University of Toronto Health Sciences Research Ethics Board.

#### Experimental Set Up and Apparatus

Participants sat on opposite sides of a table (see **Figure 1**). On top of the table, there was a foam board sheet, affixed by two C-clamps, that was 102 cm wide, 76 cm long, and 1 cm thick. On the side of the table that was closest to the participant there was a 3 cm diameter circle that served as the home position (location: 51 cm from right and left edge, and 10 cm from the bottom edge, relative to the participant). The black circle was on a piece of paper that was laminated and fixed to the foam board. In the middle of the board between both participants, there was a black square sheet of poster board (46 by 46 cm) taped to the foam board. On top of the sheet of poster board, there were

circles are the four potential targets and the four red circles are spatial locations of the cues (i.e., light emitting diodes).

four white circles (6 cm diameter) made out of poster board that functioned as target locations during the task. The four targets were equidistant to each other and were 20 cm, on a diagonal path, from the midpoint of the board.

To present cue and target location signals, four red LED lights attached to the foam board. The LED lights were at the same distance from the front and back edges of the board as the targets locations. The LEDs were 12 cm from the left and right edges of the foam board so that they were not blocked by the limbs of the individuals performing the task. The object that was moved to these targets was a wooden dowel (2.2 cm in diameter and 8 cm in height). To capture the position of the wooden dowel, an infrared light-emitting diode (IRED) from an active motion tracking system (Optotrak Certus) was attached to the center of one tip of the wooden dowel. The position of the IRED marker was recorded at 200 Hz. The tip of the wooden dowel was cut on a 45◦ angle so that the IRED could be easily seen by the motion tracking system. Two Dell speakers were used to present auditory signals. A Dell Optiplex 780 computer was used to run custom Matlab software and experimental output was displayed on a 19 ′′ LCD monitor. The custom Matlab software sent signals to the speakers and the red LED lights, recorded and analyzed the dowel position, performed block and trial randomizations and organized the structure of experimental session.

#### Design and Procedure

Participants performed individual and joint versions of the task on separate days. The order of the sessions was counterbalanced across participants. The experimental sessions were performed 1–2 days apart.

In the joint version of the task, the participant performed the task with a confederate (a 23 year old male who was unknown to the participants). Once the partners had been introduced and the informed consent was read and signed, verbal instructions were given. The participant and confederate were told that they were teammates and their goal was to move a wooden dowel as quickly and accurately as possible onto a target. In addition, they were told that the task was to be divided up between them and they each had very specific roles for completing the task. The participant was told that they would be the initiator and the confederate was told that they would be the finisher. The participant was told that the role assignment was random; however, the participant always performed the initiator role. The initiator was responsible for moving the wooden dowel from the home position and placing it anywhere they wanted on the black poster board sheet that contained the targets (but not on one of the actual targets). In addition, they were told that the finisher would have to make their movement to the target from wherever the initiator had placed it down and that, although the finisher had to move as fast as possible, there was no time constraint for the initiator's portion of the task.

The team was told that prior to the beginning each trial, there would be a cue that indicated the potential target location and that, depending on the block, the cue validity was either random/non-predictive (25% valid) or predictive (50 or 75% valid). The cue was signaled via a flash of light from a red LED that spatially corresponded to one of the target locations. The teammates were told that the cue validity remained constant during a block and that they would be told the cue validity at the beginning of each block. The initiator performed their portion of the task following the cue presentation. Once the initiator placed the dowel on the black square, they released the dowel and then the finisher grasped the dowel and waited for one the four red LEDs to flash and signal the actual target location.

The participants were told that there were three blocks; one block for each cue validity. The block order was counterbalanced across participants. In each block there were 48 trials. Each target location was cued 12 times per block. In the 25% cue validity block, the target appeared in the same location as the cue approximately one out of every four trials (three out of the 12 trials for that cued location). The target locations for the remaining nine trials were divided evenly between the other three target locations (three trials per location). Thus, target location was random with respect to the cue. In the 50% cue validity block, the target was at the same location as the cue on six out of 12 trials; on the remaining six trials the location of the target was divided evenly between the remaining three locations (two trials per location). For the 75% cue validity block, the target was in the same location as the cue for nine out of the 12 trials for a particular cued location. For the remaining three trials, the actual target location was divided evenly between the remaining target locations (one trial per location). The cue and target locations were chosen, according to the parameters mentioned above, via a randomization procedure using custom Matlab software.

Each trial began with the teammates sitting across from each other and the dowel in the home position in front of the initiator. Following the cue presentation, which occurred after a variable foreperiod (range of 100–1,000 ms), the initiator placed the dowel on the black sheet and the dowel location was recorded. The dowel position was displayed to the experimenter who verified that the recording procedure had worked. If there were any recording issues, then the experimenter was prompted to record another sample of the dowel position. Sample recordings were repeated until a valid dowel position was recorded for each trial. Once the location of the dowel was recorded, the finisher grasped the dowel and waited for the target location to be signaled. Following a variable foreperiod (range of 100–1,000 ms) that was determined by a randomization procedure in Matlab, the target location was signaled by a flash of light from one of the LEDs. The finisher moved the dowel as quickly and accurately as possible to the target location. Once the movement was finished, an auditory beep was presented for 50 ms to signal to the initiator that they could bring the dowel back to the home position for the next trial.

The trial procedure for the individual version of the task was similar to the joint version of the task except that the individual performed both the initiator and finisher roles. Therefore, the key difference was that after the cue presentation and dowel placement, the participant continued holding on to the dowel until the target location was signaled. Once the dowel position was recorded, the target location was signaled and then the dowel was moved as quickly and accurately as possible to the target.

#### Data Analysis

The current experiment was designed to determine if dowel placement, following the presentation of a cue, was influenced by cue validity, the side of space effect and/or the proximityto-body effect. The main dependent measure for this study was the distance, in millimeters, that the dowel was placed relative to the cued location. The dowel position was recorded in absolute X and Y coordinates; therefore, each data point had to be transformed to a relative distance, in the X and Y coordinates, to the cued location. Separate statistical analyses were performed on the X and Y coordinates. The analysis in the X coordinate would reveal differences in ipsilateral and contralateral space, whereas the Y coordinate analysis would reveal differences in near and far space relative to the position of the participant. The relative distance to the cued location was calculated by taking the absolute position of the dowel and subtracting the distance to the center of the cued location. The cued locations were coded relative to their position to the initiator. Because the initiator used their right hand to perform the task, the cued locations on the right side of space were coded as ipsilateral and the cued locations on the left side of space were coded as contralateral. In addition, the two cued locations on the bottom row, relative to the initiator (always the participant), were coded as near locations and the two cued locations closer to the finisher were coded as far locations. Because the confederate was in the opposite side of space and used their right hand all of the spatial coding is reversed; therefore, ipsilateral space is the finisher's contralateral space and near space is the finisher's far space. The data were also coded based on the cue validity (25, 50, and 75%) and task (individual vs. joint).

# Results

#### Absolute Data

**Figure 2** provides a pictorial representation of the average placement of the dowel, in absolute coordinates for each cue validity and task condition. The dotted lines represent the midline for both the X axis and the Y axis.

### Differences in the Distances to Cued Locations Along the Y Axis (Near to Far)

The Y coordinate data was analyzed with a mixed model ANOVA with 2 (task context: Individual, Joint) x 3 (cue validity: 25, 50, 75) x 2 (side of space: Ipsilateral, Contralateral) x 2 (proximity-tobody: Near, Far) as repeated measures factors and Order (Joint task first, Individual task first) as the between-subjects factor. This analysis revealed statistically significant main effects for task context, F(1, 19) = 7.81, p = 0.012, cue validity, F(2, 38) = 46.72, p < 0.001, and proximity-to-body, F(1, 19) = 4.90, p = 0.039. The main effects for side of space, F(1, 19) = 0.24, p = 0.632 and order, F(1, 19) = 0.690, p = 0.797, were not statistically significant. The full ANOVA table is presented in Appendix A.

Post-hoctesting of effects with 3 or more means was completed using the Bonferroni (Dunn's test) correction. The task context analysis revealed that, overall, when participants performed the individual task (M = 72.5 mm, SD = 29.8) they placed the dowel closer to the cued location in comparison to when they performed the task with a partner (M = 89.8 mm, SD = 19.1). The cue validity analysis (see **Figure 3**) showed that the dowel was placed significantly closer to the cued location when the cue

was 75% valid (M = 53.3 mm, SD = 25.0) than when the cue was 50% valid (M = 75.4 mm, SD = 28.4), t(20) = 4.47, p < 0.001, and 25% valid (M = 114.6 mm, SD = 26.7), t(20) = 8.35, p < 0.001. In addition, the dowel was placed significantly closer to the 50% cue in comparison to the 25% cue, t(20) = 5.94, p < 0.001. The proximity-to-body analysis revealed that the dowel was placed closer to the cued location when the cue was in Near space (M = 75.6 mm, SD = 28) in comparison to when the cue was in Far space (M = 86.6 mm, SD = 17.2). Taken together, these results show that the dowel was placed closer to the cued location, along the Y axis during the individual task. In addition, when the cued location was in the Near space, relative to the initiator's body, the dowel was placed closer to the cued location in comparison to when the cued location was in Far space. Lastly, this pattern of effects also shows that as cue validity increased the dowel was placed closer to the cued location.

The results of the ANOVA analysis also showed that there was a significant interaction between task context and proximity-tobody, F(1, 19) = 19.46, p < 0.001 (see **Figure 4**). Post-hoc analysis revealed that when the cue was presented in Near space (to the initiator), the dowel was placed closer to the cued location along the Y axis in the Individual task (M = 56.8 mm, SD= 45.0) in comparison to the Joint task (M = 94.5 mm, SD = 25.3), t(20) = 3.78, p = 0.001. In contrast, when the cue was in Far space (relative to the initiator) there was no significant difference in the distance that the dowel was placed between the Individual task (M = 88.2 mm, SD = 21.4) and the Joint task (M = 85.1 mm, SD = 18.3), t(20) = 0.82, p = 0.442. To determine how the dowel placement in Near and Far space varied as a function of task context additional post-hoc testing was performed. That analysis showed that during the Individual task the dowel was placed closer to cued locations in Near Space (M = 56.8 mm, SD = 45.0) in comparison to the Far Space (M = 88.2, SD = 21.4), t(20) = 3.87, p = 0.001. In contrast, the analysis of the Joint task did not show a statistically significant difference between the Near Space (M = 94.5 mm, SD= 25.3) and Far space (M = 85.1,

SD = 18.3), t(20) = 1.97, p = 0.063. Overall, this pattern of effects reveals that in the individual task the placement of the dowel was clearly biased toward the initiator's body. In contrast, during the joint task the dowel placement was biased toward the finisher's body in Near space but not Far space.

Finally, the analysis also revealed an interaction between proximity to the initiator's body and side of space, F(1, 19) = 22.62, p < 0.001. In ipsilateral space, the dowel was placed closer to the cued location when it was in Near space (M = 71.3 mm, SD = 30.7 mm) in comparison to when it was in Far space (M = 90.3 mm, SD = 17.7), t(20) = 3.54, p = 0.002. There were no significant differences between Near space (M = 79.9 mm, SD = 27.2) and Far space (M = 83.2 mm, SD = 18.2), t(20) = 0.66, p = 0.51, in the contralateral side of space relative to the initiator. This interaction reveals that the dowel was placed closer to the initiator's body in ipsilateral space in comparison to contralateral space. Note that the three-way interaction between task context, proximity-to-the body and side of space was not statistically significant, F(1, 19) = 4.00, p = 0.060, indicating that task context did not influence the interaction between body and side of space in dowel placements.

#### Differences in the Distances to Cued Locations Along the X Axis (Left to Right)

To determine what factors influenced dowel position placement in the X coordinate, a 2 (task context: Individual, Joint) x 3 (cue validity: 25, 50, 75) x 2 (side of space: Ipsilateral, Contralateral) x 2 (proximity-to-body: Near, Far) repeated measures mixed model ANOVA with Order (Joint task first, Individual task first) as the between-subjects factor was conducted. There were statistically significant main effects for cue validity, F(2, 38) = 49.65, p < 0.001, and side of space, F(1, 19) = 32.62, p < 0.001. In contrast, the main effects for proximity-to-body, F(1, 19) = 2.17, p = 0.157, task, F(1, 19) = 0.47, p = 0.501, and order, F(1, 19) = 0.52, p = 0.481, did not reach statistical significance. The full ANOVA table is presented in Appendix B.

Post-hoc testing, using Bonferroni's adjustment, was completed to determine differences in the levels of the main effects. The cue validity analysis (see **Figure 5**) showed that the dowel was placed significantly closer to the cued location when the cue was 75% valid (M = 59.0 mm, SD = 27.5) than when then when the cue was 50% (M = 84.0 mm, SD = 29.0), t(20) = 5.48, p < 0.001, and 25% (M = 120.9 mm, SD = 29.0), t(20) = 8.17, p < 0.001. In addition, the dowel was placed significantly closer to the cue when the cue was 50% valid (M = 84 mm, SD = 29.0) in comparison to the 25% valid cue (M = 120.9 mm, SD = 29.0), t(20) = 5.34, p < 0.001. The side of space analysis revealed that the dowel was placed closest to the cue when the cue was in contralateral space relative to the initiator's body (M = 73.7 mm, SD = 23.9 mm) than when the cue was in ipsilateral space (M = 102.6 mm, SD = 27.3), t(20) = 5.54, p < 0.001. Overall, these results for cue validity indicate that the dowel was placed closer to the cued location as cue validity increased. In addition, when the cued location was in contralateral space (relative to the initiator), the dowel was placed closer to the cued location than when the cued location was in ipsilateral space.

For additional results, please see Appendix C.

FIGURE 4 | Experiment 1: Mean distance of the dowel to the cued location in Near and Far space during the Individual and Joint tasks. Error bars represent the standard error of the mean. The \* indicates a statistically significant difference.

FIGURE 5 | Experiment 1: The mean distance that the dowel was placed from the cued location along the X axis (left and right space) for each cue validity condition. Error bars represent the standard error of the mean. The \* indicates a statistically significant difference.

## Discussion

Experiment 1 was designed to test if: (1) the initiator of a sequential joint action adopted the finisher's body-centered-frame of reference, even though there was a high angular disparity between them, and (2) the initiator adopted multiple frames of reference during the anticipation of the finisher's action, and hence, planned an action that accommodated multiple action features that are represented in different frames of reference. The following sections will address these hypotheses and findings.

The first hypothesis was that an allocentric frame of reference would be adopted to utilize the cue validity information during response planning. The prediction was that the dowel would be placed, by the initiator, closer to the cued location as cue validity increased while simultaneously minimizing the distance to the other potential target locations. The results were congruent with the use of an allocentric frame of reference during the response selection and planning of the initiator. Specifically, the dowel was placed closer to the cued location as the cue validity increased and when the cue validity was low the dowel was placed in a location that was a similar distance to all the other potential actions. This finding builds on previous research by showing that when critical environment-related information which can be used to facilitate the achievement of the shared goal is available, individuals can use an allocentric frame of reference during the anticipation of a co-actor's action. The following section will explore whether individuals were able to adopt their co-actor's body-centered frame of reference.

In certain joint action contexts, if individuals adopted their co-actor's body-centered frame of reference, then they might be able to anticipate and accommodate action features that are body-based during response planning (e.g., posture). Therefore, an additional hypothesis was that the finisher's body-centered frame of reference would be adopted by the initiator when they anticipated the finisher's potential actions, even though there was a large angular disparity between the co-actors, and hence, bodyrelated action features (side of space effect, proximity-to-body effect) would be integrated into the response planning of the initiator. There were two results that helped to elucidate whether individuals adopted their co-actor's body-centered frame of reference.

The first result, based on the Y axis (near and far from the initiator) analysis, showed that the proximity-to-body effect was modulated by both task context (Individual, Joint) and proximity to the initiators body (Near, Far). In the Near space condition (relative to the initiator's body), the dowel was placed closer to the cued location (along the Y axis) in the individual task in comparison to the joint task. Secondly, in the individual task the dowel was placed closer to the cued location in Near space and farther from the cue in Far space. Taken together, this pattern of effects indicates that in the individual task the dowel was biased toward the initiator's body in both near and far space (i.e., consistent with the proximity to body effect: Brown et al., 1948; Reed and Smith, 1961). Therefore, this finding indicates that a body-centered frame of reference was used in the individual task because the dowel was placed closer to the more difficult near targets (relative to the initiator's body) requiring flexor movements than to the far movements requiring extensor movements (see Augustyn and Rosenbaum, 2005; Ray et al., 2017 for evidence of similar biasing in a Fitts' Law task). In contrast, it remains unclear to what degree the initiator adopted the finisher's body-centered frame of reference because the dowel was only biased toward the finisher's body in Near space (far from the finisher's body) but not Far space (near to the finisher's body). If the initiator had adopted the finisher's body-centered-frame of reference during all of the trials, then the dowel should have been biased toward the finisher's body in both Near and Far space. However, the bias toward the finisher's body, in the initiator's Near space, does provide some evidence that body-related information was considered during response planning.

The second result, derived from the analysis of the X axis, showed a main effect for side of space (Ipsilateral, Contralateral), but no task context interaction. The X axis (left to right) dowel placement analysis showed that the dowel placement was biased toward the contralateral space (relative to the initiator), but not the finisher's contralateral space. The contralateral bias of the dowel placement in the individual task is consistent with the findings of Ray et al. (2017) and likely emerged because movements into contralateral space are less efficiently executed than those into ipsilateral space (Fisk and Goodale, 1985). Hence, the contralateral bias would help to equate the difficulty of the movements into each direction should the cue prove to be invalid (see also Ray et al., 2017). If the initiator had fully adopted their co-actors body-centered frame of reference, then there should have been a side of space by task context interaction due to the fact that contralateral space was the opposite side of space in the Individual and Joint tasks [i.e., a contralateral bias in the individual task and an ipsilateral bias (from the participants' perspective) in the joint task]. Such was not the case. Overall, the lack of a side of space effect and the presence of a partial proximity-to-body effect is congruent with the pattern of effects from the Frischen et al. (2009) study. In their study, the negative priming was strongest when the distracting stimuli was placed closest to their co-actor's hand. In contrast, the negative priming was stronger in the ipsilateral space of the observer and not their co-actor's ipsilateral space. Taken together, the pattern of effects from this experiment and the Frischen et al. (2009) study are consistent with the idea that when a task is complex and the co-actors are sitting opposite to one another, individuals might not completely adopt their co-actor's body-centered frame of reference (Pezzulo et al., 2013).

An additional purpose of this study was to investigate if the initiator adopted multiple frames of reference (i.e., both allocentric and other person body-centered) when they anticipated the finisher's potential actions and integrated that anticipated information into their response planning. In the joint task, the initiator appeared to select and plan their action based on information derived from an allocentric frame of reference (cue validity) and partially based on the body-centered frame of reference of the finisher (partial proximity-to-body effect), and hence, provides tentative support for the use of multiple frames of reference during the selection and planning of a joint action. This finding is consistent with the suggestion of Pezzulo et al. (2013) that during complex joint actions multiple frames of reference might be used simultaneously. This finding potentially goes one step further than previous work that has shown that individuals can represent a joint task from both their own egocentric frame of reference and their co-actor's body-centered frame of reference during joint tasks (e.g., Frischen et al., 2009; Böckler et al., 2011; Meyer et al., 2013; Vesper et al., 2014; Dötsch and Schubö, 2015), by showing that individuals may be able to represent the task from their own egocentric frame of reference, an allocentric frame of reference and a partially adopted body-centered frame of reference of their co-actor.

There is one potential design issue that potentially makes it difficult to interpret the different pattern of effects in Near and Far space in the Individual and Joint tasks. One reason why the dowel might not have been biased toward the finisher's body in Far space (their co-actors Near space) might be the size of the action space. Although the initiator could clearly reach into Far space, because they were able to move the dowel to those targets during the individual task, it may have been undesirable to make such large amplitude reaches due to the effort required. Curiously, the dowel placement was almost identical, in both the individual and joint tasks, in Far space. This finding might be an indication that the initiator was using their own bodycentered frame of reference for response planning in Far space (and not fully accounting for the proximity-to-body effect), or it might indicate that the response planning was influenced by the distance required to reach into Far space. Therefore, to test between these competing hypotheses an additional experiment was performed (Experiment 2). The task was identical except that the action space was reduced by half to limit the distance that the initiator would have to reach into Far space during both initiator and finisher roles.

# EXPERIMENT 2

This experiment was designed to test if a smaller action space would influence how the initiator placed the dowel for the finisher in both Near and Far space. If the dowel placement in Far space was shaped by the reaching distance and not the partial adoption of the finisher's body-centered frame of reference, then now that the action space is smaller the dowel should be biased toward the finisher's body in both Near and Far space. In contrast, if the dowel placement is due to partially adopting the finisher's bodycentered frame of reference, then the dowel placement would be biased toward the finisher's body in Near space but not in Far space.

# Methods

#### Participants

Nineteen new participants (right-handed; mean age = 20.7, SD = 3.18; 6 males, 13 females) were recruited from the student population at the University of Toronto. All participants were naïve to the purpose of the study. Handedness was selfreported and all participants reported normal or corrected-tonormal vision. There were two separate experimental sessions, each lasting approximately 45 min, and participants were compensated \$15 for their time. Written informed consent was given by all participants and this research complied with the Declaration of Helsinki and the procedures were approved by the University of Toronto Health Sciences Research Ethics Board.

### Experimental Set Up and Apparatus

The experimental set up and apparatus were identical to Experiment 1 except for two features. In Experiment 1, the targets (6 cm diameter circles) were arranged in a square around the center of the black poster board and each target was 20 cm from the center of the board. In Experiment 2, each target was 10 cm from the center of the board. To maintain the same index of difficulty (Fitts, 1954) as the reaching movements in Experiment 1, the targets were reduced from 6 to 3 cm in diameter.

### Design, Procedure, and Data Analysis

The design, procedure, and data reduction and analysis was identical to that of Experiment 1.

# Results

Absolute Data

**Figure 6** provides a pictorial representation of the average locations of where the dowel was placed, in absolute coordinates for the different cue validity and task conditions. The dotted lines represent the midline for both the X axis and the Y axis.

### Differences in the Distances to Cued Locations Along the Y Axis (Near to Far)

To determine what factors influenced where the initiators placed the dowel the Y coordinate data was analyzed with a 2 (task context: Individual, Joint) x 3 (cue validity: 25%, 50%, 75%) x 2 (side of space: Ipsilateral, Contralateral) x 2 (proximity-to-body: Near, Far) repeated measures mixed model ANOVA with Order (joint task first, individual task first) as the between-subjects factor. There was a statistically significant main effect for cue validity, F(2, 34) = 69.17, p < 0.001. However, the main effects for side of space, F(1, 17) = 0.22, p = 0.645, order, F(1, 17) = 0.71, p = 0.412, task context, F(1, 17) = 0.86, p = 0.366, proximityto-body, F(1, 17) = 4.27, p = 0.056, did not reach statistical significance. The full ANOVA table is presented in Appendix D.

Post-hoctesting, using the Bonferroni correction, on the dowel position data as a function of cue validity (see **Figure 7**), showed that participants placed the dowel significantly closer to the cued location when the cue was 75% valid (M = 28.3 mm, SD = 13.4) in comparison to when the cue was 50% valid (M = 46.4 mm, SD = 10.7), t(18) = 8.47, p < 0.001, and 25% valid (M = 60.8 mm, SD = 10.1), t(18) = 9.69, p < 0.001. In addition, the dowel was placed significantly closer the cued location when the cue was 50% valid (M = 46.4 mm, SD = 10.7) in comparison to when

the cue was 25% valid (M = 60.8 mm, SD = 10.1), t(18) = 5.29, p < 0.001. Similar to Experiment 1, these results demonstrate that as cue validity increased the dowel was placed closer to the cued location.

The task context and proximity to the initiators body interaction did not reach conventional levels of statistical significance, F(1, 17) = 3.68, p = 0.072 (see **Figure 8**); however, planned comparisons were performed based on a priori predictions (and the results of Exp. 1). Comparisons were made between the dowel placement from the Individual and Joint tasks in both Near and Far space. In addition, separate comparisons were made between Near and Far space for the Individual task and the Joint task. Because there were four comparisons the alpha level was set at 0.013 following the Bonferroni t correction. The paired sample t-test on the dowel placement data in Near space revealed that there was not a statistically significant difference between the Individual (M = 40.8 mm, SD = 22.4) and Joint tasks (M = 54.2 mm, SD = 9.2), t(18) = 2.35, p = 0.030. Similarly, the analysis of the dowel placement in Far space revealed that there was not a statistically significant difference between the Individual (M = 46.9 mm, SD = 18.8) and Joint tasks (M = 38.3 mm, SD = 17.1), t(18) = 1.29, p = 0.213. The analysis of the dowel placement in the Individual task revealed that there was not a statistically significant difference between Near Space (M = 40.8 mm, SD = 22.4) and Far space (M = 46.9 mm, SD = 18.8), t(18) = 0.80, p = 0.435. Lastly, the analysis of the dowel placement from the Joint Task revealed that there was a statistically significant difference with the dowel being placed closer to the cued location in Far Space (M = 38.27 mm, SD = 17.06 mm) than in Near space (M = 54.22 mm, SD = 9.15 mm), t(18) = 3.61, p = 0.002. Overall, this pattern of effects demonstrates that the dowel was biased toward the finisher's body in the joint task, but not toward the initiator's body in the individual task. The proximity-to-body

effect in the joint condition in Near and Far space is partially consistent with the finding from Experiment 1. However, the lack of a proximity-to-body effect in the Individual task is not fully consistent with the findings of Experiment 1.

For additional results please see Appendix E.

#### Differences in the Distances to Cued Locations Along the X Axis (Left to Right)

To determine what factors influenced where the dowel was placed during the planning of a sequential action, the X coordinate data was analyzed with a 2 (task context: Individual, Joint) x 3 (cue validity: 25%, 50%, 75%) x 2 (side of space: Ipsilateral, Contralateral) x 2 (proximity-to-body: Near, Far) repeated measures mixed model ANOVA with Order (Individual task first, Joint task first) as the between-subjects factor. There were statistically significant main effects for cue validity, F(2, 34) = 49.58, p < 0.001, and side of space, F(1, 17) = 126.21, p < 0.001. The main effects for order, F(1, 19) = 0.12, p = 0.732, task context, F(1, 17) = 1.34, p = 0.263, and proximity-to-body, F(1, 17) = 0.02, p = 0.965, did not reach statistical significance. The full ANOVA table is presented in Appendix F.

All post-hoc testing was completed using the Bonferroni's t (Dunn's test) correction based on the number of comparisons. The cue validity analysis (see **Figure 9**), showed that participants placed the dowel closer to the cued location when the cue was 75% valid (M = 27.0 mm, SD = 18.6) in comparison to both 50% valid (M = 42.3 mm, SD = 17.6), t(18) = 7.09, p < 0.001, and 25% valid cues (M = 59.0 mm, SD = 13.7), t(18) = 8.18, p < 0.001. In addition, there was a significant difference in the dowel placement between the 50% valid cue condition (M = 42.3 mm, SD = 17.6) and the 25% valid cue condition (M = 59.0 mm, SD = 13.7), t(18) = 4.31, p < 0.001. These results demonstrate that as cue validity increased the dowel was placed closer to the cued location.

The side of space analysis revealed that the dowel was placed closer to the cued location when the cued location was in contralateral space (M = 27.78 mm, SD = 12.30), relative to the

initiator, than in ipsilateral space (M = 57.7 mm, SD = 18.3), t(18) = 11.37, p < 0.001. Again, there was no interaction between task conditions and side-of-space, F(1, 17) = 2.34, p = 0.145, indicating that participants demonstrated a similar contralateral bias in dowel placement from their own perspective and did not adapt to the contralateral space perspective of the partner.

For additional results please see Appendix G.

# Discussion

The main purpose of Experiment 2 was to determine if the size of the action space influenced how the dowel was placed in Near and Far space. Therefore, the tasks from Experiment 1 were performed in a smaller action space. The key finding from Experiment 2 was that the dowel was biased toward the finisher's body in both Near and far Space. This finding is in contrast to Experiment 1 where the proximity-to-body effect was only seen in Near space. Therefore, the data indicates that the size of the action space did influence the dowel placement in Far space in Experiment 1. Experiment 2 also replicated the finding that cue validity influenced where the dowel was placed. Lastly, the dowel placement was only consistent with the side of space effect in the Individuals task. These findings are discussed in greater detail in the General Discussion.

# GENERAL DISCUSSION

The main purpose of the present work was to further our understanding of the frames of reference that are used to anticipate a co-actor's potential action, and hence, what environment and/or body-centered information can be integrated into the selection and planning of actions that facilitate the achievement of shared goals. To build on previous literature, we investigated if individuals could adopt their co-actors bodycentered frame of reference, even though they had a large angular disparity between them and they were performing a joint task that had high action requirements. In addition, because individuals in our task could facilitate their co-actor's task using either (or both) environment and body-centered information, we went one step further than previous research and investigated if individuals would use multiple frames of reference during the anticipation of their co-actor's potential actions, so that they could accommodate both environment- and body-centered information during their response selection and planning. The following sections discuss the findings and the implications of this work in detail.

The results of both experiments provided evidence that the individuals represented their co-actors portion of the task from an allocentric frame of reference. Specifically, when the cue validity was 25% the initiator placed the dowel close to the center of the board, and therefore, a similar distance would be required to reach to any of the other four target locations. In addition, even as the cue validity increased and the dowel was placed closer to the cued location, the dowel was still placed in a location that minimized the distance to the other three targets. Because the initiator adopted an allocentric frame of reference, they were able to facilitate the finisher's task by reducing the impact of incorrectly anticipating the future location of a response. This finding builds on previous research that has shown that individuals will adopt their co-actor's body-centered frame of reference (e.g., Meyer et al., 2013) during the selection and planning of sequential joint actions by showing that individuals will adopt an allocentric frame of reference when they can facilitate the achievement of the shared goal based on the spatial relationship of targets, objects, and people in the shared environment.

In terms of whether the initiator fully adopted the finisher's body-centered frame of reference, during a joint action task that had demanding action requirements, the results from Experiments 1 and 2 demonstrated that the initiator partially adopted the body-centered frame of reference of the finisher. The reason for stating that the initiator only partially adopted the finisher's body-centered frame of reference is because the initiator planned actions that accommodated the distance of the dowel to the finisher's body (consistent with the proximityto-body effect; Brown et al., 1948; Reed and Smith, 1961), but did not accommodate the increased difficulty of moving to targets in the finisher's contralateral space (which would have been consistent with the side of space effect; Fisk and Goodale, 1985; also Ray et al., 2017). If the initiator had fully adopted the body-centered frame of reference of the finisher, then the dowel placement should have been biased toward ipsilateral space (which is actually the finisher's contralateral space) in the Joint task. Instead, the results showed that the dowel was placed closer to the initiator's contralateral space (the finisher's ipsilateral space) in both the individual and joint contexts. To the best of our knowledge, the existing literature on the frames of reference used to anticipate a co-actor's action, during tasks that have lower action requirements and a smaller angular disparity between them, consistently shows that individuals are able to fully adopt their co-actor's body-centered frame of reference to plan actions that facilitate the comfort of their co-actor (e.g., Gonzalez et al., 2011; Ray and Welsh, 2011; Meyer et al., 2013; Dötsch and Schubö, 2015; Scharoun et al., 2016) or synchronize the timing of imagined movements (Vesper et al., 2014). Therefore, we present a novel finding by providing evidence that during more complex joint actions, where co-actors have a high angular disparity between them, individuals did not fully adopt their co-actor's body centered frame of reference.

The presence of a proximity-to-body effect, but not a side of space effect, during the Joint task, could be based on a number of factors. First, it may be that the initiator never anticipated the differences in difficulty of moving into different sides of space, and hence, did not plan their actions to accommodate the difficulty of moving into contralateral space. Secondly, the initiator may have anticipated the differences in difficulty but simply chose not to integrate this information into their response planning. Neither of these explanations seems likely given that in the individual task the response planning was influenced by the increased difficulty of moving into contralateral space (i.e., the dowel was biased toward the targets located in contralateral space). A more likely explanation is that due to the increased cognitive effort required to fully adopt the finisher's body centered frame of reference the initiator only partially adopted their co-actor's body-centered frame of reference. According to Pezzulo et al. (2013), when co-actors have a high angular disparity between them (>60–90◦ ) and actions need to be coded based on origin dependent spatial information (e.g., laterality information), individuals need to undergo effortful spatial transformations to align opposing egocentric frames of reference. Although both the proximity-to-body effect (which is based on the spatial relationship to the body) and the side of space effect (which is based on the spatial relationship for both the hand and body) would be based on origin dependent spatial information, the coding for side of space would be more complex because both the hand and body need to be considered. For example, side of space can be coded based on laterality (i.e., left or right side of the body, left or right hand) and the side of space of the effector relative to the midline of the body (i.e., contralateral or ipsilateral), whereas, coding Near and Far space is only based on the distance to the body. Therefore, based on the pattern of effects in the present study, it appears that individuals might anticipate actions that originate from the body-centered frame of reference of a co-actor but that not all the spatial coding will be represented and simulated from their co-actor's body-centered frame of reference.

The lack of a side of space effect is somewhat congruent with a study by Ray et al. (2017) that showed that, in the first phase of the study, individuals did not plan actions to accommodate the difficulty of their co-actor's actions based on the side of space effect. However, after first-hand motor experience the dowel was biased toward targets in contralateral space. There are two variables from that study that might highlight factors that affect whether co-actors fully adopt each other's body-centered frame of reference. First, co-actors used mirror effectors (i.e., the initiator used the right hand while the partner in the finisher role used their left hand), and therefore this arrangement likely reduced (if not completely obviated) the need to fully align egocentric frames of reference. Secondly, the individual and joint task were performed in the same session and that design might have aided in the transfer of the response planning strategy from the firsthand motor experience to the joint task. In contrast, in the present study the individual and joint tasks were not performed in the same day (note also that, unlike in Ray et al., 2017 there were no statistically significant or theoretically-relevant effects of order). The interpretation that the spatial alignment (mirrored vs. opposite orientation) and learning will affect the frames of reference adopted are both consistent with the Pezzulo et al. (2013) shared space framework, which suggests that learning may be necessary to form complex shared spatial representations and that alignment is one factor that will modulate what type of frame of reference is used.

An additional purpose of the present studies was to determine if individuals represented their co-actor's task using both environment and body-centered frames of reference during complex joint actions where co-actors had a high angular disparity between them. Previous research has shown that individuals can represent a joint task from their own bodycentered frame of reference and their co-actor's body-centered frame of reference (e.g., Frischen et al., 2009; Böckler et al., 2011; Ray and Welsh, 2011; Meyer et al., 2013; Vesper et al., 2014; Dötsch and Schubö, 2015; Ray et al., 2017); however, there is no evidence regarding whether or not individuals represented their co-actor's task from multiple frames of reference. In addition, because previous research had not explicitly tested if individuals could represent their co-actor's task from both environment centered and body-centered frames of reference, the tasks were not designed in such a way that individuals could facilitate their co-actor's task using either, or both, environment centered or body-centered information. Consistent with our hypothesis and the framework of Pezzulo et al. (2013), our results provide novel evidence that the initiator represented the finisher's portion of the task using multiple frames of reference during the anticipation of their potential actions. Specifically, the initiator's response planning was influenced by the distance to the finisher's body (finisher's body-centered frame of reference), cue validity and the spatial relationship between targets (allocentric frame of reference), and the side of space of the initiator (initiators own egocentric frame of reference). Evidence that the initiator adopted both the finisher's body-centered frame of reference and an allocentric frame of reference has already been discussed, therefore the use of an egocentric frame of reference will be discussed next.

The conclusion that the initiator's response planning was also based on their own egocentric frame of reference is based on the finding that the dowel was placed closer to targets in the initiator's contralateral space, consistent with the side of space effect, in both the individual and joint conditions. Based on this result, it would appear that action codes that concerned side of space were origin dependent on the initiator's body and hands. As previously mentioned, Pezzulo et al. (2013) have suggested that the most complex and effortful spatial transformations would occur when co-actors have completely opposite spatial orientations to each other (180◦ angular disparity) and the task requires origin dependent spatial information. Therefore, due to the difficulty and effort required to completely adopt the finisher's body-centered frame of reference the initiator may have defaulted to coding certain aspects of their task to their own egocentric frame of reference.

In addition to cognitive effort influencing the frames of reference used during joint actions, the present experiments also demonstrate that physical effort can modulate how co-actor's select and plan joint actions. Experiment 2 was conducted to investigate if the size of the action space, and hence, the amplitude of reaching movements, influenced the dowel placement in the finisher's near space. The results of Experiment 2 did show that when the action space was reduced the proximity-to-body-effect was observed in the finisher's near space. Given that the only variable that changed between Experiment 1 and 2 was the size of the action space, it seems reasonable to suggest that the lack of proximity-to-body-effect in finisher's near space was due to effort and not due to an inability adopt their co-actor's bodycentered frame of reference. This explanation is also congruent with previous joint action research that has shown that physical effort modulates decision making in joint actions (Santamaria and Rosenbaum, 2011). The effect of effort on joint actions is clearly a topic that requires further research.

# CONCLUSION

The present research has revealed that during joint actions individuals will use multiple frames of reference to anticipate their co-actor's task and integrate information from those different frames of reference into their response planning. The finding that co-actors can represent their co-actor's task from multiple frames of reference is an important contribution to the joint action literature because it provides a potential mechanism for how individuals can represent both environment- and bodyrelated factors that need to be considered during response selection and planning. In addition, our research shows that there are limitations in a person's ability to fully adopt their co-actor's body centered frame of reference. Although, previous sequential joint action research has shown that individuals fully adopted their co-actor's body-centered frame of reference and planned actions to facilitate the use of particular postures when

# REFERENCES


manipulating objects (Gonzalez et al., 2011; Ray and Welsh, 2011; Meyer et al., 2013; Dötsch and Schubö, 2015; Constable et al., 2016; Scharoun et al., 2016), the present study builds on that literature by showing that when co-actors have a large angular disparity between them and they are anticipating a complex action (one that can be facilitated based on numerous response features), they might not fully adopt their co-actor's body-centered frame of reference. Instead, this study shows that when a co-actor's task can be accommodated based on a number of action features then multiple frames of reference can be used. Although the exact underlying mechanisms that support the adoption of multiple frames of reference are still unclear and beyond the scope of this paper, the present results demonstrate that when individuals are planning joint actions to accommodate aspects of a co-actor's task, they consider multiple action features based on multiple frames of reference. Future research should investigate how modulations in task complexity impact the frames of reference used during joint actions and what learning experiences are required to fully adopt a co-actor's body-centered frame of reference.

# AUTHOR CONTRIBUTIONS

MR was the primary contributor to the design, data collection and analysis, and writing while TW contributed to the design, analysis and writing.

# FUNDING

Funding for this study was provided by an NSERC grant to TW.

# ACKNOWLEDGMENTS

The authors would like to thank all the participants and the confederate for their time and effort in completing the study.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg. 2018.00542/full#supplementary-material


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Ray and Welsh. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Response Coordination Emerges in Cooperative but Not Competitive Joint Task

Francesca Ciardo<sup>1</sup> \* and Agnieszka Wykowska1,2

1 Istituto Italiano di Tecnologia, Genoa, Italy, <sup>2</sup> Luleå University of Technology, Luleå, Sweden

Effective social interactions rely on humans' ability to attune to others within social contexts. Recently, it has been proposed that the emergence of shared representations, as indexed by the Joint Simon effect (JSE), might result from interpersonal coordination (Malone et al., 2014). The present study aimed at examining interpersonal coordination in cooperative and competitive joint tasks. To this end, in two experiments we investigated response coordination, as reflected in instantaneous cross-correlation, when co-agents cooperate (Experiment 1) or compete against each other (Experiment 2). In both experiments, participants performed a go/no-go Simon task alone and together with another agent in two consecutive sessions. In line with previous studies, we found that social presence differently affected the JSE under cooperative and competitive instructions. Similarly, cooperation and competition were reflected in coagents response coordination. For the cooperative session (Experiment 1), results showed higher percentage of interpersonal coordination for the joint condition, relative to when participants performed the task alone. No difference in the coordination of responses occurred between the individual and the joint conditions when coagents were in competition (Experiment 2). Finally, results showed that interpersonal coordination between co-agents implies the emergence of the JSE. Taken together, our results suggest that shared representations seem to be a necessary, but not sufficient, condition for interpersonal coordination.

#### Keywords: response coordination, shared representations, joint Simon effect, cooperation, competition

# INTRODUCTION

As social species, humans are skillful in attuning to others in social contexts. Several studies showed that performing a task individually could be affected by social presence. Indeed, when embedded in the social environment, we dynamically coordinate our actions with those of others in time and space (Sebanz et al., 2006; Knoblich and Sebanz, 2008). Such coordination during joint actions is supported by a complex plethora of mechanisms, such as shared representations, sensorimotor coordination, and goal sharing (see Vesper et al., 2017 for a review). However, since these mechanisms have been mostly investigated independently, it is still unclear how they are orchestrated in order to support efficient joint tasks.

#### Edited by:

Kerstin Dittrich, Albert-Ludwigs-Universität Freiburg, Germany

#### Reviewed by:

Thomas Dolk, University of Regensburg, Germany Kerry Marsh, University of Connecticut, United States

> \*Correspondence: Francesca Ciardo francesca.ciardo@iit.it

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 13 April 2018 Accepted: 18 September 2018 Published: 09 October 2018

#### Citation:

Ciardo F and Wykowska A (2018) Response Coordination Emerges in Cooperative but Not Competitive Joint Task. Front. Psychol. 9:1919. doi: 10.3389/fpsyg.2018.01919

# Shared Representations and the Joint Simon Effect

According to the shared representations account, joint action is based on the ability to share task representations, i.e., the ability to represent the task as shared, and to create a representation of the task that includes both our and co-agents' actions (Sebanz et al., 2006). In the recent years, researchers investigated joint action by means of the "Joint Simon" task (Sebanz et al., 2003). In the standard Simon task, participants respond to a non-spatial feature (e.g., color or shape) of stimuli presented to the left or to the right of fixation with assigned right and left key presses. The Simon effect (SE) refers to the finding that performance is faster and more accurate when stimulus and response location spatially correspond, as compared to when they do not (Simon and Rudell, 1967; see also Proctor and Vu, 2006 for a review). The SE is absent when participants perform a go/no-go version of the task, responding only to one feature while withholding the response for the other feature, which indicates that the SE is due to the activation of automatic links between stimulus location and the corresponding response position (Tagliabue et al., 2000). Sebanz et al. (2003) showed that SE occurs even when the Simon task is shared between two participants, i.e., when two participants perform the go/no-go task in a joint context, each one responding to one color only. The spatial compatibility effect emerging in the joint go/no-go task is known as the Joint Simon Effect (JSE)<sup>1</sup> . According to Sebanz et al. (2003, 2006), the JSE has been interpreted as an indication that when people perform together complementary parts of a task, they tend to represent the whole task and to integrate both their and other's action options into a shared representation, as if they were performing the standard Simon task alone, i.e., performing the task with two hands. In the absence of such a representation, no alternative action is represented and thus no conflict between alternative responses would arise, as is the case of the individually performed go/no-go Simon task. Thus, the JSE has been considered as an index of emergence of shared representations (e.g., Sebanz et al., 2006, 2003; Knoblich and Sebanz, 2008). However, studies that systematically investigated how social presence influences individual performance suggested that when we perform a task along with another person the representations guiding joint performance might differ from representations guiding performance in the individual task (e.g., Ferraro et al., 2011; Ciardo et al., 2016). Several alternative accounts have been proposed to explain the emergence of the JSE (see Prinz, 2015 for a review), including the referential coding account (Dolk et al., 2011, 2013; Dittrich et al., 2013; see Dolk et al., 2014 for a review). The referential coding account proposes that during a joint Go/Nogo Simon task, the presence of any salient action event, generated by a biological or non-biological agent (Stenzel and Liepelt, 2016; Miss and Burkhart, 2018), is represented by an action event code. Given the high similarity of the two action events in the Joint Simon task (i.e., pressing a button), participants need to discriminate between internally (one's own) and externally (the other agent's) activated events. In order to solve the conflict arising from this discrimination, the differences between the two action events (i.e., the left-right location of the response), are strengthened and automatically interfere with the task-irrelevant stimulus spatial code, which generates the JSE (Dolk et al., 2014).

# Interpersonal Coordination

Sharing the context with another agent does not always require intentional representation of one's own and others' actions. Indeed, the presence of another person can interfere with our performance at a lower level, as in the case of sensorimotor timing (e.g., Schmidt and Turvey, 1994; Richardson et al., 2007). For example, during a conversation, we tend to nod at a same rhythm as the speaker. Similarly, when we walk with someone, we reciprocally adapt our gait to each other. This tendency to unintentionally adapt the timing of our movements to others is called entrainment and it seems necessary in order to be temporally coupled with others (Marsh et al., 2009). Entrainment also underlies joint action, according to the dynamic account (see Marsh et al., 2009 for a review). For instance, it has been shown that pairs of participants performing rhythmic movements (i.e., swinging a pendulum or rocking chairs) tend to become temporally correlated by adopting the same movement rate (Schmidt and Turvey, 1994; Richardson et al., 2007). Vesper et al. (2011) showed that when pairs of participants perform two independent Simon tasks at the same moment, their responses tend to be coordinated (Vesper et al., 2011). Specifically, response variability positively correlated with asynchrony in reaction times across the two members of the pairs, suggesting that reducing response variability may represent an implicit strategy to facilitate cooperation. Similarly, other results show that coordination supports access to others' mental states and spontaneous cooperation (Semin and Cacioppo, 2008; Koehne et al., 2016). In a recent study, Malone et al. (2014) investigated the dynamic structure of reaction times (RTs) in a Joint Simon task. The authors compared the response variability structure of participants performing a go/no-go Simon task. Two groups of participants performed the same go/no-go Simon task individually or together with another person having the complementary go/no-go assignment. Results showed that variability structure was whiter<sup>2</sup> in the individual than in the joint condition (Malone et al., 2014); indicating that when participants performed the task sideby-side of another person, responses were characterized by nested patterns of variability which were not due to random fluctuations (Malone et al., 2014). In line with the idea

<sup>1</sup>According to the classification of spatial compatibility effects proposed by Donders (1969), the SE typically reported in the Joint Go/Nogo Simon task belongs to group "c," including spatial-compatibility effects emerging in tasks in which the spatial nature of the stimulus is task-irrelevant.

<sup>2</sup>Note that when decomposing reaction times variability, it is possible to identify three types of temporal structures. White noise indicates that the temporal variability in an action sequence is generated by unsystematic or unrelated changes from trial to trial. Brown noise corresponds to a stochastic function, indicating that each subsequent action is a function of the previous action to which a random increment is added. Pink noise is a mixture of randomness and rigidity, and it is typical of interaction dominant (complex) systems across multiple time scales. For example, it has been suggested that pink noise reflects emergent coordination between cognitive processes and behavior (e.g., Van Orden et al., 2003).

of decreasing fractal structure of RT variability, the authors reported that RTs of pairs in the joint condition were more correlated across time scales than RTs of pseudo-pairs of participants who performed the task individually. These latter results suggested that responses of co-agents during the joint Simon task were coupled and that the dynamics of co-agents' responses might be mutually constrained. In sum, the authors proposed that "dynamic processes of constraints may decouple behavior over time" (cf. p. 6 Malone et al., 2014) and may underlie the JSE instead of any form of shared or integrated representation of the task. Alternatively, it is also plausible that the emergence of shared representations, or the integration of "self " and "other" action events, may actually promote emergent temporally evolving coupling and modulate the inter-agent response dynamics.

# Social Context and Shared Representations: The Case of Cooperation and Competition

Cooperation and competition are social relations that rely on opposite goal interdependency (Deutsch, 2011), and differently affect social cognitive processes, as joint attention (e.g., Ciardo et al., 2015), sensorimotor synchronization (e.g., Fairhurst et al., 2012), and reach-to-grasp kinematics (e.g., Ciardo et al., 2017). When we cooperate with someone, our goals are positively related. In contrast, when we compete, reaching our personal goal is negatively related to others' achievement of the goal: if our competitor reaches his/her goal, then we cannot reach our goal anymore (Deutsch, 2011). Positive and negative goals interdependency between co-agents differently affects the emergence of shared representations (e.g., Ruys and Aarts, 2010; Iani et al., 2011, 2014) and self-other integration (Hommel et al., 2009; Ruissen and de Bruijn, 2016). For instance, Hommel et al. (2009), manipulated the valence of the interaction between two co-agents during a Joint Simon Task. Participants performed the task with a friendly and cooperative, or with an intimidating and competitive confederate. Results showed that the JSE occurred only for participants involved in a positive relationship, whereas the negative relationship led to a reduction of the JSE. Similarly, Iani et al. (2011, 2014) showed that when pairs of participants performed a joint Simon task, the JSE emerged only when the two co-agents were required to cooperate but not when they were in competition against each other (Iani et al., 2011, 2014). Under the cooperative condition, participants were told that the pair with the fastest and most accurate responses would receive a reward. This condition elicited a positive interdependence, as the success of one individual rendered the success of the other more likely. Under the competitive condition, they were told that the participant of the pair with the fastest and most accurate responses would receive a monetary reward. Such a design indicated that by manipulating goals interdependency, it is possible to promote or inhibit the emergence of shared representations without manipulating the physical and dynamical features of the social environment and its task constraints. According to the referential coding account (Dolk et al., 2013, 2014), the lack of JSE during competitive tasks can be explained by the fact that negative interpersonal relationships do not promote selfother integration. Thus, during competitive tasks participants do not need to discriminate between internally and externally activated action events, and they do not need to strength task-relevant information (i.e., left and right response location) resulting in the lack of the JSE. Results from a recent study by Ruissen and de Bruijn (2016) are in line with the selfother integration account (Dolk et al., 2014) showing that the JSE is reduced following a competitive game play. According to Ruissen and de Bruijn (2016), motivation and contextual factors might affect self-other integration during the Joint Simon task by exerting different effect on attentional processes. In a cooperative situation we might be motivated to attend to our co-agents performance even if, as in the Joint Simon task (Ferraro et al., 2011), it actually interferes with our own performace, – in order to monitor potential co-agent's mistakes, and to better adapt our internal action model. On the contrary, during competitive interactions, co-agents might be focused on stabilizing their own performance and do not attend the co-agent's behavior, which results in attenuation of self-other integration (Hommel et al., 2009; Ruissen and de Bruijn, 2016).

Recently, Keller et al. (2016) proposed a model of joint action, which connects shared representation of goals and interpersonal coordination. The authors proposed that during joint action, distinct self and other internal models are maintained in order to ensure that each co-agent controls their action planning and execution. When shared representations of goals are established, self and other models work together allowing co-agents to anticipate, attend, and adapt to each other in real time (Keller et al., 2014). The coupling of self and other models into a joint model facilitates interpersonal coordination. Thus, the emergence of shared representations of goals guide joint action by supporting the interaction between cognitive and online sensorimotor processes. Previous studies investigating how cooperation and competition affect self-other integration or shared representations used a monetary reward to manipulate cooperation and competition between co-agents (Ruys and Aarts, 2010; Iani et al., 2011). However, individual and contextual differences can shape the actual perception and experience of a monetary reward as a motivational cue (e.g., Kahneman and Tversky, 1979; see Schultz, 2006 for a review). This would explain the controversial nature of the results reported by previous studies on how competition affects JSE (Ruys and Aarts, 2010; Iani et al., 2011). In order to minimize the effect of individual and contextual differences in the motivation to cooperate or compete, in the present study, we manipulated positive and negative interdependency between co-agents through punishment avoidance. Indeed it has been shown that reward and punishment avoidance emerge from different learning mechanisms rely on distinct neural circuits (e.g., Palminteri et al., 2012). Thus, by using punishment avoidance instead of reward, we aimed at testing whether previous findings showing that the JSE can be modulated by cooperative vs. competitive instructions generalize to different types of experimental manipulation.

# Aim of Study

The present study aimed at examining the relationship between interpersonal coordination and the JSE, with JSE being taken as an index of shared representations. To this end, in two experiments we asked participants to perform a go-no/go Simon task alone or side-by-side of another person. In Experiment 1, we investigated the response coordination when co-agents were required to cooperate, with the assumption that cooperation promotes self-other integration or the emergence of shared representations. In Experiment 2, we examined response coordination when co-agents' goals were mutually exclusive, like in competition, assuming that in this case, self-other integration would be attenuated, or shared representation would not be activated.

# EXPERIMENT 1

The present experiment aimed at assessing interpersonal response coordination during joint action. To this end, we compared the coordination between RTs when participants performed a go/no-go Simon task alone or together with another person. We focused on cooperative joint actions, i.e., when the goals of two co-agents are positively related to each other. In line with previous studies, we expected a non-significant SE (i.e., no difference between corresponding and non-corresponding trials) when participants perform the task alone, and a JSE when they are required to cooperate (Hommel et al., 2009; Ruys and Aarts, 2010; Iani et al., 2011, 2014; Ruissen and de Bruijn, 2016). Regarding response coordination, to explore interpersonal coordination in the context of JSE, we examined if RTs of coagents were correlated (i.e., coordinated) with each other over time. We hypothesized that if shared representations or self-other integration are reflected in the dynamics of the behavior then the response coordination should be greater between the RT timeseries of individuals in the joint condition, as compared to RT time-series of pseudo–pairs created using RT time-series from the two individual conditions. Specifically, a higher percentage of response coordination in RT times-series is expected for the Joint compare to the Individual condition (Malone et al., 2014).

# Materials and Methods

#### Participants

Twenty participants (11 males; 4 left-handed; Mean age: 24 ± 3.9 years) took part in the study. All participants had normal or corrected-to-normal vision and were not informed with respect to the purpose of the experiment. Participants received a reimbursement of 15€ for their participation. All gave their written informed consent before participating. Both Experiment 1 and Experiment 2 were conducted in accordance with the ethical standards laid down in the 2013 Declaration of Helsinki and were approved by the local ethical committee (Comitato Etico Regione Liguria). Sample size was defined according to previous experiments (Ruys and Aarts, 2010; Iani et al., 2011), and by an a priori power analysis indicating a sample N = 18 to detect a medium effect size [Cohen's d for repeated measures (Dz) = 0.60, alpha (one-tailed) = 0.05 and power = 0.95] for within-subjects comparisons. Participants were recruited individually from the subject database of the Italian Institute of Technology. They were paired according to the time slots in which they were available to take part in the experiment.

#### Apparatus and Stimuli

Stimuli presentation, response timing, and data collection were controlled by the E-Prime version 3 software (Psychology Software Tools, Inc.). Stimuli were red and green solid squares (2.3◦× 2.3◦ ), which were randomly presented on the left or on the right of a central white fixation cross (0.6◦× 0.6◦ ) on a black background. Responses were executed by pressing with the index finger the "z" or "-" key of a standard Italian QWERTY keyboard. Response keys were highlighted with two white circular stickers. The experiment was carried out in a dimly lit and noiseless room. Participants were seated facing a 27<sup>00</sup> LCD screen driven by a 2.4 GHz processor computer. Viewing distance was about 60 cm.

#### Procedure

Pairs of participants performed two consecutive sessions, separated by a 5-min interval and lasting about 60 min in total. To avoid transfer of learning effects typical of spatial compatibility tasks (Ansorge and Wühr, 2009; Dittrich et al., 2012; Lugli et al., 2013), the order of the two sessions was fixed: an Individual session was followed by a Joint session (for a similar procedure see Dittrich et al., 2017). In the Individual session, each participant performed the task alone, sitting to the right or on the left from of the center of the screen, with an empty chair next to him/her. Left-handed participants seated always on the left side of the screen, in order to let them perform the task with their dominant hand. At their arrival to the lab, participants were told that they were going to perform two different experiments. The two members of the pair participated in the Individual session in parallel (i.e., at the same time), sitting in two different rooms without any possibility to see or talk to each other. Instructions for the individual session were provided separately to each participant by the same experimenter. In the Joint session, participants seated side-by-side, one to the left and one to the right of the center of the screen. Pairs of participants were instructed to cooperate in order to be the bestperforming pair, in terms of both speed and accuracy. They were told that, at the end of the experiment, if they did not perform as the best couple they would receive a punishment, consisting in performing an additional task (i.e., performing the first session again). In both sessions (i.e., Individual and Joint), the experimental procedure was as follows: A trial began with the presentation of the fixation cross at the center of the screen. After 1 s, the stimulus appeared to the right or to the left of the fixation and remained visible until a response was collected, or for 800 ms. Maximum time allowed for response was 1 s after stimulus presentation. Immediately after a response was collected, or the stimulus elapsed, a black screen was presented for 1 s. In the Individual session, Nogo stimulus was presented for 800 ms and followed by a 1 s black screen before the next trial started. For both sessions, the task consisted of 16 practice trials and 384 experimental trials divided into four blocks of 96 trials each. For half of the trials, stimulus and response location

corresponded (corresponding trials), for the other half, they did not correspond (non-corresponding trials). A fictional partial score was displayed at the end of each block. The score was computed as the difference between corresponding and notcorresponding trials. Participants were told that the score was computed by an algorithm based on their speed in responding, corrected by the overall percentage of correct answers. For half of the pairs, the participant sitting on the right chair pressed the right key to the red stimulus whereas the participant sitting on the left chair pressed the left key to the green stimulus. The other half was assigned opposite stimulus–response mapping. Response-, seat- and stimulus assignment to each participant was identical across the two sessions.

# Data Analysis

First, we analyzed correct responses to check the JSE. Mean correct RTs were submitted to a repeated-measures analysis of variance (ANOVA) with Condition (Individual vs. Joint), and Correspondence (non-corresponding vs. corresponding) as within-subjects factors.

In order to evaluate whether JSE requires time to emerge, we conducted a distributional analysis of RTs (Ratcliff, 1979). To have enough observations in each bin, we chose to divide the RT distribution in quartiles (Liepelt et al., 2011). Thus, individual correct RTs for each condition were rank ordered and divided into four bins. Mean RTs for each bin were then entered into a repeated-measures ANOVA with Condition (Individual vs. Joint), and Correspondence (noncorresponding vs. corresponding) and Bin (1–4) as withinparticipant factors. To investigate trial-by-trial modulations (Liepelt et al., 2011; Yamaguchi et al., 2018), mean RTs were submitted to an ANOVA with Condition (Individual vs. Joint), Trial Transition (n−1 Go/ n go vs. n−1 Nogo/ n go), Trial n−1 Correspondence (non-corresponding vs. corresponding), and Trial n Correspondence (non-corresponding vs. corresponding) as within-participant factors. When necessary, comparisons were performed using paired samples t-tests. Significance thresholds were corrected for the number of comparisons (Bonferroni correction).

To quantify the degree of coordination between the agents, following study Malone et al.'s (2014), we applied instantaneous cross-correlation on RTs series (Barbosa et al., 2008); which allows determining the correlation between time-series across multiple time-scales. This is done by computing correspondence between two time-series recursively and generating a time-series of how past and future samples are correlated at all points in time. This method has been applied to determine objective coordination between non-synchronous behaviors occurring at different time lags, like in articulatory coordination of two vocals tracts (Vatikiotis-Bateson et al., 2014). Subsequently, an index of response coordination was estimated as the proportion of correlated activity (i.e., the proportion of r > 0.25, see Malone et al., 2014, 2013) between the RT time-series of the two members of a pair. RT time-series were computed by ordering for each participant RTs in the order they were collected, and then by subtracting from each data point the mean of respective condition for each participant. RTs for missing and incorrect responses were substituted by the mean of RTs for the respective condition. We ran the instantaneous correlation analysis for offsets of −9 to + 9 trials with a conservative (η = 0.1) non-causal filter (Barbosa et al., 2008). Thus, the offset range was chosen by reducing the interval size applied in Malone et al.'s (2014) study, in order to consider delays proportional to the lower number of trials.

Finally, a paired samples t-test was applied to compare if the proportion of correlated response activity (i.e., index of response coordination) within each pseudo-pair in the individual condition differed from the percent of coupling observed for pairs in the joint condition.

# Results

### Reaction Times

Errors were 0.4 and 0.5% of the total amount of trials, for the Individual and Joint conditions, respectively, and were not further analyzed. Tukey outlier thresholds (1977) were used for each condition to identify outliers in the number of erroneous trials. No participants were excluded. Mean RTs are summarized in **Table 1**. The ANOVA revealed a main effect of Correspondence, F1,<sup>19</sup> = 8.70, p = 0.008, η 2 <sup>p</sup> = 0.31, together with a significant two-way interaction with Condition, F1,<sup>19</sup> = 15.09, p = 0.001, η 2 <sup>p</sup> = 0.44. Pairwise comparisons showed that the difference between corresponding (M = 340 ms) and non-corresponding trials (M = 350 ms) was significant for the Joint condition only, t<sup>19</sup> = 4.11, pBonferroni–corrected = 0.001, d = 0.92. In the Individual session no effect of correspondence was evident (M = 345 and M = 347 ms for corresponding and non-corresponding trials, respectively), t<sup>19</sup> < 1 (**Figure 1**).

#### RTs Distribution

Besides the main effect of Correspondence and its interaction with Condition already reported in the previous analysis, the ANOVA revealed a main effect of Bin, F1,<sup>19</sup> = 175.98, p < 0.001, η 2 <sup>p</sup> = 0.90, indicating that RTs increase across quartiles. No other main effects or interactions were significant, all ps > 0.77 (**Figure 2**).

#### Trial-by-Trial Modulation and Transition Effects

The first trial of each block, errors and responses that were preceded by an incorrect response were discarded from the analysis (1.15 and 1.41% of the total trials in the Individual and Joint condition, respectively). The results are summarized in **Table 2**. The ANOVA showed that responses were faster in corresponding (M = 342 ms, SE = 8.74 ms) than noncorresponding (M = 348 ms, SE = 8.71 ms) trials, as indicated

TABLE 1 | Experiment 1: Mean correct reaction times (and standard deviation) in ms as a function of Condition (individual vs. joint) and Correspondence (non-corresponding vs. corresponding).


TABLE 2 | Experiment 1: Mean correct reaction times (and standard deviation) in ms as a function of Trial Transition (Nogo/go, Go/go), Trial n−1 (corresponding, C vs. non-corresponding, NC), and Trial n (corresponding, C vs. non-corresponding, NC).

panel) and Joint condition (Right panel). Error bars show standard errors of the means.


The Simon effect (SE) is computed as the difference in RTs between noncorresponding and corresponding trials.

by the main effect of Trial n Correspondence, F1,<sup>19</sup> = 9.66, p = 0.006, η 2 <sup>p</sup> = 0.34. As in the previous analysis, the twoway interaction between Condition and Correspondence was significant, F1,<sup>19</sup> = 8.73, p = 0.008, η 2 <sup>p</sup> = 0.32. Pairwise comparisons showed a significant 9-ms JSE for the Joint condition only, t<sup>19</sup> = 3.98, pBonferroni–corrected < 0.001, d = 0.89, and a non-significant 3-ms JSE in the Individual session, t<sup>19</sup> = 1.37, pBonferroni–corrected = 0.187, d = 0.31. The interaction between Trial n Correspondence and Trial n−1 Correspondence was also significant, F1,<sup>19</sup> = 58.88, p < 0.001, η 2 <sup>p</sup> = 0.32. Post hoc comparisons showed a 15-ms effect after a corresponding n−1 trial, t<sup>19</sup> = 6.4, pBonferroni−corrected < 0.001, d = 1.43, and a non-significant 3-ms effect after non-corresponding n−1 trials, t<sup>19</sup> = 1.11, pBonferroni−corrected > 0.05, d = 0.25. The three-way interaction between Trial Transition, Trial n−1 Correspondence, and Trial n Correspondence was significant, F1,<sup>19</sup> = 18.49, p < 0.001, η 2 <sup>p</sup> = 0.49. Planned comparison showed that trialby-trial modulation occurred always for Nogo/go transitions with a significant 24-ms effect following a corresponding n−1 trial, t<sup>19</sup> = 5.63, pBonferroni−corrected < 0.001, d = 1.26, and a reversed 9 ms effect following a non-corresponding n−1 trial, t<sup>19</sup> = 3.01, pBonferroni−corrected = 0.007, d = 0.67. On the contrary, no trial-by-trial modulations occurred for Go/go transitions, all ps > 0.06. No other main effects or interaction were significant, all ps > 0.14.

#### Response Coordination

As mentioned above, we compared the proportion of response coordination (i.e., the proportion of correlation between the times series higher than 0.25) within each pseudo-pair (N = 10) in the individual condition with the percent of response coordination observed for pairs in the joint condition. Results

showed higher response coordination for the Joint (15.3%) than for the Individual session (12.7%), t<sup>9</sup> = 3.44, p = 0.007, d = 1.09.

# Discussion

In Experiment 1, we examined coordination of RTs when participants performed a go/no-go Simon task alone or sideby-side of another person, in a cooperative context. In line with previous studies, results from mean RTs showed a nonsignificant SE when participants performed the task alone, and a significant JSE when they were performing the task sideby-side and were instructed to cooperate (Iani et al., 2011, 2014, see Karlinsky et al., 2017b for a meta-analysis on the magnitude of the JSE). The comparison of the distributional trends showed that response speed did not affect the magnitude of the SE neither in the Individual task nor in the Joint task. The similarity in the distributional patterns between the Joint and the Individual tasks replicates previous results reported by study Liepelt et al.'s (2011), suggesting that the emergence of JSE cannot be attributed to different temporal dynamics underlying the two conditions. Trial-by-trial modulations occurred both in the Individual and Joint conditions, as indicated by the lack of significant interaction involving trial sequence (n−1/n) and Condition. As reported by previous studies (Liepelt et al., 2011; Yamaguchi et al., 2018), trial-by-trial modulations occurred when Go trials were proceeded by a Nogo trial (Nogo-Go Transition), probably reflecting response inhibition during Nogo trials. Trial-by-trial modulations mimic the pattern typically reported in standard two-choice Simon tasks (Iani et al., 2009; Ciardo et al., 2018), with a reversed effect following non-corresponding n−1 trial and a positive effect following corresponding n−1 trial. These trial-by-trial modulations have been taken as evidence that the conflict experienced in a trial is accompanied by changes aiming at preventing the reocurrence of the conflict in the next trial by means of enhanced processing of task-relevant information (e.g., Egner and Hirsch, 2005) or inhibition of task-irrelevant features (e.g., Ridderinkhof, 2002). Alternatively, it has been proposed that trial-by-trial modulation may reflect binding effects (e.g., Hommel et al., 2004). Indeed in the Simon task, sequences of two corresponding trials (C–C) and sequences of two noncorresponding trials (NC–NC) are either complete repetitions or complete changes of stimulus position and response or complete changes of both stimulus position and response. In contrast, mixed sequences (C–NC or NC–C) are always partial repetitions in which either stimulus position or response repeats. Thus, the SE may be reduced following a noncorresponding trial because responses are faster for complete repetitions and alternations compared to partial repetitions (Hommel et al., 2004). The results of response coordination showed that when co-agents were instructed to cooperate, their RT time-series were more coordinated (i.e., a higher percentage of correlation) with each other over time, as compared to when they were performing the task individually. Such result confirms and extends Malone et al. (2014) evidence for the idea that coordination of behavior is observed together with the JSE.

# EXPERIMENT 2

Experiment 1 suggested that when co-agents' goals are positively related, the correlation between co-agents' responses across time scales increases. However, it could be that co-agents' coordination reflects their adaptation in space and time related to any dynamic event occurring during the task, like another agent (human or not) acting in the same environment, independently from positive goal. Thus, it is possible that the increase in response coordination reported in Experiment 1 results from the natural tendency to adapt the timing of our movements to external events (e.g., Marsh et al., 2009), rather than resulting from integration of self and other action events, or from the emergence of a shared representation. In line with this hypothesis, there are several results showing that the JSE can occur even when no shared representation is necessary, like when the co-agent is not present physically (e.g., Sellaro et al., 2013) or when an object is performing the complementary go/no-go task (Stenzel and Liepelt, 2016). For example, a recent evidence showed that JSE emerges even when the alternative response is executed by a non-human agent (e.g., a Japanese cat, a metronome, or a wooden hand, Dolk et al., 2013; Stenzel and Liepelt, 2016). Interestingly, the JSE was larger when the external event (i.e., the non-human agent) was acting in a turn-taking way with respect to the participant, as compared to a condition when it was acting in a continuous way, i.e., not task-related (Stenzel and Liepelt, 2016). To test this alternative explanation, in Experiment 2 we examined the effect of mutually exclusive goals (assuming no shared representation) on response coordination. As in Experiment 1, we compared coordination between RT timeseries when participants performed a go/no-go Simon task alone or side-by-side of another person. However, during the Joint session participants were instructed to compete against each other. Note that competition is a particular case of joint task in a shared environment, where individuals work to reach an individual goal that – in order to be reached – excludes the goal of the other co-agent. In line with previous studies showing that competition disrupts the emergence of shared representation in joint tasks (Ruys and Aarts, 2010; Iani et al., 2011, 2014) or affect the integration of self and other action events (Hommel et al., 2009; Ruissen and de Bruijn, 2016); we expected no difference in the SE between the Individual and Joint condition. Regarding response coordination, we hypothesized that if findings of Experiment 1 are merely due to environmental perturbations produced by dynamic events, i.e., the co-agent acting in the shared environment, then results should replicate the pattern reported in Experiment 1, with greater response coordination (i.e., higher percentage of correlation) in the joint condition compared to the individual condition. This result would speak against the idea that shared representations, are the consequence of response coordination. On the contrary, if the percentage of response coordination reflects the emergence of shared representations, or the integration of self-other action events, then the social presence should not modulate coordination between RT times-series across the individual and joint conditions, similarly to the standard SE. This would speak in favor of the hypothesis that interpersonal coordination yields shared representation.

# Materials and Methods Participants

fpsyg-09-01919 October 6, 2018 Time: 16:59 # 8

Twenty-six new participants (8 males; 4 left-handed; Mean age: 24 ± 2.9 years), selected as in the previous experiment, took part in Experiment 2. All participants gave their written informed consent and the study was conducted in accordance with the ethical protocol applied also in Experiment 1. Three pairs, six participants in total, were excluded from the data analysis, given the number of errors made by at least one member of the pair.

#### Apparatus, Stimuli, and Procedure

The apparatus, stimuli, and procedure were the same as in Experiment 1. With the only exception that in the Joint session, pairs of participants received the instructions to compete against each another. They were told that at the end of the experiment, the worst performer of the pair would receive a punishment, i.e., s-/he had to perform an additional task. Apart from the instructions, all other aspects of the experimental design were as in Experiment 1.

# Results and Discussion

#### Reaction Times

Errors were 0.6 and 1.1% of the total amount of trials for the Individual and Joint conditions, respectively, and were not further analyzed. Tukey outlier thresholds (1977) were used for each condition to identify outliers in the number of erroneous trials. These thresholds removed 1 participant from the Individual condition and 2 participants from the Joint condition. In total 3 pairs were excluded from the analyses, thus data analysis was run on a sample size including 10 pairs (N = 20). Mean correct reaction times (RTs) were analyzed as in Experiment 1. The results are summarized in **Table 3**. The analysis revealed a main effect of Condition, F1,<sup>19</sup> = 34.82, p < 0.001, η 2 <sup>p</sup> = 0.65, indicating that participants performed faster in the Joint condition (M = 305 ms) than the Individual condition (M = 339 ms). Main effect of Correspondence, F1,<sup>19</sup> = 7.50, p = 0.013, η 2 <sup>p</sup> = 0.28, indicated faster responses for corresponding (M = 320 ms) than noncorresponding trials (M = 325 ms), however, this effect did not differ across the Joint and the Individual conditions, as indicated by the lack of significance for the two-way interaction, F < 1.

TABLE 3 | Experiment 2: Mean correct reaction times (and standard deviation) in ms as a function of Condition (individual vs. joint) and Correspondence (non-corresponding vs. corresponding).


### RTs Distribution

Besides the main effect of Correspondence and Condition already discussed in the previous analysis, the ANOVA revealed a main effect of Bin, F1,<sup>19</sup> = 528.90, p < 0.001, η 2 <sup>p</sup> = 0.97, indicating that RTs increase across quartiles. Two-way interaction between Condition and Bins was significant, F1,<sup>19</sup> = 20.90, p < 0.001, η 2 <sup>p</sup> = 0.52. Pairwise comparisons showed that for all the quartiles responses were faster in the Joint compared to the Individual condition, all ps < 0.001. No other main effects or interactions were significant, all ps > 0.55.

#### Trial-by-Trial Modulation and Transition Effects

The first trial of each block, errors and responses that were preceded by an incorrect response were discarded from the analysis (1.20 and 1.04% of the total trials in the Individual and Joint condition, respectively). The results are summarized in **Table 4**. The ANOVA showed that responses were faster in corresponding (M = 319 ms, SE = 7.31 ms) than noncorresponding (M = 324 ms, SE = 7.61 ms) trials, as indicated by the main effect of Trial n Correspondence, F1,<sup>19</sup> = 6.78, p = 0.017, η 2 <sup>p</sup> = 0.26. As in the previous analysis, the main effect of Condition was significant, F1,<sup>19</sup> = 34.24, p < 0.001, η 2 <sup>p</sup> = 0.64, as well its interaction with Trial n−1 Correspondence, F1,<sup>19</sup> = 6.40, p < 0.001, η 2 <sup>p</sup> = 0.64. The interaction between Trial n Correspondence and Trial n−1 Correspondence was also significant, F1,<sup>19</sup> = 91.40, p = 0.020, η 2 <sup>p</sup> = 0.25. Post hoc comparisons showed a 14-ms effect after a corresponding n−1 trial, t<sup>19</sup> = 7.04, pBonferroni−corrected < 0.001, d = 1.58, and a reversed 5-ms effect after non-corresponding n−1 trials, t<sup>19</sup> = 2.55, pBonferroni−corrected = 0.02, d = 0.57. The three-way interaction between Trial Transition, Trial n−1 Correspondence, and Trial n Correspondence was significant, F1,<sup>19</sup> = 54.42, p < 0.001, η 2 <sup>p</sup> = 0.74. Planned comparison showed that trialby-trial modulations for Nogo/go transitions with a significant 22-ms effect following a corresponding n−1 trial, t<sup>19</sup> = 8.64, pBonferroni−corrected < 0.001, d = 1.93, and a reversed 11-ms effect following a non-corresponding n−1 trial, t<sup>19</sup> = 4.94, pBonferroni−corrected < 0.001, d = 1.11. When trial transition was Go/go a significant 6-ms effect occurred following a corresponding n−1 trial, t<sup>19</sup> = 2.65, pBonferroni−corrected = 0.02, d = 0.59, and a 2-ms non-significant effect following a

TABLE 4 | Experiment 2: Mean correct reaction times (and standard deviation) in ms as a function of Trial Transition (Nogo/go, Go/go), Trial n−1 (corresponding, C vs. non-corresponding, NC), and Trial n (corresponding, C vs. non-corresponding, NC).


The Simon effect (SE) is computed as the difference in RTs between noncorresponding and corresponding trials.

non-corresponding n−1 trial, t<sup>19</sup> < 1. No other main effects or interaction were significant, all ps > 0.36 (**Figure 3**).

#### Response Coordination

Reaction time-series were computed and analyzed as in Experiment 1. A paired samples t-test was applied to compare if the proportion of correlated activity between the two members of the pair (N = 10) in the individual condition differed from the percent of response coordination observed in the joint condition. Results showed no difference in the proportion of correlated activity between the Individual (12.7%) and the Joint session (13.1%), t<sup>9</sup> < 1.

# Discussion

Experiment 2 aimed at testing if the increased response coordination reported during the joint task in Experiment 1 can be interpreted as the consequence of the mere temporal coupling with external events. To this end, we compared response coordination of participants performing a go/no-go Simon task alone or in competition with another person. Results from mean RTs showed that participants were faster in the Joint condition, as compared to the Individual condition. This result is not surprising since participants were instructed to be the best performer in the couple, in order to avoid punishment. A similar increase in speed of responses has been reported in a recent study investigating the role of turn-taking in the emergence of JSE (Karlinsky et al., 2017a). Specifically, the authors reported faster RTs when the structure of the task did not require to alternate own actions with those of the co-agent. Thus, it is possible that in our experiment the competitive framework affected the perception of turn-taking during the task. In line with our prediction, results from mean RTs indicated no difference in the SE (5 ms) between the Individual and the Joint condition. In line with results from Experiment 1, no difference emerged from the analysis of distributional trends across the Individual and the Joint condition. Similarly, trial-by-trial modulations occurred both in the Individual and Joint conditions. Again, trial-by-trial modulations were stronger for the Nogo/go transitions compared to the Go/go transitions.

Response coordination analysis showed that the percentage of coordination between RT time-series was similar for the Individual and the Joint condition. Results of Experiment 2 suggest that the increase in response coordination reported in Experiment 1 cannot be interpreted as merely the consequence of the perturbation produced by a dynamic event in the task environment. Indeed, if this was the case, a similar pattern should have emerged in Experiment 2. On the contrary, the present experiment shows that, despite the presence of the co-agent acting in a shared environment, participants did not coordinate their responses with those of a competitor.

## Comparisons Between Experiments Linear Mixed-Effects Analysis

To examine the contribution of response coordination in the JSE, we used a linear mixed-effects model analysis on mean RTs to re-analyze data from both Experiment 1 and Experiment 2. We compared our Model 1, which, as fixed factors, included Condition (Individual, Joint), Correspondence (corresponding, non-corresponding) and their interaction, with Model 2, which included coordination as a random effect. We began with a maximal random effects structure (Barr et al., 2013; Bates et al., 2015). Then, we redefined the model by including coordination as a random effect to check whether the goodness of fit was significantly increased or reduced after removing variance accounted by the random effect of coordination. In other words, by using the percentage of correlation between co-agents' RT time-series as the random effect, we controlled if it influenced main effects. The significance of the effects and parameters was evaluated using Chisquare test. Analyses were carried out using the package lme4 (version 1.0−5; Bates et al., 2015) available for the

TABLE 5 | Model comparisons for the random effect of correlated response coordination on mean RTs.


statistical software R (version 3.0.1, freely available at http:// www.rproject.org). Results of the two models are displayed in **Table 5**.

#### Experiment 1

Results showed that including the percentage of response coordination as random effect significantly improved model fit. Then, we re-estimated the mean differences in mean RTs for Experiment 1 using Model 2. Results showed that both the main effect of Correspondence and its interaction with Condition were still significant, χ <sup>2</sup> = 19.58, p < 0.001 and χ <sup>2</sup> = 8.22, p = 0.001.

### Experiment 2

Results showed that including the percentage of response coordination as random effect significantly improved model fit. We then re-estimated the mean differences in mean RTs for Experiment 2 using Model 2. Results showed that only the main effect of Condition was still significant, χ <sup>2</sup> = 10.50, p = 0.001, while the main effect of Correspondence did not reach the significance, χ <sup>2</sup> < 1.

Results from the linear mixed models analysis indicate that in both experiments introducing the percentage of response coordination as random effect increased the goodness of fit. In Experiment 1, by removing the variance explained by response coordination, the main effect of Correspondence survived, and so did the two-way interaction with Condition. Such a result suggests that the significant JSE reported in the joint condition is not fully explained by correlation between coagents' responses. In contrast, no main effect of Correspondence emerged in Experiment 2 when the percentage of response coordination is introduced as random effect. Thus, it is possible that the main effect of Correspondence in Experiment 2 could be a false positive (i.e., a type I error). Summing up, results from the linear mixed models analysis suggest that by using mixed linear models it is possible to account for random effects produced by response coordination. In both experiments, including response coordination as a random effect significantly improved model fit, suggesting that accounting for random effects at the pair level allows to reduce substantial biases in analyses (Dittrich et al., 2017). However, since in Experiment 1 response coordination did not mediate the interaction between Correspondence and Condition, the JSE under cooperative instructions cannot be interpreted as the mere consequence of the perturbation produced by a dynamic event in the task environment.

### RTs Distribution

In order to evaluate the time course of the JSE across experiments, we performed an ANOVA with Condition (Individual vs. Joint), and Correspondence (non-corresponding vs. corresponding) and Bin (1–4) as within-participants factors. In addition, we included Experiment (Exp. 1 vs. Exp. 2) as between-subjects factor. Results are reported in the **Supplementary Material (SM-1)**. In these analyses we observed significant three-way interactions Bin x Condition x Experiment, F1,<sup>38</sup> = 6.07, p = 0.001, η 2 <sup>p</sup> = 0.14, and Experiment x Condition x Correspondence interaction, F1,<sup>38</sup> = 6.66, p = 0.014, η 2 <sup>p</sup> = 0.15. In order to explore in more detail the three-way interactions we performed two separate ANOVAs for Individual and Joint condition, including Correspondence (non-corresponding vs. corresponding) and Bin (1–4) as within-participant factors, and Experiment (Exp. 1 vs. Exp. 2) as between-subjects factor.

Individual condition. The ANOVA revealed a main effect of Bin, F1,<sup>38</sup> = 452.35, p < 0.001, η 2 <sup>p</sup> = 0.92, and a main effect of Correspondence, F1,<sup>38</sup> = 4.86, p = 0.034, η 2 <sup>p</sup> = 0.11. No main effect or significant interaction with Experiment were found, all Fs < 1.

Joint condition. The ANOVA revealed a main effect of Bin, F1,<sup>38</sup> = 269.29, p < 0.001, η 2 <sup>p</sup> = 0.88, a main effect of Correspondence, F1,<sup>38</sup> = 21.26, p < 0.001, η 2 <sup>p</sup> = 0.36. The main effect of Experiment was significant, F1,<sup>38</sup> = 12.71, p = 0.001, η 2 <sup>p</sup> = 0.25, indicating that participants performed faster under competitive (M = 305 ms, SE = 7.84 ms) than cooperative instructions (M = 344 ms, SE = 7.84 ms). No main effect or significant interaction with Experiment were found, all ps > 0.08.

#### Trial-by-Trial Modulation and Transition Effects

In order to compare trial-by-trial modulations across experiments, we performed an ANOVA with Condition (Individual vs. Joint), Trial Transition (n−1 Go/ n go vs. n−1 Nogo/ n go), Trial n−1 Correspondence (noncorresponding vs. corresponding), and Trial n Correspondence (non-corresponding vs. corresponding) as within-participant factors. Also for this analysis, Experiment (Exp. 1 vs. Exp. 2) was included as between-subject factor. Results are reported in **Supplementary Material (SM-2)**. In these analyses we observed a marginally significant Condition x Trial n Correspondence x Experiment interaction, F1,<sup>38</sup> = 4.21, p = 0.047, η 2 <sup>p</sup> = 0.10. In order to explore in more detail the three-way interaction, we performed two separate ANOVAs for Individual and Joint condition, including Trial Transition (n−1 Go/ n go vs. n−1 Nogo/ n go), Trial n−1 Correspondence (noncorresponding vs. corresponding), and Trial n Correspondence (non-corresponding vs. corresponding) as within-participant factors, and Experiment (Exp. 1 vs. Exp. 2) as between-subjects factor.

Individual condition. The ANOVA showed a main effect of Trial n−1 Correspondence, F1,<sup>38</sup> = 5.26, p = 0.027, η 2 <sup>p</sup> = 0.12, and a main effect of Trial n Correspondence, F1,<sup>38</sup> = 6.07, p = 0.018, η 2 <sup>p</sup> = 0.14. The interaction between Trial n Correspondence and Trial n−1 Correspondence was

also significant, F1,<sup>38</sup> = 45.68, p < 0.001, η 2 <sup>p</sup> = 0.55, as well the three way interaction with Trial Transition, F1,<sup>38</sup> = 49.70, p < 0.001, η 2 <sup>p</sup> = 0.57. Planned comparison showed that trial-by-trial modulations for Nogo/go transitions with a significant 22-ms effect following a corresponding n−1 trial, t<sup>39</sup> = 7.77, pBonferroni−corrected < 0.001, d = 1.23, and a reversed 13-ms effect following a non-corresponding n−1 trial, t<sup>39</sup> = 4.57, pBonferroni−corrected < 0.001, d = 0.72. No trial-by-trial modulations occurred for Go/go transitions, all ps > 0.08. No other main effects or interaction were significant, all ps > 0.14. No main effect or significant interaction with Experiment were found, all ps > 0.17.

Joint Condition. The ANOVA showed a main effect of Trial n Correspondence, F1,<sup>38</sup> = 18.61, p < 0.001, η 2 <sup>p</sup> = 0.33. The interaction between Trial n Correspondence and Trial n−1 Correspondence was also significant, F1,<sup>38</sup> = 115.96, p < 0.001, η 2 <sup>p</sup> = 0.75, as well the three way interaction with Trial Transition, F1,<sup>38</sup> = 14.86, p < 0.001, η 2 <sup>p</sup> = 0.28. Planned comparison showed that trial-by-trial modulations for Nogo/go transitions with a significant 23-ms effect following a corresponding n−1 trial, t<sup>39</sup> = 7.49, pBonferroni−corrected < 0.001, d = 1.18, and a reversed 9-ms effect following a noncorresponding n−1 trial, t<sup>39</sup> = 3.49, pBonferroni−corrected = 0.001, d = 0.55. When trial transition was Go/go, a significant 10-ms effect occurred following a corresponding n−1 trial, t<sup>39</sup> = 3.87, pBonferroni−corrected = 0.02, d = 0.61, and a 2-ms null effect following a non-corresponding n−1 trial, t<sup>19</sup> < 1. The main effect of Experiment was significant, F1,<sup>38</sup> = 12.65, p = 0.001, η 2 <sup>p</sup> = 0.25, indicating that participants performed faster under competitive (M = 304 ms, SE = 7.80 ms) than cooperative instructions (M = 344 ms, SE = 7.80 ms). No significant interactions with Experiment were found, all ps > 0.10.

#### Response Coordination

In order to assess the effect of goal interdependency (and thus shared representation) in modulating the percentage of response coordination across the two experiments, we conducted an additional analysis to compare data pattern from the two experiments. The proportion of correlated activity was entered into an ANOVA with Condition (Individual vs. Joint) as within-subjects factor and Experiment as between-subjects factor. The analysis revealed a main effect of Condition, F1,<sup>18</sup> = 11.23, p = 0.004, η 2 <sup>p</sup> = 0.38, and a significant twoway interaction with Experiment, F1,<sup>18</sup> = 6.43, p = 0.021, η 2 <sup>p</sup> = 0.26. Two separate one-way ANOVAs indicated that for the Joint condition the percentage of response coordination was higher in Experiment 1 (15.3%) than in Experiment 2 (13.1%), F = 8.74, p = 0.008, d = 1.32. No difference emerged for the Individual condition between the two experiments, F < 1 (**Figure 4**).

The comparison of the two experiments showed that when participants performed the task alone, a comparable response coordination between pseudo pairs occurred, independently from the experiment at which they were assigned. In contrast, when they were performing the task side-by-side of another person, response coordination in joint

condition increased (relatively to the individual condition) only under cooperation. This latter result indicates that the increased response coordination in the joint Simon Task cannot be explained by the perturbation of temporal and spatial features induced by presence of a second agent performing a task. Our results suggest that positive goal interdependency (shared representation) may be a necessary condition for response coordination and inter-agent response dynamics.

# GENERAL DISCUSSION

The present study aimed at examining the role of temporally evolving coupling and inter-agent response dynamics during joint tasks. In two experiments, we investigated the contribution of interpersonal coordination to the emergence of JSE, as an index of shared representations. In both experiments, participants performed a go/no-go Simon task alone and together with another agent in two consecutive sessions. Across experiments, we manipulated goal interdependency by administering cooperative or competitive instructions. In Experiment 1, we instructed participants to cooperate during the social task, while in Experiment 2 participants were required to compete against each other. We examined JSE and response coordination between co-agents as a function of instructed competition or cooperation.

# Shared Representations and Joint Simon Task

Results from the analysis of mean RTs confirmed that when participants performed the go/no-go Simon task alongside another agent, social presence modulated the SE only for the group instructed to cooperate (Experiment 1). On the contrary, no influence of the presence of the co-agent was observed in the group who received competitive instructions (Experiment 2), as indicated by the lack of significant interaction between correspondence and condition (Individual vs. Joint). Our results replicate previous studies showing that when coagents' goals are mutually exclusive, the presence of another agent does not affect performance (Iani et al., 2011, 2014). Our results extend previous studies in different ways. First,

we directly compared social and individual context within participants. Indeed, previous studies investigating cooperation and competition in the joint Simon task did not include an individual condition as a baseline (Ruys and Aarts, 2010; Iani et al., 2011; Ruissen and de Bruijn, 2016). This allowed us to compare the effect of goal interdependency in the social environment with a baseline performance for each participant in the individual go/no-go task. Sellaro et al. (2013) showed that when the SE emerges in a go/no-go task due to nonsocial factors (i.e., the extended practice with spatially compatible actions, as typing on a keyboard), then the JSE vanishes in the subsequent task. However, this was not our case: although in the individual condition participants had an empty chair next to them and were aware of the presence of another person in the neighboring room, the magnitude of the SE in the individual condition was comparable across experiments. Results from the distributional analysis do not show any differences in the time course of the SE in the Joint and Individual conditions across the two experiments, as indicated by the lack of four-way interaction. This result is in line with evidence reported by Liepelt et al. (2011) showing that the time course of the JSE is stable across conditions and despite the faster performance under competitive instructions. Moreover, it should be noted that the lack of differences across experiments in the time course suggests that the non-significant JSE under competitive or negative relationships cannot be attributed to faster RTs (Hommel et al., 2009). However, further research is needed to address the relationship between the JSE and response speed. Thus, we can argue that the differential effect of social presence on the JSE between the two social conditions (competitive vs. cooperative) can be explained mainly by the difference in goal interdependency. Second, we investigated the relationship between response coordination and the JSE under cooperative and competitive instructions. Our results showed that response coordination in joint Simon task increased (relative to the individual condition) only under cooperative instructions. Interestingly, when participants were in competition, response coordination of the co-agents was equal to when they performed the task alone. This result suggests that the positive or negative interdependency between co-agents' goals is reflected not only in the representation of the task (i.e., Iani et al., 2011, 2014) but it also at the sensorimotor timing level. In line with results reported by Liepelt et al. (2011) RTs distribution was comparable between conditions (Individual vs. Joint) and across experiments, suggesting that cooperative and competitive instructions do not affect the temporal dynamics of the JSE. Trial-by-trial effects were comparable across experiments with a significant effect following a corresponding trial, and a non-significant or even reversed after a not corresponding one (Liepelt et al., 2011; Yamaguchi et al., 2018; but see also Ciardo et al., 2018 for results using social cues). Sequential modulations are thought to represent reactive adjustments of control settings (an increase of attention weights on relevant information after experiencing conflict in a non-corresponding trial; e.g., Iani et al., 2009), priming of an earlier stimulus episode (Hommel et al., 2004), or both. Interestingly, we found comparable trial-by-trial modulation in the Individual and Joint condition when the preceding

trial was a Nogo trial, i.e., when no response occurred in the Individual condition or the trial required a response of the co-agent in the Joint task. Again, no differences emerged across experiments suggesting that during both cooperative and competitive joint tasks participants represented the coagents' S-R associations. Finally, we used punishment avoidance instead of reward to manipulate goal interdependency between co-agents. Our results generalized previous findings showing that the JSE in cooperative vs. competitive condition can be modulated by the need to avoid punishment. Taken together, our results confirm that JSE is elicited by goal sharing between coagents (Iani et al., 2011) and not by the mere fact that during social task co-agents attend to each other (Ruys and Aarts, 2010).

### Interpersonal Coordination

In line with Malone et al. (2014), work the analysis of temporally emergent response coordination revealed that when participants were instructed to cooperate, the percentage of coordination between co-agents' responses was higher relative to the individual condition (Experiment 1). Interestingly, this was not true when participants were in competition, as indicated by the lack of difference in the percentage of response coordination across conditions in Experiment 2. The lack of increase of coordination occurred despite the overall speeding up of responses in the joint condition, which is in line with data showing that when performing a Simon task alongside another person, response speed does not correlate with asynchrony between co-agents' responses (Vesper et al., 2011).

It can be argued that the percentage of response coordination in our study is smaller in size than those reported in previous studies. Indeed, in two different studies, Malone et al. (2013, 2014) reported on average 24 and 33% of response coordination between co-agents for the individual and joint condition, respectively. In our study, we found on average 13% of response coordination for the individual condition and the 15% of correlation between co-agent's responses in the cooperative condition (Experiment 1, joint condition). The discrepancy between our results and those reported by Malone et al.'s (2013, 2014) could be explained by the fact that in their work, the joint task included 1100 trials. Our joint task comprised only 384 trials. Thus, it is possible that in our study participants did not reach the same amount of response correlation given the lower number of trials, which consisted in a lower number of samples available to create an accurate model of the co-agent's behavior. However, the choice to include a lower number of trials was motivated by the need to avoid transfer of learning effects typical in spatial compatibility tasks (e.g., Lugli et al., 2013), since we manipulated the social presence (Individual vs. Joint) within participants. In addition, in computing RT time-series, we considered all the data points collected during the task, while Malone and colleagues analyzed only the last 512 responses. By analyzing all the collected responses, we also considered the initial phase of the task during which participants could not have yet coordinated. Future studies should explore in more

depth the lack of response coordination in the Joint Simon task under competitive instructions. For instance, by analyzing the structure of RT variability in order to test if during a competitive task responses of co-agents are characterized by nested patterns of variability or by random fluctuations (Malone et al., 2014).

Interestingly, results of Experiment 2 showed that response coordination was not modulated by the competitive social context. By comparing the two experiments, we showed that the lack of difference across conditions in Experiment 2 could not be explained by a difference in the two groups. Indeed, we reported that the percentage of response correlation did not differ across experiments in the Individual condition (12.7% for both experiments). Our results extend Malone et al. (2013, 2014) evidence by showing that when co-agents are in competition, response coordination in the joint Simon Task is comparable to when they perform the go/no-go task alone, despite the dynamic nature of the tasks and their constraints are different. The increase of correlation between co-agents' responses reported in Experiment 1 cannot be interpreted only as the consequence of the natural tendency to adapt the timing of movements to timing of external events (e.g., Marsh et al., 2009). Indeed, if this was the case, the same pattern should have emerged in Experiment 2. On the contrary, the comparison between the two experiments indicates that in Experiment 2 response coordination was not affected by the mere presence of a competitor acting in the shared setting (i.e., both the screen and the keyboard were shared), suggesting that response coordination emerges when the framework of the task allows co-agents to become an integrated perception-action system.

# Social Context and Coordination – The Case of Cooperation and Competition

Our results suggest that emergence of shared representations is a necessary condition for temporally evolving response coordination but not necessarily a sufficient one. Accordingly, by introducing response coordination as a random effect in mean RTs models, we showed that response coordination did not affect the JSE in the cooperative joint task. Such result is in line with a recent study by Dittrich et al. (2017), showing that including random effect at the pair level increases model fitting and reduces potential biases driven by differences across pairs. The current findings give hints about the relation between shared representations or self-other integration indexed by JSE, sensorimotor coordination, and goal sharing and how these mechanisms are orchestrated to reach efficient joint action. Specifically, our results suggest that response coordination between co-agents does not account for JSE. However, they also highlight the importance of taking into account response coupling between co-agents when investigating the nature of the JSE (Dittrich et al., 2017). The percentage of coupling and the JSE might be considered as two independent elements supporting effective joint action. Keller et al. (2016) proposed that joint action outcome results from the integration and segregation of internal models of the self and of others. The authors proposed that during joint action, although goals are represented as shared, in order to guide the joint performance, a distinction between self and other internal model is preserved to allow each co-agent to keep control over their action planning and execution. This facilitates co-agents to anticipate, attend and adapt to each other in real time, resulting into a precise and flexible interpersonal coordination (c.f. Keller et al., 2016). It is plausible that in our study, when co-agents' goals were positively related, shared goal representation may have promoted the integration of self and other models. As a result, in the joint action model both alternative of response were represented. Selfother integration allowed co-agents to attend and adapt their performance to each other's sensorimotor timing, resulting in response coordination. In contrast, when the co-agents were in competition, the lack of shared representation may have favored self-other segregation. Thus, the resulting joint model did not include co-agent's alternative of response. Therefore co-agents did not consider other's behavior, but focused on stabilizing their own performance (i.e., increasing the response speed) in order to suppress the sensorimotor interference generated by co-agent's action timing. Then, as a final result no response coordination emerged.

Our results are limited to joint tasks based on discrete nonrhythmic actions. However, they highlight the importance of competitive interactions in the context of understanding how different mechanisms support joint action. If we suppose that goal interdependency between co-agents may vary on a dipole between positive and negative relation, then we can assume that when cooperation is not explicitly requested, the shared nature of the task (i.e., context, setting, and space) prompts the integration of self and other models into a joint model. As a result, when others do not explicitly interfere with our goals, by default we perceive positive interdependence with them (Ruys and Aarts, 2010; Iani et al., 2011), and coordinate with them at the sensorimotor level (Malone et al., 2013, 2014) even if their actions have no direct consequences on our next action, as in the case of the joint Go/nogo Simon task. Future studies should address the relation between coordination and the emergence of the JSE by de-personalizing the goal of the task. For instance, synchronization with external and variable events generated by human or artificial agents can be examined (e.g., a Humanoid robot, see Wykowska et al., 2016; Wiese et al., 2017), as perceived natural/intentional vs. artificial agency may moderate self-other integration in the Joint Simon task. In addition, future studies need to address the role of visual access to co-agent's action for the emergence of response coordination during joint tasks.

To conclude, the present study was designed to investigate the contribution of interpersonal entrainment to the emergence of shared representations by comparing response coordination in a joint Simon task during cooperation and competition. Our results show that emerging coordination increases during joint action only if co-agents' goals are shared, but not when co-agents' goals are mutually exclusive. The results show that interpersonal coordination requires the emergence of shared representations or self-other integration, indexed by JSE. Therefore, in joint action shared representations seem to be

a necessary condition for interpersonal coordination, but not sufficient one.

# AUTHOR CONTRIBUTIONS

FC conceived, designed and performed the study, analyzed the data, discussed and interpreted the results, and wrote the manuscript. AW conceived and designed the study, discussed and interpreted the results, and wrote the manuscript. All authors reviewed the manuscript.

# FUNDING

This project has received funding from the European Research Council (ERC) under the European Union's Horizon 2020

# REFERENCES


research and innovation program (Grant Awarded to AW, Titled "InStance: Intentional Stance for Social Attunement." Grant Agreement No. 715058).

# ACKNOWLEDGMENTS

We thank Claudio Campus and Kyveli Kompatsiari for helping with data analysis.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg. 2018.01919/full#supplementary-material

The data related to this study can be accessed online at https://doi.org/10.5281/zenodo.1421280.



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Ciardo and Wykowska. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Social Situation Affects How We Process Feedback About Our Actions

Artur Czeszumski <sup>1</sup> \*, Benedikt V. Ehinger <sup>1</sup> , Basil Wahn1,2 and Peter König1,3

1 Institute of Cognitive Science, Universität Osnabrück, Osnabrück, Germany, <sup>2</sup> Department of Psychology, University of British Columbia, Vancouver, BC, Canada, <sup>3</sup> Institut für Neurophysiologie und Pathophysiologie, Universitätsklinikum Hamburg-Eppendorf, Hamburg, Germany

Humans achieve their goals in joint action tasks either by cooperation or competition. In the present study, we investigated the neural processes underpinning error and monetary rewards processing in such cooperative and competitive situations. We used electroencephalography (EEG) and analyzed event-related potentials (ERPs) triggered by feedback in both social situations. 26 dyads performed a joint four-alternative forced choice (4AFC) visual task either cooperatively or competitively. At the end of each trial, participants received performance feedback about their individual and joint errors and accompanying monetary rewards. Furthermore, the outcome, i.e., resulting positive, negative, or neutral rewards, was dependent on the pay-off matrix, defining the social situation either as cooperative or competitive. We used linear mixed effects models to analyze the feedback-related-negativity (FRN) and used the Threshold-free cluster enhancement (TFCE) method to explore activations of all electrodes and times. We found main effects of the outcome and social situation, but no interaction at mid-line frontal electrodes. The FRN was more negative for losses than wins in both social situations. However, the FRN amplitudes differed between social situations. Moreover, we compared monetary with neutral outcomes in both social situations. Our exploratory TFCE analysis revealed that processing of feedback differs between cooperative and competitive situations at right temporo-parietal electrodes where the cooperative situation elicited more positive amplitudes. Further, the differences induced by the social situations were stronger in participants with higher scores on a perspective taking test. In sum, our results replicate previous studies about the FRN and extend them by comparing neurophysiological responses to positive and negative outcomes in a task that simultaneously engages two participants in competitive and cooperative situations.

Keywords: social cognition, joint action, EEG, feedback related negativity, cooperation, competition

# 1. INTRODUCTION

In every day life, humans frequently commit errors. For example, they are prone to press incorrect buttons, trip over household objects or make typing mistakes. These errors often influence not only the person committing the mistake but also other people. Such erroneous actions may have a negative impact on others if people are cooperating in a task (e.g., moving furniture together). Conversely, they may have a positive impact on others if people are competing in a task (e.g., in a game of table tennis). These mistakes that involve others frequently require external feedback to find out about the impact of one's own and others' performed actions. Thus, it is likely that the

#### Edited by:

Karl Christoph Klauer, University of Freiburg, Germany

#### Reviewed by:

Benjamin Ernst, Katholische Universität Eichstätt-Ingolstadt, Germany Annelie Rothe-Wulf, University of Freiburg, Germany

> \*Correspondence: Artur Czeszumski aczeszumski@uos.de

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 17 July 2018 Accepted: 05 May 2019 Published: 25 February 2019

#### Citation:

Czeszumski A, Ehinger BV, Wahn B and König P (2019) The Social Situation Affects How We Process Feedback About Our Actions. Front. Psychol. 10:361. doi: 10.3389/fpsyg.2019.00361 human brain has mechanisms that distinguish between positive and negative outcomes of one's own and others' actions.

Earlier research on error processing in tasks performed individually shows that humans have a fast and efficient error detection mechanism (Coles et al., 2001; Yeung et al., 2004). In particular, studies using electroencephalography (EEG) identified event-related-potential (ERP) components instantly following one's own errors awareness, or feedback regarding the outcome of one's own actions (Falkenstein et al., 1991). These components are known as error-related-negativity (ERN) and feedbackrelated-negativity (FRN). The ERN is evoked 50–70 ms after an erroneous action is carried out (e.g., an incorrect button press) and it originates from the anterior cingulate cortex and the pre-supplementary motor area in the posterior medial frontal cortex (Holroyd et al., 2004; Ridderinkhof et al., 2004; de Bruijn et al., 2009). The FRN is elicited approximately 200–350 ms after performance feedback is received and is considered to have a similar origin as the ERN (Miltner et al., 1997). Holroyd and Coles (2002) proposed that the ERN/FRN component is elicited as soon as the outcome of an action can be detected by proprioceptive, motor or external feedback. They also proposed a direct relationship between a negative outcome detection and reward processing. In essence, whenever the result of an action is worse than expected, which results in a loss of reward, the ERN/FRN is elicited.

While these components have been widely studied in individuals, little research has investigated how humans process feedback about actions that involve others. A first step in this direction was made by van Schie et al. (2004). They found that the FRN component occurs after observing an error committed by others. Given the sensitivity of the FRN to mistakes of others, researchers suggest that it might reflect the processing of socially relevant stimuli. Further studies explored this idea by manipulating the social situation (i.e., either cooperative or competitive) while participants performed or observed actions and received feedback about monetary rewards (Itagaki and Katayama, 2008; Marco-Pallarés et al., 2010). Results showed that the FRN was elicited by losses of others in a cooperative situation. In a competitive situation, conversely, others' gains elicited the FRN. These results indicate that the FRN reflects the valence of an outcome, which in turn depends on the current social situation.

In contrast to studies of the FRN discussed above, the ERN, which is elicited for self-generated errors, appears to be not influenced by the social situation (de Bruijn and von Rhein, 2012). In another study self-generated errors elicited ERN in both cooperative and competitive situations, however, observed errors of others elicited the observed ERN (oERN) only in a cooperative situation (Koban et al., 2010). These studies focused on outcome processing in cooperative and competitive situations. However, the tasks used in these studies involved actions that are performed in turns and there was always either a division between a performer and observer participant (Koban et al., 2010; Marco-Pallarés et al., 2010; de Bruijn and von Rhein, 2012) or the partner was virtual (Itagaki and Katayama, 2008). Hence, it is not clear whether these findings would also generalize to designs in which co-actors perform a task that requires simultaneous responses to identical stimuli from both participants in contrast to turn-taking tasks such as for instance, joint Simon tasks (Sebanz et al., 2003, 2005; Dolk et al., 2014), in which co-actors respond to different stimuli at different time points.

To close this gap in the literature, a set of recent studies also investigated the FRN in situations in which humans perform tasks together. Humans in real life often perform actions together with others, instead of observing another human performing an action alone. Thus, studying the social aspect of outcome processing requires paradigms, in which co-actors perform tasks jointly (Hari and Kujala, 2009; Schilbach et al., 2013). In line with this idea Picton et al. (2012) tested dyads of participants in a cooperative joint choice reaction time task. In their study, participants were able to realize their own mistakes without feedback, which elicited the ERN, while mistakes of a partner had to be inferred from visual feedback, which elicited the FRN (Picton et al., 2012). In an even more naturalistic set-up, Loehr and colleagues tested piano duets (Loehr et al., 2013, 2015). Such a music paradigm allowed for a clear division between one's own, other's and joint errors. Results of both Picton et al's and Loehr et al's experiments confirmed that the FRN monitors both one's own and other's errors in joint situations. Interestingly, the FRN is stronger for one's own than joint mistakes, and stronger for joint mistakes than others' mistakes (Loehr et al., 2015). These studies focused on the monitoring of actions in cooperative joint set-ups. However, according to our knowledge there are no studies that involve two participants performing actions and receiving feedback about their individual and joint actions in both cooperative and competitive situations.

To fill this gap in the literature, in the present study we focused on two aspects: First, in our experiment both participants were actively performing a task. That is, in contrast to previous research there was no distinction between an active co-actor and a passively observing co-actor (Itagaki and Katayama, 2008; Marco-Pallarés et al., 2010). Instead, each of the participants performed their individually assigned task in parallel and observed their own and the co-actor's errors. Second, rewards (positive, negative, and neutral) associated with errors depended on whether the assigned task was performed in a cooperative or competitive situation. With this design, the main question we addressed was whether the FRN is influenced by different social situations when both co-actors actively perform a task. Additionally, by including neutral conditions (i.e., condition without any monetary rewards) in the design, we were able to investigate whether FRN amplitudes differed between errors that are associated with monetary outcomes (positive and negative) and errors that are not associated with any monetary rewards (neutral). Such comparisons were only rarely addressed in previous research (Holroyd et al., 2006). We also aimed to relate FRN amplitudes to personality traits measured with a questionnaire. Namely, we focused on the Perspective taking subscale of the Interpersonal reactivity index (IRI, Davis, 1983) that measures the tendency to spontaneously adopt the psychological point of view of others. We chose this subscale because it was already shown that FRN amplitudes correlate with the Perspective taking scores (Koban et al., 2012). Finally, we performed exploratory analysis to explore the time course of processing feedback about self-produced actions and co-actors' actions depending on the social situation.

# 2. METHODS

# 2.1. Participants

Fifty-two students (37 females, mean age = 24.1, standard deviation = 4 years) randomly grouped into 26 dyads (15 female-male and 11 female-female dyads) participated in the experiment. Twenty-six participants were measured with EEG (16 females, mean age = 24.5, standard deviation = 3.3 years). Prior to the experiment we asked all participants whether they knew each other and paired only strangers in a dyad. The ethics committee of the University of Osnabrück approved the experiment. We informed participants about their rights and all participants signed a written consent form. The study was conducted in Osnabrück and all participants were students in an international study program. Therefore, all instructions and questionnaires were provided in English. Participants could chose either a monetary reward or course credits in exchange for their participation. All participants that were measured EEG opted for the monetary reward.

# 2.2. General Apparatus

We tested participants in dyads. They sat next to each other on the same side of a table in the same room. To avoid interference and communication during the experiment, we separated them with a cardboard screen (**Figure 1B**). We presented stimuli on two identical computer monitors (BenQ 24 inches, 1920 x 1080 pixels, refresh rate 120 Hz). We used two separate keyboards (Cherry RS 6000) to collect behavioral responses, one for each participant. The experiment was programmed using the Python library PsychoPy (Peirce, 2007) and the experimental procedure and data collection were implemented in Python 2.7.3 [code available (https://osf.io/c4wkx/)]. The experiment was run on an Intel Xeon CPU.

# 2.3. Experimental Design

Each member of a dyad performed a four-alternative forcedchoice (4-AFC) visual task (**Figure 1A**) and later received feedback about their performance and associated monetary rewards (**Figure 1C**). In each dyad, one participant performed an orientation discrimination task and the other participant a spatial frequency discrimination task. The assignment of the participants to both tasks was randomized and counterbalanced. First, we presented a target object in the middle of the screen for 400 ms. The target object was a single Gabor Patch of size 9.95◦ x 9.95◦ visual angle, oriented at a randomly chosen angle (between 20◦–80◦ and 100◦–160◦ ) and with a randomly chosen spatial frequency (between 10 and 20 cycles/stimulus size). Subsequently, we displayed a gray mask with a fixation cross in the middle (linewidth of 0.13◦ visual angle) for 100 ms followed by four Gabor patches arranged in a 2 x 2 grid, each patch separated from neighboring patches by 0.41◦ visual angle on each side. Each of the four Gabor patches was of the same size as the target object. One Gabor Patch always had the same orientation as the target object while the other three patches were manipulated according to a QUEST staircase procedure (Watson and Pelli, 1983). A different Gabor patch had the same spatial frequency as the target object and the other three patches again had different spatial frequencies according to a second QUEST staircase procedure (for more details about the QUEST procedure, see section 2.5). Therefore, both participants simultaneously had to respond to identical visual stimuli, however, their tasks were independent. This means, participants could not influence each other's performance while performing the task. Participants were informed that their partners had different tasks and they were familiar with the partner's instructions. The location of the correct answer for each of the participants was randomized between four possible locations. Participants responded with key presses ("Q,""W,""A,""S,"or "7,""8,""4,""5" on the num-pad, for the participants seated on the left or right respectively). The key corresponded spatially with the displayed Gabor patches. We displayed the Gabor patches until both of the participants gave their responses or 3,000 ms passed. In the case of no response, the answer was considered as incorrect. We instructed participants to give their answers as accurately and as quickly as possible. Subsequently, a gray mask with a fixation cross was displayed for 700–800 ms and then feedback appeared on the screen. We used a colored circle (radius: 3.94◦ visual angle) vertically divided in halves to inform participants about the performance of both participants. The color of the feedback was dependent on the participants' answers. The green color indicated correct answers and red incorrect answers. The left semicircle and right semicircle gave feedback to the left and right participants, respectively. Additionally, we presented individually a letter (0.8◦ visual angle, "W" for wins, "L" for losses and "T" for ties; for more details, see section 2.4 below) in the middle of a circle. Feedback was displayed for 1,000 ms and was followed by a gray mask for 200 ms before moving on to the next trial (**Figure 1A**).

# 2.4. Social Manipulation and Monetary Rewards

The feedback included information about individual and joint errors as well as the resulting positive, negative or neutral monetary rewards. Note, the schema of monetary rewards, as given in the pay-off matrix, defined the social situation as cooperative or competitive. The gain or loss of 5 cents was dependent on the particular social situation as follows (**Figure 1C**):

In the cooperative situation the trial was considered as a win, and consequently positively rewarded, only in the case in which both of the participants responded correctly (one green semicircle for each of the two participants). In the case that both participants were wrong, it was considered a loss and as a negative reward five cents were subtracted from their budgets (one red semi-circle for each of the two participants). In the case that one participant was correct and the other was incorrect, no money was added to or subtracted from either budget (half green and half red circle).

In the competitive situation both participants answering correctly or incorrectly resulted in a tie (full green or red

circle). Thus, no money was added to or subtracted from either budget. A reward was achieved when one participant was correct and the other was incorrect (half green and half red circle). In this case the reward was added to the correct participant's budget and subtracted from the incorrect participant's budget. At the end of each block the participants' respective budgets were calculated and displayed on the screen.

Social situations alternated between blocks (16 blocks in total, 8 cooperation and 8 competition). The order of blocks was counterbalanced across participants and randomly chosen for each dyad, with never more than three repetitions. To ensure that participants know and understand both social situations, we provided information regarding the block number, the social situation, and rewards associated with each feedback at the beginning of each block. In addition, "win" or "lose" was shown as text inside the feedback stimulus (**Figure 1C**). Each participant had an initial budget of 10 Euro that could increase or decrease by 5 cents based on their performances in each trial.

# 2.5. Experimental Procedure

One participant of each dyad was invited one hour earlier than the other and was prepared for the EEG recordings outside of the recording chamber. Thus, the participant assignment for EEG recordings was done prior to the experiment. After around 45 min, when preparation was finished and the second participants arrived, both participants were seated side-by-side in a room at a 60 cm distance to their screen. For technical reasons, the participant measured with the EEG sat on the left side. The experimental session lasted approximately 90 min and was structured as follows: After detailed written and oral instructions, a QUEST staircase procedure (Watson and Pelli, 1983) was performed for each participant separately (one after another) for the assigned task with the goal to home in on 50% performance, i.e., well above the chance level of 25%. To achieve this, we used the PsychoPy QuestHandler function with the threshold set to 0.63 and a gamma 0.01. Both participants performed 100 training trials. For the participant performing the orientation discrimination task we varied the degree of orientation between 1◦ and 45◦ with a starting value of 15◦ and a standard deviation 10◦ . For the other participant, who performed the spatial frequency discrimination task, we varied the spatial frequency between 1 and 25 cycles/stimulus size with starting value of 3 cycles/stimulus with a standard deviation of 3 cycles/stimulus. Subsequently, participants proceeded to the actual experiment, which consisted of a total of 640 trials grouped in 16 blocks of 40 trials each. After 20 trials in each block, participants were asked to answer in which social situation they were currently in. Namely, they were asked to indicate whether the current block was a cooperative or competitive situation, in order to check whether the participants remembered the social situation manipulation correctly. Blocks were separated by short rests and the overall experiment was divided into three parts with short breaks. In these breaks experimenters made sure that participants were not exchanging any information about the experiment. When the tasks were completed, participants filled out the Interpersonal reactivity index (IRI, 28 questions) questionnaire (Davis, 1983).

# 2.6. Methods of EEG Data Acquisition and Preprocessing

Electrophysiological data were recorded using a 64-Ag/ AgCl electrode system (ANT Neuro, Enschede, Netherlands), using a REFA-2 amplifier (TMSi, Enschede, Netherlands) with electrodes placed on a Waveguard cap according to the 5% electrode system (Oostenveld and Praamstra, 2001). The data was recorded using average reference electrode at a sampling rate of 1,024 Hz. Impedances of all electrodes were manually checked to be below 10 k before each experiment. We used R and MATLAB to preprocess and analyze the data. All analysis scripts and data are available online (https://osf.io/c4wkx/). We used the eegvis toolbox (Ehinger, 2018) to visualize the exploratory analyses. Data were preprocessed using the EEGLAB toolbox (Delorme and Makeig, 2004) in the following order: First, the data were downsampled to 512 Hz and subsequently filtered using a 0.1 Hz high-pass filter and a 120 Hz low pass filter ( 6 dB cutoff at 0.5Hz, 1 Hz transition bandwidth, FIRFILT, EEGLAB plugin). Channels exhibiting either excessive noise or strong drifts were manually detected and removed (2.1 ± 2.5, mean and standard deviation, respectively). After this, the continuous data were manually cleaned, rejecting data sequences including jumps, muscle artifacts, and other sources of noise. To remove eye and muscle movement-related artifacts, an independent component analysis based on the AMICA algorithm (Palmer et al., 2008) was computed on the cleaned data. The independent components (ICs) corresponding to eye, heart, or muscle activity were manually selected based on their timecourse, spectra and topography, and removed before transforming the data back into the original sensor space (number of removed ICs 8.3 ± 5.2, mean and standard deviation, respectively). The initially removed channels were interpolated based on the activity of their neighboring channels (spherical interpolation). Subsequently, the continuous data were divided into epochs for each trial by including data from 200 ms pre-stimulus to 1,000 ms post stimulus, using the time window between –200 ms and stimulus onset for baseline correction. For the exploratory analysis we used 62 electrodes (Fp1, FPz, Fp2, F7, F3, Fz, F4, F8, FC5, FC1, FC2, FC6, T7, C3, Cz, C4, T8, CP5, CP1, CP2, CP6, P7, P3, Pz, P4, P8, POz, O1, Oz, O2, AF7, AF3, AF4, AF8, F5, F1, F2, F6, FC3, FCz, FC4, C5, C1, C2, C6, CP3, CPz, CP4, P5, P1, P2, P6, PO5, PO3, PO4, PO6, FT7, FT8, TP7, TP8, O7, PO8).

# 3. RESULTS

# 3.1. Behavioral Analysis

### 3.1.1. Social Situation Awareness

To assure that participants payed attention to the different social situations in the experiment we asked them in the middle of each block whether the current block was a cooperative or competitive situation. Answering this question participants achieved a high accuracy (mean correct answers = 97%, standard deviation = 7%), suggesting that participants consistently understood and memorized the instructions about differences between social situations.

# 3.1.2. Accuracy

Prior to running the actual experiment, we used a QUEST staircase procedure to adjust a difficulty in each task for participants such that participants were expected to attain a 50% accuracy. Confirming this expectation, the mean accuracy in the task was 53% (standard deviation = 9%) and the mean difference between paired participants was 8% (standard deviation = 6%). It was important that both paired participants performed with comparable accuracy to avoid that the analyzed ERPs are influenced by differences at the behavioral level. Further, it results in an even distribution of performance data in correct-correct, correct-false, false-correct, and false-false.

## 3.1.3. Response Time

We analyzed response times to test whether our experimental manipulations influenced behavioral responses. Prior to analysis, we excluded all trials with response times faster than 50 ms (2 trials) because such fast responses are likely due to premature responses. Then, we used a linear mixed model (LMM) to analyze response times. The LMM was calculated with the lme4 package (Bates et al., 2015) and p-values were based on Walds-T test using the lmerTest package. Degrees of freedoms were calculated using the Satterthwaite approximation. We modeled responses times by task, social situation, and correctness as fixed effects and interactions between them. As random effects, we used random intercepts for grouping variables participants and dyads. In addition, we used random slopes for all fixed effects, including interactions, in the participant grouping variable. For all predictors, we used an effect coding scheme with binary factors coded as –0.5 and 0.5. Thus, the resulting estimates can be directly interpreted as the main effects. The advantage of this coding scheme is that the fixed effect intercept is estimated as the grand average across all conditions and not the average of the baseline condition. We found a main effect of correctness [t(50.18) = −8.1, p <0.0001]. Correct answers were on average 80 ms faster than incorrect answers. The main effects for the two other predictors (tasks and social situations) and all possible interactions were not significant (p >0.17). These results suggest that different tasks (orientation and spatial frequency) and social situations (cooperative and competitive) are of comparable level of difficulty and engage two participants to similar degrees.

# 3.2. Electrophysiological Data

To analyze EEG data in form of ERPs, we applied a preselected single-trial based LMM analysis (Frömer et al., 2018). We defined the FRN as the mean amplitude over six electrodes (Fz, F1, F2, FCz, FC1, FC2) between 200 and 300 ms after the feedback of each trial. Our choice of electrodes and time window was based on previous research and were pre-specified before any analysis (Ullsperger et al., 2014). We modeled the FRN using outcomes (win and lose) and social situations (cooperative and competitive) as fixed effects and an interaction between them. As random effects, we modeled random intercepts for participants and random slopes for both predictors (outcomes and social situations) and interaction between them. For the same reason as above, predictors were effect coded, i.e., binary factors are coded as –0.5 and 0.5. The result of this analysis are presented in **Table 1** and ERPs in **Figure 2**. We found main effects for the outcome [t(26.02) = −5.85, p <0.001] and the social situation [t(26.01) = 4.4, p <0.001]. The FRN amplitudes were on average 1.03 (standard deviation = 0.23) µV higher in lose than win trials and 1.54 (standard deviation = 0.26) µV higher in competitive than cooperative trials. The interaction between these factors was not significant [t(27.15) = −0.93, p = 0.36]. These results suggest that the FRN differs between positive and negative outcomes and between cooperative and competitive social situations and that these two effects are independent of each other.

Additionally, we used individual estimates of the difference between the FRN in the two social situations to correlate them with the Perspective Taking Score. We calculated the Spearman's Rho to quantify the association of the Perspective taking score and individual participant's mixed model best linear unbiased prediction of the factor social situation from the mean amplitude analysis. We chose Spearman's correlation because our questionnaire data was rank data. We found a significant negative correlation (r = –0.54, p = 0.005, **Figure 3**). This result suggests that on average the effect of the social situation is stronger on the characteristic ERPs in participants with personality traits related to high perspective taking abilities.

Furthermore, after visual inspection of the grand average ERPs (**Figure 2**), we decided to also apply a peak to peak amplitude analysis because the FRN peaked earlier than expected (Ferdinand et al., 2012). For the peak to peak analysis we used



represent the outcome, i.e., win and lose trials respectively. Solid and dashed lines represent cooperative and competitive situations. The gray box shows the preselected time window used for the confirmatory statistical analysis (200–300 ms).

the same electrodes as for the mean amplitude analysis (Fz, F1, F2, FCz, FC1, FC2). We used the grand average to identify the maximum positive peak between 140 and 200 ms and the maximum negative peak between 200 and 270 ms after feedback presentation. We subtracted the average maximum negative peak amplitude from the average maximum positive peak over these time windows. This is equivalent to applying directly a peakto-peak analysis on data low-pass filtered by a boxcar kernel. Compared to the plain peak-to-peak analysis it is, however, less susceptible to high frequency noise and therefore more robust. Then, we applied exactly the same LMM analysis as with the mean amplitude (details above). The result of this analysis are presented in the **Table 2**. We found main effects for the outcome [t(26) = 3.55, p = 0.001] and the social situation [t(26.09) = −3.04, p = 0.005]. The peak amplitudes were on average 0.71 (standard deviation = 0.23) µV higher in win than lose trials and 1.3 (standard deviation = 0.36) µV higher in cooperative than competitive trials. The interaction between these factors was not significant [t(25.98) = −0.98, p = 0.34]. These results are in line with results of mean amplitude analysis, further corroborating that FRN amplitudes differ between positive and negative outcomes and between cooperative and competitive situations.

For the exploratory analysis, we used the Threshold-Free-Cluster-Enhancement method (TFCE) and permutation analysis (Smith and Nichols, 2009; Mensen and Khatami, 2013; Ehinger et al., 2015). This method allows for comparisons between experimental conditions over all electrodes and time points of ERPs while at the same time controlling for the multiple comparison. We analyzed the EEG data with a two-way repeated measures ANOVA with outcome (win vs. lose) and the social situation (cooperative vs. competitive) as within participants factors and taking into account 62 electrodes, and all time points between 0 and 600 ms. We enhanced the signal with the TFCE method and used permutation tests to account for multiple comparisons. We used 5,000 permutations

FIGURE 3 | Feedback locked difference waveforms at pooled electrode sites (F1, Fz, F2, FC1, FCz, FC2). Data are average referenced. Pink and green colors represent the monetary outcome, i.e., lose-win (monetary) and incorrect-correct (neutral) trials respectively. Solid and dashed lines represent cooperative and competitive situations. The gray box shows the preselected time window used for the confirmatory statistical analysis (200–300 ms). All ERP waveforms that were used to make difference waves are presented in Supplementary Materials.



and for each permutation we randomized the assignment to different experimental conditions of each data point within each participant. For each of these TFCE permutations, a repeated measures ANOVA was calculated. The maximum F-value across chosen samples in time and space were used to construct a max F-value distribution, against which the actual F-values were compared. We considered F-values above the 95th percentile to be significant. The results of this analysis are presented in **Figure 5**. We found two separate clusters of significant activity for the main effect of outcome. One cluster spans from 88 to 152 ms (median p value: p = 0.01, min p value: p = 0.001) with a peak at C1 electrode 121 ms after the feedback and was more negative for lose than win outcomes. The other cluster ranges between 172 and 340 ms with a peak at Fz electrode 240 ms following the feedback (median p value: p = 0.01, min p value = 0.0006). This cluster resembles spatially and temporally the FRN and it was more negative for lose than win outcomes. Please note that in contrast to the conventional analysis above that makes assumptions on the timing of the relevant signals, the TFCE approach gives the intervals with significant differences as a result. Thus, the present analysis validates and makes the assumptions of the analysis above more precise. Moreover, we found that there is a main effect of the social situation. This cluster stretched from 68 till 600 ms (median p value: p = 0.0004, min p value: p = 0.0002) and encompassed all electrodes at different time points, suggesting a robust difference in processing of feedback between cooperative and competitive situations. The peak significant value was at FC5 electrode 143 ms after the feedback. Overall, these results support the observations above of large differences between processing of feedback between cooperative and competitive situations and suggest that the difference in processing positive and negative feedback starts earlier than classically considered time window for the FRN.

Next, to fully explore our design we analyzed differences in the FRN amplitudes between monetary vs. neutral outcomes crossed with social situations. For this, we utilized a difference wave approach (Li et al., 2018). In each of the social situations, we subtracted ERPs of negative from positive monetary outcomes. In addition we subtracted incorrect responses from correct ones in the neutral monetary outcomes. Then, we quantified the FRN, likewise as in the LMM analysis above, as the mean amplitude between 200 and 300 ms after the feedback presentation for each condition. We used a two-way repeated measures ANOVA with social situation (cooperative vs. competitive) and type of outcome (monetary vs. neutral) as within participant factors. We applied a different statistical method than in above mean

amplitude analysis to analyze difference waves as it is unclear how one would pair trials for subtraction on a single trial level. Thus, we used grand averages for each condition to calculate difference waves. Moreover, we used a difference wave approach to simplify the necessary statistical model and answer a different question. Namely, whether types of outcomes (monetary vs. neutral), without considering whether it's positive or negative, are different. In the time window from 200 to 300 ms we found a main effect of social situation [F(1,25) = 6.17, p = 0.02, η <sup>2</sup> = 0.022, **Figure 4**)], a main effect of type of outcome [F(1,25) = 4.55, p = 0.04, η <sup>2</sup> = 0.026], and no interaction between these factors [F(1,25) = 0.34, p = 0.56, η <sup>2</sup> = 0.003]. The amplitudes were more negative for monetary than neutral outcomes and more negative in competitive than cooperative situations. These results suggest that the effect of the social situation on the FRN reported above extends to neutral outcomes. Furthermore, the significant difference between monetary and neutral outcomes suggest that is sensitive to both monetary rewards and task performance. Furthermore, after we observed significant cluster resembling the FRN in our exploratory analysis (172 to 340 ms after the feedback presentation, see above) we analyzed this later time window as well. In particular, we calculated the mean amplitude between 300 and 340 ms for each difference wave and analyzed it with a two-way repeated measures ANOVA with social situation (cooperative vs. competitive) and type of outcome (monetary vs. neutral) as within participant factors. The main effect of social situation [F(1,25) = 0.05, p = 0.83, η <sup>2</sup> = 0.0002, **Figure 4**] was not significant. However, we found a main effect of type of outcome [F(1,25) = 8.32, p = 0.007, η <sup>2</sup> = 0.02] and an interaction effect between the factors [F(1,25) = 4.91, p = 0.036, η 2 = 0.06]. Thus, in contrast to the time window predefined based on earlier literature, we do not observe a main effect of the social situation in the late window, but an interaction arises between the factors of social situation and outcome. Thus, the complete time window reaching up to 340 ms contains dynamics and is not a completely homogeneous block. Specifically, these results further corroborate that feedback processing is sensitive to both

monetary rewards and task performance. Moreover, the social situation modulates amplitudes for both of them.

Lastly, we address a potential visual confound in our design. As we used four different visual stimuli to inform our participants about their performance and associated rewards, results potentially reflect differences of the visual feedback. To address this potential perceptual confound, we invited five participants again, who previously completed the experiment, for a control experiment. In this version of the experiment, the Gabor patches were not displayed and random feedback was provided. Thus, this experiment controls for the pure visual effect of the feedback. To assess this potential confound, we calculated grand average ERPs for experimental and control data. We visually inspected the ERPs and found no difference between the different visual feedback displays, including the early visual components. Then, we subtracted these control data from the experimental data. Again, visual inspection suggests that differences in the visual appearance of the feedback information did not influence the FRN (**Figure 6**). This is in line with previous research that shows only early components e.g., C1, P1, N1, in the first 150 ms are modulated by such low-level visual stimuli properties (Wijers et al., 1989; Hillyard and Anllo-Vento, 1998). Hence, we are reassured that our results represent differences between outcomes and social situations and not due to differences in the visual stimuli.

# 4. DISCUSSION

The goal of the present study was to compare reward processing between different social situations as well as to test whether earlier results (Picton et al., 2012) generalize to a setting which actively involves two participants. For this purpose, we designed a joint 4-AFC visual task, in which two co-actors both concurrently perform a task and receive rewards depending on the social situation. We were able to replicate the difference in FRN amplitudes between positive and negative outcomes in the cooperative situation (Picton et al., 2012). Moreover, we extended these earlier results by observing a significant difference between win and lose outcomes in the competitive situation. We also found that the FRN significantly differs between social situations, suggesting that reward processing is

modulated by the social situation. However, we did not observe an interaction between these two factors. Further, the difference induced by the social situations were stronger in participants with higher perspective taking scores, which were obtained using a perspective taking questionnaire. Finally, we compared feedbacks with and without monetary outcomes (win/lose vs. neutral) in both social situations. We found that our reported effect, that the social situation affects the FRN, also extends to the processing of neutral outcomes. Moreover, we found a significant difference between feedbacks with and without monetary outcomes, suggesting that the FRN is sensitive to both monetary rewards and task performance.

Earlier behavioral findings support the idea that humans corepresent co-actors actions even if they are irrelevant to one's own goals (Atmaca et al., 2011), for a recent general review,(see, Vesper et al., 2017). Such representations may also influence how humans process feedback about actions and associated monetary rewards while performing joint actions with another person. Therefore, our experiment involved two participants performing their tasks simultaneously and hence differs from previous studies that utilized a virtual partner to investigate differences between social situations (Itagaki and Katayama, 2008). Moreover, the design allows for concurrent actions from both participants–an aspect that it is not present in designs that employ turn-taking tasks which create a division between a performer and observer (Koban et al., 2010; Marco-Pallarés et al., 2010; de Bruijn and von Rhein, 2012). Thus, with the results of the present study, we extended earlier findings by demonstrating that they also generalize to a setting involving co-actors that both actively and simultaneously perform a task.

Our result that the outcome (positive vs. negative reward) affects the FRN in both social situations is in line with a great body of earlier research (Ullsperger et al., 2014). We quantified the FRN in two different ways (mean and peak to peak amplitude) and applied additional exploratory analyses. Results of all three analyses provide strong evidence that negative outcomes elicit more negative amplitudes at mid-line electrodes around 200 to 300 ms after the feedback presentation. Such an outcome of our study suggests that the FRN component is robust and it generalizes from individual to joint set-ups and different social situations. In contrast, our results are not compatible with the theory that the FRN represents differences in expectancies and probabilities (Alexander and Brown, 2010, 2011). In our task the probabilities for each outcome were nearly equal, therefore, there are no differences in probabilities or expectancies. Future studies could investigate whether reward processing is also affected by the outcome in tasks, in which both co-actors actively perform a task collaboratively as, for instance, in joint perceptual tasks (e.g.,Brennan et al., 2008; Brennan and Enns, 2015; Wahn et al., 2016b, 2017a,b, 2018c; for a recent review, see Wahn et al., 2018a) or in joint motor tasks (e.g., Knoblich and Jordan, 2003; Wahn et al., 2016a, for a recent review, see Wahn et al., 2018b).

The main question, namely, whether reward processing differs between social situations was addressed in three ways. First, we analyzed the FRN as mean as well as peak to peak amplitude and found a main effect of social situation. Second, we also found a main effect of social situation when analyzing the difference waves. Third, using an exploratory analysis, we again found a main effect of social situation. Taken together, these results suggest that the FRN amplitudes are affected by the social situation, although the lack of interaction in a pre-specified time window (200 to 300 ms) implies that positive and negative outcomes are equally affected. Additionally, analysis of a later time window (300 to 340 ms) in difference waves revealed a main effect of different types of outcome and an interaction effect. This raises the question which aspect of the change in social situation affects the FRN. Potentially, the social situations might differ with respect to arousal state and the amount of attentional resources utilized. Previous research points in the direction of such an interpretation (Cui et al., 2015). However, we did not observe differences in the level of performance as a function of the social situation. This makes an influence on the FRN by variations of arousal or attentional resources unlikely. Therefore, our study provides evidence that reward processing is affected by social situations, however, further research is needed to unravel details of involved processes. Given, that we find that the social situation (cooperative or competitive) modulates processing of feedback about our actions, an interesting research direction would be to test how different social situations may also affect how coactors monitor actions joint actions (Keller, 2008; Vesper et al., 2010), representations of co-actors in a dyad (Sebanz et al., 2005) , and the prediction of co-actors actions of co-actors (Keller et al., 2007). Moreover, our results are in line with EEG hyperscanning studies suggesting that different cognitive processes are involved in cooperative and competitive situations (Astolfi et al., 2010; Sinha et al., 2016).

A previous study suggested that that the FRN is only sensitive to the outcome, but not task performance as such (Itagaki and Katayama, 2008). As studying the FRN in response to neutral outcomes is mostly neglected in literature (but see Holroyd et al., 2006), this is difficult to disentangle. Due to our design that included neutral outcomes, we were in a better position. Specifically, the comparison of FRN amplitudes between feedbacks with and without monetary outcomes in combination with correct or incorrect individual performance, results in a significant difference between feedbacks with and without monetary outcomes. This result suggests that different neural processes are involved in processing outcomes and task performance. Given that we find that the FRN is present for neutral outcomes, this result suggests that the FRN is sensitive to outcome as well as task performance.However, we have to be cautious with interpretation of these results because both performance and monetary feedback was delivered at once and it is not clear how to disentangle them in our design. In future research, one could manipulate chance of winning (25, 50, and 75%) for individual participants to extend current results.

In this study, we used state of the art EEG analysis methods, namely Linear Mixed Models for hierarchical analysis of single trial activity (Frömer et al., 2018) and TFCE to control for multiple comparisons (Smith and Nichols, 2009; Mensen and Khatami, 2013). In the following, we first provide a discussion of the benefits using these analysis techniques and then further discuss the obtained results of our exploratory analysis. We quantified the FRN on a single trial basis and used the LMM to model the FRN. This approach helps to account for a multitude of problems. For instance it handles unequal number of observations per cell, allows for between participant variability in effect sizes and combines single participant variability and group level variability (Pinheiro and Bates, 2000; Baayen et al., 2008; Barr et al., 2013; Matuschek et al., 2017). In our experiment, we tried to reduce the first problem of unequal cell size by using the QUEST procedure to obtain almost equal number of trials. Nevertheless, EEG data has to be cleaned and depending on the noise level the number of rejected trials varies between participants. However, the issue of high variability between participants in cognitive neuroscience field is prevalent and has to be accounted for (Seghier and Price, 2018). The LMM approach is suitable to address this problem. Our additional motivation to use this method was related to its capability of estimating effect sizes for individual participants. We used those to correlate them with information about personality traits of participants to test a possible association between neurophysiological and questionnaire data. We also made use of the TFCE permutation analysis to perform the exploratory analysis (Mensen and Khatami, 2013) without specifying electrode sites or time window. This approach circumvents the need to preselect time points and electrodes (Bishop, 2007), which is an additional benefit as making these decisions may not always be straightforward, especially in the absence of clear guidelines.

Using this exploratory analysis, we found the same pattern of results as above in our confirmatory analysis. Namely, we found a main effect of the outcome and social situation in both the LMM and the permutation analysis, further corroborating earlier results that the FRN is sensitive to positive and negative outcomes and the social situation (Ullsperger et al., 2014). In addition, our exploratory analysis showed that these differences for the FRN preceded the time window typically defined for the FRN, suggesting that the human brain differentiates the valence of the outcome and the social situation earlier than previously suggested (Koban et al., 2010; Marco-Pallarés et al., 2010; Rigoni et al., 2010; de Bruijn and von Rhein, 2012; Picton et al., 2012; Loehr et al., 2013, 2015). Our results (**Figure 5**, second row), suggest that there are stronger positive activation in cooperative than competitive situation in two stages of processing of the feedback. Namely, around 160 and 280 ms after the feedback presentation. The social situation main effect might arise from a source close to CP6 and P6 electrode. Because Superior temporal sulcus (STS) and Temporoparietal junction (TPJ) are close to these electrodes and earlier fMRI research suggests these areas are involved in differentiating the self from others, this might be the origin (Saxe and Kanwisher, 2003). Thus, it might be interpreted that while people receive feedback and process them simultaneously in a cooperative situation they merge their own and their co-actor positive outcomes and process them as simultaneously while the competitive situation requires distinct processing of rewards. However, this interpretation have to be taken cautiously due to the inverse problem.

Moreover, we investigated the relation between the Perspective taking score and mixed model best linear unbiased prediction of the factor social situation. We found that the higher the Perspective taking score, the stronger is the difference in FRN amplitudes between social situations. This result suggests that personality traits related to perceiving and understanding others might be related to the strength of the neurophysiological response to rewards. Thus, brain mechanisms involved in reward processing in people showing more consideration for others, might be more sensitive for different social situations. However, this result and interpretation should be treated with caution, as using mixed model best linear unbiased prediction in combination with a correlation analysis is a new approach and still has to be fully validated (Houslay and Wilson, 2017).

Taken together, we investigated neural underpinnings of feedback processing in cooperative and competitive situations. We find that the FRN component is sensitive not only to positive and negative outcomes but also to the social situation in a design, in which both co-actors in dyad actively perform a task.

# AUTHOR CONTRIBUTIONS

AC, BW, and PK: Study design; AC: data collection; AC and BE: data analysis; AC, BE, BW, and PK: draft and revisions of manuscript.

# ACKNOWLEDGMENTS

We gratefully acknowledge the support by the European Commission Horizon H2020-FETPROACT-2014 641321 socSMCs, DFG-funded Research Training Group Situated Cognition (GRK 2185/1) and the Deutsche Forschungsgemeinschaft (DFG) Open Access Publishing Fund of Osnabrück University. Moreover, we would like to thank Anna Lisa Gert for her help with preprocessing EEG data as well as following students, who collected the data: Chiara Carrera, Marketa Becevova, Maria Sokotushchenko, Greta Häberle and Susanne Schuberth.

# REFERENCES


Ehinger, B. V. (2018). EEGVIS Toolbox. Osnabrück.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg. 2019.00361/full#supplementary-material


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Czeszumski, Ehinger, Wahn and König. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.