Face-to-Face Communication in Aphasia: The Influence of Conversation Partner Familiarity on a Collaborative Communication Task

Aphasia is language impairment due to acquired brain damage. It affects people’s ability to communicate effectively in everyday life. Little is known about the influence of environmental factors on everyday communication for people with aphasia (PWA). It is generally assumed that for PWA speaking to a familiar person (i.e. with shared experiences and knowledge) is easier than speaking to a stranger (Howard, Swinburn, and Porter). This assumption is in line with existing psycholinguistic theories of common ground (Clark, 1996), but there is little empirical data to support this assumption. The current study investigated whether PWA benefit from conversation partner (CP) familiarity during goal-directed communication, and how this effect compared to a group of neurologically healthy controls (NHC). Sixteen PWA with mild to severe aphasia, sixteen matched NHC, plus self-selected familiar CPs participated. Pairs were videotaped while completing a collaborative communication task. Pairs faced identical Playmobile rooms: the view of the other’s room was blocked. Listeners attempted to replicate the 5-item set-up in the instructor’s room. Roles were swapped for each trial. For the unfamiliar condition, participants were paired with another participant’s CP (PWA were matched with another PWA’s CP based on their aphasia profile). The outcomes were canonical measures of communicative efficiency (i.e. accuracy, time to complete, etc.). Results showed different effects in response to the unfamiliar partner for PWA compared to NHC: In the instructor role, PWA showed faster trial times with the unfamiliar partner, but similar accuracy scores in both conditions. NHC, on the other hand, showed similar trial times across CPs, but higher accuracy scores with the unfamiliar partner. In the listener role, PWA showed a pattern more similar to NHC: equal trial times across conditions, and an improvement in accuracy scores with the unfamiliar partner. Results show that conversation partner familiarity significantly affected communication for PWA dyads on a familiar task, but not for NHC. This research highlights the importance of identifying factors that influence communication for PWA and understanding how this effect varies across aphasia profiles. This knowledge will ultimately inform our assessment and intervention of real-world communication.


INTRODUCTION
One-third of individuals who suffer a stroke will experience aphasia (difficulties speaking and understanding language, reading and writing) (Spaccavento et al., 2013), with detrimental effects on communication and functioning in everyday life (Lam and Wodchis, 2010;Hachioui et al., 2014). When compared against various health conditions (e.g. cancer and Alzheimer's disease) aphasia has the highest impact on quality of life (Hilari et al., 2003;Lam and Wodchis, 2010;Spaccavento et al., 2013). The loss of functional language use affects social, vocational, and emotional well-being (Hilari et al., 2003;Spaccavento et al., 2013), preventing People with Aphasia (PWA) from participating in society and maintaining relationships.
Traditionally, the study of aphasia has focused on impairments of language, with assessment tasks that present isolated language elements (e.g. sounds, words, sentences) in highly controlled lab environments. These studies have been the foundation for the development of reliable assessment instruments and intervention plans targeted at particular profiles of language impairment (Thompson et al., 2008). However, it is generally accepted that such impairment-based performance measures do not reliably predict communication ability in the real world (Holland, 1982;Kolk and Heeschen, 1992;Wilkinson, 1995;Beeke et al., 2007;Davidson et al., 2008;Armstrong et al., 2011). Perhaps because of the complexity of language and communication, the same level of detailed analysis has not been applied to real-world communication for PWA (Leaman and Edmonds, 2019). Providing reliable assessment and evidence-based interventions at the level of communication has, for that reason, remained problematic in aphasiology (Brady et al., 2016). This is a crucial gap in knowledge, as improvement in the ability to communicate in one's own dayto-day environment remains one of the most important longterm goals reported by clinicians and PWA themselves (Thompson et al., 2008).
There is a need for systematic, theoretically driven research on naturalistic communication in aphasia. Recently, we showed how a theoretical framework of situated language use, borrowed from research with neurologically healthy controls (NHC) (Clark, 1996), can be applied to aphasia rehabilitation (Doedens and Meteyard, 2019). It provides a structure along which different components of real-world communication, and their influence on a person's ability to communicate, can be examined systematically. The framework defines communication as being 1) interactive-including atleast one other person, 2) multimodal-involving multiple channels of information and 3) contextual-grounded in shared situational, personal and social knowledge.
Here, we will focus on the contextual aspect of communication. One part of contextual information is common ground shared with a conversation partner (CP)part of which is modulated by the familiarity of that CP. For PWA, questionnaires on communication often distinguish between the ability to communicate with familiar and unfamiliar CPs (e.g. the disability questionnaire of the Comprehensive Aphasia Test; Howard et al., 2004; or the Aphasia Impact Questionnaire-21; Swinburn et al., 2018). The assumption is often made that it is easier for PWA to speak to a familiar person than speaking to a stranger (Green, 1982;Wirz et al., 1990;Ferguson, 1994;Perkins, 1995;Howe et al., 2008;Laakso and Godt, 2016). The familiarity advantage has also been reported by PWA as an influential factor when it comes to ease of communicating (Dalemans et al., 2010).
pairs. Brown-Schmidt (2009a) found a similar effect of task: only performance on an interactive task showed an influence of common ground between conversation partners (participants were manipulated on shared knowledge within the experiment, not on personal familiarity; Brown-Schmidt, 2009b). The authors argued that the type of task and its complexity might have been of influence: the more complex, the greater the need to rely on shared information to complete the task. Interestingly, Pollmann and Krahmer (2017) showed that in addition to the higher accuracy scores, familiar pairs reported higher levels of motivation and enjoyment of the game, suggesting that these factors might influence communicative efficiency and accuracy as well. Finally, on the non-interactive email task, an effect of friendship closeness on accuracy was found for the group of familiar pairs. Andersson and Ronnberg (1997) showed that participants performed better on a word association task when working with friends compared to strangers. Fussell and Krauss (1989) also reported higher levels of accuracy when subjects were asked to interpret a message that was recorded specifically for them by a friend, than when they were asked to interpret a message that was recorded by a stranger. While the difference in accuracy scores between these conditions was significant, it was very small. The authors hypothesized that the traditional referential communication task might not have required participants to rely on personal common ground. Instead, reliance on general, community-wide knowledge would have enabled participants to successfully interpret messages recorded by a stranger (Fussell and Krauss, 1989). Furthermore, the authors suggested that the degree to which familiar pairs know each other (i.e. length of time, level of intimacy) might have mediated this effect, as the familiar pairs in their study had known each other for less than six months (Clark and Schaefer, 1987;Fussell and Krauss, 1989).
The research on the benefit of conversation partner familiarity is, however, inconclusive. Gould et al. (2002) did not find differences across tasks between familiar and unfamiliar pairs. The authors also suggested that the familiarity effect might only be present in particular communicative or experimental contexts, as well as depend on the type of relationship that is studied. Schober and Carstensen (2010) also found no difference between familiar and unfamiliar pairs on their efficiency and accuracy in describing unfamiliar things, such as tangram shapes. While Pollmann and Krahmer (2017) found differences in accuracy between familiar and unfamiliar pairs on a face-to-face task, no differences were found between the groups on efficiency. Finally, it has also been suggested that the existence of shared knowledge between two interlocutors might not necessarily lead to a reliance on that shared knowledge per se. Instead, it might lead the speaker to rely more on their own knowledge. While this can facilitate communication on some topics (when speaking about topics that are part of shared knowledge), it can also lead to greater confusion when communicating about topics that are not part of common ground (Wu and Keysar, 2007;Savitsky et al., 2011). Overall, the research on the effect of conversation partner familiarity on communication efficiency and accuracy remains relatively inconclusive. It is suggested to depend on factors such as the type and complexity of task, the topic of conversation and whether it requires personally shared knowledge to be understood, the type, length and intimacy of the relationship under study and the motivation of the interlocutors on the task.

Conversation Partner Familiarity in the Aphasia Literature
Only a small number of studies have explored the influence of personal common ground on communication for PWA. Leaman and Edmonds (2019) analyzed and compared the unstructured conversations of eight PWA (most with mild anomic aphasia) with a familiar conversation partner (FCP) and an unfamiliar speech and language therapist (SLT). The authors reported no differences on measures of communicative success, on linguistic measures such as grammaticality (morphological and verb tense/ mood errors) and sentence production (correct use of a complete sentence frame and the relevance of lexical items in the frame in the discourse context), or on lexical retrieval behaviors (false starts, repetitions, pauses of 2+ s, etc.). These findings suggest that some linguistic characteristics of conversation for PWA might remain stable across conversation partners. Kistner (2017) assessed gesture use by twenty PWA (ranging from severe to mild aphasia) and NHC in conversation with FCPs and unfamiliar conversation partners (UFCP). A procedural and a narrative conversational task were used to elicit conversation. UFCPs were SLT students or researchers with knowledge of aphasia. In this study, both PWA and NHC showed an increase in the number of gestures when speaking to the UFCP as compared to the FCP. The authors hypothesized that gesture production increased to help disambiguate meaning or as speech became more complex. With the UFCP, this need increased due to the lack of shared reference. Williams et al. (1994) explored the influence of conversation topic and conversation partner familiarity for 22 PWA and ten NHC on a procedural and story-retell task. The syntactic complexity measures in the study showed no effect of CP familiarity (Williams et al., 1994). On the same dataset, Li et al. (1995) found no significant differences on discourse grammar between conversations with FCPs and UFCPs, except on the description of the setting in the story retell task, where PWA provided more detail with the FCP. The authors suggested PWA might have felt more comfortable or at ease with the familiar CP, which could have facilitated recall of that particular aspect of the story. Finally, case studies by Gurland et al. (1982) and Lubinski et al. (1980) showed that PWA used different communication styles depending on the familiarity of their CP: Gurland et al. (1982) showed a greater number of acknowledgments were produced in conversation with a familiar CP, while with the unfamiliar CP, topic-relevant turns increased. The authors suggested PWA might take on a more "passive, less informative role with the spouse (familiar CP) vs. the clinician (unfamiliar CP)" (Williams et al., 1994). Lubinski et al. (1980) compared the unstructured conversation of one PWA with a familiar (spouse) and a therapy session with an UFCP (in this case, a SLT). The topic of conversation was not controlled for. The number of conversational breakdowns and repairs were assessed: similar types of conversational breakdowns were found with the FCP and UFCP. The way in which the breakdowns were repaired, however, differed significantly. UFCPs (SLT) tended to gloss over the breakdowns, while FCPs (spouse) actively attempted to repair them collaboratively with the PWA. The authors suggested that one reason for this difference was the different goals each CP had during their conversation with the PWA: the clinician often let the PWA repair the conversational trouble, while the spouse wanted to-collaboratively-discuss the plans for that day. Ferguson (1994) found no difference in trouble indicating behaviors between FCP and UFCP in a study with eight PWA, where the conversational topic was slightly more aligned. The authors found that the way these troubles were dealt with was different depending on the familiarity of the CP: UFCP more often took on the responsibility of repairing the trouble (i.e. "other-repair"), rather than letting the PWA repair the trouble (i.e. "self-repair"). The authors hypothesized that by not letting PWA repair the trouble as often, UFCPs might have been driven by a desire to avoid potential continued conversational breakdown. The familiarity manipulation might not have been sufficient in this latter study: the role of UFCP was filled by someone who knew the PWA less well compared to the FCP, but still had known the PWA for years.

Confounding Factors in Interactive Communication in Aphasia
In addition to the effect of personal common ground, there are two confounding factors that have been shown to influence the communicative ability of PWA. First, research has shown that communication for PWA is influenced by the extent of knowledge the CP has about aphasia, the language impairment and on potential communication strategies they can use to facilitate communication (Rayner and Marshall, 2003). CPs with knowledge of communicating with PWA have been shown to enable PWA to communicate more effectively and increase the PWA's level of participation in conversation (Lindsay and Wilkinson, 1999;Pound et al., 2000;Kagan et al., 2001;Simmons-Mackie et al., 2010;Wilkinson and Wielaert, 2012;Nykanen et al., 2013). PWA also specifically self-report the positive impact of communicating with someone who knows about aphasia and what communication strategies to use during conversation (Dalemans et al., 2010;Harmon, 2020).
Second, the sense of comfort and support experienced during communication has been suggested as an important factor for communicative ability (Dalemans et al., 2010;Worrall et al., 2010;Harmon, 2020). Though not exclusively, this sense of comfort and support is often associated with the familiarity of the CP. This line of reasoning suggests that the fear of not being able to express oneself due to the language impairment and subsequently the fear of "losing face" or of being perceived unfavourably because of the communication difficulties, can make communication with an UFCP more effortful and a more negative experience (Harmon, 2020). For PWA, this could potentially result in more errors in their language production, more and longer word searches, or potentially result in avoidance of the interaction with the UFCP resulting in, for example, shorter interactions altogether.
Suggestions to this end have been made in the literature (Li et al., 1995;Kistner, 2017). In a discussion of the use of compensatory communication strategies by PWA, Simmons-Mackie and Damico showed that PWA may vary their communication strategies depending on the goal in a particular context, such as "looking okay", rather than being maximally communicatively effective (Simmons-Mackie and Damico, 1995). To the knowledge of the authors the sense of being at ease during communication and the influence of conversation partner familiarity has not been explored empirically.
In sum, the existing research suggests that the presence of personal common ground can influence communication for PWA. The existing evidence base is small, but it seems that the effect of conversation partner familiarity might depend on the level at which communication is measured. It seems that lower level linguistic measures such as verb or sentence production could remain stable across different conversation partners, while higher level communication strategies such as the use of gesture or the repair of conversational trouble might vary. More work is needed, however, to assess whether this advantage exists, how it manifests, whether it exists for all types of aphasia, and if it is mediated by other factors such as aphasia severity. It is crucial to control for the influence of other confounding factors such as knowledge of aphasia of the CP, the sense of comfort experienced by the PWA as well as the conversation topic.

The Current Study
The aim of the current study was to investigate whether CP familiarity affects communication for PWA. Participants completed a collaborative task that required communication in two different conditions: once with a FCP, and once with an UFCP. Participants were in two groups: PWA with a NHC conversation partner, and NHC with a NHC conversation partner. To investigate the question of personal common ground we controlled for the potential influence of two confounding factors. Knowledge of aphasia was controlled for by swapping the CPs of pairs of PWA who were matched on their linguistic and communication impairment profiles. Knowledge of aphasia was also tested through a questionnaire. The sense of comfort was taken into account by asking each familiar and unfamiliar pair to indicate the level of comfort they felt while completing the task with their conversation partner. These research questions were part of a bigger pre-registration (https://osf.io/9xwm7).
A collaborative task was used to elicit naturalistic communication between the participant pairs. Different versions of this task have been used in previous research with NHC (Clark and Wilkes-Gibbs, 1986;Boyle et al., 1994;Clark, 1996;Clark and Krych, 2004;Howarth and Anderson, 2007;Lysander and Horton, 2012) where naturalistic communication is investigated in a controlled lab setting. This experimental setup made it possible to adhere to the previously described framework of real-world communication and to manipulate variables within that framework (Doedens and Meteyard, 2019), see Table 1.
To measure the effect of the experimental manipulation on communicative success for PWA and NHC, a selection of key outcome measures was made based on previous literature on CP familiarity with PWA and NHC. Based on research with NHC, measures of trial time and task accuracy were selected. Previous research with PWA suggests that the number of times trouble is identified during conversation, can be indicative of communicative success (e.g., Beeke, 2012). We therefore also included a measure of self-initiated repair (i.e. instances where the "instructor" initiates a self-correction) and other-initiated repair (i.e. instances where the "listener" requests clarification on what has been said) as a measure of communicative success.
Due to the nature of the task, an additional analysis was included (not part of the pre-registration). This analysis aimed to assess the influence of role (instructor or listener) on goaldirected communication. The current study included trials in which PWA and NHC took turns in an "instructor" role, requiring them to actively communicate new information to their CP. Conversely, participants also took on the "listener" role, requiring them to follow instructions from their CP. Previous studies with NHC have assumed no differences in role for measures such as time taken and accuracy (Boyle et al., 1994). Therefore, no difference in roles was expected for NHC for the measures of time and accuracy. However, as PWA present with impairments of language production and comprehension, a difference in performance based on role can be expected. For the number of self-initiated repairs and clarification requests, we expected an effect of role for both groups. Self-initiated repairs are naturally expected to be more frequent when someone speaks more (i.e. the "instructor" role), while Clarification requests are naturally expected to be more frequent when someone is in the "listener" role. Finally, given the inherent variability of the language impairment within the aphasia group, we include a visual representation of the individual difference in scores between conditions (i.e. familiar-unfamiliar), ordered by a standardized measure of aphasia severity. This will provide insight into the spread of individual data-points within the aphasia group, and how this compares to the NHC group.
Analysis addressed the following research questions: (1) What is the effect of speaker role (instructor/listener) on goal-directed communication? (2) What is the effect of CP familiarity (personal common ground) on goal-directed, face-to-face communication in aphasia? (3) Do PWA differ from NHC in how they respond to CP familiarity during goal directed communication?
Based on the existing literature, it was hypothesized that it will be easier for PWA to complete the task with a FCP than with an UFCP, as evidenced by the familiar pair taking less time, requiring fewer repairs, obtaining higher accuracy scores and fewer requests for clarification. Based on the case study by Lubinski et al. (1980), it could be the case that the number of repairs falls into the category of more lower-level behavior which remains stable across conversation partners. In comparison to NHC, we expect PWA to show a similar direction of the effect of CP familiarity. Due to the presence of the language impairment for PWA, we expect the CP familiarity effect to be greater for PWA compared to NHC, i.e. we expect PWA to have more difficulty adapting to communicating with an UFCP, or to benefit more from communicating with their FCP (see Table 2).

Ethics Statement
This study was carried out with ethical clearance from the School of Psychology and Clinical Language Sciences, University of Reading (Ref: 2018-093-LM). All participants provided informed consent prior to taking part in the study. Consent

Type of code Description Example
Self-initiated repairs Revised repair The interlocutor repeats the main clause with modifications "The man goes under the chair. . .. no I mean he goes on the chair" Addition repair The interlocutor provides additional information to the main clause "The sofa is in opposite the window . . . the small window" Word finding repair The interlocutor explicitly has word-finding difficulties (repetitions without revisions, additions or explicit statements of difficulties finding a word are not included) . oh what is that word?"

Clarification requests
Request for elaboration or clarification The interlocutor asks their CP to provide more information on what has been said. This type of clarification request includes most wh-questions "Which window?" or "Where?"

Statement of not understanding
The interlocutor indicates that they did not follow what their CP said "I don't understand" or "Huh?" Partial or complete repetitions The interlocutor repeats (part of) a phrase as produced by the CP, sometimes with a questioning intonation, to check if they have understood correctly CP1: "by the window on the left" CP2: "by the window on the left?" Insertion When the CP is speaking the interlocutor inserts a word or phrase that fits into the utterance of the CP. This can happen, for example, when the CP pauses to search for a word. The insertion functions as an evaluation for the interlocutor to assess if they have correctly understood the utterance of the CP.
CP1: "and then the sofa is facing the.." CP2: "The tv cabinet?" CP1: "yes, the tv cabinet" Indirect request for clarification The interlocutor asks for a repetition of what has been said, indirectly indicating they (might not) have not fully understood or followed "Please speak more slowly" Frontiers in Communication | www.frontiersin.org June 2021 | Volume 6 | Article 574051 5 and information forms were adapted to aphasia friendly format for the participants with aphasia.

Participants
Sixteen participants with post-stroke aphasia (42-72 years, M 60.94, SD 9.41) and sixteen control participants (NHC, 52-84 years (M 64.94, SD 9.66) took part in the current study. PWA and controls were matched for age (t (30) 1.19, p 0.245) and years of education (t (29) −0.07, p 0.946). Nine male and seven female PWA were recruited through the Aphasia Research Registry of the School of Clinical Language Sciences, University of Reading (British Academy Grant ARP scheme 190023), as well as through local stroke groups. PWA were at least one-year post-stroke (1-14 years, M 7.04, SD 3.85) and were native speakers of English prior to the stroke. Exclusion criteria were coexisting neurological diagnoses such as dementia and an inability to provide consent due to severe comprehension difficulties. Seven male and nine female NHC were recruited through the older adult research panel at the School of Psychology, University of Reading. Exclusion criteria were a history of neurological illness. All subjects reported normal or corrected-to-normal vision and hearing.
All participants brought along a FCP to take part in the study with them. The PWA self-nominated a FCP who they spoke to regularly. Six male and ten female FCPs agreed to take part (partner, friend or family member between the ages of 22-72 years, M 54.12, SD 15.12, see Table 3 for more details). All FCPs except those labeled child (only ID 48), ex-partner and friend lived in the same house with the PWA. For NHC, partners were recruited as the FCP (aged range 51-79 years, M 64.12, SD 7.57, see in the Supplementary Table S1). All FCPs lived in the same house with their partner. All FCPs reported normal or corrected-to-normal vision and hearing and did not report a history of neurological illness.
All PWA completed the Western Aphasia Battery-Revised (WAB-R; Kertesz, 2009). The aphasia quotient score (AQ) ranged from 11.60-94.2 (M 65.88, SD 26.59), severities ranging from very severe to mild (see Table 3 for an overview). To obtain a standardized measure of communicative ability, PWA also completed the Scenario Test United Kingdom (Hilari et al., 2018). Scores ranged from 20.25-54 (maximum score 54, M 45.64, SD 8.83; details shown in Table 3). Thirteen out of sixteen PWA had some degree of weakness (hemiparesis) on the right-hand side due to the stroke. All PWA were able to use their unaffected arm and hand effectively. All PWA were mobile enough to attend the experiment at the University clinic. One PWA attended the clinic in a wheelchair.
All participants without aphasia completed the Montreal Cognitive Assessment (MoCA; Nasreddine et al., 2005), a cognitive screening tool for mild cognitive impairment. Scores ranged between 17-30 (M 27.23, SD 2.49). Six participants scored below the cut-off score of the test (<26 points; one NHC with score 17; two FCP to NHC with scores 23; three FCP to PWA with scores: 22, 23, 24), suggesting the potential presence of mild cognitive impairment. Due to difficulties in recruiting the PWA subjects, their partners and age-and years of education-matched controls, none were excluded from participation on the basis of their MoCA scores. Following reviewer comments, an additional statistical analysis was run in which these subjects were excluded, as described in the Statistical Analysis section. The MoCA was not administered with the people with aphasia. The heavy reliance of this test on language in its instructions and responses makes this test unsuitable and unreliable for administration with people with existing language processing difficulties.

Procedure
All participants were invited to take part in a study about conversation and different CPs. Background testing with PWA was completed either at the participant's home or at the School of Clinical Language Sciences, University of Reading. All NHC completed background testing at the University of Reading.
For the experimental session, two participants and their respective FCP were invited to the Speech and Language Therapy Clinic at the University of Reading.

Task
The experimental design consists of a collaborative, referential communication task (Clark and Krych, 2004) that allows pairs to interact and communicate freely, replicating a real-life face-toface communicative setting. Pairs sat across from each other, in front of identical playmobile rooms (see Figure 1). The view of the other person's room was blocked by a low barrier. Five items were placed in one room (instructor), while the other room (listener) remained empty with six items placed on the side of the room. Pairs were asked to replicate the setup of the instructor's room in the listener's room. They were asked to communicate as they normally would, including the use of any communication aids. Pen and paper were provided for both Notes: *Classification refers to the communicative ability of the PWA: "almost no communicative ability", "seriously limited communicative ability", "okay communicative ability in simple situations" and "good communicative ability in simple situations". * indicates a MoCA score below the cut-off (<26).
participants. Participants were instructed not to show items to their CP or to look over the barrier at the other room. In many ways, the current set-up echoes that of PACE in the aphasia literature (Davis and Wilcox, 1985;Davis, 2005). Aphasia friendly images were used to visually support the instruction for all participants. The experimenter left the room for the duration of the task. When the pair completed the task, they pressed a button. The experimenter then re-entered, took a picture of both rooms, and showed the participants the result. Any paper used was collected by the experimenter and the next trial was set up. Each pair (familiar and unfamiliar) completed the game six times: For each trial, roles (instructor/listener) were swapped, resulting in three instructor trials and three listener trials for each participant. The starting role was counterbalanced across participants. A different setup of items was used for each trial, the order of which was randomized for each pair.
The experimental manipulations of the current study can be summarized according to the previously described framework of real-world communication (Doedens and Meteyard, 2019). See Table 4.

Materials
An empty Playmobil room with four windows and one door was used for the current experiment. Six Playmobil objects were selected based on psycholinguistic features that have been shown to influence lexical retrieval in PWA (Nickels and Howard, 1995, see in the Supplementary Table S2 for details). The items were selected based on high levels of concreteness, familiarity and imageability, as well as (roughly) low number of phonemes to facilitate naming of the items as much as possible.
Six different room setups were created by placing five Playmobil items in various configurations across the room (see Figure 2). One item (counterbalanced across trials) was a distractor and placed outside of the room. Three additional objects were permanently placed in the same location across all six trials, functioning as reference points for the other objects:  Unrestricted use of all communicative modalities (gesture, facial expressions, body posture, intonation, language) Optional use of pen and paper for drawing and writing (specified as "if you need to, you can use") Added option of communication aid Common ground Personal Interaction with a familiar CP and with an unfamiliar CP (the main experimental manipulation) Communal -Communicative Repetition of the same task across 6 trials allowing CPs to build communicative context. Theoretically, this context could have carried over into the unfamiliar condition, where the same task was repeated Situational The use of 6 concrete, highly frequent, familiar, and recognisable objects and their physical location in relation to a physical space and each other Frontiers in Communication | www.frontiersin.org June 2021 | Volume 6 | Article 574051 8 1) a chest of drawers with 2) a television on top and a 3) potted plant in the opposite corner of the room.
Between conditions, the physical appearance (i.e. the color) of the cat and the hair of the woman was changed to incorporate some variation in the stimuli. Two reference objects were also changed: the potted plant was replaced by a different potted plant and the television was replaced by a set of books. The location of all the items remained constant.

Familiarity Manipulation
In the unfamiliar condition, each participant was matched with another participant's FCP. PWA were matched with the FCP of a PWA with a similar aphasia profile based on their WAB-AQ score and their communication score on the Scenario Test (Meulen et al., 2010). This way, PWA were matched with an FCP who was unfamiliar at a personal level, but who had experience communicating with someone with roughly similar communication difficulties. Where possible, PWA were also matched on age and gender (see in the Supplementary Table  S3 for more details). In the control group, NHCs were matched on gender, age and years of education (in order of priority). For the unfamiliar condition, each NHC was paired up with their matched NHC's FCP (see in the Supplementary Table S4 for details on matching).
At the beginning of each condition, each participant was asked to rate the familiarity of their CP on an aphasia-friendly Likert scale (0 this person is a stranger, 5 I know this person extremely well). For both groups, the FCP was rated higher in familiarity (PWA: M 3.55, SD 0.62, NHC: M 3.97, SD 0.12) compared to the UFCP (PWA: M 0.52, SD 0.92, NHC: M 0.03, SD 0.12). The difference in familiarity ratings was significant for both groups (PWA: t (30) 10.97, p < 0.001, NHC: t (30) 89.09, p <0 .001).
The order of conditions was not counterbalanced: All participants first completed the familiar condition, followed by the unfamiliar condition. The authors decided against counterbalancing the order of conditions to minimize potential anxieties about communicating with an UFCP for the PWA.

Controlling for Knowledge of Aphasia
To control for knowledge of aphasia, all CPs of PWA filled out a questionnaire testing their knowledge of aphasia (factual knowledge and knowledge on communication stratiegies as described in Rayner and Marshall, 2003

Sense of Comfort With the CP
The degree of comfort participants felt with their FCP and UFCP during the task was taken into account: At the end of each condition, each participant was presented with a statement ("I feel that my partner and I communicate comfortably together") and a visual 5-point Likert scale (0 completely disagree, 4 completely agree). For both PWA and NHC, the degree of comfort they felt with their CP was roughly equal in the familiar (PWA: M 3.56, SD 0.51, NHC: M 3.71, SD 0.47) and unfamiliar condition (PWA: M 3.28, SD 0.52, NHC: M 3.53, SD 0.62). A non-parametric paired t-test showed no significant difference between the degree of comfort participants felt with their FCP and UFCP (PWA: V 18, p 0.119, NHC: V 20, p 0.299).

Coding
All trials were video and audio recorded. Videos of the interactions were coded in ELAN (The Language Archive, 2019). For the purpose of this study, the following measures were coded: Trial time. All videos were coded for trial time. Trial time was defined as the moment participants started to communicate on a trial (speak, draw, gesture, etc.) until the moment one of the participants pressed the button to signal the experimenter to come into the room.
Task accuracy. Task accuracy was defined as the correct placement of the items in the listener's room as compared to the instructor's room as set up by the experimenter. The setup of the instructor's and listener's room was photographed at the end of each trial. Both images were scored by two independent judges on accuracy (correct/incorrect) of two aspects of the item: its location (in the room and in relation to other objects), its orientation. For the people, two additional aspects were coded: the action that was undertaken by the item (i.e. standing, sitting, etc.) and the positioning of the arms. For all other objects, the action was always coded as correct, resulting in a maximum score of three per item, and four per person (a maximum score of 20 and a minimum score of 4, examples of low, moderate and high accuracy scores are provided in the Supplementary Figure S1). In case of doubt due to different angles of the pictures, a grid was superimposed on the floor of each image using Kinovea software (Charmant andContrib., 2006-2011).
Self-initiated repairs. Self-initiated repairs were defined as instances where a participant explicitly attempted to repair or change their own output (often described as the repair initiation; Wilkinson, 2006;Schegloff et al., 1977). A selfinitiated repair was always an explicit correction initiated by the interlocutor themselves, without any prompts from the conversation partner. Three different types of self-initiated repairs were coded, partially based on Perkins (1993) (see Table 1). For the word-finding repairs, repetitions of parts of words are expected, but if parts of a word are repeated without revisions, additions or explicit statements of difficulties finding a word, these are not coded as a repair. All self-initiated repairs are coded, regardless of the way in which the repair is resolved (i.e. by the interlocutor themselves, collaboratively with their conversation partner or by the conversation partner). Whether a repair is successful or not was not coded (i.e. whether the correction creates a correct utterance or not, or whether the correct word is produced, or the search is abandoned). Nonverbal instances of self-initiated repairs are also included (e.g. direct gaze at the partner to provide help in a word search, Beeke, 2012). The total number of self-initiated repairs was counted for each trial and participant.
Clarification requests. Clarification requests are defined as instances when one interlocutor indicates to their conversation partner that they have not fully understood what has been said (also described as an "other-initiated" repair; Schegloff et al., 1977). Five types of clarification requests were coded, partly based on Schegloff et al., (1977) (see Table 1). Coding included verbal and non-verbal clarification requests such as clear eye gazes and frowns, or clear shrugs directed at the CP. The total number of clarification requests was counted for each trial and participant.
Coding of the latter two outcome measures is expected to be more subjective compared to the first two outcome measures due to the inherent nature of the coding process (Beeke et al., 2007). Self-initiated repairs were coded by a second rater (native English-speaking speech and language therapy student), resulting in a moderate intraclass correlation coefficient (ICC 0.74, CI 0.51-0.87, p < 0.001, calculated in R studio using the psych package version 1.9.12.31; (Revelle, 2020).

Statistical Analysis
All outcome measures showed a non-normal distribution and contained outliers. The outcome measures also showed significant differences in variance between groups. Log-linear transformations did not eliminate the problems of normality or extreme values in the data. To avoid relying on assumptions of normality, a bootstrap procedure was used to obtain a distribution based on resampling of the existing data, from which the test statistic was derived (Wilcox, 2012). Outliers and differences in variance between groups were dealt with by choosing robust analyses based on the median (percentile bootstrap) and 20% trimmed means (bootstrap-t). An alpha threshold of 0.05 was used to determine statistical significance. All analyses were run in R Studio version 1.1.463 (RStudio, 2020). The results from the median analysis are reported in the paper. When there was a difference in outcome, results from both analyses are discussed. For all bootstrapping methods, 10,000 bootstrap samples were used (Rousselet et al., 2019).
First, we ran an omnibus between-by-within-by-within 2 (group: PWA/NHC) x 2 (role: instructor/listener) x 2 (condition: familiar/unfamiliar) robust analyses on all outcome measures: of the median (bwwmcppb in Wilcox, 2012) and the 20% trimmed mean (bwwmcp in Wilcox, 2012). We then ran specific follow up comparisons to answer our research questions.
Research question 1: An effect of role (instructor or listener). Research question 2: An effect of CP familiarity for PWA. We analyzed each group separately (PWA or NHC). This helps us to identify patterns for each group of participants, and to address whether role and familiarity have an effect on goal directed communication. Two factors were entered into analysis. First, the condition of familiarity (familiar/unfamiliar), as this was our principle experimental manipulation. Second, the role of the participant (instructor/listener). Role was expected to affect the nature of communication in the goal directed communication task for PWA.
Thus, within subjects 2 (role: instructor/listener) x 2 (condition: familiar/unfamiliar) robust analyses were conducted on all outcome measures: of the median (wwmcppb in Wilcox, 2012), and of the 20% trimmed mean (wwmcpbt in Wilcox, 2012). Planned comparisons were conducted for significant main effects: for a main effect of role, a dependent groups analysis on each level of condition (familiar/unfamiliar) was run on the median and 20% trimmed mean (bootdpci and ydbt, respectively, in Wilcox, 2012). For a main significant effect of condition, the same dependent groups analysis was conducted on each level of role (instructor/listener). The full results of these analyses are reported in the Supplementary Table S5. Results of the planned comparisons are reported in the Supplementary Tables S7-S10.
Research question 3: An effect of CP familiarity for PWA compared to NHC.
We first accounted for the effect of Role (see above) by splitting data into Instructor or Listener trials. We then completed betweenby-within 2 (group: PWA/NHC) x 2 (condition: familiar/ unfamiliar) robust analyses on all outcome measures: of the median (sppba, sppbb and sppbi in Wilcox, 2012) and the 20% trimmed mean (bwtrimbt in Wilcox, 2012). Planned comparisons on significant main effects of group (PWA vs NHC) were conducted with an independent groups analysis (pb2gen in Wilcox, 2012), to test the effect at each level of condition (familiar/unfamiliar). For a main significant effect of condition, a dependent groups analysis (bootdpci and ydbt, as described above, in Wilcox, 2012) was conducted on each level of group (PWA/ NHC). The full results of these analyses are reported in the Supplementary Table S11. Results of the planned comparisons are reported in the Supplementary Tables S13-S16.
To evaluate the influence of participants who scored below cut-off on the MoCA, all statistical analyses reported above were conducted a second time. In these analyses all the sessions (familiar and unfamiliar) in which one participant within a dyad had a MoCA score below the cut-off were excluded. This resulted in the exclusion of data from three dyads in the familiar and unfamiliar conditions, both for PWA and NHC. The results of the 2 × 2 analyses are shown in the Supplementary Tables S6-S12. Any differences in the outcomes of the 2 × 2 × 2 omnibus are mentioned in the results below.
To assess the individual patterns of behavior, a difference score between conditions was calculated for each role: for each participant, the value of each outcome measure for the familiar condition was deducted from the value of the unfamiliar condition. The difference scores were then plotted by group. This visual representation of individual difference scores by aphasia severity is not part of the formal statistical analysis, due to the small and unequal numbers of subjects within the different groups of aphasia severity.

Trial Time
In the omnibus analysis, the analysis based on the median did not show any significant main effects or interactions. The 20% trimmed mean analysis resulted in a main effect group (PWA vs NHC; estimated mean difference 363.61 s, p 0.026), with longer trial times for NHC compared to PWA. The main effect of condition was also significant (familiar vs unfamiliar; estimated mean difference 248.78 s, p 0.049), with longer trial times for the familiar condition. No other main effect or interaction was significant.
The omnibus analysis without the participants with low MoCA scores based on the median did show a main effect of group (PWA vs. NHC, p 0.018). In the trimmed means analysis, the main effect of condition was no longer significant (p 0.052).
Research question 1: An effect of role (instructor or listener). Research question 2: An effect of CP familiarity for PWA.

PWA
The 2  total trial times were longer when they were in the instructor role as compared to the listener role. See Figure 3.
There was a main effect of condition (familiar vs. unfamiliar; estimated median difference 167.34 s, p < 0.001), with longer trial times in the familiar condition (median 363.92, CI 307.84, 404.11) compared to the unfamiliar condition (median 251.28,CI 198.96,277.92). Planned comparisons show that the difference in trial time between familiar and unfamiliar conditions was significant for the instructor role (p < 0.001) and not when PWA take on the listener role (p 0.201). In the instructor role, PWA took less time to complete a trial in the unfamiliar condition compared to the familiar condition. In the listener role, trial times were more equal. See Figure 3.
The interaction of role*condition was not significant (estimated median difference 38.02 s, p 0.457).

NHC
There were no significant effects (role: estimated median difference 173.4 s, p 0.014, condition: estimated median difference 75.75 s, p 0.46 , interaction: estimated median difference 21.26 s, p 0.76). For NHC trial times were constant for both roles (instructor/listener) and conditions (familiar/ unfamiliar). See Figure 3.
Research question 3: An effect of CP familiarity for PWA compared to NHC.

Instructor Trials
There was no significant main effect of group (PWA/NHC, estimated median difference 78.37 s, p 0.199), with PWA and NHC showing similar overall total trial times for Instructor trials.
There was a significant main effect for condition (familiar vs unfamiliar; estimated median difference 68.09 s, p 0.01) 3 , with longer trial times in the familiar condition (median 384.50, CI 343.35, 491.88) compared to the unfamiliar condition (median 284.29, CI 259.57, 457.81). Planned comparisons within subjects showed that for PWA, total trial times were faster in the unfamiliar condition compared to the familiar condition (see Figure 3). Whilst the main effect of condition was significant, planned comparisons did not show a difference within subjects for the familiar vs unfamiliar conditions for NHC (p 0.203).
The interaction of group * condition was not significant (estimated median difference −53.52 s, p 0.253).

Listener Trials
There was a main effect of group (PWA vs. NHC; estimated median difference 144 s, p 0.008 Planned comparisons between subjects showed a significant difference in the unfamiliar condition (p 0.009), with trial times for PWA significantly faster than for NHC. The same comparison for the familiar condition was not significant (p 0.158). See Figure 3.
The main effect of condition (estimated median difference 60.4 s, p 0.08) and the interaction of group * condition was not significant (estimated median difference −53.52 s, p 0.399).
Supplementary Figure S2 shows the changes in total trial times for each group, condition and role by trial. This figure shows a relatively smooth transition in trial times between the final trial of the familiar condition and the first trial of the unfamiliar condition for both groups.

Summary of Results for Trial Time
Total trial times for NHC dyads were slower than PWA dyads (this effect held when participants with low MOCA scores were removed). Total trial times were longer when PWA took on the instructor role, regardless of the familiarity of the CP. In addition, total trial times for PWA were faster for the unfamiliar condition. For NHC, there was no significant difference in trial times in the familiar and unfamiliar conditions, or between the different roles.

Changes at the Level of Individual Dyads
To explore the results descriptively, we plotted the changes in total trial time for each dyad (Figure 4). Data for PWA has been  In general, the spread of data points for both groups (PWA or NHC) is greater for the Instructor role. There is a trend that, as aphasia severity decreases (moving left to right along the x axis), the distribution of difference scores increases with more dyads showing faster total trial times in the unfamiliar condition (negative values). Note that this is confounded by there being more data points for moderate to mild PWA. However, it is tentative evidence that for PWA who are less severe, total trial times were likely to be faster for the unfamiliar condition.

Task Accuracy
The omnibus analysis showed a main effect of group (PWA vs NHC; estimated median difference 9.67; p < 0.001), with NHC scoring higher than PWA. There was also a main effect of condition (familiar vs unfamiliar; estimated median difference −4.33; p 0.008), with accuracy scores higher in the unfamiliar condition compared to the familiar condition. The main effect of role was not significant (instructor vs. listener; p 0.707). No two-way interactions were significant. Finally, the three-way interaction was significant (group by role by condition; estimated median difference −3; p 0.033), indicating that accuracy scores were different, depending on the group (PWA vs. NHC), role (instructor vs. listener) and condition (familiar vs. unfamiliar). The patterns driving this three-way interaction are explored below.
The omnibus analysis without the participants with low MoCA scores showed the same effects and interactions.
Research question 1: An effect of role (instructor or listener). Research question 2: An effect of CP familiarity for PWA. . Planned comparisons showed that in the instructor role, PWA did not show a significant change in accuracy scores between familiar and unfamiliar conditions (p 0.607). In the listener role, the difference in accuracy scores between conditions (familiar/unfamiliar) was significant in the trimmed mean analysis (p 0.007, median analysis: p 0.062). Accuracy was higher in the unfamiliar condition compared to the familiar condition. It therefore seems that the main effect of condition (familiar vs. unfamiliar) for PWA was driven by the improvement in accuracy scores in the listener role (see Figure 5).

PWA
There was no significant main effect of role (estimated median difference 0.67, p 0.538) and no significant interaction of role*condition (estimated median difference 1.83, p 0.167).

NHC
There was a significant main effect of condition (familiar vs. unfamiliar; estimated median difference −0.67, p 0.015), with NHC obtaining higher accuracy scores in the unfamiliar condition (median 18.75, CI 18.33, 19.0) compared to the familiar condition (median 18.33, CI 17.17, 18.67). Planned pairwise comparisons showed a significant effect of condition for NHC in the instructor role as measured by the 20% trimmed means analysis (p 0.043, median analysis: p 0.131), but not for the listener role (p 0.182). As instructors, NHC obtained higher accuracy scores in the unfamiliar condition compared to the familiar condition, driven more by the significant improvement in scores in the instructor role.
There were no significant effects of role (estimated median difference −0.67, p 0.173) nor an interaction of role*condition (estimated median difference −0.33, p 0.338).
Research question 3: An effect of CP familiarity for PWA compared to NHC.

Instructor Trials
The 2 (group: PWA/NHC) x 2 (condition: familiar/unfamiliar) analysis showed a significant main effect of group (PWA vs. NHC; estimated median difference 2, p 0.004), with higher accuracy scores for NHC (median 18.33, CI 17.5, 18.83) compared to PWA (median 16.17, CI 15.0, 16.75). Planned pairwise comparisons showed that the effect of group was significant in both conditions (familiar: p 0.022; unfamiliar: p 0.002). In the instructor role, NHC had significantly higher accuracy scores compared to PWA (see Figure 5).
The main effect of condition (familiar/unfamiliar) and the interaction of group*condition were not significant (condition: estimated median difference −0.33, p 0.407, interaction: estimated median difference < −0.01, p 0.95).

Listener Trials
There was a significant main effect of group (PWA vs. NHC; estimated median difference 2.83, p < 0.001). PWA obtained lower accuracy scores (median 15.58,CI 14.17,17.0) compared to NHC (median 18.42,CI 18.33,19.17). Planned pairwise comparisons showed that the effect of group was significant in both conditions (familiar: p < 0.001, unfamiliar: p < 0.001). In the listener role, NHC had significantly higher accuracy scores compared to PWA. See Figure 5.
The main effect of condition (familiar vs. unfamiliar) was significant in the 20% trimmed means analysis 4 (Q 14.09, Q crit 4.36, p 0.002), with higher accuracy scores in the unfamiliar condition (median 17.67,CI 17.17,18.67) compared to the familiar condition (median 17, CI 16, 18.17). Planned pairwise comparisons showed that the effect of condition was significant for PWA in the 20% trimmed means analysis (p 0.007, median analysis: p 0.062), but not for NHC (p 0.182). In the listener role, PWA had significantly higher accuracy scores in the unfamiliar compared to familiar condition. See Figure 5.
The interaction group*condition was not significant (estimated median difference −1.67, p 0.093).
Supplementary Figure S3 shows the changes in accuracy scores for each group, condition and role by trial. This figure shows a relatively smooth transition in accuracy scores between the final trial of the familiar condition and the first trial of the unfamiliar condition for both groups, such that there is no clear practice effect across trials. The distributions of accuracy scores differ, accuracy scores become less variable in the unfamiliar condition.

Summary of Results for Accuracy
Overall, NHC always scored higher on task accuracy compared to PWA. When analyzed as separate groups, accuracy scores were higher in the unfamiliar condition for both PWA and NHC. These main effects survived the removal of participants with low MOCA scores.

Changes at the Level of Individual Dyads
The changes in accuracy scores for each dyad are plotted in Figure 6. Data for PWA has been grouped according the severity of aphasia for the PWA participant. In general, the spread of data points is greater for PWA than for NHC. Based on aphasia severity, there doesn't seem to be a clear pattern of change in accuracy scores between condition: while the two participants with very severe aphasia have a higher accuracy score in the unfamiliar condition compared to the familiar condition, the opposite is true for the participant with severe aphasia. This is true in the listener and instructor role. The moderate and mild severity groups show a pattern that is more similar to the NHC group, with a tendency to show higher accuracy scores for the unfamiliar condition.

Self-Initiated Repairs
The omnibus analysis showed a significant effect of role (instructor vs listener, estimated median difference 37.67, p < 0.001), with a higher number of self-initiated repairs in the instructor role. No other main effects or interactions were significant.
The omnibus analysis without the participants with low MoCA scores based on the median also showed a main effect of group (PWA vs. NHC, p 0.021), with a greater number of repairs in the instructor role.
Research question 1: An effect of role (instructor or listener). Research question 2: An effect of CP familiarity for PWA.

PWA
The 2 (role: instructor/listener) x 2 (condition: familiar/unfamiliar) analysis showed a significant main effect of role (instructor vs. listener; estimated median difference 17, p < 0.001). The number of self-initiated repairs was higher in the instructor role (median 13, CI 1.17, 18) compared to the listener role (median 2.08, CI 0.17, 4.42). Planned pairwise comparisons on the effect of role show that the significant difference in number of self-initiated repairs was present in both the familiar (p < 0.001) and unfamiliar condition (p < 0.001). For PWA, the number of self-initiated repairs was higher when they were in the instructor role compared to the listener role. See Figure 7.
There was no significant effect of condition (estimated median difference 0.5, p 0.201) or of the interaction role*condition (estimated median difference −0.17, p 0.806).

NHC
There was a significant main effect of role (instructor vs. listener; estimated median difference 23, p < 0.001). The number of selfinitiated repairs was higher in the instructor role (median 15.25, CI 13.17, 23.0) compared to the listener role (median 5.75,CI 2.5,9.17). Planned pairwise comparisons on the effect of role show that for NHC the significant difference in number of selfinitiated repairs was present in both the familiar (p 0.007) and unfamiliar condition (p < 0.001). For NHC, the number of selfinitiated repairs was higher when they were in the instructor role compared to the listener role. See Figure 7.
There were no significant effects of condition (estimated median difference 0.33, p 0.806) or interaction of role*condition (estimated median difference −2.5, p 0.173).
Research question 3: An effect of CP familiarity for PWA compared to NHC.

Instructor Trials
The 2 (group: PWA/NHC) x 2 (condition: familiar/unfamiliar) showed no significant effects for group (estimated median difference 2.25, p 0.559), condition (estimated median difference 0, p 1) or the interaction group*condition (estimated median difference −1, p 0.539). In the instructor role, PWA and NHC self-initiated repairs a similar number of times. The rate of self-initiated repairs was the same in both conditions. See Figure 7.

Listener Trials
The 2 (group: PWA/NHC) x 2 (condition: familiar/unfamiliar) analysis showed a main effect of group (PWA vs. NHC; estimated median difference 3.25 s, p 0.039), with a larger number of self-initiated repairs by NHC (median 5.75, CI 2.5, 9.17) compared to PWA (median 2.08, CI 0.17, 4.42). Planned pairwise comparisons show that the difference in number of selfinitiated repairs did not differ significantly in the familiar condition (p 0.133) or the unfamiliar condition (p 0.055) 5 .

FIGURE 7 | Boxplots showing total number of self-initiated repairs by condition and group, for each role (instructor/listener).
5 In the unfamiliar condition, the difference in self-initiated repairs between groups was significant in the 20% trimmed means analysis (p 0.031). The presence of a large number of outliers could have inflated the effect of the trimmed means analysis. We will therefore rely on the more conservative median analysis here. Figure 7, averaged across conditions, NHC show a larger number of self-initiated repairs compared to PWA. This effect disappears when this difference is assessed at the level of each condition (familiar and unfamiliar).

As shown in
The effect of condition and the interaction were not significant (condition: estimated median difference 0.33, p 0.511, interaction: estimated median difference 0.17, p 0.934).
Supplementary Figure S4 shows the changes in the number of self-initiated repairs for each group, condition and role by trial. This figure shows a relatively smooth transition in the number of self-initiated repairs between the final trial of the familiar condition and the first trial of the unfamiliar condition for PWA and NHC, such that there are no clear practice effects. More statistical analyses at the trial level would need to be conducted to confirm these observations.

Summary of Results for Number of Self-Initiated Repairs
The number of self-initiated repairs depended on the role participants fulfilled: in the instructor role, both PWA and NHC showed a higher number of self-initiated repairs compared to the listener trials, this main effect survived the removal of participants with low MOCA scores. Compared to NHC, PWA produced a similar number of repairs in the instructor role. As listeners, PWA produced fewer selfinitiated repairs compared to NHC.

Changes at the Level of Individual Dyads
The changes in number of self-initiated repairs for each dyad is plotted in Figure 8. Data for PWA has been grouped according the severity of aphasia for the PWA participant. In general, the spread of data points for both groups (PWA and NHC) is greater for the instructor role. In the instructor role, there is a trend that as aphasia severity decreases (moving left to right along the x axis), the distribution of difference scores becomes more like the NHC group, with more dyads showing lower number of self-initiated repairs in the unfamiliar condition (negative values). Interestingly, PWA do not show the tendency to increase the number of selfinitiated repairs to the extent that NHC do (positive values): PWA tend to show fewer self-initiated repairs in the unfamiliar condition compared to the familiar condition, while NHC show a slightly more equal distribution between decreases and increases in the number of self-initiated repairs. There is tentative evidence that for PWA who are less severe, the number of self-initiated repairs was likely to be smaller for the unfamiliar condition.

Clarification Requests
In the omnibus analysis, there was a significant main effect of group (PWA vs. NHC, estimated median difference 24.67, p 0.002), with the NHC producing a higher number of clarification requests than PWA. There was a significant main effect of role (instructor vs. listener, estimated median difference −46.67, p < 0.001), with a higher number of clarification requests produced in the listener role as compared to the instructor role. There was a significant main effect of condition in the trimmed means analysis (familiar vs unfamiliar, estimated mean difference 10.6, p 0.033), with a higher number of clarification requests with the familiar CP than the unfamiliar CP. There was a significant interaction between group and role (estimated median difference −25.33, p 0.001), with the NHC producing a greater number of clarification requests than PWA when in the listener role, however, this difference was absent for the instructor role (principally because so few clarification requests are made in the instructor role, see Figure 9 and Supplementary Figure S5). The omnibus analysis without the participants with low MoCA scores based on the median did not show a main effect of condition (familiar vs. unfamiliar, p 0.344). All other effects were as reported above.
Research question 1: An effect of role (instructor or listener). Research question 2: An effect of CP familiarity for PWA.

PWA
The 2 (role: instructor/listener) x 2 (condition: familiar/ unfamiliar) analysis showed a significant main effect of role (instructor vs listener; estimated median difference −10, p < 0.001) 6 . The number of clarification requests was higher when PWA took on the listener role (median 6.17, CI 3.0, 12.67) compared to the instructor role (median 0.75, CI 0.42, 1.33). Planned comparisons show that for PWA, the difference in number of clarification requests between instructor and listener role was significant in the familiar (p < 0.001) and the unfamiliar condition (p < 0.001). PWA showed a higher number of clarification requests in the listener role compared to the instructor role. See Figure 9. The main effect of condition was significant (familiar vs. unfamiliar; estimated median difference 3.33, p 0.010) 7 , with higher number of clarification requests in the familiar condition (median 4.42, CI 2.0, 10.5) compared to the unfamiliar condition (median 2.17, CI 0.67, 3.25). Pairwise comparisons resulted in a significant difference between conditions for both the listener (p 0.002) and instructor roles (p 0.036). PWA showed a higher number of clarification requests in the familiar condition compared to the unfamiliar condition. See Figure 9.
The interaction of role*condition was also significant (estimated median difference −2.17, p 0.046) 8 . In the instructor role, there is no difference in number of clarification requests between the familiar and unfamiliar conditions. In the listener role, PWA produced a smaller number of clarification requests in the unfamiliar condition compared to the familiar condition. See Figure 9.

NHC
For NHC there was a significant main effect of role (instructor vs. listener; estimated median difference −34.5, p < 0.001), with more clarification requests produced in the listener role (median 18.17, CI 13.58, 28.17) compared to the instructor role (median 0.75, CI 0.5, 1.08). Planned pairwise comparisons for the effect of role show that the number of clarification requests between roles is significantly different in both the familiar (p <0 .001) and the unfamiliar condition (p <0 .001). NHC produced more clarification requests while in the listener role compared to when they were instructors. See Figure 9.
There were no significant effects of condition (estimated median difference 5.33, p 0.244) or interaction of role*condition (estimated median difference −4.5, p 0.388). For both roles, NHC produced similar numbers of clarification requests in the familiar and unfamiliar conditions. See Figure 9.
Research question 3: An effect of CP familiarity for PWA compared to NHC.

Instructor Trials
For the instructor trials the 2 (group: PWA/NHC) x 2 (condition: familiar/unfamiliar) showed no significant effects for group FIGURE 9 | Boxplots showing total number of clarification requests by condition and group, for each role (instructor/listener). 6 The 20% trimmed means analysis did not show a significant effect of role (role: Q −11.7, p 0.064). The variance in the instructor role is close to zero. This will have made the analysis based on the 20% trimmed mean less reliable. We will therefore rely on the outcome of the median analysis here. 7 The 20% trimmed means analysis did not show a significant main effect of condition (Q 4.33, p 0.159). The same reasoning applies as discussed in footnote 5. 8 The 20% trimmed means analysis did not show a significant interaction of role*condition (Q −2.67, p 0.112). The same reasoning applies as discussed in footnote 5.

Listener Trials
The 2 (group: PWA/NHC) x 2 (condition: familiar/unfamiliar) analysis showed a main effect of group (PWA vs. NHC; estimated median difference 12.5, p 0.001), with NHC producing a larger number of clarification requests (median 18.17, CI 13.58, 28.17) compared to PWA (median 6.17, CI 3.0, 12.67). Planned pairwise comparisons indicated that a significant difference between the two groups existed in both conditions (familiar: p .032 10 , unfamiliar: p < 0.001). As listeners, NHC showed a higher number of clarification requests compared to PWA in both conditions. See Figure 9.
The effect of condition and the interaction were not significant (condition: estimated median difference 3.33, p 0.156, interaction: estimated median difference 3.83, p 0.454) 11 .
Supplementary Figure S5 shows the changes in the number of clarification requests for each group, condition and role by trial. This figure reflects some potential differences in the number of clarification requests between the final trial of the familiar condition and the first trial of the unfamiliar condition for both groups. The reduction in clarification requests from familiar to unfamiliar conditions may be driven by practice effects, rather than the familiarity of the CP.

Summary of Results for Number of Clarification Requests
The number of clarification requests depended on the role the participants took on: both PWA and NHC asked their conversation partner for clarification more often as listeners compared to when they were instructors. Overall, PWA asked their conversation partner for clarification less often compared to NHC. These effects survived the removal of participants with low MOCA scores. As listeners, PWA asked for clarification less often when working with their unfamiliar conversation partner compared to a familiar conversation partner. In the listener role, NHC did not show a change in number of clarification requests between conditions.

Changes at the Level of Individual Dyads
The changes in number of clarification requests for each dyad are shown in Figure 10. Data for PWA has been grouped according the severity of aphasia for the PWA participant. For the instructor role, the change in number of clarification requests was minimal for both groups, and the pattern seems roughly the same across all aphasia severities and groups. In the listener role, there is a trend that as aphasia severity decreases, FIGURE 10 | Plot showing individual data points for PWA for difference score between familiar and unfamiliar conditions, by role, categorized by WAB categorization. Zero represents no change in the number of clarification requests between conditions, negative values indicate a smaller number of clarification requests in the unfamiliar condition compared to the familiar condition. 9 The main effect of condition was significant based on the 20% trimmed mean analysis (Q 4.74, Q crit 4.38, p 0.042). The variance for the groups will have been close to zero, which will have made the trimmed means analysis less reliable. We will therefore rely on the median analysis here. 10 In the familiar condition, the trimmed mean analysis showed an insignificant difference between the two groups (p 0.755). Again, the presence of multiple outliers will have inflated the trimmed mean for the PWA group, making the trimmed mean analysis less reliable. 11 The main effect of condition was just significant based on the 20% trimmed mean analysis (Q 4.29, Q crit 4.27, p 0.049). As for the instructor trials, the presence of a large number of outliers will probably have inflated the trimmed mean analysis more than the median analysis. To be on the safe side, we will again rely on the more conservative median analysis. the distribution of difference scores increases with more dyads showing lower numbers of clarification requests in the unfamiliar condition (negative values). Overall, even the milder severities mostly show more variation in terms of reduction in clarification requests with the UFCP compared to the FCP. NHC show a slightly more equal distribution between decrease and increase in number of clarification requests. These effects are confounded by the uneven spread of data points across aphasia severities.

DISCUSSION
This study examined the effect of conversation partner familiarity on goal-directed, face-to-face communication in aphasia, as part of the contextual component of a theoretical framework of realworld communication. We addressed three research questions.
Research question 1: Is there an effect of role (instructor or listener) during goal-directed communication on the collaborative communication task?
We hypothesized that the type of role (instructor/listener) would affect the outcome measure differently for each group. We predicted that role would have an impact on trial time and accuracy for PWA, but not for NHC. For both groups, we expected an effect of role on the number of self-initiated repairs and clarification requests, due to the nature of these communicative behaviors.
The omnibus analysis showed that overall, NHC showed longer total trial times compared to PWA. There was a significant effect of role for PWA: in the instructor role, PWA took longer to complete a trial compared to when they were in the listener role. For NHC, total trial time was stable across roles.
Overall, PWA obtained lower accuracy scores compared to NHC. For both PWA and NHC, accuracy scores did not significantly differ by role. Planned comparisons on the main effect of condition did show a different pattern of change between the familiar and unfamiliar conditions across the two roles for PWA, which will be discussed in the next section.
The number of self-initiated repairs showed the expected main effect of role: both groups initiated more self-repairs as instructors compared to when they were listeners. Overall, both groups showed equal numbers of self-initiated repairs in the instructor role, while PWA produced fewer repairs compared to NHC in the listener role.
The number of clarification requests also showed the expected main effect of role for both groups. These requests were more frequent in the listener role compared to the instructor role. As listeners, NHC produced more clarification requests compared to PWA.
Overall, these results show that the role participants take on during the task affected the process of goal-directed communication. This is true for PWA on all measures except accuracy. In line with our expectations, role only impacted communication for NHC on the measures of self-initiated repairs and clarification requests.
Research question 2: Do PWA benefit from the familiarity of their conversation partner (personal common ground) during goal directed communication?
For each outcome measure, we tested the hypothesis that it would be easier for PWA to complete the collaborative task with a familiar CP than with an unfamiliar CP. Easier is characterized by the need for less time to complete the task, higher accuracy scores and requiring fewer self-initiated repairs and fewer requests for clarification to reach mutual understanding. The lack of counterbalancing in the design of the current study means that the unfamiliar condition was always presented after the familiar condition. We therefore have to assume that a practice effect is present in the unfamiliar condition. The conclusions we can draw in terms of causality are therefore limited, and we note that omnibus familiarity effects for Total Trial Time and Clarification Requests were no longer significant when participants with low MOCA scores were removed.
The differences between the familiar and unfamiliar condition went against our initial predictions (see Table 5 ). PWA showed shorter total trial times for the unfamiliar condition, higher accuracy for the unfamiliar condition (especially with PWA as listeners) and fewer clarification requests in the unfamiliar condition. Notes: Red indicates the outcome is different from the original hypothesis. * hypotheses were about the difference scores between the familiar and unfamiliar conditions. A larger difference score represents a bigger impact of the experimental manipulation. ** in these columns, red indicates a different directional effect in response to the experimental manipulation for PWA compared to NHC.
Despite the lack of "familiarity advantage", it is of interest to note that none of the outcome measures show a change in the "negative" direction during communication with the UFCP (i.e. "worse" communication as evidenced by longer trial times, lower accuracy scores, higher number of self-initiated repairs and clarification requests) as a result of the familiarity manipulation. We expect this to be, at least in part, due to the lack of counterbalancing of conditions, as the unfamiliar condition always came second. If we assume that the familiar condition acted as a practice run, the results suggest that as a group, PWA dyads can show a practice effect (i.e. learning) on a communicative task. Furthermore, on a familiar, practised, concrete task, the communicative ability of PWA dyads are not negatively affected by the lack of personal common ground with their CP during goal-directed communication.
Research question 3: Do PWA differ from NHC in how they respond to conversation partner familiarity?
Finally, we tested whether PWA differ from NHC in how they respond to CP familiarity during goal directed communication. We hypothesized that PWA and NHC would show an overall similar response to the familiarity manipulation on all outcome measures, but that the effect of the experimental manipulation would be greater for PWA compared to NHC, as evidenced by an interaction effect in the group*condition analysis. Results showed no significant interaction effects for any of the outcome measures. When each group was assessed separately for an effect of role and condition, a difference across the familiar and unfamiliar conditions did emerge (see Table 5). Due to the experimental design we, again, assume that both groups benefitted from a practice effect in the unfamiliar condition. However, the comparison between performance of both groups in the unfamiliar condition is possible because the practice effect is present for both NHC and PWA.
A comparison of the two groups by role shows that for most outcome measures (five out of eight), PWA and NHC show a different directional response to the change in CP familiarity. NHC showed a stable profile of communicative behavior across the two conditions, apart from an improvement in communicative performance (accuracy scores) as an instructor with an UFCP, which may have come from the practice effect of having the familiar CP condition first. NHC, therefore, generally did not show an effect of CP familiarity in their communicative behavior, nor a significant influence of practice.
In contrast to this, PWA showed a change in communicative behavior between the two conditions as an instructor (time and number of clarification requests) and as a listener (number of clarification requests). As listeners, communicative performance (accuracy) is also affected. In short, PWA show a more widespread change in communicative behavior and performance as a result of the familiarity manipulation compared to NHC. These differences are discussed below.

Instructors
We found that as instructors, PWA showed a different pattern of behavior when working with a FCP compared to an UFCP (shorter trial times, fewer clarification requests with the UFCP, and stable accuracy scores and self-initiated repairs). The stability to the number of self-initiated repairs is in line with previous studies that have suggested that certain aspects of communication might remain stable across different communicative settings and CPs (Lubinski et al., 1980;Gurland et al., 1982;Leaman and Edmonds, 2019). The higher number of clarification requests with the FCP is also in line with previous research with NHC (Boyle et al., 1994). As suggested by the authors, the unfamiliarity might have discouraged PWA from asking UFCPs for clarification more often. In addition, the experience PWA had gained on the task by the time they worked with the UFCP, could have meant that fewer clarification requests were needed. The stability of the accuracy scores across familiar and unfamiliar CPs, and the reduction in trial time with the UFCP compared to the FCP, suggest that the ability to complete the task in less time with the unfamiliar CP was a result of increased experience and confidence on the task. With the UFCP, PWA were able to achieve the same result (i.e. stable accuracy scores), while putting in less "effort" (i.e. time and number of clarification requests). Differently put, PWA might have been more 'efficient' at completing the task with the UFCP compared to the FCP, possibly due to greater experience on the task in the unfamiliar condition. In contrast to this, NHC were shown to put in the same amount of effort (i.e. time, repairs and clarification requests) with both CPs, which resulted in a better outcome with the UFCP (i.e. higher accuracy scores). While both groups had the same amount of practice on the task, a different pattern of behavior is observed in the unfamiliar condition.
There are a number of possible reasons for this difference in effort. Firstly, perhaps PWA felt more comfortable with their FCP compared to the UFCP, resulting in more time and effort spent with the FCP. In line with this, PWA might have felt more comfortable asking for clarification from the FCP compared to the UFCP. The results from our measure of comfort with the CP indicate that at least at the group level, this explanation doesn't hold, as PWA reported the same level of comfort with both CPs. Another explanation for the reduced time and number of clarification requests is that familiarity of the task reduced the need for more time. The stability of the accuracy scores for PWA, while NHC still improved in the unfamiliar condition (showing a likely practice effect) is perhaps more surprising. It is possible that in the instructor role, PWA dyads reached a ceiling for accuracy and might not have been able to communicate more detail on the task to their CP, even with practice.
Finally, it is possible that as instructors, PWA and NHC differed (consciously or unconsciously) in the criterion they set for achieving mutual understanding. To communicate, interlocutors must continuously achieve mutual understanding together, i.e. they must understand what the other person is saying to continue the conversation (Clark and Wilkes-Gibbs, 1986;Clark and Brennan, 1991;Clark, 1996). Mutual understanding does not have to be perfect for conversation to work. Instead, interlocutors negotiate a criterion of mutual understanding "well enough for current purposes" (Clark, 1996, p. 221). NHC, unrestricted by any communication difficulties, might have set a higher criterion for mutual understanding on the current task (i.e. striving for a higher level of accuracy). This then resulted in similar amounts of effort made in an attempt to achieve higher accuracy scores, regardless of CP familiarity.
For PWA, this process might have unfolded differently. When confronted with the UFCP, PWA might have accepted the level of mutual understanding they had been able to achieve so far (with their FCP) as good enough for current purposes. This might have allowed PWA to strip away any communicative behaviors deemed unnecessary for current purposes (i.e. fewer clarification requests and less time). We can only speculate about the underlying reasons for such a shift. It could have been the desire to avoid unnecessary conversational difficulties (or: avoid "losing face") with the UFCP (as evidenced by fewer clarification requests initiated by the PWA in the unfamiliar condition) (Simmons-Mackie and Damico, 1995). It could also be that regardless of the CP, PWA tend to strive to minimize communicative (cognitive) effort in light of the good enough accuracy scores more generally.

Listeners
The changes in the number of self-initiated repairs and clarification requests were in line with previous research, as discussed for the instructor role. The increase in accuracy scores with the UFCP, and the stable trial times across CPs go against our predictions and indicates the presence of a practice effect. The NHC group will be used as a reference in the discussion of current findings for PWA.
It seems that as listeners, PWA put in the same amount of "effort" in both conditions (as measured by total trial time), while achieving a better result with the UFCP (i.e. higher accuracy scores). NHC show the same pattern in trial time, but their accuracy scores remain stable. For NHC, this might reflect a ceiling effect rather than a strong behavioral pattern.
The most likely explanation in our view is that PWA benefitted from repeated practice on the task, resulting in better performance on the second half of the trials. Completing the same task with the same set of stimuli a number of times might have created a physical and communicative context (i.e. things that have been discussed within the same conversation become part of common ground) that could have helped restrict the number of possible interpretations for PWA (Skipper, 2014;Doedens and Meteyard, 2019).
Interestingly, while PWA showed shorter trial times with the UFCP when they were instructors, this effect disappeared when they were in the listener role. A potential explanation for this is that those who take on the instructor role are more in control of the way the trial unfolds over time. This would explain why the reduced trial time when PWA are listeners disappears: their CP might have taken the lead, resulting in similar patterns of "effort" as compared to the NHC group and no reduction in overall trial time. Further assessment of the CP role is needed to confirm this interpretation, however. An analysis as reported in this paper, conducted on data from the conversation partners of each PWA when they were in the instructor role, for example, could reveal whether they show a pattern of "effort" across conditions that is similar to NHC or not. Furthermore, insight into the number of turns taken, or the duration of turns for each CP (PWA and their familiar and unfamiliar CPs) could provide more detailed insight into the efforts made by both parties during the task, and how this changed (or not) as a result of the familiarity manipulation.

Aphasia Severity
The inspection of the difference scores on all outcome measures between the familiar and unfamiliar conditions allows us to draw tentative conclusions about the difference in behavioral patterns depending on aphasia severity. Visual inspection of the data shows the tendency for PWA with milder severity to show greater behavioral change as a result of CP familiarity. As might be expected, as aphasia severity decreases the behavioral pattern becomes more like that of the NHC group. Although more research is needed with a larger group of people with severe aphasia, an intuitive interpretation is that less flexibility in communicative behavior is seen for PWA with more severe aphasia, as they have less scope for flexible communication in the first place. More research is needed with a larger group of PWA, divided equally across severities, to draw stronger conclusions about this.
Finally, a limitation to the current study is the fact that a number of the participants scored below the cut-off score on the MoCA, suggesting the potential presence of mild cognitive impairment but perhaps typical for older dyads as sampled here. Previous studies have shown that the presence of mild cognitive impairment can influence performance on a referential communication task, due to impairments in cognitive functioning or impairments in Theory of Mind (Moreau et al., 2015;Moreau et al., 2016). The secondary analyses, excluding the data from the dyads these participants belonged to, showed that main effects of familiarity for Total Trial Times and Clarification requests were no longer significant. Effects of familiarity were already confounded with practice, making it difficult to draw strong conclusions. However, future research should examine the influence of potential cognitive and theory of mind deficits on performance on this task and communication more closely, especially in relation to the older participants, and the ability of conversation partners to provide optimal and flexible communicative support to their conversation partners with aphasia.

CONCLUSION
When communicating about a concrete, practised topic, PWA dyads do not show the often-assumed negative influence of a lack of shared personal common ground. Furthermore, the current results seem to suggest that PWA might be able to carry over the experience on a communicative task across conversation partners. More research is needed, however, to confirm this. It may be the case that in a more complex or abstract task, partner familiarity will have a greater impact on performance for PWA (Fussell and Krauss, 1989).
We found tentative evidence that PWA showed a different response to the presence of an unfamiliar conversation partner compared to NHC (where both groups had the same practice). Based on the current findings, it seems PWA aim to reduce communicative efforts in order to achieve good enough information transfer. This seems specifically the case when PWA are in the "instructor" role. In the listener role, it seems PWA might benefit from the repeated practice on the same task, i.e. building up of common ground within the task, as evidenced by their improved accuracy across conditions. In contrast to PWA, NHC show similar communicative behaviors across conversation partners. This group seems to strive for the most detailed information exchange, regardless of the familiarity of the CP. In the case of NHC, an improvement in performance suggests NHC might benefit from a building up of experience, or common ground, within the task, regardless of the familiarity of their conversation partner. Especially considering that this task used highly concrete materials that the NHC should have found easy to describe. More research is needed to evaluate the effect of conversation partner familiarity on communicative behaviors and performance in PWA on, for example, an unfamiliar or more complex task. In such a case, the tendency of PWA to minimize communicative efforts with the unfamiliar conversation partner, without having had any practice, could potentially lead to lower performance scores.

CLINICAL IMPLICATIONS
The findings from the current study have clinical implications for treatment and assessment in aphasia rehabilitation. The current study partly supports the existing assumption that conversation partner familiarity affects communication for PWA. Importantly, the outcome on the current task was not negatively affected by the presence of an unfamiliar CP, as shown by equal or improved communicative performance on the task with the unfamiliar conversation partner. We assume these results are at least partly due to a practice effect. However, a positive effect of practice for PWA on a goal-directed communication task, in many ways similar to a setup like PACE (Davis, 2005) for intervention, is something to be celebrated. This research shows that PWA can show different communicative behaviors and communicative purposes, depending on the conversation partner they are communicating with (Simmons-Mackie and Damico, 1995). These findings also have implications for the way communicative behaviors that have been trained in one setting, might generalize (or not) across conversation partners. The results also suggest that PWA with more severe aphasias might be less flexible in adapting to different communicative settings (and therefore might require training on a more generic set of communicative strategies, that work across communication settings and partners). The lower MoCA scores for some CPs also suggest that the ability of the CP to flexibly support and enable the PWA to communicate effectively should be considered during intervention. Although the underlying reasons for the change in communicative behaviors between conversation partners remain unclear, this is important to keep in mind when profiling realworld communicative abilities for PWA.

DATA AVAILABILITY STATEMENT
The raw data presented in this article are not readily available because of the sensitivity of the video materials. Requests to access the anonymized datasets should be directed to the first author.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the PCLS School Ethics Research Committee, School of Psychology and Clinical Languages Sciences, University of Reading. The patients/participants provided their written informed consent to participate in this study.