Likeability and Expert Persuasion: Dislikeability Reduces the Perceived Persuasiveness of Expert Evidence

Younan, Mariam; Martire, Kristy A.

doi:10.3389/fpsyg.2021.785677

ORIGINAL RESEARCH article

Front. Psychol., 23 December 2021

Sec. Forensic and Legal Psychology

Volume 12 - 2021 | https://doi.org/10.3389/fpsyg.2021.785677

Likeability and Expert Persuasion: Dislikeability Reduces the Perceived Persuasiveness of Expert Evidence

School of Psychology, The University of New South Wales, Kensington, NSW, Australia

Article metrics

View details

Citations

16,5k

Views

2,4k

Downloads

Abstract

With the use of expert evidence increasing in civil and criminal trials, there is concern jurors' decisions are affected by factors that are irrelevant to the quality of the expert opinion. Past research suggests that the likeability of an expert significantly affects juror attributions of credibility and merit. However, we know little about the effects of expert likeability when detailed information about expertise is provided. Two studies examined the effect of an expert's likeability on the persuasiveness judgments and sentencing decisions of 456 jury-eligible respondents. Participants viewed and/or read an expert's testimony (lower vs. higher quality) before rating expert persuasiveness (via credibility, value, and weight), and making a sentencing decision in a Capitol murder case (death penalty vs. life in prison). Lower quality evidence was significantly less persuasive than higher quality evidence. Less likeable experts were also significantly less persuasive than either neutral or more likeable experts. This “penalty” for less likeable experts was observed irrespective of evidence quality. However, only perceptions of the foundational validity of the expert's discipline, the expert's trustworthiness and the clarity and conservativeness of the expert opinion significantly predicted sentencing decisions. Thus, the present study demonstrates that while likeability does influence persuasiveness, it does not necessarily affect sentencing outcomes.

Introduction

Expert evidence is ubiquitous in modern civil and criminal trials (Gross, 1991; Diamond, 2007; Jurs, 2016). Jurors involved in legal proceedings must assess the value of expert opinions to inform consequential decisions affecting lives and liberty. However, these assessments are sometimes mistaken, threatening the administration of justice, and contributing to unsafe trial outcomes (Innocence Project, 2021).

The Elaboration Likelihood Model (ELM) of persuasion is an information-processing model that has been used to understand jury decision-making about expert evidence (Petty and Cacioppo, 1986; McAuliff et al., 2003). This model suggests that jurors may struggle to accurately distinguish between low- and high-quality expert opinions because of the cognitive demands involved in the task (Petty and Cacioppo, 1986; Greene and Gordan, 2016). According to ELM, limited cognitive resources and insufficient knowledge increase reliance on readily accessible but potentially irrelevant, peripheral aspects of a message (Petty and Cacioppo, 1984, 1986; San José-Cabezudo et al., 2009; Salerno et al., 2017). This theory is supported by evidence suggesting that when information is unfamiliar, highly technical or complex—as is often the case for expert opinions—juror evaluations of credibility and persuasiveness may be swayed by superficial features of the expert and their evidence (Chaiken, 1980; Heuer and Penrod, 1994; Shuman et al., 1994; Cooper et al., 1996; Schuller et al., 2005; Ivković and Hans, 2006; Daftary-Kapur et al., 2010; Bornstein and Greene, 2011; Neal, 2014; Maeder et al., 2016). Expert likeability is one peripheral cue that may affect perceptions of persuasiveness.

“Likeability” refers to the extent to which an expert presents as friendly, respectful, well-mannered, and warm (McAdams and Powers, 1981; Kerns and Sun, 1994; Levin et al., 1994; Gladstone and Parker, 2002; Neal and Brodsky, 2008; Brodsky et al., 2009, 2010; Neal et al., 2012). The likeability of the expert is a prominent social cue that is readily accessible to jurors. It is considered important because likeability increases juror connection, attention and receptiveness (McGaffey, 1979; Schutz, 1997), thereby fostering perceptions of credibility and merit (Chaiken, 1980; Brodsky et al., 2009; Neal et al., 2012). The importance of likeability for expert credibility assessment is supported by evidence that the Witness Credibility Scale (Brodsky et al., 2010) accounts for ~70% of the observed variance in credibility using just four factors: likeability, confidence, knowledge, and trustworthiness. On its own likeability accounts for ~7% of the variance within this model.

Although likeability is clearly not the sole determinant of jurors' credibility assessments, experimental research further supports the significance of likeability in expert persuasion. For example, Brodsky et al. (2009) presented mock jurors one of two videos of the testimony of an expert who was a licenced clinical psychologist, with an established private practice, 14 years of experience conducting over 100 forensic risk evaluations, and a history of providing expert testimony in over 50 cases. The only difference between the two videos was the level of expert likeability, which was manipulated to be either “low” or “high” using verbal and non-verbal cues such as smiling, body language and deferential speech. The results showed that the likeable expert was rated as more credible and trustworthy than the less likeable expert. Thus, the more likeable expert was more persuasive than a less likeable expert of the same quality.

Adapting the materials used by Brodsky et al. (2009) and Neal et al. (2012) examine the effect of likeability and expert knowledge on perceptions of persuasiveness. In their study, mock jurors watched the testimony of a high or low likeability expert who was either a “high knowledge” experienced clinical psychologist, or a “low knowledge” inexperienced general psychologist. The results showed that the more knowledgeable expert was more credible to jurors than the low knowledge expert. They also found that likeability had a consistent effect, boosting the credibility of both high and low knowledge experts. Taken together, these findings show that likeability does influence perceptions of expert credibility. Yet there is no evidence that a more likeable expert provides evidence that is more scientifically sound, logically coherent, or empirically justified than a less likeable expert (Chaiken, 1980; Petty and Cacioppo, 1986; Greene and Gordan, 2016). Thus, a reliance on likeability may misdirect or misinform juror evaluations and contribute to unjust trial outcomes, especially when a highly likeable expert provides a low-quality opinion. However, it is important to consider the limitations of past research when assessing the potentially negative effects of expert likeability on juror assessments of credibility and persuasiveness.

To-date studies have typically conceptualised and manipulated expert evidence quality in simplistic ways, for example, using abridged trial vignettes, decontextualised expert extracts, and few or basic indicators of quality (e.g., years of experience or prestige of credentials; Petty et al., 1981; Swenson et al., 1984; Guy and Edens, 2003; McAuliff and Kovera, 2008; Brodsky et al., 2009; Neal et al., 2012; Parrott et al., 2015; Salerno et al., 2017). Given these somewhat impoverished materials, it is possible that the information that was available—including about likeability—may gain undue prominence in decision-making. Where peripheral cues are available, they may even “stand in” for useful but unavailable information (Petty and Cacioppo, 1986; Shuman et al., 1994; Sporer et al., 1995; Cooper et al., 1996; Ivković and Hans, 2006; Tenney et al., 2008). For example, there is evidence that likeability is used to make inferences about expert trustworthiness (Neal et al., 2012). Thus, it remains unclear how likeability may impact jurors' assessments of credibility and persuasiveness, when more realistic indicators of expertise are available to inform decision-making.

Another related limitation is the tendency to conflate expert evidence quality and likeability in experimental manipulations. For example, in previous studies, likeability manipulations also altered aspects of the evidence quality. Specifically, modest statements that acknowledge limited certainty and the potential for error used in studies such as Brodsky et al. (2009) and Neal et al. (2012) are generally considered to be higher quality than overstated conclusions that fail to acknowledge uncertainty (Koehler, 2012; Edmond et al., 2016). Thus, the influence of likeability on judgments of credibility might not have been entirely attributable to likeability, but rather, may partially be a response to differences in evidence quality. Consequently, it is unclear how influential peripheral cues such as likability are to credibility judgements when they are made in more realistic contexts where expert opinion quality is operationalised in more subtle and realistic ways.

Recent attempts to address this gap in the persuasion literature have used richer representations of expert opinion quality. Martire et al. (2020) operationalised expert opinion quality using the Expert Persuasion Expectancy (ExPEx) Framework. The ExPEx Framework specifies eight attributes that are logically relevant to the quality of an expert opinion: foundation, field, specialty, ability, opinion, support, consistency, and trustworthiness. Foundation refers to the empirical validity and reliability of the field in which the expert is opining (e.g., the discipline's error rate). Field relates to expert's training, study, and experience in an area generally relevant to their opinion (e.g., clinical psychology training). Specialty concerns whether the testifying expert has training, study or experience that is specifically relevant to the assertions they are making (e.g., risk assessment training). Ability relates to the expert's track record and their ability to form accurate and reliable opinions (e.g., personal proficiency). Opinion concerns the substantive opinion or judgment conveyed by the expert, its clarity, and the acknowledgement of limitations. Support concerns the presence and quality of evidence underpinning the opinion (e.g., the results of psychometric testing). Consistency relates to the level of agreement amongst different suitable experts. Trustworthiness refers to the experts' conscientiousness, objectivity, and honesty.

When information about all ExPEx attributes was available to jury-eligible respondents, participants were more persuaded by objectively high- compared to low-quality forensic gait expert evidence (Martire et al., 2020). Jurors were also particularly influenced by information about the experts' track record (ability), their impartiality (trustworthiness), and the acceptability of their conclusion to other experts (consistency). However, the nuanced operationalisation of expert evidence quality used in this research did not extend to the use of realistic trial materials. Participants were merely presented an eight-statement description of the expert and their opinion and were not given any information about peripheral cues such as likeability. Thus, the influence of likeability on the assessment of expert evidence quality, especially in information-rich decision scenarios, remains unknown. Our research addresses this gap.

Across two studies, jury-eligible participants rated the persuasiveness of an expert opinion and provided a sentencing decision in a Capitol murder case after viewing and/or reading ExPEx-enriched high- or low-quality expert testimony. The materials were adapted from Neal et al. (2012) and Parrott et al. (2015) and included versions of the testimony from a high- or low-likeability expert (Study 1) with a neutral likeability control (Study 2).

In line previous research using the ExPEx framework, we expect that jurors will regard higher quality expert evidence as more persuasive than lower quality evidence, and that sentencing decision will be affected by evidence quality. We also expect that persuasiveness ratings will predict sentencing decisions. In addition, if as previously observed, likeability does influence perceptions of expert credibility and persuasiveness, then we would expect to find that more likeable experts are more persuasive than less likeable experts irrespective of evidence quality. However, if the previous effects of likeability were a result of the simplistic or confounded experimental materials rather than the persuasiveness of likeability per se, then we would not expect an effect of likeability because jurors will instead rely on the numerous valid quality indicators available in the trial scenario. Main effects of both quality and likeability, and any interactions between quality and likeability, would suggest that both likeability and indicators of evidence quality affect the persuasiveness of an expert opinion and/or sentencing decisions.

Study 1

Method

Design

Study 1 used a two (expert evidence quality: low, high) × two (likeability: low, high) between-subjects factorial design. Expert evidence quality was operationalised using either eight “high-quality” or eight “low-quality” ExPEx attributes. Low-vs. high-likeability was operationalised using the trial materials and verbal components from Neal et al. (2012). The primary dependent variables were persuasiveness rating and sentencing decision. Persuasiveness was measured by averaging ratings of expert credibility, evidence value and evidence weight (all rated from 0 to 100). Sentencing decision was a binary choice between life in prison or death sentence per Neal et al. (2012). This study was pre-registered (AsPredicted#: 65017) and materials, data and analyses are available at [link for blind review to be updated if accepted: https://osf.io/yfgke/].

Participants

Participants were recruited from Amazon Mechanical Turk (MTurk). All participants resided in the United States and were aged 18 years or older. To maximise data quality, participation was limited to those who had not been involved in our similar studies and who had a 99% MTurk approval rating. Participants also completed attention checks and a reCAPTCHA to exclude non-human respondents (Von Ahn et al., 2008). Two-hundred and forty participants were recruited and were compensated US$2.00 for their time. Participants who either failed the age check, were ineligible to serve on a jury or failed the attention checks (n =22), were excluded from the final sample per our pre-registered exclusion criteria. The final sample consisted of 218 jury-eligible participants randomly allocated to condition as follows: high-quality, high-likeability: n = 55: high-quality, low-likability: n = 57; low-quality, high-likeability: n = 50; low-quality, low-likeability: n = 56.

Materials and Measures

Trial Materials

The trial materials used in this study were adapted from Neal et al. (2012) with the permission of the author. Departures from the original materials and procedures are specified below.

Pre-trial Instructions

Participants read written jury instructions indicating that the defendant had been found guilty of first-degree murder and that they were to return either a sentence of life in prison, or death, based on whether it could be shown “beyond a reasonable doubt that there is a probability that the defendant would commit criminal acts of violence that would constitute a continuing danger in society” (Neal et al., 2012). This jury instruction was adapted from the Texas Criminal Procedure Code, Article 370.071b-f (1985) by Krauss and Sales (2001).

Expert Evidence

The expert evidence transcript used by Neal et al. (2012) was based on an actual jury sentencing proceeding and portrayed the examination-in-chief and cross-examination of a forensic psychologist testifying about the likelihood that a convicted murderer would commit future violence (Krauss and Sales, 2001). The expert provided inculpatory evidence, ultimately stating that there is a “high probability that he will commit future acts of dangerousness” (Neal et al., 2012).

Participants were presented the original examination-in-chief and cross-examination of the expert used by Neal et al. (2012) without modifications. This transcript contained information about the experts' educational credentials, experience, method for conducting violence risk assessment, and their opinion about the defendants' future risk of violence. This information related to the field, specialty, and support attributes in the ExPEx framework, which together formed the manipulation of expert “knowledge” (see Evidence Quality Manipulation for further detail). Ability, foundation and opinion were also addressed though in a limited way. Specifically, in all conditions, the expert had ultimately concluded that the defendant posed a “continuing danger to society” and that despite research showing clinical psychologists can be inaccurate, as far they knew, they had “never been wrong” in their evaluations.

To ensure that there was information available about all ExPEx attributes, a three-page ‘re-examination’ was added to enrich the trial transcript. In this supplementary material participants were told that the prosecution and defence have recalled the expert for further testimony, were reminded of the jury instructions before reading the three new pages of written testimony. During the re-direct and cross-examination, the expert provided additional detail about their educational credentials, experience, methodology, and clarified their opinion. They also provided new information about the scientific basis for risk assessment (foundation), their own proficiency conducting risk assessments (ability), whether other experts agreed with their conclusions (consistency), and their track record working for the defence and prosecution (trustworthiness).

Evidence Quality Manipulation. All eight ExPEx attributes were manipulated in the transcript to produce either a high- or low-quality opinion (see OSF for evidence quality manipulations).

In the high-quality condition, participants read the materials developed by Neal et al. (2012) presenting the testimony of a clinical psychologist, educated at Yale, with a PhD, who was a Board-certified Forensic Psychologist with several academic publications in forensic risk assessment (field and specialty). The expert had 14 years of specialist training and experience in dangerousness and violence risk assessment and had used multiple clinical interviews totaling 15 h with the defendant to assess risk utilising the Violence Risk Assessment Guide (specialty and support). In the enriched re-examination, participants were also given information that the expert was highly proficient in conducting violence risk assessments, with an average performance of 90–94% accuracy (ability), that clinical psychology is a discipline that equips professionals to make accurate risk judgments, and that the V-RAG is an empirically supported and validated assessment method (foundation). The high-quality condition also read that the clinical psychologist managed the potential for bias in their opinion, had testified equally for the prosecution and defence, and did not know the defendant previously (trustworthiness). They also acknowledged the limits of their conclusion by suggesting that even experts do not always have perfect judgment and that risk assessments are not 100% accurate (opinion). The clinical psychologists' opinion was based on collateral information, interview and addressed the relevant risk factors (support). The opinion was also verified by independent experts in the same specialist field (consistency).

By contrast, adopting Neal et al. (2012) manipulations, participants in the low-quality condition read the testimony of a non-specialist psychologist, with 2 years of experience as a psychotherapist in private practice, who did not provide their educational credentials (field), had no specialisation or experience in violence risk assessment (specialty) and had only completed a 30-min interview with the defendant before using the V-RAG (support). In the enriched re-examination, participants also learned that the psychotherapist had not had their risk assessment performance tested but nevertheless reported they were highly proficient (ability), relied on unvalidated clinical judgment and modified the VRAG to assess risk (foundation and support). Those in the low-quality condition also learned that the psychotherapist had worked mostly for the prosecution, had known the defendant previously, and had only considered information they considered relevant in their personal opinion (trustworthiness). They communicated no uncertainty or limitations around their conclusions when re-clarifying their expert judgment (opinion). The psychotherapist's opinion was based only on information from the 30-min interview, did not refer to empirical literature (support), and was verified by a law enforcement official who was also working on the same case rather than an expert in risk assessment (consistency).

Likeability Manipulation. Likeability was manipulated in the original materials using verbal cues (Neal et al., 2012; see OSF for likeability manipulations). The same high and low likeability manipulations were applied throughout the enriched re-examination transcript to ensure consistency throughout the scenario. These likeability manipulations have been shown to be successful at differentiating an expert high in likeability from an expert low in likeability (Brodsky et al., 2009; Neal et al., 2012).

In the high-likeability conditions, participants read a version of the expert who used terms such as “we” or “us” when referring to themselves or others, used informal speech (e.g., referring to an individual by name), was genuine, humble and deferential (e.g., commended the work of others), showed considerate and respectful disagreement, agreeableness to requests and questions (e.g., stating “of course” when asked to repeat something), and had a pleasant and friendly interpersonal style.

In the low-likeability conditions, participants read a version of the expert who used individualistic pronouns (i.e., I, me), was disingenuous, arrogant, and non-deferential (e.g., displayed superiority relative to others, was self-complimenting), showed aggressive contradiction and disagreement, disagreeableness in response to requests, and questions (e.g., pointing out repetitiveness and labelling questions as redundant), and had an unfriendly and condescending interpersonal style.

Primary Dependant Measures

Persuasiveness. The persuasiveness measure comprised three questions. Participants rated the credibility of the expert (“how credible is Dr. Morgan Hoffman?”) from 0 “not at all” to 100 “definitely credible,” the value of the expert's evidence (“how valuable was Dr. Morgan Hoffman's testimony?”) from 0 “not at all” to 100 “definitely valuable,” and the weight of the expert's evidence (“how much weight do you give to Dr. Morgan Hoffman's testimony?”) on a scale from 0 “none at all” to 100 “the most possible.” Question order was randomised. These items have been previously found to be highly correlated (all r's > 0.847) and have high internal consistency (Cronbach's α = 0.954; Martire et al., 2020).

Sentencing Decision. Participants were asked “Considering all the evidence provided to you, what is the sentence you would recommend for the defendant?” and were required to answer either “I would recommend that the defendant receive a death sentence” or “I would recommend that the defendant receive a sentence of life in prison.”

Secondary Measures & Manipulation Checks

ExPEx Attribute Ratings.

Eight items were used to assesses decision-makers' perceptions of whether or not the expert opinion had a high-quality foundation, field, specialty, ability, opinion, support, consistency and trustworthiness from 0 “not at all” to 100 “definitely.” Question order was randomised. See Table 1 for verbatim wording and format.

Table 1

ExPEx Attribute	Question
Foundation	Does training, study, and experience in clinical psychology support assertions that a defendant will commit a violent offence and pose a danger to society?
Field	Does Dr. Morgan Hoffman have training, study, and/or experience in clinical psychology?
Specialty	Does Dr. Morgan Hoffman have training, study and/or experience specific to making assertions that defendant will commit a violent offence and pose a danger to society?
Ability	Does Dr. Morgan Hoffman make assertions that the defendant will commit a violent offence and pose a danger to societyaccurately and reliably?
Opinion	Did Dr. Morgan Hoffman convey their assertion that the defendant will commit a violent offence and pose a danger to societyclearly, and with necessary qualifications/limitations?
Support	Did Dr. Morgan Hoffman rely on evidence when forming their assertion that the defendant will commit a violent offence and pose a danger to society?
Consistency	Is Dr. Morgan Hoffman's assertion that defendant will commit a violent offence and pose a danger to society? consistent with what other experts in clinical psychology would assert?
Trustworthiness	Do you believe that Dr. Morgan Hoffman is fair, impartial, and objective?

Expert persuasion expectancy (ExPEx) quality items.

The bolded writing reflects the definitional component of the ExPEx attributes and the italicised writing reflects the key statement (i.e., the expert's conclusion) from which the ExPEx attribute is being rated in accordance with.

Witness Credibility Scale.

The Witness Credibility Scale (WCS) is a 20-item measure assessing expert credibility (Brodsky et al., 2010). Each item contains bipolar adjectives on a 10-point Likert scale [e.g., not confident (1) to confident (10)]. The presentation of the items was randomised. The highest possible score for overall credibility is 200, with higher scores indicating higher credibility ratings. The WCS also yields a sub-scale score for four credibility domains: knowledge, trustworthiness, confidence, and likeability. The highest possible score for each domain is 50, with a higher score indicating higher rating in a domain. The WCS has good validity and reliability—it can successfully differentiate between expert displaying varying levels of the four sub-domains (Brodsky et al., 2010). The WCS was included to provide an embedded measure of expert likeability. Participants were asked to rate the expert on the following bipolar adjectives: unfriendly (1) to friendly (10); unkind (1) to kind (10); disrespectful (1) to respectful (10); ill-mannered (1) to well-mannered (10) and unpleasant (1) to pleasant (10). Collectively, these items produced the WCS-Likeability subscale.

Agreement.

Participants were asked “If Dr. Hoffman reported that the defendant will commit a violent offence and pose a danger to society, would you agree with that opinion?” and were required to answer either “yes” or “no.” Analyses involving agreement are available on OSF.

Likeability.

Participants were asked to “rate how likeable Dr. Morgan Hoffman is to you, with zero being ‘not at all likeable’ and ten being ‘extremely likeable’.”

Expert Testimony Comprehension.

Comprehension of the expert evidence was measured using 6 multiple-choice items to assess engagement with the testimony and understanding of its substantive content. Higher comprehension scores (out of 6) indicated greater recall and comprehension of the expert testimony. Analyses involving comprehension are available on OSF.

Demographic Information.

Participants were asked to provide information about their age, gender, education level, cultural background, English proficiency, religiosity, political orientation, views of the death penalty, experience and familiarity with the expert's discipline and jury eligibility and experience.

Procedure

This study was approved by the UNSW Human Advisory Ethics Panel C—Behavioural Sciences (Approval #3308) and pre-registered. The study was advertised on MTurk and completed by participants online in Qualtrics. Before commencing the study, participants were asked to provide informed consent, complete age eligibility and reCAPTCHA, before random allocation to condition. Participants read the instructions and the version of the expert testimony transcript as determined by allocation to condition. Next, participants completed the ExPEx, WCS and likeability measure, in randomised order; completed the persuasiveness measure, and made their sentencing decision. Finally, participants completed the comprehension and demographic items. At the conclusion of the study, participants were given a completion code, were debriefed, and thanked. The average study completion time was 23.5 min.

Results

Participant Demographics

Participants were aged between 18 and 71 years (M = 39.36, SD =12.34) and 48.6% were male. Most participants reported that their highest level of completed education was college/university (54.1%) or a Masters degree (27.5%). Most identified as White/Caucasian (71.6%), followed by Asian (10.6%), African American (8.3%), and Hispanic (5%). Almost all participants (95.9%) were native English speakers.

About half of participants (53.2%) considered themselves more than “moderately” religious (on a 10-point scale from “not at all” to “very” religious). The largest proportion of participants (45.5%) rated themselves as conservative (on a 10-point scale from “very liberal” to “very conservative;” 42.2% were liberal; and 12.4% were neutral). The largest proportion of participants (47.7%) were in favour of the death penalty (on a 10-point scale from “strongly opposed” to “strongly in favor;” 45% were not in favour; 7.3% were neutral). A majority (61.5%) were unfamiliar with dangerousness and violence risk assessments (“none” to “some” familiarity), but approximately half (52.8%) reported being familiar with psychology/clinical psychology (from “some” to “extensive” familiarity). More than half of the sample (56.9%) had been called for jury duty; 46.8% of these participants had served on a jury, and 7.3% (n = 9) had served on a murder trial.

Manipulation Checks

All assumptions were tested before conducting the planned analyses. The analytic approach reported here either satisfies the relevant assumptions or is robust to violations.

Evidence Quality

A two-way (Pillai's Trace) MANOVA was conducted comparing the ratings of each of the eight ExPEx expert attributes between the low- and high-quality expert evidence conditions. There was a significant main effect of expert evidence quality overall [F_(8,207) = 11.9, p < 0.001, ηp² = 0.315] and for each attribute [all ≥ 25.83, all p's < 0.001, all ηp² ≥ 0.108] such that, on average, participants in the high-quality condition rated each ExPEx attribute as higher quality than those in the low-quality condition (see Table 2).

Table 2

	ExPEx Attribute Rating
ExPEx Attribute	High-Quality Mean (SE)	95% CI	Low-Quality Mean (SE)	95% CI	F	p	η²
Foundation	76.62 (2.18)	72.32, 80.92	60.70 (2.25)	56.28, 65.13	25.83	<0.001	0.108
Field	92.64 (1.82)	89.04, 96.23	72.86 (1.88)	69.17, 76.56	57.10	<0.001	0.211
Specialty	88.16 (1.99)	84.24, 92.09	63.08 (2.05)	59.04, 67.12	77.04	<0.001	0.265
Ability	78.93 (2.34)	74.32, 83.53	57.24 (2.41)	52.50, 61.98	41.82	<0.001	0.163
Opinion	82.87 (2.39)	78.16, 87.59	64.33 (2.46)	59.47, 69.18	29.18	<0.001	0.120
Support	76.48 (2.34)	71.87, 81.10	58.22 (2.41)	53.47, 62.97	29.56	<0.001	0.121
Consistent	78.74 (2.08)	74.64, 82.84	57.97 (2.14)	53.74, 62.19	48.40	<0.001	0.184
Trustworthy	73.07 (2.61)	67.93, 78.21	51.05 (2.69)	45.76, 56.34	34.59	<0.001	0.139

Table of marginal means and inferential statistics for expert persuasion expectancy (ExPEx) attributes by evidence quality condition.

Likeability

Independent samples Welch t-tests showed a significant difference between high- and low-likeability experts on the WCS-likeability sub-scale score [t_(163.79) = −10.99, 95% CI (−20.78, −14.45), p < 0.001]. On average participants in the high-likeability condition rated the expert 39.3 out of 50 (SD = 7.3) compared to 21.7 out of 50 (SD = 15.3) in the low-likeability condition.

The single item rating subjective likeability was strongly and positively correlated with the WCS-Likeability subscale score (r = 0.921, p < 0.001). Accordingly, we report all subsequent analyses using the validated WCS-Likeability scores rather than the single likeability item.

Persuasiveness Ratings

Consistent with Martire et al. (2020), ratings of expert credibility, value and weight were all strongly and positively correlated (r_{credibility/weight} = 0.905; r_{credibility/value} = 0.894; r_value/weight = 0.913, all p's < 0.001), and had high internal consistency (Cronbach's α = 0.966), so were combined into a single measure of persuasiveness.

Effect of Expert Evidence Quality and Likeability on Persuasiveness

Average persuasiveness ratings by condition are shown in Figure 1. The mean persuasiveness ratings by condition were: high-quality, high-likeability M = 84.5 (SD = 11): high-quality, low-likability M = 72.8 (SD = 24.2); low-quality, high-likeability M = 62.7 (SD= 24.4); low-quality, low-likeability M = 47.9 (SD = 32). A two-way ANOVA showed a significant main effect of evidence quality [F_{(1, 214)} = 50.76, p < 0.001, ηp² = 0.192] and a significant main effect of likeability [F_{(1, 214)} = 16.39, p < 0.001, ηp² = 0.071]. The high-quality expert was significantly more persuasive than the low-quality expert. The high-likeability expert was also more persuasive than the low-likeability expert. There was no significant interaction between evidence quality and likeability indicating that the effect of likeability was consistent for high- and low-quality evidence [F_{(1, 214)} = 0.234, p = 0.629, ηp² = 0.001].

Figure 1

Multiple regressions were conducted to examine whether continuous subjective ratings of the eight ExPEx attributes and WCS-likeability predicted persuasiveness ratings. The overall model was significant [F_{(9, 208)} = 111.06, p < 0.001] and accounted for 82% of the variance in persuasiveness ratings (adjusted R² = 0.82). Ratings of trustworthiness, consistency, support, ability, and specialty were all significant independent predictors (all p's ≤ 0.022), while foundation, field, opinion, and likeability were not (all p's ≥ 0.054; see Table 3). For example, holding all else constant, a one unit increase in perceptions of the trustworthiness of the expert was associated with a 0.347 unit increase in persuasiveness ratings.

Table 3

		95% CI forB
	B	Lower	Upper	SE B	β	p	R²	Adj. R²
Model						<0.001	0.828	0.82
Foundation	−0.048	−0.143	0.047	0.048	−0.042	0.324
Field	0.000	−0.097	0.098	0.05	0.000	0.993
Specialty	0.18^***	0.071	0.288	0.055	0.161^***	0.001
Ability	0.237^***	0.141	0.334	0.049	0.232^***	<0.001
Opinion	0.021	−0.063	0.105	0.042	0.021	0.619
Support	0.148^***	0.069	0.226	0.04	0.142^***	<0.001
Consistent	0.119^*	0.017	0.221	0.052	0.107^*	0.022
Q0Trustworthy	0.347^***	0.254	0.44	0.047	0.395^***	<0.001
WCS-Likeability	0.15	−0.002	0.303	0.077	0.081	0.054

Multiple regression predicting persuasiveness from continuous expert persuasiveness expectancy (ExPEx) ratings and witness credibility score (WCS) for likeability.

B, unstandardised regression coefficient; CI, confidence interval; SE B, standard error of the coefficient; β, standardised coefficient.

p < 0.05,

^**p < 0.01,

***

p ≤ 0.001.

Relationship Between Persuasiveness and Sentencing Decision

A binominal logistic regression was used to examine the relationship between persuasiveness and sentencing decision. The overall model was a good fit and significant [ = 56.14, p < 0.001]. Persuasiveness accounted for 31.7% of the variance in sentencing decision [Nagelkerke R² = 0.317; Wald = 29.79, p < 0.001], with a one unit increase in persuasiveness increasing the odds of the decision-maker choosing a death sentence by 1.063 (Exp B).

Effect of Expert Evidence Quality and Likeability on Sentencing Decision

The proportion of participants giving death sentences by condition is shown in Table 4. A binominal logistic regression was used to predict sentencing decision from expert quality condition, likeability condition, and their interaction. The overall model was a good fit but not significant [ = 6.84, p = 0.077] and accounted for only 4.3% of the variance in sentencing decision (Nagelkerke R² = 0.043). Neither expert evidence quality, likeability, nor their interaction were significant independent predictors of sentencing decision (all p's ≥ 0.158; see Table 5).

Table 4

Likeability	Low-Quality Evidence %	High-Quality Evidence %
Low	26.8	43.9
High	22	34.5

Proportion of participants selecting death sentence by evidence quality and likeability condition.

Table 5

	B	SE	Wald	df	p	Odds Ratio	95% CI for Odds Ratio
							Lower	Upper
Evidence quality	−0.63	0.44	1.99	1	0.158	0.53	0.22	1.28
Likeability	0.39	0.39	1.01	1	0.314	1.48	0.69	3.18
Evidence quality^* likeability	−0.13	0.6	0.05	1	0.826	0.88	0.27	2.84

Logistic regression predicting sentencing decision from evidence quality condition, likeability condition, and their interaction.

Life in prison was coded as zero and death was coded as one. B, unstandardised regression coefficient; SE B, standard error of the coefficient; CI, confidence interval.

p < 0.05,

^**p < 0.01, ^***p ≤ 0.001.

Another binominal logistic regression conducted to examine whether subjective continuous ExPEx ratings and WCS-likeability scores predicted sentencing decision. The overall model was a good fit and was significant [ = 53.35, p < 0.001], accounting for 30.4% of the variance in sentencing decision (Nagelkerke R² = 0.304). Ratings of foundation (p = 0.022) and trustworthiness (p = 0.004) uniquely predicted sentencing decision, while the remaining ExPEx attributes and likeability scores did not (all p's ≥ 0.104; see Table 6).

Table 6

	B	SE	Wald	df	p	Odds Ratio	95% CI for Odds Ratio
							Lower	Upper
Foundation^*	0.028^*	0.012	5.25	1	0.022	1.03	1.00	1.05
Field	−0.009	0.012	0.6	1	0.439	0.99	0.97	1.01
Specialty	−0.004	0.014	0.095	1	0.758	0.996	0.97	1.02
Ability	0.002	0.013	0.03	1	0.869	1.00	0.98	1.03
Opinion	0.016	0.012	1.83	1	0.177	1.02	0.99	1.04
Support	<0.001	0.008	0.001	1	0.974	1	0.98	1.02
Consistent	−0.006	0.012	0.293	1	0.588	0.99	0.97	1.02
Trustworthy	0.034^**	0.012	8.46	1	0.004	1.03	1.01	1.06
WCS-Likeability	−0.027	0.017	2.64	1	0.104	0.97	0.94	1.01

Logistic regression predicting sentencing decision from continuous expert persuasion expectancy (ExPEx) ratings and witness credibility score (WCS) for likeability.

Life in prison was coded as zero and death was coded as one. B, unstandardised regression coefficient; SE B, standard error of the coefficient; CI, confidence interval.

p < 0.05,

p < 0.01,

^***p ≤ 0.001.

Discussion

Study 1 examined whether expert quality and likeability affected jury-eligible participants' perceptions of expert persuasiveness and sentencing decisions. We found that participants' perceptions of persuasiveness were significantly affected by evidence quality and expert likeability whereby higher quality and higher likeability experts were more persuasive than lower quality and lower likeability experts. However, there was no interaction between evidence quality and likeability. We also found that subjective perceptions of the eight ExPEx attributes and likeability together accounted for ~80% of the variance in persuasiveness scores, which demonstrates these attributes have strong predictive power.

These results suggest that evidence quality impacts understanding, but persuasiveness is determined by both the underlying quality of the evidence, and superficial aspects of the experts' interpersonal style. Our results also suggest that previously observed effects of likeability on perceptions of credibility or persuasiveness were not merely an artefact of simplified evidence quality materials and manipulations. The expert evidence presented in this study was detailed and included extensive information about the quality of the opinion, yet the effect of likeability persisted and appeared to provide a boost to the persuasiveness of both lower and higher quality evidence. Thus, concerns about juror reliance on peripheral information in their decision-making remain.

However, it is important to note that neither likeability nor quality affected sentencing decisions in the same way that they affected persuasiveness. There were no significant associations between evidence quality or likeability conditions on sentencing outcomes. Continuous subjective likeability ratings also did not predict sentencing outcome, but perceptions of expert trustworthiness and foundation did. Thus, although likeability affected perceptions of persuasiveness, and persuasiveness affected sentencing outcomes, likeability did not directly affect the final sentencing outcome. This was not the case for evidence quality—elements of which remained influential for both evidence evaluation and sentencing decisions. Taken together, this suggests that lay decision-makers consider elements of expert evidence quality more so than peripheral likeability information when making their sentencing decisions.

These results raise further questions that should be explored. First, it is important to establish whether these effects are reliable by attempting to replicate the results. It is also important to consider whether our results are generalisable, especially given the lower ecological validity of trial transcript studies. Perceptions of likeability are strongly affected by non-verbal cues such as smiling, nodding, eye contact and open posture (Kleinke, 1986; Leathers, 1997; Gladstone and Parker, 2002). These cues were not available in our materials. Thus, it is important to examine whether the effects of expert likeability are replicated when more realistic video manipulations of likeability are used. Finally, we were interested to inform our general understanding of the relationship between likeability and persuasion by considering likeability's directional impact on persuasiveness. It is unclear whether being likeable increases persuasiveness, or if it is being disliked that decreases persuasiveness, or both. Study 2 is designed to tease apart these possibilities.