Creepy, but Persuasive: In a Virtual Consultation, Physician Bedside Manner, Rather than the Uncanny Valley, Predicts Adherence

Care for chronic disease requires patient adherence to treatment advice. Nonadherence worsens health outcomes and increases healthcare costs. When healthcare professionals are in short supply, a virtual physician could serve as a persuasive technology to promote adherence. However, acceptance of advice may be hampered by the uncanny valley effect—a feeling of eeriness elicited by human simulations. In a hypothetical virtual doctor consultation, 441 participants assumed the patient’s role. Variables from the stereotype content model and the heuristic–systematic model were used to predict adherence intention and behavior change. This 2 × 5 between-groups experiment manipulated the doctor’s bedside manner—either good or poor—and virtual depiction at five levels of realism. These independent variables were designed to manipulate the doctor’s level of warmth and eeriness. In hypothesis testing, depiction had a nonsignificant effect on adherence intention and diet and exercise change, even though the 3-D computer-animated versions of the doctor (i.e., animation, swapped, and bigeye) were perceived as eerier than the others (i.e., real and cartoon). The low-warmth, high-eeriness doctor prompted heuristic processing of information, while the high-warmth doctor prompted systematic processing. This pattern contradicts evidence reported in the persuasion literature. For the stereotype content model, a path analysis found that good bedside manner increased the doctor’s perceived warmth significantly, which indirectly increased physical activity. For the heuristic–systematic model, the doctor’s eeriness, measured in a pretest, had no significant effect on adherence intention and physical activity, while good bedside manner increased both significantly. Surprisingly, cognitive perspective-taking was a stronger predictor of change in physical activity than adherence intention. Although virtual characters can elicit the uncanny valley effect, their effect on adherence intention and physical activity was comparable to a video of a real person. This finding supports the development of virtual consultations.


INTRODUCTION
There is growing interest in adopting virtual characters to persuade people to change their health-related behaviors. Strategies that people adopt to persuade others, such as arguments and social cues, could also be employed in virtual characters (André et al., 2011). This could offer a novel and efficient way to increase adherence to treatment advice and improve health literacy more generally. Consulting with conversational agents has been found to increase users' physical activity (Yin et al., 2010) and guide older adults toward healthy behaviors (Looije et al., 2010). However, virtual characters, depending on their level of realism, could appear eerie-an effect called the uncanny valley (Mori, 2012). Thus, we must consider how eeriness influences persuasion in virtual characters.
Persuasion can be used socially to encourage behaviors that enhance health and discourage those that harm it (Umberson, 1987;Lewis and Rook, 1999). Adherence measures how much a patient's behavior follows the program they have established with their healthcare provider, such as taking prescribed medications on time and making recommended lifestyle changes (McDonald et al., 2002). Adherence intention is the patient's strength of determination to follow the program.
This work adopts as its use case scenario a virtual consultation involving a possible diabetes diagnosis. Diabetes and other chronic diseases pose a growing threat to global health. By 2030, the number of people with type II diabetes is projected to hit 578 million (Saeedi et al., 2019). Diabetes reduces life expectancy by 5 years, and its complications reduce quality of life (Zhuo et al., 2013). To stay healthy, diabetes patients must follow their doctors' diet and exercise advice (Nelson et al., 2002). Although a sedentary lifestyle and poor dietary habits are risk factors, a healthy diet and regular exercise can delay or prevent type II diabetes and its associated health complications (Eriksson and Lindgärde, 1991;Tuomilehto et al., 2001;Rejeski et al., 2012;Sami et al., 2017).
Increasing adherence is a major public health challenge. In the United States, nonadherence to medication affects 40-50% of patients with chronic diseases like diabetes and hypertension, leading to 100,000 preventable deaths each year (Kleinsinger, 2018). For diabetic patients, nonadherence to medication increases adverse health outcomes, hospitalization, and mortality (Ho et al., 2006). The rate of nonadherence can exceed 70% when the physician advises significant or complex changes in behavior (Martin et al., 2005). Nonadherence results in increased hospitalization and mortality rates, especially in older adults (Walsh et al., 2019).
The cost of nonadherence to healthcare systems is staggering. In the United States, estimates ranged from $100 billion to $300 billion per year (Chisholm-Burns and Spivey, 2012;McGuire and Iuga, 2014;Morello and Hirsh 2017). Per patient, they ranged from $949 to $44,190 in 2015, depending on the disease (Cutler et al., 2018).
Physician bedside manner influences the physician-patient relationship, which, in turn, influences adherence (Miller, 1997;Safran et al., 1998;Ettner, 1999). For example, a poor physician-patient relationship lowered long-term adherence in patients with hypertension (Waeber et al., 2000). However, a European study found that a good physician-patient relationship encouraged patients to follow recommendations (Stavropoulou, 2011). A good relationship also improved adherence for HIVpositive patients (Schneider et al., 2004).
The United States faces a shortage of clinicians and other healthcare professionals (McKechnie, 2016;Kirch and Petelle, 2017;Marć et al., 2019). Researchers estimate that nearly 52,000 more primary care physicians will be needed by 2025 (Petterson et al., 2012). A virtual clinical consultation could address the scarcity of healthcare experts. However, creating credible virtual characters remains a challenge for researchers and designers, in part because factors like the uncanny valley could negatively influence the persuasiveness of their advice (McDonnell and Breidt, 2010;Wang et al., 2013).
The uncanny valley effect, proposed in 1970 by Mori (2012), is a negative affective reaction toward objects that imperfectly resemble human beings, such as android robots or computeranimated characters. A large meta-analysis has confirmed the effect (Diel et al., 2022), which could hinder the acceptance of virtual characters and their advice. The uncanny valley effect can occur when a character's features are atypical, appear less real than others, or deviate from a familiar configuration Diel and MacDorman, 2021). This can elicit in viewers a feeling of eeriness and revulsion. For example, eeriness could be elicited by atypical proportions in a photorealistic human face (Green et al., 2008), such as enlarged eyes, or inconsistent realism among facial features (MacDorman et al., 2009;Stein and Ohler, 2018). Eeriness has been considered detrimental to persuasion because of its negative effect on a virtual character's credibility (Patel and MacDorman, 2015). Increasing a human character's overall realism makes subtle nonhuman imperfections in its features more noticeable and disturbing (McDonnell et al., 2012). Different stylizations of computer-generated characters could affect the perception of eeriness, trustworthiness, attractiveness, and realism (Zell et al., 2015;Schindler et al., 2017).
MacDorman (2019) proposed that the uncanny valley effect could disrupt feelings of empathy for a virtual character. It could, therefore, have a greater effect on a high-warmth character than a low-warmth character because a high-warmth character elicits more empathy. Thus, in a virtual doctor's consultation, it would be useful to examine both realism and bedside manner to check for interaction effects.

Hypotheses
Virtual characters have been found to be at least as persuasive as real people (Bickmore et al., 2009;Dai and MacDorman, 2018;Ogawa et al., 2018). Their ability to persuade has been used to improve patient adherence (Brown et al., 2016;Richards and Caldwell, 2016). Thus, we hypothesize that (H1) the depiction of the virtual physician affects adherence. In our previous results, the animated physician was more persuasive than a video recording of a real person because interacting with the animated character was more enjoyable (Dai and MacDorman, 2018).
In the literature on persuasion, the heuristic-systematic model (HSM) explains how people process information (Chaiken, 1980;Chaiken and Eagly, 1983;Chen and Chaiken, 1999;Todorov, Chaiken et al., 2002). Systematic processing demands cognitive effort to understand and evaluate the content of a message. Thus, for systematic processing, message-related thoughts mediate persuasion. By contrast, heuristic processing relies on rules of thumb or peripheral cues, which reduce cognitive effort. For heuristic processing, persuasion is mediated by thoughts unrelated to the message, such as features of the message's source. These could include, for example, a speaker's interpersonal warmth, physical attractiveness, novelty, and realism (Miller et al., 1976;Chaiken, 1980). People may pay more attention to a virtual character than a real human because its visual novelty stimulates curiosity, discovery, and learning (Sokolov, 1963;Patel et al., 2014). For example, people stared longer at the faces of eerie digital human characters than real humans (Carter et al., 2013). Virtual environments have used behavioral measures like proximity (i.e., the minimum distance between a person and a virtual character). People moved closer to eerie, zombielike virtual characters and paid more attention to them than to more realistic characters (Zibrek et al., 2018). Thus, we hypothesize that by increasing the heuristic processing of information, (H2a) the high-eeriness virtual physician elicits more thoughts related to the message source than the low-eeriness virtual physician and that (H2b) the low-eeriness virtual physician elicits more thoughts related to the message itself than the high-eeriness physician.
Fiske and colleagues proposed the stereotype content model (SCM), which describes the formation of interpersonal impressions and group stereotypes on two dimensions: warmth and competence (Fiske et al., 2002). Warmth encompasses traits indicating whether a person intends to help or harm, such as the presence or absence of friendliness, helpfulness, morality, sincerity, and trustworthiness. Competence encompasses traits indicating the ability or inability to act on that intention, such as the presence or absence of creativity, efficacy, intelligence, and skill (Fiske et al., 2002(Fiske et al., , 2007. A person viewed as warm and competent elicits positive emotions and behaviors, whereas a person viewed as lacking these traits elicits negative emotions and behaviors. These responses confer selective advantage to individuals and groups. Warmth is the primary dimension of interpersonal perception. It accounts for 53% of the variance in global impressions, while competence accounts for 29% (Wojciszke et al., 1998). Warmth and competence have been used to explain interpersonal and intergroup social cognition, especially as related to competition and status.
The physician's warmth helps support the physician-patient relationship (Buller and Street, 1992). In clinical settings, it is exhibited by such affiliative behavior as attention to patients, empathy, and kindness. Affiliative behavior motivates patient adherence in real (Willson and McNamara, 1982;Kim et al., 2004) and virtual contexts (Dai and MacDorman, 2018). However, affiliative behavior does not ensure that patients will continue to follow the recommended treatment (O'Hair, 1986;Street, 1990;Buller and Street, 1992). We hypothesize (H3) greater adherence to the advice of a high-warmth physician than a low-warmth physician.
Warmth and competence are positively correlated when patients interact with physicians in clinical settings (Kraft-Todd et al., 2017). A rude, unfriendly physician may attract more attention than a polite, friendly one. Moreover, sarcasm and dark humor could make communication more entertaining and memorable (Ziegele and Jost, 2020). We hypothesize that (H4a) the low-warmth virtual physician elicits more source-related thoughts than the high-warmth virtual physician by encouraging the heuristic processing of information, and (H4b) the high-warmth virtual physician elicits more messagerelated thoughts than the low-warmth virtual physician by encouraging the systematic processing of information.
Our assumption may appear to run counter to findings in the persuasion literature, namely that warm, competent, or attractive sources of information elicit more heuristic processing than cold, incompetent, or unattractive ones (Chaiken, 1980(Chaiken, , 1987DeBono and Harnish, 1988;Wood and Kallgren, 1988). According to HSM, if a source is judged positively, less cognitive effort must be expended to evaluate the message. We justify our assumption based on the context of the clinical scenario: Patients expect their doctor to be warm and competent by default. This expectation is implied by the Hippocratic Oath. Behavior deviating from it is surprising, thus drawing attention away from the message to its source.
People's attitudes shape their intentions, which, in turn, predict behavior change (Austin and Vancouver, 1996;Abraham et al., 1998;Webb and Sheeran, 2006;Montano and Kasprzyk, 2015). In general, a shift in attitudes resulting solely from heuristic processing will be less stable and less resistant to counterarguments and have less of an effect on behavior change than a shift resulting from systematic processing (Chaiken, 1980). Prochaska and DiClemente (1983) proposed the transtheoretical model of behavior change, which has been used to increase physical activity (Prochaska et al., 1994;Prochaska et al., 2015;Stonerock and Blumenthal, 2017). In their model, adherence intention is a major predictor of behavior change (Austin and Vancouver, 1996;Abraham et al., 1998;Webb and Sheeran, 2006;Montano and Kasprzyk, 2015).
This study investigates the effect of bedside manner on people's behavior change in diet and physical activity. Good bedside manner in a virtual physician has been found to increase patients' adherence and satisfaction (Schmid Mast et al., 2007;Dai and MacDorman, 2018). Based on these findings, we hypothesize that (H5a) the consultation with the high-warmth physician will improve diet and (H5b) physical activity more than the consultation with the low-warmth physician. Based on the literature on adherence, we hypothesize that (H6a) those reporting greater adherence intention after the virtual consultation will improve their diet and (H6b) increase their physical activity more than others.
To the best of our knowledge, the effect of a virtual character on persuasion has not yet been fully explored in a virtual clinical setting. In particular, the extent to which the physician's warmth and eeriness affect heuristic and systematic processes and how these processes affect adherence intention and behavior change has not been considered. Our experiment manipulated a virtual doctor's bedside manner and depiction to determine how these variables influence perceived warmth and eeriness, heuristic and systematic processing, adherence intention, and behavior change. We further explored these variables of the stereotype content model and the heuristic-systematic model in a path analysis to predict adherence intention and behavior change. Thus, our overarching research question is this: How does a physician's bedside manner and depiction influence adherence intention and behavior change through the stereotype content model and heuristic-systematic model?

Addressing Threats to Validity
Our previous study examined how bedside manner, outcome (i.e., being awarded a fellowship or sued for malpractice), and virtual depiction affected adherence intention in a virtual clinical scenario (Dai and MacDorman, 2018). Its findings revealed two main threats to interval validity resulting from the experimental design. First, poor bedside manner caused the doctor to be rated significantly higher in eeriness than good bedside manner. The eeriness index was not designed as a posttest for comparing characters whose responses differ greatly in warmth or other traits between scenarios. To eliminate the effect of bedside manner on eeriness, this study measured the doctor's eeriness in a neutral setting as a pretest (see Procedure). The second threat to internal validity was that our previous study failed to separate the computer-animated doctor's positive effect on consultation enjoyment and adherence intention from the uncanny valley's negative effect. To address this threat, this study used multiple versions of the computer-animated doctor to measure eeriness. Finally, our previous study did not measure behavior change. This study measured dietary and exercise change at least 1 week after the virtual consultation.

METHODS
The experiment centered on an online virtual consultation. Participants took on the role of a patient in a doctor's consultation regarding a possible diabetes diagnosis. The virtual doctor was manipulated by bedside manner and depiction. Bedside manner had two stimulus conditions: either good or poor. Depiction had five stimulus conditions that varied in their level of realism: cartoon, bigeye, swapped, animation, and real ( Figure 1).
These five conditions were selected to manipulate the virtual physician's level of eeriness to measure the effect of eeriness on adherence intention and behavior change, as mediated by the stereotype content model and the heuristic-systematic model. The real condition used a video of a male actor playing the part of the doctor. This video was used as a reference by a professional animator to create the animation condition. A realistic computer model was first developed from high-resolution reference photos of the human actor. In a previous study, the same model had been rated significantly eerier than the real actor (Patel and MacDorman, 2015).
The animation condition served as a reference for the remaining three conditions. We created a cartoon character as a low-eeriness condition with flat shading and fewer textural features. This approach has been used to avoid the uncanny valley in game development (McDonnell and Breidt, 2010;McDonnell et al., 2012). FaceSwap, a deep learning neural network program, was used to generate characters with a swapped face (swapped) and a swapped face with eyes enlarged 50% (bigeye). The bigeye condition was motivated by the fact that viewers are prone to be sensitive to atypical features in human faces, especially when they involve the eyes (Kätsyri et al., 2015;Schein and Gray, 2015). Increasing the size of the eyes or making their level of realism inconsistent with other facial features causes the face to appear eerier (MacDorman et al., 2009;Seyama and Nagayama, 2009;MacDorman and Chattopadhyay, 2016;Feng et al., 2018).

Participants
Undergraduate and graduate students randomly selected from a public university system in the Midwestern United States comprised the sample. Each was randomly assigned to one of 10 groups.
A total of 441 participants completed the experiment and first survey (72% female, n 318) with between 36 and 52 in each group. Among those participants, 329 completed a second followup survey, with between 28 and 38 in each group (73% female, n 244). One participant's responses were removed for not completing the thought-listing task. Participants ranged in age from 18 to 70 (Mdn 22, IQR [19, 28]), and most had grown up in the United States (89%, n 394).
FIGURE 1 | The doctor in the experiment was presented with five styles of depiction. These were, from left to right, (A) cartoon, (B) bigeye, (C) swapped, (D) animation, and (E) real. Real was a video of a real person. Animation was a 3-D computer animation. Swapped was a face replacement of animation using deep learning. Bigeye was also a face replacement but with eyes enlarged 50%. Cartoon was a cartoon version of animation. The props and background for bigeye and swapped were identical to animation.

Power Analysis
The power analysis aimed to determine how many participants each condition would require for the effect of depiction on pretest Eeriness to be sufficient to have a significant effect on Adherence Intention. In MacDorman et al. (2009), a 50% increase in eye size significantly increased eeriness, m 1 −1.60, SD 1 1.22, m 2 1.37, SD 2 1.23, η 2 p ≈ 0.65. To consider related constructs, Feng et al., (2018) found that a 25 and 50% increase in eye size significantly decreased how often a face was preferred, η 2 p 0.35. Seyama and Nagayama (2009) found that a 50% increase in eye size decreased pleasantness significantly, F (5, 195) 49.73, η 2 p 0.56, p < 0.001.
In preparation for this experiment, 30 participants rated a virtual physician on Eeriness. The virtual physician was presented at three levels of realism in 20-s video clips. At the lowest level, the eyes were enlarged 50%. The effect of realism on Eeriness was significant and large, F (2, 58) 15.34, η 2 p 0.35, p < 0.001. In Dai and MacDorman (2018), bedside manner had a significant effect on Warmth, F (1, 214) 193.99, η 2 p 0.48, p < 0.001, and a significant effect on Adherence Intention, F (1, 730) 282.86, η 2 p 0.28, p < 0.001. Based on these results, we assumed that the effect of increasing eye size 50% on pretest Eeriness and Adherence Intention would be similar in magnitude to the effect of bedside manner on Warmth and Adherence Intention. For a 0.05 alpha level and an effect size set conservatively at η 2 p 0.15, a two-way between-subjects ANOVA with two levels for the first factor and five for the second factor would require 36 participants in each of the 10 groups to achieve power of at least 0.90 (bedside manner: n 180, power > 0.99, depiction: n 72, power > 0.91).

Research Design
The experiment adopted a 2 × 5 between-groups posttest design, although the doctor's perceived Eeriness was also pretested. The 10 treatment groups had five different stylizations of the virtual doctor, who had either good or poor bedside manner.

Procedure
Participants first completed informed consent. To measure the eeriness of the virtual physician, Dr. Richards, independently from the bedside manner manipulation, participants rated the doctor on Eeriness in a short, 7-s clip. Dr. Richards was presented on the screen holding a clipboard. He wore a white lab coat over a white shirt with a red tie.
Participants were then introduced to the doctor-patient scenario and their role as patient. They began the consultation with a test result indicating higher-thannormal blood sugar.
Eight hypothetical doctor-patient interactions comprised the virtual consultation. For all but the last, the participant chose a preferred question from four options. For experimental control, the doctor provided the same response, regardless of the question selected, and the questions and responses were designed to create the impression of a logically flowing conversation. The doctor closed the consultation by providing the patient with a recommendation concerning diet and exercise in audio and text.
Next, the participant completed the indices (see Dependent variables) and a demographics questionnaire. The indices were presented in the following order: a thought-listing task; a fully randomized block of scales composed of human realism, eeriness, warmth, and competence; cognitive perspectivetaking; adherence intention; diet; and physical activity. After at least 1 week, the diet and physical activity indices were administered a second time to measure behavior change.
The experiment was conducted from October 10 to November 8, 2019. The median duration of the experiment and first survey was 21 min 39 s (25th percentile 18 min 2 s, 75th percentile 29 min 35 s). For the virtual consultation, the duration of the physician's part of the physician-patient interactions was 3 min 26 s for the physician with good bedside manner and 2 seconds longer for the physician with poor bedside manner. The average duration of the user's part was 2 min 18 s for a total virtual consultation time of 5 min 45 s.

Independent Variables
The independent variables were bedside manner (good or poor) and depiction (five levels of realism). Depiction was recoded as real (two levels) and 3-D animation (two levels) for planned contrasts. A transcript of the interactions including the bedside manner manipulation and diet and exercise recommendation is provided in Supplementary Material. All videos including the bedside manner and depiction manipulations are available at https://doi.org/10.6084/m9. figshare.13337267.

Bedside Manner
Bedside manner had two conditions (i.e., good or poor), which were designed to exhibit high warmth or low warmth, respectively. The doctor's bedside manner was depicted mainly via dialogue (40% of the total), nonverbal expressions, and gestures. The high-warmth doctor treated the patient with care, empathy, and patience, responded to questions positively, provided reassurance and emotional support, and expressed a willingness to be available. By contrast, the low-warmth doctor was rude and impatient, joked about the patient's diagnosis, and tried to end the consultation early. The low-warmth doctor showed little empathy for the patient, interacted paternalistically, and implied that the patient's questions were ill-informed. These are the same qualities encountered in real physicians with good and poor bedside manner (Hickson, 1994;Levinson et al., 1997;Schmid Mast et al., 2007).

Depiction
The depiction of the character was presented in five different rendering styles: cartoon, bigeye, swapped, animation, and real. The five styles were a video of a real person, a 3-D computer animation modeled on the video using Maya and ZBrush, a software-generated cartoon version of the animation, and face replacements of the animation generated by FaceSwap, which contained two versions (i.e., the face replacement with an animated face, and the same face with eyes enlarged 50%). Apart from character stylization, all five high-warmth conditions had the same narrative content, and all five low-warmth conditions had the same narrative content. The doctor's mouth movements were synchronized to his voice.

Dependent Variables
Indices were used to measure the dependent variables. They are discussed below and listed in the Supplementary Material. Source-and message-related thoughts were determined by two coders who labeled and counted each participant's thoughts. Dietary and exercise change were quantified by the participant who counted servings of food or hours of activity during the prior week. All other indices were composed of semantic differential scales (e.g., strongly agree-strongly disagree or creepy-ordinary). Each scale was depicted with a horizontal bar with a term on one end and its antonym on the other (see Reips and Funke, 2008;Funke and Reips, 2012). For each scale, the participant used a slider to position an indicator on the bar. The indicator's position was recorded as a decimal value between -1.00 and 1.00.

Human Realism and Eeriness
To evaluate the appearance of the message source, Dr. Richards' human realism was measured using the combined scales of the realism and humanness indices; eeriness was measured using the eeriness index's eerie subindex (Ho and MacDorman, 2017).

Warmth and Competence
Dr. Richards' perceived warmth and competence were measured using a warmth index (Ho and MacDorman, 2010) and a competence index (McCroskey and Teven, 1999).

Source-and Message-Related Thoughts
A thought-listing task was used to analyze the heuristic and systematic processing of information for persuasion (Chaiken, 1980;Pallak et al., 1983;Koh and Sundar, 2010). Chaiken and Maheswaran (1994) used the number of source-related thoughts to measure heuristic processing and the number of message-related thoughts to measure systematic processing. In the experiment, participants listed their thoughts about consultation content and Dr. Richards' appearance. They were given 3 min and 10 text input boxes to record their thoughts. Two coders, blind to the conditions, independently assigned each thought to one of seven categories: positive, neutral, or negative sourcerelated, positive, neutral, or negative health-related, or other. An index for source-related thoughts was calculated by adding up the number of thoughts in this category. The process was repeated for message-related thoughts, which were further divided into negative and nonnegative thoughts.

Cognitive Perspective-Taking
Participants' ability to imagine Dr. Richards' mental state (Davis, 1983;Leslie, 1987;Frith, 2001) was measured using a cognitive perspective-taking index designed for the virtual consultation's specific clinical scenario (Dai and MacDorman, 2018). The index was composed of statements about Dr. Richards' mental state paired with the semantic differential scale strongly agree-strongly disagree. Either strongly agree or strongly disagree corresponded to the correct answer. In that sense, the index was a disguised test. This variable was included as a second measure of information processing (DeBono and Harnish, 1988).

Adherence Intention
The intention to adhere to the doctor's advice was measured using an adherence intention index designed for the clinical scenario (Dai and MacDorman, 2018).

Dietary Change and Exercise Change
Behavior change was measured by a survey on diet and physical activities, inspired by a lifestyle change survey for diabetic patients (Chong et al., 2017). Participants reported their consumption of vegetables, fruits, fatty and processed foods, and foods with added sugar for the prior week. These four items comprised a diet index. They also reported how many hours they spent walking and doing moderate and vigorous physical activities for the prior week. These three items comprised an index on physical activity. The indices were administered immediately after the experiment and at least 1 week later. The difference between their first and second administration constitutes the dietary change and exercise change indices.
The cognitive perspective-taking index was an unweighted average of its scales. To maximize variance explained, all other indices were averages of their respective scales weighted by their component loadings.
Test statistics were interpreted at the significance level p ≤ 0.05. For the manipulation checks and directional hypotheses, planned contrasts were one-tailed. All other tests were two-tailed. Linear mixed-effects models were fitted by maximum-likelihood estimation. Mixed-effects models were used instead of ANOVA and ANCOVA because they account for variability among groups. Planned contrasts were orthogonal and used type III sum of squares.
For path models, the standard for acceptable global fit required meeting the following criteria: p > 0.05 for model χ 2 , the root mean square error of approximation (RMSEA) ε ≤ 0.08 (MacCallum et al., 1996), the lower bound of its confidence interval ε L 0, the comparative fit index (CFI) ≥ 0.95, and the standardized root mean squared residual (SRMR) ≤ 0.08 (Hu and Bentler, 1999). The standard for acceptable local fit was a significant p-value for a parameter estimate representing a direct effect between variables.

Reliability and Psychometrics of Indices
All indices were reliable (0.72 ≤ α ≤ 0.93). Item 7 was removed from the adherence intention index to increase its reliability.
In the thought-listing task, the two coders had substantial agreement in assigning one of the seven labels to each thought, κ 0.73, agreeing on the labels for 75% of the thoughts. Labeling discrepancies were resolved through discussion.
The psychometric properties of the final indices are listed in Table 1. The initial indices and their factor loadings appear in the Supplementary Material. Correlations among independent and dependent variables (indices) are listed in Figure 2.

Manipulation Checks
A two-way MANOVA indicated a significant effect of bedside manner, Pillai's trace 0.49, F (4, 432) 105.00, p < 0.001, and depiction, V 0.19, F (16, 1740) 5.28, p < 0.001, on the manipulation check variables. Two-way mixed-effects models confirmed that bedside manner × depiction had only nonsignificant interaction effects on Human Realism, pretest Eeriness, Eeriness, Warmth, and Competence. Mixed-effects models with planned contrasts were used for the manipulation checks. The values were calculated after index revision and weighting the means by their component loadings.
FIGURE 2 | Pearson's correlation between the dependent and independent variables. All correlations were significant except those indicated by white text on a gray background.
Frontiers in Virtual Reality | www.frontiersin.org September 2021 | Volume 2 | Article 739038 After controlling for the effect of depiction, bedside manner had a nonsignificant effect on Human Realism and pretest Eeriness, but a significant effect on posttest Eeriness, Warmth, and Competence (Table 2, n 441). Planned contrasts revealed that the doctor with good bedside manner was rated significantly less eerie, warmer, and more competent than the doctor with poor bedside manner.
After controlling for the effect of bedside manner, depiction had a significant effect on Human Realism and pretest Eeriness, but a nonsignificant effect on posttest Eeriness, Warmth, and Competence. The fact that bedside manner, not depiction, had a significant effect on posttest Eeriness indicates that not only is bedside manner a confounding variable, but that it renders nonsignificant the effect of depiction.
A planned contrast revealed that the real human doctor (i.e., real) was rated significantly higher in Human Realism than the others (i.e., cartoon, bigeye, swapped, and animation). Another planned contrast revealed that 3-D animation (i.e., bigeye, swapped, and animation) was rated significantly higher in pretest Eeriness than the others (i.e., cartoon, real). Figure 3 plots pretest Eeriness by depiction. Increasing levels of realism reproduced the characteristic U-shape of the uncanny valley (Mori, 2012;Ho and MacDorman, 2017).
Women have been found to be more responsive than men to specific persuasion strategies and scenarios (MacDorman et al., 2009;Orji et al., 2015). Their satisfaction is also more likely to increase when the physician exhibits good bedside manner (Schmid Mast et al., 2007). Thus, to ensure the generalizability of our results given that the participants were 72% female, we tested the effect of participant gender on all dependent variables used in the manipulation checks and in hypothesis testing. A MANOVA revealed that the effect of gender on these variables was nonsignificant.

Hypothesis Testing
A two-way mixed-effects model confirmed that bedside manner × depiction had only nonsignificant interaction effects on Adherence Intention, number of source-and message-related thoughts, Dietary Change, and Exercise Change. Mixed-effects models with planned contrasts were used in hypothesis testing. The results are shown in Table 3.
H1 states that the physician's depiction affects the intention to adhere to his advice. After controlling for the effect of bedside manner, depiction had a nonsignificant effect on Adherence Intention. Thus, H1 was not supported.
H2a states that the high-eeriness physician elicits more sourcerelated thoughts than the low-eeriness physician. High eeriness is operationalized as 3-D animation (i.e., bigeye, swapped, and animation) and low eeriness as cartoon and real. After controlling for the effect of bedside manner, depiction had a significant effect on the number of source-related thoughts. A planned contrast revealed that 3-D animation elicited significantly more source-related thoughts than cartoon and real (Table 3; Figure 4). Thus, H2a was supported.
H2b states that the low-eeriness physician elicits more message-related thoughts than the high-eeriness physician. After controlling for the effect of bedside manner, depiction had a nonsignificant effect on the number of message-related thoughts. Thus, H2b was unsupported.  H3 states that patients would be more likely to adhere to the advice of the high-warmth physician than the lowwarmth physician. After controlling for the effect of depiction, bedside manner had a significant effect on Adherence Intention. A planned contrast revealed that Adherence Intention was significantly higher for the doctor with good bedside manner (Table 3; Figure 5). Thus, H3 was supported.
H4a states that the low-warmth physician elicits more source-related thoughts than the high-warmth physician. After controlling for the effect of depiction, bedside manner had a significant effect on the number of source-related thoughts. A planned contrast revealed that the doctor with poor bedside manner elicited significantly more source-related thoughts. Thus, H4a was supported.
H4b states that the high-warmth physician elicits more message-related thoughts than a low-warmth physician. After controlling for the effect of depiction, bedside manner had a significant effect on the number of message-related thoughts. A planned contrast revealed that the doctor with good bedside manner elicited significantly more message-related thoughts. Thus, H4b was supported. Planned contrasts revealed 3-D animation elicited significantly more source-related thoughts, and good bedside manner significantly increased Adherence Intention, elicited significantly fewer source-related thoughts, elicited significantly more message-related thoughts, and increased Exercise Change.  H5a states that the consultation with the high-warmth physician will improve diet more than the consultation with the low-warmth physician, and H5b states that it will also improve physical activity more than the consultation with the low-warmth physician. To operationalize behavior change, diet and exercise activity were measured immediately after the consultation and at least 1 week later. After controlling for the effect of depiction, bedside manner had a nonsignificant effect on Dietary Change (Table 3). However, after controlling for the effect of depiction, bedside manner had a significant effect on Exercise Change. A planned contrast revealed that good bedside manner increased Exercise Change more than poor bedside manner. Thus, although H5a was not supported, H5b was supported.
H6a states that those reporting greater adherence intention after the virtual consultation will improve their diet more than others, and H6b states that they will also increase their physical activity more than others. After controlling for the effect of bedside manner and depiction, Adherence Intention had a nonsignificant effect on Dietary Change and Exercise Change. Thus, neither H6a nor H6b were supported.

Regression Analysis
A regression analysis revealed that 3-D animation was a nonsignificant predictor of Adherence Intention and Exercise Change. Bedside manner was a significant predictor of Adherence Intention with a medium-to-large effect size, β 0.48, t (439) 11.39, p < 0.001, and explained a significant proportion of the variance, R 2 0.23, adj. R 2 0.23, F (1, 439) 129.67. Bedside manner was a significant predictor of Exercise Change with a small effect size, β 0.11, t (326) 2.05, p 0.042, and explained a significant proportion of the variance, R 2 0.01, adj. R 2 0.01, F (1, 326) 4.18.

Path Analysis
We next develop path models to address our overarching research question: How does a physician's bedside manner and depiction influence adherence intention and behavior change through the stereotype content model and heuristic-systematic model?
The stereotype content model predicts that adherence intention and behavior change will be greater for the doctor perceived as having more interpersonal warmth because warmth indicates a willingness to help. Moreover, the uncanny valley effect predicts that a 3-D animated doctor elicits the perception of greater eeriness and less warmth than a real or cartoon doctor, which could decrease adherence intention and behavior change, owing to an aversive response. Hypothesis testing confirmed that over the course of a week, the doctor with good bedside manner increased participant's physical activity significantly more than the doctor with poor bedside manner. In a path model, a direct effect is a hypothesized directional relation between two variables, represented by an arrow. The relation's strength is a parameter estimated by model identification. If no relation is hypothesized, there is no arrow, and the parameter is fixed to zero. By the parsimony principle, model specification begins with the simplest model with the highest priority effect and proceeds by adding the next highest priority effects incrementally (Kline, 2016).
We start with Exercise Change-the effect we seek to predict. Among the dependent variables, only cognitive perspectivetaking had a significant direct effect on Exercise Change. Adherence Intention and Warmth were added in Model 1, followed by Competence in Model 2, and Warmth's direct effect on Competence in Model 3. In Model 4-our final SCM model-the independent variables bedside manner and 3-D animation had significant direct effects on Warmth (Figure 6).
The heuristic-systematic model predicts that the high-eeriness doctor is processed more heuristically, eliciting more sourcerelated thoughts owing to its visual novelty. By contrast, the loweeriness doctor is processed more systematically, eliciting more message-related thoughts. Thus, the low-eeriness doctor is predicted to promote adherence more because persuasion increases more and endures longer with the number and favorability of message-related thoughts (Greenwald, 1968). As before, we proceed by adding significant direct effects, this time to Model 4. Model 7-our final hybrid model-had acceptable local and global fit (Figure 8; Table 4).
Based on the heuristic-systematic model, we found that number of source-related thoughts had a negative direct effect on Adherence Intention. We wanted to explore these effects in more detail but could not add more free parameters given the sample size. The N:q rule stipulates that the sample size should exceed the number of free parameters by a factor of at least 20 (Jackson, 2003). Given 441 participants, this set the upper bound on the number of free parameters at 22. So, we removed variables related to the stereotype content model, and instead focused on the heuristic-systematic model, dividing source-and message-related thoughts into negative or nonnegative ones. Model 11 was our final model using HSM variables (Figure 9). The model had acceptable local and global fit ( Table 4). Model 11 shows that good bedside manner had a strong negative direct effect on number of negative sourcerelated thoughts, which had a negative direct effect on Adherence Intention.

DISCUSSION
This research investigates how a character's warmth and eeriness in a virtual clinical scenario impact adherence intention and behavior change. The 3-D computer-animated versions of the doctor in the virtual consultation had a nonsignificant effect on adherence intention (H1) and change in physical activity as compared with the real and cartoon versions; however, the doctor's bedside manner significantly increased adherence intention (H3) and physical activity (H5b). These results indicate that a good bedside manner, which is perceived as warm and competent, has a much stronger effect on persuasion than the uncanny valley effect.
In the path analysis of Model 4, derived from the stereotype content model's variables, 3-D animation slightly decreased warmth, while good bedside manner increased warmth significantly with a large effect size. This in turn increased perceived competence, adherence intention, cognitive perspective-taking, and physical activity ( Figure 6). In Model 11, derived from the heuristic-systematic model's variables, 3-D animation significantly increased negative source-related thoughts with a small effect size, which significantly decreased adherence intention, while good bedside manner significantly increased adherence intention (Figure 9).
We found roughly comparable adherence intention and exercise change in a virtual consultation using a 3-D computer animation as using a cartoon or real human, even though the 3-D rendering styles elicited the uncanny valley effect. The effect of the doctor's bedside manner washed out that of the uncanny valley to such an extent that the effect of depiction on eeriness was nonsignificant in the posttest. This bodes well for the acceptance of virtual characters in clinical settings as compared with videos of real humans. Virtual characters have the added advantage of efficiency and controllability (Johnson et al., 2000;Kenny et al., 2007;Bickmore et al., 2010b). Interactions can be scripted and subsequently enhanced by artificial intelligence (AI) systems like IBM Watson Health. It would be worthwhile to determine a virtual character's effect on persuasion in other clinical scenarios.
Adherence intention, however, was not an ideal predictor of exercise change. We found that cognitive perspective-taking instead directly predicted exercise change. Cognitive perspective-taking measured participants' ability to infer the doctor's mental state, including his thoughts, feelings, and beliefs. Most behavior change theories, including the most popular ones, share the idea that intention determines FIGURE 6 | Model 4: Path model of Exercise Change with SCM variables. The blue arrow indicates a positive direct effect, and the red arrow indicates a negative direct effect. Next to each arrow, the standardized estimate of the parameter (β, γ, ζ, φ, ψ) indicates the strength of the direct effect, and the number of asterisks indicates its significance level: *p ≤ 0.05, **p ≤ 0.01, and ***p ≤ 0.001. The independent variables bedside manner (good or poor) and 3-D animation (two levels: either bigeye, swapped, and animation or cartoon and real) are the exogenous variables. The model had acceptable local and global fit ( Table 4). behavior. These include the theory of planned behavior, the theory of reasoned action, theories of attitude-behavior relations, goal theories, models of health behavior, and the technology acceptance model and its extensions (Ajzen, 1991;Eagly and Chaiken, 1993;Austin and Vancouver, 1996;Conner and Norman, 1996;Gollwitzer and Moskowitz, 1996;Abraham et al., 1998;Maddux, 1999;Hale et al., 2002;Webb and Sheeran, 2006;Tamilmani et al., 2020). In the path models, however, adherence intention only predicted exercise change indirectly, mediated by cognitive perspective-taking. SCM variables like warmth and competence were far more predictive of adherence intention than HSM variables like the number of source-and message-related thoughts. This is surprising because SCM is not a theory of persuasion.
The doctor with poor bedside manner elicited significantly more source-related thoughts than the doctor with good bedside manner (H4a), while the doctor with good bedside manner elicited significantly more message-related thoughts (H4b ,  Table 3). This pattern also emerged in the path analysis (Figures 8, 9; Table 4). These results run counter to findings in the persuasion literature. HSM predicts that interpersonal warmth prompts heuristic processing, indicated by an increase in source-related thoughts (Chaiken, 1980(Chaiken, , 1987DeBono and Harnish, 1988;Wood and Kallgren, 1988;Horcajo et al., 2010) and that less warm, attractive, and competent sources prompt systematic processing (Chaiken, 1979;Pallak et al., 1983;Brownlow, 1992;Messner et al., 2008;Koh and Sundar, 2010). However, in a clinical setting, patients may expect good bedside manner; thus, a doctor who breaks this expectation could draw attention. Likewise, patients may not expect the doctor to appear eerie; thus, an eerie appearance could also draw attention.
The doctor's interpersonal warmth increased his perceived competence and shifted thoughts from the source to the message, contrary to most HSM studies. In HSM, both warmth and  competence are deemed heuristic cues, which prompt people to make decisions by rules of thumb learned from experience, such as trusting a source that looks and sounds authoritative (Abelson, 1976;Lin et al., 2016;Lin and Spence, 2018). The effects of heuristic cues tend to be fleeting because they are unlikely to elicit systematic processing of the message. They may increase adherence intention temporarily but without a lasting effect on behavior change. This aligns with Model 11, which shows that message-related thoughts have a stronger effect on adherence intention and exercise change than source-related thoughts.
In the medical literature, however, there is a close link between warmth and competence. When patients perceive their physicians as friendly and helpful, they also perceive them as competent, and they are less likely to sue them for malpractice (Charles et al., 1985;Shapiro et al., 1989;Hickson, 1994;Levinson et al., 1997). The physician's warmth has a major influence on the patient-physician relationship. It prevents patient dissatisfaction, mistrust, and nonadherence (Cousin et al., 2013). Determining how SCM variables influence heuristic and systematic processing is crucial to the study of adherence because systematic processing tends to result in more enduring changes in attitude and intention-as well as changes in behavior that are more resistant to relapse (Cacioppo and Petty, 1984).
The results confirmed our earlier suspicion that interpersonal warmth has a confounding effect on measuring eeriness (Dai and MacDorman, 2018). Depiction affected pretest eeriness, but not posttest eeriness, which was instead affected by the doctor's bedside manner, even when controlling for this variable. The manipulation check of pretest eeriness revealed that the 3-D computer-animated doctor, rendered in three different styles, was eerier than the real and cartoon doctors. Thus, eeriness should be measured in a pretest before an interactive scenario that manipulates warmth instead of in a posttest with other dependent variables.
This study builds on past research on how a relational agent's characteristics influence credibility and trust in health counseling (Bickmore et al., 2005;Bickmore et al., 2010a;Bickmore and Gruber, 2010;Sillice et al., 2018) and pedagogy (Baylor and Kim, 2005). However, we varied warmth and realism systematically by using the same character interacting in the same way while controlling for other aspects of the scenario. By eliminating the effect of using virtual agents based on different characters, we could isolate the effect of warmth and realism on adherence.
Virtual consultations offer a cost-effective way to improve patient adherence and health literacy (Bickmore et al., 2009b;2010c). Recent advances in AI, computer modeling and animation, and immersive games will enable the rapid development and deployment of virtual consultations via the Internet to computers and mobile devices. The advantages of AIenabled virtual consultations include the ease of applying them to different conditions and treatments, speed of revision, and internationalization and localization for language, culture, and demographic group (DeSmet et al., 2014).

Limitations and Future Work
Concerning the effect of depiction on Adherence Intention, the study was underpowered. MacDorman et al. (2009) found that a 50% increase in eye size significantly increased Eeriness with a large effect size, η 2 p ≈ 0.65. However, in this study, the manipulation only had a small, though significant, effect on pretest Eeriness, p 0.019, η 2 p 0.03. For a 0.05 alpha level and this study's effect size of η 2 p 0.03, depiction: power 0.08. With such a small effect size, just to achieve a power of 0.80 would require 670 participants in each of the 10 groups. As noted in Methods, the effect size was large when participants rated the videos in preparation for this study. One explanation is that they were aware of comparing five different styles of depictions, whereas in the present study, participants were only aware of their own condition.
Cognitive perspective-taking was a better predictor of exercise change than adherence intention. It is unclear why this should be the case. Was understanding the situation from the doctor's perspective predictive, or was this variable simply a correlate for understanding the message? Further research is needed. In addition, behavior change only occurred for physical activity, not diet. This pattern, however, is not uncommon in the literature (e.g., Anderson et al., 2010).
The use of self-reported scales to measure behavior change in participants could be inaccurate for various reasons, such as the fabricated purpose of the virtual consultation, the participants' poor recollection of it, or their tendency to respond in a socially desirable way. Exercise change can be measured more accurately by taking a different approach, such as using a monitoring device for physical activity. These include accelerometers, pedometers, and heart rate monitors (Bravata et al., 2007;Rogers et al., 2014;Stonerock and Blumenthal, 2017;Jelen et al., 2020).
Although patients' emotional responses to their health conditions can influence their attitudes and intention to follow health-related advice and their actual behavior change (Dillard and Shen, 2005;Krakow et al., 2018), variables like fear and anger were not measured in the current study. Emotions like fear, anxiety, and disgust are also known to correlate with the uncanny valley effect . Future research could examine how these variables affect the persuasiveness of virtual agents.

Contributions to the Field
A major finding in the literature on persuasion is that a message will be less convincing if its source is evaluated negatively. Likewise, a major finding of the literature on the uncanny valley effect is that virtual characters that resemble humans will be evaluated negatively because of their eerie appearance. Therefore, it stands to reason that a human-looking virtual character should be less convincing than a real human. However, this study found no significant difference. Instead, it found that, in the context of a virtual doctor's consultation, good bedside manner increased the intention to adhere to health advice and actual physical activity. The study also showed how bedside manner-a form of interpersonal warmth-had such an overriding effect on perceptions of the virtual character that a standard measure of eeriness was rendered invalid for use in a posttest. Although in the persuasion literature warmth typically elicits more thoughts about the message source, in the virtual consultation, the doctor with poor bedside manner or an eerie depiction elicited more source-related thoughts. Finally, contrary to the literature on behavior change, adherence intention was a poor predictor of exercise change compared with the ability to take the doctor's perspective.

CONCLUSION
In 1970, Mori (2012) proposed that human replicas-like android robots-could appear eerie. This effect, known as the uncanny valley, has been confirmed by a meta-analysis (Diel et al., 2022). The present study reproduced this effect by using a 3-D computeranimated doctor in a virtual consultation with three different rendering styles. However, the impact of the uncanny valley on adherence intention and exercise change was negligible compared with bedside manner. Bedside manner also had a pronounced effect on two SCM variables, warmth and competence, and on the HSM variables number and valence of source-and message-related thoughts. Thus, although virtual characters can elicit the uncanny valley effect, in this study they were comparable to a video of a real human in their effect on adherence intention and exercise change.
The present study provides empirical findings that extend the theory and practice of persuasion to virtual clinical settings. SCM variables like warmth and competence were found to be important dimensions in designing persuasive computer-animated characters. Contrary to the literature, a physician with good bedside manner prompted greater systematic processing in HSM, which increased adherence intention and exercise change. Physician warmth predicted competence. Surprisingly, both warmth and competence were better predictors of adherence intention than HSM variables. Cognitive perspective-taking performed better in predicting exercise change than adherence intention. Depiction had much less of an impact on persuasion than the physician's warmth and competence.

DATA AVAILABILITY STATEMENT
The analysis was performed in the R statistical computing environment (packages: effectsize, jmv, lavaan, lavaanPlot, psy, psych, pwr2, multcomp, nlme, performance). The dataset and R scripts for the analyses are available as Supplementary Material.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Indiana University's Office of Research Administration. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.

AUTHOR CONTRIBUTIONS
ZD and KM conceptualized and designed the experiments, and ZD conducted them. ZD and KM analyzed the data, prepared the tables and figures, and wrote and revised the manuscript. ZD reviewed the literature on persuasion and adherence. KM wrote the final R scripts and supervised the research.

FUNDING
The study was supported by the National Institutes of Health (P20 GM066402/GM/NIGMS NIH HHS/United States), an IUPUI Signature Center grant, Research Investment Funds grant, and Open Access Fund.