Embodied Virtual Patients as a Simulation-Based Framework for Training Clinician-Patient Communication Skills: An Overview of Their Use in Psychiatric and Geriatric Care

Clinician-patient communication is essential to successful care and treatment. However, health training programs do not provide sufficient clinical exposure to practice communication skills that are pivotal when interacting with patients exhibiting mental health or age-related disorders. Recently, virtual reality has been used to develop simulation and training tools, in particular through embodied virtual patients (VP) offering the opportunity to engage in face-to-face human-like interactions. In this article, we overview recent developments in the literature on the use of VP-simulation tools for training communicative skills in psychiatry and geriatrics, fields in which patients have challenging social communication difficulties. We begin by highlighting the importance of verbal and non-verbal communication, arguing that clinical interactions are an interpersonal process where the patient’s and the clinician’s behavior mutually shape each other and are central to the therapeutic alliance. We also highlight the different simulation tools available to train healthcare professionals to interact with patients. Then, after clarifying what communication with a VP is about, we propose an overview of the most significant VP applications to highlight: 1) in what context and for what purpose VP simulation tools are used in psychiatry (e.g. depression, suicide risk, PTSD) and geriatrics (e.g., care needs, dementia), 2) how VP are conceptualized, 3) how trainee behaviors are assessed. We complete this overview with the presentation of VirtuAlz, our tool designed to train health care professionals in the social skills required to interact with patients with dementia. Finally, we propose recommendations, best practices and uses for the design, conduct and evaluation of VP training sessions.


INTRODUCTION
"Medicine is an art whose magic and creative ability have long been recognized as residing in the interpersonal aspects of patientphysician relationship" (Hall et al., 1981).Creating and maintaining an effective and trustworthy clinician-patient relationship is fundamental for providing high-quality care (Ha and Longnecker, 2010).More specifically, the professional practice of health providers includes important verbal and nonverbal communication skills, especially during face-to-face interaction in order to promote therapeutic alliance (Zolnierek and DiMatteo, 2009) and patient's satisfaction (Brown et al., 1999).Communication skills refer to the ability to convey information to another person effectively (Beaulieu et al., 2011).Thus, they refer not only to what is said but how it is said through non-verbal behavior, including tone of voice (prosody), active listening, empathy, gestures, body language, and facial expressions used when interacting with another person.Training in communication techniques for health professionals has become a major issue in health education.However, health training programs do not provide sufficient clinical exposure and supervision to acquire these essential skills (Brown et al., 1999).Recently, virtual reality (VR) has been used to develop simulation and training tools (Pottle, 2019), notably through embodied virtual agents (Lee et al., 2020), in order to respect the fundamental ethical principle of "never the first time on a patient" (Granry and Moll, 2012).

Clinician-Patient Communication: Why and How?
Interactions between a clinician and a patient can vary greatly depending on the clinical context (anesthesia, surgery, nursing, neurology or psychiatry), age (pediatrics, geriatrics) or state of the patient (comprehension capacity, cognitive or psychological profile).In any clinical context, successful interactions are generally associated with better adherence to care or treatment (Catty, 2004), medical outcome (Ruberton et al., 2016), enhanced mutual trust (Khullar, 2019) and patient satisfaction (Williams et al., 1998).Conversely, poorer or negative interactions can lead to medical non-adherence (Piette et al., 2005), patient distrust (Hawley, 2015) or clinician burnout (Chang et al., 2018).Successful interactions are driven by verbal and non-verbal communication.Verbal communication serves to evoke a reality and includes the content of what is said (intonation, choice of words) or even active listening (Kee et al., 2018).Non-verbal communication, whether intentional or unintentional, enables information to be shared without using speech, and includes social signals such as facial expressions, eye contact, head movements, posture, touch, interpersonal distance or tone of voice (for a review see Hall et al., 2019).While verbal communication is undoubtedly important during medical consultations and care (Conigliaro, 2007), non-verbal communication also plays a crucial role as it can reinforce or contradict verbal communication and thereby be instrumental to patient satisfaction or treatment outcome (Ambady et al., 2002;Lorié et al., 2017).
Clinicians should be concerned about their own non-verbal behavior as it influences the success of the consultation (Strasser et al., 2005;Roter et al., 2006).Specifically, maintaining eye contact, smiling, maintaining an adequate distance or direct body orientation with legs and arms uncrossed, and arm symmetry is generally associated with greater patient satisfaction (Beck et al., 2002).In addition, communicating positive messages and signs of empathy to patients can lead to better therapy outcomes, even in patients with serious illnesses (Hojat et al., 2011;Howick et al., 2018).Such elements may also lead the patient to judge the clinician as being warm and competent (Howe et al., 2019).Clinicians should also be skilled at identifying patients' non-verbal behaviors, typically for diagnostic purposes, such as using facial or vocal cues to assess patient pain (Ruben et al., 2018) or mood changes in severe depression (Ellgring and Scherer, 1996;Douglas and Porter, 2010).
In the psychiatric or geriatric field, clinicians are expected to interpret specific non-verbal cues associated with patients' psychological and behavioral symptoms, including signs such as emotional blunting, anxiety, apathy or aggression (Lehman et al., 2004;Templier et al., 2015).These behavioral symptoms are known to be difficult for clinicians to address and may even lead them to experience stress and burnout (O'Connor et al., 2018;Isik et al., 2019).
The dynamic non-verbal interaction between clinician and patient, which refers to the way in which the behavior of each interlocutor shapes and influences that of the other, is a crucial element in clinical practice (for a review, see Henry et al., 2012).Most research on non-verbal behavior in clinician-patient communication has been observational, focusing on examining associations with the collaborative relationship between clinician and patient (i.e., the therapeutic alliance) or with treatment outcomes.For instance, during a routine consultation, Lavelle et al. (2015) reported that when the patient displays pro-social behaviors that initiate or maintain interaction (i.e., direct gaze, smiling, nodding, using hand gestures), this in turn leads the clinician to display similar pro-social behavior and results in a better therapeutic relationship.Other studies that analyzed the vocal behavior of clinician-patient dyads from audio recordings have shown that synchronization between clinician and patient in prosody (Imel et al., 2014) or silence (Tomicic et al., 2017) plays a key role in clinical interaction and is associated with a better therapeutic alliance.This behavioral synchrony has also been evidenced with physiological measures, such as levels of electrodermal activity (Marci et al., 2007;Bar-Kalifa et al., 2019) or heart rate (Kodama et al., 2018), suggesting that in clinician-patient dyads, both partners tend to experience concomitant emotional activation.Interestingly, other methods have focused on assessing the relationship between clinicians and patients through automated objective videotape analysis algorithms.Ramseyer andTschacher (2011, 2014), for instance, reported that in face-to-face psychotherapy, greater coordination of the patient's and clinician's body and head movements is associated with more positive therapeutic relationships and greater patient self-efficacy.Hence, the objective characterization of the dynamics of the clinician's and patient's movements provides a quantification of the nonverbal synchrony (see Delaherche et al., 2012).More recently, a two-person brain imaging interactive study (fMRI hyperscanning) has identified a potential brain-behavioral mechanism supporting the clinician-patient relationship.It showed that the mirroring of facial expression and brain-tobrain concordance in the temporo-parietal junction (TPJ) was significantly associated with patient analgesia and therapeutic alliance (Ellingsen et al., 2020).
Effective communication skills are an essential component of successful interactions between clinicians and patients.However, patients may exhibit inappropriate behaviors such as aggression, isolating themselves or refusing to respond to those who try to communicate with them, or refusing to take medication or follow treatments.It therefore appears crucial to train clinicians on how to communicate with patients from a person-centered care perspective (Del Piccolo and Goss, 2012), which means respecting and responding to the needs and values of each patient (Lewin et al., 2001;Hardman and Howick, 2019).Yet, this raises the question of whether non-verbal communication skills can be taught.In fact, it is not so much about developing specific non-verbal communication skills but more about using non-verbal communication to develop engagement, reciprocity, and synchronization in order to create a genuine therapeutic alliance that bonds the clinician and the patient (Shattell et al., 2007;McGilton et al., 2009).In any case, training and evaluating non-verbal communication and their links with clinical outcomes is challenging, often intrusive (e.g., real-time observation) or can require extensive resources (e.g., interactions involving actors representing standardized patients; Henry et al. (2012).

Medical Simulation in Healthcare Education
Medical simulation, which is increasingly being promoted by health authorities in many countries (Forbes and Kennedy, 2009;Granry and Moll, 2012;Alinier and Platt, 2014), refers to the use of standardized devices or virtual reality tools to emulate a clinical context to teach or train a health professional in clinical, therapeutic or diagnostic procedures (Ker and Bradley, 2010).The purpose of medical simulation is to reproduce real world clinical scenarios in a standardized, safe and reproducible context, which facilitates the immersion of trainees during their initial or continuing education (Gaba, 2004).While medical simulation cannot substitute for clinical experience, it offers the opportunity to receive feedback and gain confidence without going through the real clinical event or remaining 'far away from the patient' the first few times of practice (Okuda et al., 2009;Wolf et al., 2011).
In the field of healthcare, a variety of simulation techniques (see Figure 1) are thus suitable for both novice or expert trainees for psychomotor, cognitive, affective or communicative learning tasks (Munshi et al., 2015).'Procedural simulation' conducted using realistic part-or whole-body manikins is the most traditional (Lapkin et al., 2010) and is mostly used to facilitate the learning of psychomotor skills such as surgery gestures or nursing care (see Rivière et al., 2018).'Standardized simulation' is structured in the form of role-play with well-trained actors to simulate clinical scenarios or to portray a patient with a specific health concern (i.e., human standardized patient) providing indepth experience of clinical reasoning, decision-making or communication techniques in various situations, including crisis intervention (Brender et al., 2005;Keltner et al., 2011).However, the use of standardised simulation is limited due to the high costs of recruiting and training patient actors, especially when actors need to play an adolescent, an elderly person or a person with mental health issues (Keiser and Turkelson, 2017).In addition, this type of simulation can also lead users to feel uncomfortable or apprehensive about acting in front of their peers (Albright et al., 2016).Finally, 'virtual reality' (VR), which has developed extensively over the last 2 decades in the field of healthcare, refers to a computer-screen-based simulation that offers a multisensory and immersive interactive experience in a safe environment (Mantovani et al., 2003;Rizzo et al., 2017;Riva and Serino, 2020).These techniques can be used to implement applications such as 'serious games' in which the trainee is confronted with virtual situations drawn from real-life events, allowing them to develop clinical reflexes (e.g., from the discovery of patient files to the administration of medication) (Wang et al., 2016) and that can incorporate the notion of feedback or scoring (Stuckless et al., 2014).Such tools may take the form of virtual world platforms such as Second Life ® (see Irwin and Coutts, 2015) or The Sims TM (Arts, 2009), and allow for the creation of virtual hospital units (Aebersold et al., 2012) and interactions with virtual patients.Virtual simulation can also be implemented in the form of conversational agents such as 'chatbots' that can interact with users by simulating a human conversation through text or voice via smartphones or computers, and are able to interpret the user's responses.Chatbots have shown their potential to promote clinician-patient communication (Friedman et al., 1977;Madhu et al., 2017;Jagtap et al., 2021).
A more recent breakthrough in the field of healthcare is the use of Embodied Conversational Agents or ECAs.One approach, especially in the field of mental health (see Provoost et al., 2017) has been to use ECAs for diagnostic or remediation purposes, either as partners of social interaction (Georgescu et al., 2014;Tanaka et al., 2017b;Grossard et al., 2020), virtual coaches motivating the user (Torres et al., 2018;Ali et al., 2021), or even for virtual clinical interviews with real patients suffering from depression, post-traumatic stress disorder or dementia (Stratou et al., 2015;Philip et al., 2017;Mirheidari et al., 2019).In addition, ECAs have also come to be used as 'virtual patients' (VPs), i.e., representing a patient alone or in a virtual environment and offering the trainee (clinician or student) the possibility to engage in a human-like face-to-face interaction (see Combs and Combs, 2019).As such, medical educational strategies are increasingly shifting toward the use of VP simulations, as they are more scalable and reproducible, being available at all times and places, while providing learning outcomes comparable to standardized clinical learning environments (Consorti et al., 2012;Quail et al., 2016), including improved skills when interacting with real patients (Cook and Triola, 2009).It has also been shown that people show empathetic responses (Deladisma et al., 2007) and tend to show more willingness to disclose information when interacting with virtual compared to real humans, possibly because of this "soothing" effect since "after all, it is just a computer" (Lucas et al., 2014).

Aim of the Overview
The benefit of VPs for the training of so-called soft or nontechnical skills, including communication or decision-making, has been surveyed elsewhere.One of the first reviews to show a benefit of VPs was on clinical reasoning rather than on communication itself (Cook et al., 2010).Another systematic review, that did not cover communication skills, reported that simulation using VPs can enhance clinician empathy among healthcare students (Bearman et al., 2015).An integrative review in the field of nursing demonstrated the value of VPs for developing decision-making, communication, or teamwork skills (Peddle et al., 2016).More recently, a systematic review in the context of pharmacist-patient interactions also showed the benefit of VPs in developing communication or counseling skills (Richardson et al., 2020).In addition, Lee et al. (2020) conducted a systematic review focusing on the design and evaluation characteristics allowing for effective medical communication skills education based on VP simulation.None of these reviews, however, covered studies on training communicative skills with embodied VPs displaying psychiatric or geriatric disorders.
It should be pointed out, however, that communicating with patients with psychiatric disorders (e.g., depression, schizophrenia, personality disorders) or age-related and degenerative disorders (e.g., dementia) is considered as very challenging because of the behavioral, thinking or language disturbances that occur with the illness (Hartley et al., 2020).However, this dimension is not sufficiently taken into account in the training of clinicians, in particular non-verbal communication to which these patients are very sensitive, thanks to their partially preserved capacity to integrate multimodal social signals (Maurage and Campanella, 2013;Giannitelli et al., 2015;Templier et al., 2015;Xavier et al., 2015).
The aim of the present article is to highlight the relevance of technology-assisted education through VPs, in addressing the specific challenges facing the training of clinician-patient communication in psychiatric and geriatric care education.We start by explaining what communicating with a VP is about, and by describing their role in training communication skills in psychiatry and geriatrics by presenting important studies in this field (see Table 1).Our intention is to provide a roadmap for those interested in learning more about the use of VP as a simulation framework for training clinician-patient communication skills, including their strengths and weaknesses, in psychiatric and geriatric care education.To that purpose, the overview is driven by highlighting several key features of VP simulation tools (see Figure 2) related to the VP itself (e.g., predominant competencies, underlying simulation model, tool evaluation) and the user (e.g., target competencies, underlying clinical situation, user evaluation).

ROLE AND INTEREST OF EMBODIED VIRTUAL PATIENTS IN HEALTHCARE Communication With an Embodied Virtual Patient: What Is This About?
Virtual Reality-based technologies are increasingly used in the field of healthcare and scientific research for simulating cognitive and socio-emotional skills (Riva and Serino, 2020) or studying human social interaction (Pan and Hamilton, 2018).Thus, it is now possible to interact and communicate with a virtual partner, such as embodied conversational agents (ECA) (Cassell et al., 2000;Loveys et al., 2020;Pavic et al., 2020), and thanks to the ability of ECAs to simulate and mimic human behavior, users tend to interact with them as with a real person (Gratch et al., 2013) and to assign them mental states (Callejas et al., 2014).In the context of healthcare simulation training, embodied VPs are typically computer-based programs using ECAs and simulating real patients and emulating a clinical encounter (see Cook and Table 1 | Overview of annotated attributes in our selected list of articles presented on Section 2. The annotation procedure captures the user and the VP characteristics in simulation tools for clinician-patient communication in the field of psychiatry and geriatrics. Triola, 2009; Su and Chang, 2021).The challenge is then to provide enough realism and reliability to make the learner's experience sufficiently relevant and useful (see Talbot and Rizzo, 2019).One approach is to offer case-based training whereby the interaction with the VP unfolds as a function of the trainee's responses (see Staccini and Fournier, 2019;Staccini, 2021).VPs can be fully autonomous and engage in brief interactions with the user, under no human control, or can be controlled by a human operator, through a Wizard-of-Oz (WoZ) procedure, which controls the verbal or non-verbal responses of the virtual patient during the interaction (Fraser and Gilbert, 1991;Riek, 2012).Whether automated or human-controlled, computer systems are generally equipped to record the user's non-verbal behaviors (e.g., voice, gestures, body movements, gaze, or facial expressions) and use them to modulate the evolution of the interaction, giving a high degree of realism and social presence (Fox et al., 2015).In the context of communication training, the overall value of using VPs in the field of healthcare is that they offer the opportunity to practice communicating with patients in a stress-free context (Elzubeir et al., 2010) where mistakes or bad decisions are inconsequential (e.g., breaking bad news, Carrard et al., 2020).This type of virtual environment also gives the trainee the opportunity of selfobservation, allowing for the identification of the most effective communication practices and thereby increasing selfconfidence (Baumann-Birkbeck et al., 2017).Such a tool may finally be supplemented by an evaluation or coding of the trainee's verbal or non-verbal behaviors, via external judges or by computer-based automatic analyses.

VP-Simulation for Clinician-Patient Communication in Psychiatry
Psychiatry is a medical specialty that focuses on the diagnosis, treatment, and prevention of mental, emotional, and behavioral disorders.In recent years, the prevalence of disorders such as anxiety, depression, and stress has increased significantly (see Steel et al., 2014), and was magnified further during the COVID-19 crisis (Castelli et al., 2020;Salari et al., 2020).Other chronic and severe mental disorders, such as schizophrenia, involve social communication difficulties (Burns, 2006) that can be difficult for the clinician to address (McCabe et al., 2013).These difficulties, associated with the stigmatization of patients (Hinshaw and Stier, 2008), as well as stress, burnout, or job dissatisfaction of mental health professionals (Rössler, 2012), has motivated paradigm shifts in medical communication and education approaches.It has been shown, for instance, that ineffective psychiatrist-patient communication is associated with poorer patient outcomes and experiences (Schneider et al., 2004).However, although there are several guides on how to communicate with psychiatric patients (Priebe et al., 2011), studies of non-verbal behaviors in psychiatry are limited (Cruz et al., 2011) and efforts have to be made to improve clinician-patient communication in this field.In this context, training practices based on clinical simulation, including VP-simulation, have the potential to provide health care professionals with safe and controlled tools that can help them develop the necessary skills to care for or communicate with patients experiencing mental disorders (McNaughton et al., 2008;Piot et al., 2020).
One area where VP simulation has proven successful is in training clinicians to interact with people suffering from severe depression or at risk for suicide, as they allow for repetition in practice with challenging scenarios that clinicians may face.One of the first use cases of VPs in the field of psychiatry was designed as a web-based platform to train clinicians to assess suicide risk in youth (Carpenter et al., 2012).The Suicidal Avatars for Mental Health Training (SAMHT) platform allows the user to face a VP who can move, speak and is embodied as a child suffering from depressive symptoms.The user converses with the VP by selecting questions related to suicide risk from a predetermined text-based list, which are repeated by a synthesized voice (as if the user was asking the question) and allows the VP to respond through an intelligent decision tree.One of the major limitations of this study, which appears to be a proof of concept, is that it did not consider the educational value of the tool nor the trainees' behaviors.More recently, another work (O'Brien et al., 2019) tested the acceptability and benefit of VP simulation, using The PeopleSim ® technology, to train mental health practitioners to interact effectively with individuals at risk for suicide.The VP embodies a fictive 20-year-old female patient who states that she has thought about committing suicide.In the scenario, the patient comes to the clinic to be evaluated on her ability to return to work.The trainee's goal is to encourage the patient to share details of her thoughts to assess the immediate risk of suicide, while empathizing with the patient's thoughts.The proposed simulation consists of a unidirectional interaction with the VP-via a text-based interface incorporating pre-recorded videos of the patient-where the trainee can select from a list the question to ask the patient who can answer via a chat window.Based on their choices, an on-screen virtual instructor also provides text-based advice or feedback to help trainees improve their skills.After the conversation, the 20 participants who took part in this pilot study found the training experience to be satisfactory (feasibility and acceptability), especially in terms of training communication skills.Finally, the trainees' knowledge (assessed using pre-post training questionnaires) showed a significant improvement.A limitation of this work is that with the pre-recorded videos of the VP, the interaction lacks naturalness, and the verbal and non-verbal behavior of the trainee is not taken into account.
Using The Sims video game, other studies have focused on the use of VPs to enhance the empathy and interpersonal communication skills of medical students during interactions with a 21-year-old virtual patient (Ms.Cynthia Young) suffering from major depressive episodes (Shah et al., 2012;Cordar et al., 2014).The interaction between the trainee and the VP was bidirectional (i.e., each can ask the other participant questions according to a predefined script) and relied on a text-based interface incorporating images of the patient (Shah et al., 2012) or 90-s cutscenes depicting short moments in the VP's life to provide a backstory (Cordar et al., 2014).These studies focused mainly on the evaluation of the educational tool by the students via questionnaires.Trainees expressed overall satisfaction of the interaction with the VP, found the tool easy to use, and considered that it could represent a good educational tool.The empathy of 35 trainees was evaluated (Cordar et al., 2014) from text-based transcripts only (with or without VP backstory) and users' verbal or non-verbal behaviors were not considered.Results suggested that VPs with a backstory, showing how the patient may be affected by their illness, are an effective training tool for interpersonal communication skills, such as empathy.A limitation of these studies is that, in the limited environment of 3D virtual world games, VPs are not able to converse in a realistic manner, nor respond to dialogues with human-like non-verbal behaviors or emotions.
A further area where VP simulation has proven effective is in training to interact with people with post-traumatic stress disorder (PTSD).PTSD is a psychiatric disorder related to trauma caused by exposure to a traumatic or stressful event (e.g., war, natural disasters, terrorist attacks, assaults) usually resulting in persistent feelings of anxiety and psychological distress, and potentially leading to altered social-emotional behaviors or depression (Zoellner et al., 2014).Several studies in this area (e.g., Parsons et al., 2008;Kenny et al., 2009a,b) have focused on improving the interviewing or communication skills of mental health students by having them interacting with a VP embodied in Justina1 , a female adolescent suffering from PTSD following an assault.Interactions were designed as a 15-min clinical interview in which 15 novices in mental health (Parsons et al., 2008;Kenny et al., 2009a) or 15 novice and nine expert mental health clinicians (Kenny et al., 2009b) were asked to assess a patient's initial diagnosis of PTSD.The VP, displayed on a computer screen, spoke in natural language with the user.
Interestingly, the user's speech was recorded and transcribed in real-time so that it could be interpreted by a statistical question-answer system (i.e., based on a real clinical corpus) to generate the VP's verbal behavior.The transcription of the whole dialogue session was also recorded and annotated in terms of whether clinicians asked the appropriate questions needed to determine whether the patient reported symptoms that met the criteria for PTSD.Overall, results of these studies (see also Rizzo and Shilling, 2017) showed a good evaluation of the credibility of the tool by users, despite encountering some problems with voice recognition.In addition, novices asked the VP more questions about general matters whereas experts were better able to ask the specific questions needed to make a differential diagnosis.While this approach is promising, it focused on natural language, and the fact that non-verbal behaviors such as users' facial expressions were recorded but not yet exploited in the study somewhat limits its impact.
Another application of VPs concerns the training of health care professionals in the field of transcultural psychiatry and refers to clinicians' interactions with patients from diverse ethnic backgrounds.Pantziaras et al. (2014Pantziaras et al. ( , 2015) developed a VP system called Refugee Trauma Simulation (RT-SIM) portraying a refugee (Mrs K) exhibiting severe symptoms of PTSD.The interaction, based on a predetermined questionanswer scenario, was designed as a virtual psychiatric interview (up to 45 min) requiring 32 residents in psychiatry to provide a differential diagnosis and treatment program of a refugee with PTSD.The VP tool aimed at providing knowledge and training on identifying PTSD symptoms, clinical management, and communication skills.The VP displayed on a computer screen was depicted using pre-recorded video clips and used prerecorded sentences that were played according to the FIGURE 2 | Key features of healthcare simulation tools using embodied virtual patients (VPs), applicable to the user and the VP.On the left, those concerning the user (e.g., clinician, student), including the predominant competencies targeted for training, the type of underlying situation simulated, as well as the user evaluation; On the right, those concerning the embodied virtual patient, including the predominant competencies of the VP, the underlying simulation model, and the tool evaluation by the user.WoZ: Wizard-of-Oz.
questions chosen by trainees on a list.An interesting element of this tool was the inclusion of an automated feedback module covering the VP's perspective on the consultation, the clinical aspects of PTSD diagnosis/management, and the trainee's communication skills.This feedback was given in the form of a video presentation of the virtual patient (VP) speaking directly to the trainee (e.g., perceived level of empathy, relevance of questions asked to the problems encountered).Assessments included both tool evaluation (credibility, usability, and effectiveness) by the trainees and trainee evaluation (selfreport of emotional reactions, questionnaire-based assessment of PTSD knowledge acquisition, and assessment of communication skills).Overall, the results showed that the tool was positively evaluated by the trainees in terms of ease of use and effectiveness.In addition, the training session had a significant impact on the improvement of knowledge (pre-and post-interaction).Of note, the follow-up evaluation, several weeks after the interaction, showed that the knowledge gain decreased over time, suggesting that a single training session with the VP does not seem to be sufficient to ensure long-term learning.A limitation of the study is that the article did not provide any details about how communication skills were automatically analyzed by the system and evaluated.
More recently, a study focused on improving the empathic communication skills of medical students during realistic psychiatric interviews with a middle-aged VP with affective disorders (Dupuy et al., 2020).Interestingly, the simulation was not exclusively text-based or speech-based as in previous studies, but also addressed the VP's non-verbal behavior as well as the trainees' non-verbal and empathic behavior.The interaction, based on a predetermined question-answer scenario, was designed as a 35-min psychiatric interview requiring 35 students (medical, psychiatric) to conduct interviews with a middle-aged VP with major depressive disorder and to extract semiology (i.e., clinical manifestations).The VP was displayed on a large humansized screen and could interact with the user in natural language, allowing the user's speech to be interpreted via a speech recognition system.The strength of this study is that it emphasized not only the non-verbal behaviors of the VP (prosody, gestures, movements) but also those of the trainees (recording of facial expressions, with manual annotation of the videos and automatic analysis by an emotion recognition software).The tool was positively evaluated, in terms of usefulness, realism and credibility, during a debriefing (i.e., semi-structured interview) with the trainees.The results also suggest that the trainees maintained a neutral face during the interview, a finding interpreted by the authors as a form of empathy and the ability to maintain a certain emotional distance.Overall, this work highlights the added value of automatic facial expression recognition in psychiatry training.
A last interesting proposal consists in training health professionals to break bad news to patients, which requires fine skills in psychology, non-verbal behaviors, and empathy.Although this work is not strictly in the field of psychiatry, it seemed relevant to all those who wish to learn more about the use of VPs for training clinician-patient communication skills, as breaking bad news is a frequent and challenging task for clinicians in most clinical specialties (Fallowfield and Jenkins, 2004).In psychiatry, this may include conversations about the irreversible cognitive impairment of schizophrenia in a young adult (Cleary et al., 2009), in geriatrics, conversations about death issues (Lenherr et al., 2012).Hence, breaking bad news with empathy and being involved in the struggle that follows can make a significant difference.In one work (Ochs et al., 2017(Ochs et al., , 2019)), the scenario required the physician to explain to a patient that a complication had occurred during her operation, requiring a second operation in a day.The stated objective was to train the physician to verify that their verbal and non-verbal communication had the right impact on the patient.The 22 participants in this study (7 expert physicians, 12 naive students) interacted in natural language with the VP that communicated through verbal (e.g., questions) and non-verbal (e.g., nods, smiles) behaviors depending on the trainee's behaviors.To overcome the limitations of voice-based tools controlled via speech-to-text modules only (e.g., misunderstandings, frustration due to transcription errors), the authors preferred a Wizard-of-Oz (WoZ) procedure, where a human operator observes the user and updates the VP response accordingly.The strength of this study is that it displayed the VP in different formats, including a computer screen and two previously unexplored immersive virtual reality systems: a Head-Mounted Display (HMD) and a 3D immersive room with wall projection (CAVE).The objective was to analyze the effect of the immersion format on the users' feeling of presence.The results of the study showed that the CAVE seems to improve the trainee's experience (feeling of presence, perception of the virtual patient) and the credibility of the tool compared to the HMD and the computer screen.The results also showed that the immersive room (CAVE) is particularly suitable for physicians (i.e., more engagement) compared to naive participants, suggesting the potential effect of the "familiar" context on the interactive experience.Finally, although this approach is promising, especially via the fully immersive approach, the fact that the non-verbal behaviors (gestures, movement, facial expressions) of the users were recorded via sensors (i.e., Kinect) but not yet exploited limits its potential somewhat.In another recent study on the same topic (O'Rourke et al., 2020), 60 medical students were asked to deliver bad news to the spouse of a patient experiencing a medical error.The interest of this study was to compare the interaction in terms of communication skills with an embodied VP or a standardized patient (SP) played by a paid actor.The VP, displayed on a large human-sized screen and controlled by a human operator in a WoZ-like procedure, communicated through verbal and non-verbal behavior (head movements, facial expressions, and gestures).The tool, which was rated by users in terms of credibility, was found to be less authentic with the VP compared to the SP.The novelty of this study was to assess communication performance (e.g., "Used appropriate eye contact") as well as trainees' pre-post interaction experience on an emotional level through subjective questionnaires and salivary cortisol concentration.Interestingly, the results of this study suggest that trainees' task performance and emotional reactions do not differ whether they interact with a SP or a VP.

VP-Simulation for Clinician-Patient Communication in Geriatrics
Geriatrics is a medical specialty devoted to the health of elderly people by preventing and treating diseases and disabilities in the elderly.The population over the age of 65 is gradually increasing and, due to the general decline in their physical or cognitive health (Gill et al., 1996), has substantial needs in terms of health care (Hashimoto and Tabata, 2010).Elderly patients often present with complex pathologies that require extensive explanation by clinicians (see Ambady et al., 2002).In addition, they tend to be relatively passive in their interactions, probably for generational reasons or for apprehension of being perceived as disrespectful (Gorawara-Bhat et al., 2007).Furthermore, as the population ages, an increasing number of people are at risk of developing Alzheimer's disease or related dementia (Baumgart et al., 2015).Alzheimer's disease is characterized by a progressive decline in cognitive resources (e.g., memory, language, judgement, attention, etc.) and is associated with behavioral disorders (e.g., aggression, agitation, withdrawal or resistance to care) leading to interpersonal problems and contributing to the patient's loss of autonomy (Orange, 2001;Chaby and Narme, 2009).This leads to difficulties for the patient in expressing their needs or feelings, even though they may rely on non-verbal cues (prosody, facial expressions, gestures, etc.) over time (Alsawy et al., 2020), which implies several challenges for the clinician to communicate with them adequately (van Manen et al., 2020).
In geriatric practice, although the impact of clinicians' nonverbal behaviors on elderly patient outcomes is important as well (Ishikawa et al., 2006), methods to study clinicians' non-verbal behaviors are limited (Collins et al., 2011).A few studies have attempted to use behavioral methods through observation of video recordings to code the non-verbal Dimension in physician-Elderly Patient transactions (NDEPT), including the analysis of the physician's body language during the interaction such as posture, eye contact, facial expressions, and social touch (Gorawara-Bhat et al., 2007;Gorawara-Bhat and Cook, 2011;Stepanikova et al., 2012;Gorawara-Bhat et al., 2013).Interestingly, these studies found that eye contact was the most frequently used non-verbal behavior by clinicians when communicating with elderly patients.
In this context, it becomes crucial to increase the competence of the entire healthcare workforce to communicate and maintain relationships with this geriatric population.Unfortunately, this dimension is not sufficiently considered in the training of clinicians, although several studies have already shown the positive effects of sensitizing clinicians to the use of nonverbal (Magai et al., 2002;Machiels et al., 2017) and empathetic communication (Brown et al., 2020).While VP simulation in geriatrics education is still in its infancy, it has the potential to address several challenges, including reduced access to real patients and the need to provide safe settings in which trainees can learn or practice their clinical skills (Tan et al., 2010).
To find a way to bridge the gap and improve the relationship between health care professionals and older adults with dementia, a first attempt was to use immersive technologies that virtually expose the clinician to what an older adult with dementia experiences.For instance, the Virtual Dementia Tour ® (Beville, 2002;Slater et al., 2019) or myShoes (Adefila et al., 2016) projects allow clinicians to "embody" an elderly patient in a nursing home, to experience the physical and sensory difficulties, as well as the memory loss, feelings, and frustrations associated with dementiarelated problems.While not directly involving training through patient interaction, these virtual immersion techniques that allow clinicians to put themselves in the patient's shoes have shown to improve clinicians' empathy and non-verbal behaviors toward elderly patients in real clinical practice (see Campbell et al., 2021).
One of the first use cases for VPs in geriatrics was designed as a web-based platform to offer clinicians the opportunity of interacting with elderly patients in clinical care encounters (Orton and Mulhausen, 2008).The GeriaSims platform allows the user to interact with a VP embodied as an elderly person and displayed as an image or multimedia clip.The interaction can last one to 2 hours, including questions about history, physical examination, or choices about treatment.Several scenario topics (i.e., modules) are proposed, including cognitive and behavioral disorders in dementia, medication management, primary care, palliative care, or falls (e.g., Ruiz and Leipzig, 2008).Prior to the interaction, the trainee has access to the patient's backstory and the objectives of the encounter-which can be diagnostic or therapeutic-are indicated.The user converses with the VP by selecting questions from a predetermined text list.A virtual mentor is also available for assistance or guidance.At the end of the interaction, the tool was evaluated with questionnaires in terms of usability and effectiveness of learning.The authors report that the tool was rated by 287 trainees as easy to use and an effective way to achieve the targeted objective.An advantage of such a web-based tool is its accessibility at any time and place and thus the flexibility in planning training, which can be a drawback when using other VP simulation tools.One of the limitations, however, is that the VP is not able to converse in a realistic manner, nor respond to dialogues with non-verbal behaviors or human-like emotions.In addition, the trainee's verbal and non-verbal behaviors are not assessed.
Another particularly innovative aspect of VP simulation in geriatrics is the training of communication skills between members of geriatric care teams.These interprofessional approaches are known to improve care efficiency and patient health outcomes (Curran et al., 2007).In this regard, a recent study proposed an interprofessional virtual visit scenario with multiple healthcare professionals at the bedside of Mr. Jin, a 80year-old man with pain and fever after surgery (Liaw et al., 2019).To provide a backstory, the patient's medical file is displayed on the screen before the interaction begins.Here, the various clinicians (physician, nurse, physical therapist, social worker, etc.) are integrated into the virtual environment via an avatar representing them and displayed on the screen and at the same time as the VP.The VP can communicate through verbal (synthesized voice) and non-verbal behavior (e.g., facial expressions, body movements, breathing noise, moaning) in response to the behavior of the clinicians, who can also communicate with each other.The tool was positively evaluated by the 29 trainees participating in the study in terms of usability and effectiveness, with a moderate evaluation of the feeling of presence.The interprofessional attitude of the trainees was also assessed by judges using a questionnaire.Overall, the results of this study show the feasibility of using a 3D environment simulation including a VP to foster social interactions and collaborative practices between multiple healthcare professionals to facilitate the sharing of information about the elderly patient.Note, however, that the non-verbal behavior of the participants was not recorded and or analyzed.Interestingly, this work was complemented by a study on the transferability -5 months after training-of virtual simulation learning to clinical practice (Liaw et al., 2020).Although the assessment of transferability to clinical practice was based on students' subjective perceptions via focus groups, the results indicated transferability effects through clinical practice and how working together with different healthcare professions could ensure a more holistic care of a patient.
Interestingly, a study by Robinson et al. (2020) focused specifically on training communication skills of 82 speech pathology students in realistic conversations with an elderly VP with behavioral symptoms of dementia and resident of a nursing home (for a full description of the tool, see Quail et al., 2016).The 15-min interaction was based on a predetermined scenario of verbal (e.g., comprehension difficulties, word search, confusion) and non-verbal (e.g., crying, shrugging, chuckling) responses that were representative of dementia.Trainees were instructed to have a conversation with the VP to identify any problems he might be experiencing.Following this, a 15-min feedback with a clinical educator was offered to the trainees, aimed at encouraging the trainee to engage in self-reflection, followed by a second 15-min interaction with the elderly patient.The VP was displayed on a large human-sized screen and could interact with the user in natural language, with the user's speech interpreted via a WoZ operator.The strength of this study is that it emphasized, for each clinical encounter, on the analysis of the trainee's verbal and non-verbal behavior based on speech transcription (e.g., "demonstrates awareness of how his/her responses are affecting the communication partner") and video annotation (e.g., "maintains appropriate eye contact").This was coupled with a self-rating by the trainees of their communication skills.Findings revealed an improvement in students' communication skills in the second interaction, confirmed by the improved self-ratings.However, it is not possible here to distinguish the benefit of the simulation on the verbal vs nonverbal level.Although not directly focused on clinician-patient communication training, another related study deserves to be mentioned (Szilas et al., 2019) as it reports the preliminary implementation of an embodied VP simulation tool to support interactions between family caregivers of patients with Alzheimer's disease and the patient himself (i.e., a 65-year-old apathetic woman suffering from an early-stage Alzheimer Disease).

VirtuAlz-A VP Tool for Training Clinician to Communicate With People With Alzheimer's Disease
As mentioned above, the development of simulation tools for training clinicians to communicate with people with dementia is still very limited.Here, we present a virtual VP tool called VirtuAlz that was designed for geriatric health professionals to sensitize them to the basic communication skills needed to interact with elderly patients with Alzheimer's disease.The tool, including the behavior of the Alzheimer's VP, is based on real clinical cases (e.g., medication administration, patient's wandering) derived from a field observation conducted in a geriatric service and an analysis of communication training needs in this field (Becerril-Ortega et al., 2022).
A first 3D prototype including a hospital setting and an 89year-old virtual patient were modeled and displayed on a large human-size screen (Figure 3).To provide a backstory, a patient file is displayed on the screen before the interaction starts.It includes medical information about the patient (i.e., name, age, diagnosis and medical history) and a description of the context before the interaction (e.g., restless night, refusal to eat) and the trainee's objective for the current scenario (e.g., stimulate the patient and ensure that the medication is taken).
Particular attention was paid to generating the predominant competencies of the VP, the implementation of the underlying technology through a WoZ simulation, the evaluation of the tool by trainees, and the automatic monitoring of the users' behavior (see Figure 2).Concretely, the VP could produce verbal (synthesized voice) and non-verbal (body and head movements, gaze direction, facial expressions) behaviors that mimicked an elderly patient with signs of Alzheimer's disease (apathy, memory loss, agitation, aggression, or refusal of care).The trainee could interact in natural language with the VP.In our simulation approach (Figure 3), the WoZ operator (a geriatric expert) selects the verbal and non-verbal behaviors (facial expressions, posture, etc.) to be generated in real time on the VP, based on the verbal and non-verbal behavior of the trainee who was expected to act as in a role-play with a real patient (Benamara et al., 2020).Each action performed by the WoZ to control the VP behavior is recorded and logged.After each session, the educational tool is assessed with questionnaires in terms of system usability, acceptability, VP realism and effectiveness of the educational tool.
In addition, a key aspect of the VirtuAlz platform relates to the automatic evaluation of the trainee's non-verbal behaviors, that are captured by a front-facing camera during the interaction with the VP that lasted on average 6 min.The corpus collected was composed of 29 videos of clinician-VP interactions (each video involved a different clinician, exercising at the geriatric hospital as a physician, psychologist, nurse, or health care provider).We focus our analysis on non-verbal features of the trainee using automated feature extraction.Several non-verbal cues have been shown to capture relevant socio-affective states in similar settings with children (e.g., Delaherche et al., 2013;Avril et al., 2014;Anzalone et al., 2019) or adults (e.g., Aigrain et al., 2016).The main difference with this line of research is that VirtuAlz focuses on the training of clinician to communicate with elderly patient with signs of Alzheimer's disease.Based on the literature mentioned in Section 2.3, we decided to focus on the analysis of the clinician's non-verbal behaviors by considering the following non-verbal cues: body posture (i.e., body openness, computed as the distance between wrists and shoulder of the clinician), proxemics (i.e., physical proximity with the VP), facial expressions based on facial Action Units analysis (i.e., smile with mouth corners pull, AU14-mouth corners depress, AU15-frowning with eyebrows lowered and drawn together, AU4-eyebrow/eyes raising AU1/AU5) or self-touching on the head (i.e., hand position in the zone of the head, see Aigrain et al. (2016).The facial Action Units (AUs) are automatically extracted by OpenFace (Baltrusaitis et al., 2018) while the features related to selftouching on the head, proxemics and body openness are computed from the Body and Hand pose estimation using OpenPose (Cao et al., 2021).This set of non-verbal cues is then transformed into symbols and evaluated throughout the interaction (for more details see Zagdoun et al., 2021) to obtain explainable measures that could be indicators of openness to others, warmth, or empathy (e.g., duration of the clinician's smiles, physical proximity or body opening during the interaction, see Mast and Hall (2017)) or discomfort or stress exhibited by the clinician (e.g., number of times the clinician touches their head, see Harrigan (1985).For the feedback, the symbols are contextualized by considering the behaviors of the virtual patient in order to assess the consistency of the clinician's behaviors (e.g., smiling when approaching the patient, do not appear nervous with overgesturing or excessive self-touching).Although the potential of the VirtuAlz tool has not yet been fully realized, it offers the opportunity for trainees to work on their non-verbal behavior in safe environments.Although the potential of the VirtuAlz tool has not yet been fully realized, it offers the opportunity for trainees to work on their non-verbal behavior in safe environments.Results from the automatic analysis of non-verbal behavior should also allow, during interviews with a clinical educator, to provide finer-grained feedback aimed at encouraging the trainee to engage in self-reflection before interacting with a real patient.

DISCUSSION
As mentioned above, the value of VP-based simulation is that it provides tools that allow clinical teams and researchers to examine and model the verbal and non-verbal behaviors of the clinician/student while manipulating either the socialemotional or cognitive behavior of the VP and the visual appearance of the graphic environments.With good experimental control, these features make VPs powerful, useful, and reliable tools for studying the social communication that is central to clinician-patient interactions.However, in the field of psychiatry and geriatrics (Table 1), research has mainly focused on modeling the VP to simulate specific clinical encounters or on the evaluation of the simulation tool itself (e.g., feasibility, credibility, usability).Among the 14 tools (17 articles) using VP simulation presented in this overview, five used simulation models (e.g., prerecorded videos associated with text-based interfaces) that do not allow for realistic human-like conversations (Carpenter et al., 2012;O'Brien et al., 2019;Shah et al., 2012;Cordar et al., 2014;Pantziaras et al., 2014Pantziaras et al., , 2015)).By contrast, four tools offer immersive experiences using VPs displayed on human-sized screens and interacting in natural language (Ochs et al., 2017(Ochs et al., , 2019;;Dupuy et al., 2020;O'Rourke et al., 2020;Robinson et al., 2020).Note that while tools based on web interfaces or pre-recorded videos may lack realism and fluidity, tools based on natural language and speech recognition, are rarely fully autonomous and currently require the intervention of a human WoZ operator.In addition, our overview reveals a lack of interest in evaluating the learner.Note that in most of the studies, the learner evaluation in terms of knowledge is considered in post-session (except in, Carpenter et al., 2012;O'Brien et al., 2019;Orton and Mulhausen, 2008;Szilas et al., 2019).Interestingly, a few studies propose a follow-up of learners' knowledge several weeks after the interaction (Pantziaras et al., 2015) or an assessment of the transferability of the training to clinical practice (Liaw et al., 2020).By contrast, it was pointed out that an aspect seldom assessed was the learner's verbal (except, Parsons et al., 2008;Kenny et al., 2009a,b;Pantziaras et al., 2014Pantziaras et al., , 2015;;Dupuy et al., 2020) andnon-verbal (except, Dupuy et al., 2020;O'Rourke et al., 2020;Liaw et al., 2020;Robinson et al., 2020) behaviors.Non-verbal behaviors play a crucial role in clinician-patient relationships (Henry et al., 2012).Eye contact, physical proximity, clinician's posture leaning toward the patient, or synchronization of the movements of this dyad may be associated with the length of the visit, the patient's perceived empathy, feeling of trust, or patient self-disclosure (Lorié et al., 2017;Goldstein et al., 2020).One possible reason for this poor consideration of non-verbal behaviors is that the tools used rely primarily on human coding (speech transcript, video annotation), which is time-consuming and labor-intensive (D'Agostino and Bylund, 2011) and highly subjective due to biases of human annotation (Mast and Cousin, 2013).A precursor to developing adequate simulation tools in clinical settings is being able to capture and analyze the user's nonverbal behaviors automatically so that they can be linked in real time to the patient's behavior or for ulterior feedback (e.g., focus group, exchange with a tutor).Such automatic methods (without requiring intensive human coding or specialized training) have already been validated for analyzing the clinician-patient relationship (e.g., Hart et al., 2016;Tan et al., 2020).More generally, we suggest that automated analysis of clinicianpatient interaction could offer a high temporal resolution and fine-grained analysis-sometimes invisible to the clinician's or tutor's eye-to provide feedback to clinicians or students on key aspects of their communication.The link between such finegrained analysis of interaction and learning gain of clinicians has to be investigated.A promising direction is to consider the concept of productive engagement as the level of engagement that maximizes learning Nasir et al. (2021).Lastly, these educational tools should consider the ethical issues surrounding virtual reality research (e.g., risks related to information overload, intensification of arousal with virtual environments and re-entry into the real world, Behr et al., 2005) or human-computer interaction (usefulness in the light of the purpose, Grinbaum et al., 2017;Wullenkord and Eyssel, 2020), and also protection of the user's data and privacy (e.g., audio/video recording of the learner, see Parsons, 2021).

TOWARD BEST PRACTICES AND USES
Importantly, when developing a VP simulation tool, the pedagogical context has a strong impact on the choice of technologies needed to develop the modules for simulating the VP's verbal and non-verbal behaviors.Hence, a simulation tool that focuses on training nonverbal communication skills requires technologies to detect and interpret the trainee's nonverbal behavior (Hoque et al., 2013).
It should also be considered that one of the most challenging parts of any VP application is the development of modules for simulating verbal and nonverbal behaviors.According to the context, they may only require a small subset of non-verbal behaviors such as posture and gestures, without necessarily focusing on facial expressions or gaze (Ochs et al., 2018).However, if the prime pedagogical objective is rather directed to the verbal content of communication or decision-making, a chatbot or a text-based interface may be suitable (Tanaka et al., 2017a).
At the same time, the choices regarding the technologies to be used are dependent on the resources and constraints of the project.Computer screen simulation can provide a multisensory and immersive interactive experience, with the possibility to design a VP that communicates verbally and nonverbally in real time.Alternatively, if the simulation needs to run on smartphones or tablets, technical resources are limited, and heavy computations cannot always be performed (e.g.realtime detection of non-verbal behavior).In such cases, the simulation will rely instead on text and graphical menu interfaces and use predefined animations to control the VP's behavior (Philip et al., 2020).
It also has to be mentioned that several different technologies are available for simulating verbal and nonverbal VP behaviors.In most simulation systems in medical training, a decision tree is used to guide the interaction and trigger the VP's reactions depending on the trainee's choices and/or behavior.The possible VP reactions depends on the specific needs of the simulation, and can display specific characteristics of an agent, such as emotional states or pathological symptoms (Rizzo and Shilling, 2017).These reactions can be triggered from a predefined set of combinations of verbal and non-verbal behaviors, following the unfolding of the scenario script.VP's reactions can also be selected in real-time by a human experimenter using a wizard-of-oz setup according to the specific goals of the training.Finally, VP reactions can be generated automatically using computational models that take into account a set of inputs provided by the user (for a review, see Wang and Ruiz, 2021).From a technical point of view, this latter technique is the most challenging to design.A first category of simulation is based on theories of cognitive FIGURE 3 | Overall architecture of the VirtuAlz platform.Here a user (i.e., clinician) is instructed to interact in a face-to-face clinical encounter with Ms. Dupont, an elderly virtual patient (VP) with Alzheimer's disease symptoms.The 10-min interaction concerns taking medication and the VP exhibits behaviors such as agitation, aggression, apathy, or refusal of medication, and that the clinician must address (e.g., stimulate or pacify the patient) in order to persuade them to take their medication.The interface with the VP, including the hospital room, is displayed on a large screen (55′).The VP is controlled by a Wizard-of-Oz (WoZ) operator and equipped to record the user's non-verbal behaviors (i.e., through a video camera and microphone) to modulate the evolution of the interaction.
psychology and allows emulating emotional states or a personality appropriate for a given context (Jones and Sabouret, 2013;DeVault et al., 2014), or even pathological symptoms (Benamara et al., 2022).A second category of simulation focuses on the verbal content of the interaction and allows interaction with the VP through natural language.In this context, speech-based facial animation can be synthesized from keywords or automatically with machine learning methods.Note that most systems are hybrid, incorporating both emotional models and natural language-based techniques (Vougioukas et al., 2020).In the future, additional non-verbal behaviors could be explored to improve the simulation of social interaction in medical environments.For instance, in medical care, the use of touch plays a crucial role in communicating with patients (Kim and Buschmann, 1999).Thus, the development of haptic interfaces should provide new ways to train social skills related to social touch (for a review, see Pelachaud et al., 2021).
In conclusion, the survey of the literature enables us to propose concrete recommendations for each step of the VP simulation process, including the development of the VP, conduct and evaluation of the training sessions.Designers will be expected to include healthcare professionals in all stages of the VP design in order to develop and implement a VP tailored to their needs and to define ways to improve the tool.
• Use cases-Determine clinical cases that are challenging for clinician/student and patient interactions, and identify learner populations that might be interested in or benefit from training.• Needs-Examine the needs of learners, particularly in difficult situations, by reviewing the literature and conducting field observations in clinical units, interviews and questionnaires.• VP-Design a VP adapted to the needs and constraints of the project.On smartphones/tablets, the VP should be based on predefined animations.On computer screen the VP should be able to communicate autonomously verbally (e.g., question-answer) and non-verbally (e.g., facial expressions, gaze, body movements, gestures, prosody), to promote immersion (taking into account screen size) and human-like interactions.
• Scenario-Develop a narrative scenario, which may include problem-solving and allow multiple training sessions.A non-linear navigation structure (in which the learner's decisions shape the VP's behavior) will ensure flexibility and learner interactivity with the VP.• Feedback-Provide feedback with messages, scores, visual representation of the VP's emotional state.Gamification with leaderboards could help make the learning experience fun and engaging.
• Choices and tradeoffs-Review the available technology solutions for VP simulation and choose the one that seems to best meet the needs of the learners in terms of accessibility, usability, training needs, cost, data security and privacy, technical assistance requirements and sustainability.Authoring tools, if available, could help design, prototype, and deploy VPs in a variety of use cases.
• Feasibility/Usability testing-Test the feasibility of the system with learners, conduct usability testing, and define necessary adaptations of the tool.
Conduct and Assessment of training sessions.
• Training plan-Establish a training program (where and when to use VP simulation) that is tailored to the needs and abilities of the clinicians or students who will be involved in the learning.• Tutorial-Offer specific training on the use of the tool for learners interested in participating in the learning experience.The availability of educational material adapted to this objective (e.g., tutorials) can help to enhance understanding of the tool's use.• Debriefing-Conduct VP learning sessions followed by debriefing with a tutor at the end of the session to allow learners to discuss the use case and ask questions.

FIGURE 1 |
FIGURE 1 | Simple classification of healthcare simulations.Procedural simulation is based on manikins that represent a full or partial human body; Standardized simulation is based on role-playing with patients played by actors in real clinical situations or through videos; Virtual reality simulation is based on computer technology to create interactive virtual worlds that the user can interact with (e.g., serious games, conversational chatbots or Embodied Virtual Agents.).

•
Training effectiveness-Evaluate the learning gain pre-and post-session with questionnaires.• Non-verbal communication-Provide tools to measure and interpret non-verbal characteristics of clinician-patient communication.An automated or semi-automated tool (commercial software) for measuring and analyzing nonverbal communication can provide additional value over manual annotation.• Follow-up-Define a way to follow-up on the VP learning at the individual and institutional level (university, hospital) to identify necessary modifications.