Automatic Thoughts and Facial Expressions in Cognitive Restructuring With Virtual Agents

Cognitive restructuring is a well-established mental health technique for amending automatic thoughts, which are distorted and biased beliefs about a situation, into objective and balanced thoughts. Since virtual agents can be used anytime and anywhere, they are expected to perform cognitive restructuring without being influenced by medical infrastructure or patients' stigma toward mental illness. Unfortunately, since the quantitative analysis of human-agent interaction is still insufficient, the effect on the user's cognitive state remains unclear. We collected interaction data between virtual agents and users to observe the mood improvements associated with changes in automatic thoughts that occur in user cognition and addressed the following two points: (1) implementation of a virtual agent that helps a user identify and evaluate automatic thoughts; (2) identification of the relationship between a user's facial expressions and the extent of the mood improvement subjectively felt by users during the human-agent interaction. We focus on these points because cognitive restructuring by a human therapist starts by identifying automatic thoughts and seeking sufficient evidence to find balanced thoughts (evaluation of automatic thoughts). Therapists also use such non-verbal behaviors as facial expressions to detect changes in a user's mood, which is an important indicator for guidance. Based on the results of this analysis, we provide a technical guidance framework that fully automates the identification and evaluation of automatic thoughts to achieve a virtual agent that can interact with users by taking into account their verbal and non-verbal behaviors in face-to-face situations. This research supports the possibility of improving the effectiveness of mental health care in cognitive restructuring using virtual agents.

1 Nara Institute of Science and Technology,Nara,Japan,2 Osaka University,Osaka,Japan Cognitive restructuring is a well-established mental health technique for amending automatic thoughts, which are distorted and biased beliefs about a situation, into objective and balanced thoughts. Since virtual agents can be used anytime and anywhere, they are expected to perform cognitive restructuring without being influenced by medical infrastructure or patients' stigma toward mental illness. Unfortunately, since the quantitative analysis of human-agent interaction is still insufficient, the effect on the user's cognitive state remains unclear. We collected interaction data between virtual agents and users to observe the mood improvements associated with changes in automatic thoughts that occur in user cognition and addressed the following two points: (1) implementation of a virtual agent that helps a user identify and evaluate automatic thoughts; (2) identification of the relationship between a user's facial expressions and the extent of the mood improvement subjectively felt by users during the human-agent interaction. We focus on these points because cognitive restructuring by a human therapist starts by identifying automatic thoughts and seeking sufficient evidence to find balanced thoughts (evaluation of automatic thoughts). Therapists also use such non-verbal behaviors as facial expressions to detect changes in a user's mood, which is an important indicator for guidance. Based on the results of this analysis, we provide a technical guidance framework that fully automates the identification and evaluation of automatic thoughts to achieve a virtual agent that can interact with users by taking into account their verbal and non-verbal behaviors in face-to-face situations. This research supports the possibility of improving the effectiveness of mental health care in cognitive restructuring using virtual agents.

INTRODUCTION
Cognitive restructuring, an established therapeutic technique that reduces the effects of negative thoughts, is a cognitive behavior therapy (CBT) method (Beck and Beck, 2011). Some proposed virtual agents employ CBT methodology with mixed degrees of effectiveness (Laranjo et al., 2018;Montenegro et al., 2019). Several types of agent dialogues have also been proposed, such as text-based dialogues in the style of messaging apps (Fitzpatrick et al., 2017;Ly et al., 2017;Fulmer et al., 2018;Inkster et al., 2018;Suganuma et al., 2018) and virtual agents represented by animated computer characters (Ring et al., 2016;Kimani et al., 2019). Virtual agents have many advantages because they can provide face-to-face multimodal interactions like CBT with human therapists. In actual psychotherapy, the therapist understands the patient's condition not only through words but also through such non-verbal communication as facial expressions, speaking rhythms, and gestures (Koole and Tschacher, 2016). References to patients' facial expressions are also made in the CBT literature in clinical practices (Beck and Beck, 2011). If a virtual agent, like a psychiatrist, can recognize both verbal and non-verbal behaviors, it might approach the level of cognitive restructurings that humans do with each other. Therefore, dialogues with virtual agents can facilitate healthcare dialogues using non-verbal communication like facial expressions and voices, similar to human dialogues. According to a 2019 survey on the use of virtual agents in health contexts (Montenegro et al., 2019), some studies refer to medical datasets to study non-verbal behavior. For example, virtual agents have been proposed to provide social skills training based on the analysis of non-verbal behavior in autism spectrum disorders (Tanaka et al., 2017). Such cases remain rare, and to the best of our knowledge, no agent has been created based on analysis of the non-verbal features of cognitive restructuring.
In addition to virtual agents, CBT-based technical approaches have been proposed. Several studies have begun to report CBT adaptation to robot-to-human interactions. Dino et al. (2019) conducted a cohort study that provides CBT-based interactions to a small number of participants. Akiyoshi et al. (2021) applied CBT theory to promote self-disclosure in robotto-human interactions. Human-to-human CBT within virtual reality has also been studied. Wallach et al. (2009) performed a randomized clinical trial of CBT on virtual reality. Lindner et al. (2019) presented a perspective on the usefulness and challenges of virtual reality in CBT.
Cognitive restructuring identifies the biased and useless ideas mistakenly used by patients to solve their current problems (Beck and Beck, 2011). In cognitive restructuring, a patient evaluates her automatic thoughts by evaluating their accuracy through a fact-based evaluation. Then the therapist asks some factfinding questions and encourages the evaluation of automatic thoughts. Worksheets called thought records are often used in cognitive restructuring for personal use and allow therapists to reconstruct cognition. After the original thought record concept was proposed (Beck and Beck, 2011), Greenberger's updated version (Greenberger and Padesky, 2015) also became widely used. An important advance in Greenberger's thought records was to include Socratic questioning for evaluating a patient's expression of automatic thoughts. Questions are taken from Beck and Beck (2011) and can be widely used for general automatic thoughts.
CBT can treat such mental health conditions as anxiety and depression in members of the general population. In this paper, our system is intended for daily mental health care for members of the general population. Our final goal is to contribute to a mental health care system that provides cognitive behavior therapy to members of the general population before the onset of illness. In our data collection, our participants are graduate/undergraduate students who have not been diagnosed with depressive disorders. We analyzed humanagent interactions and focused on the concept of automatic thoughts as used in cognitive restructuring. Automatic thoughts are interpretations that occur instinctively, depending on the situation. In the theory of cognitive restructuring, a situation and its reaction, such as moods, are not directly linked; automatic thoughts mediate them. In other words, the situation itself does not cause negative moods. Negative moods are caused by automatic thoughts that surface during specific situations. Cognitive restructuring improves a person's moods by confirming the bias of automatic thoughts concerning the situation and correcting them. In cognitive restructuring, the therapist first makes the patient aware of (identifies) automatic thoughts and then considers whether they are negative/biased or factual. If an automatic thought is negative/biased, effective cognitive restructuring can lead to valid/factual thoughts. Patients suffering from biased thoughts can improve their mood by eliciting new thoughts.
Since the reliable identification of automatic thoughts by patients/users is essential for evaluating them, we propose a system that helps users identify their automatic thoughts. When a virtual agent asks a user for automatic thoughts, it automatically determines whether their answers are automatic thoughts, and if it fails, this leads to successful identification.
The objective of this study is to observe the mood improvements associated with changes in automatic thoughts that occur in user cognition. We help users reliably identify automatic thoughts and investigate which questions in cognitive restructuring affect the user's moods during human-agent interaction. We hypothesize that an item called the evaluation of automatic thoughts affects the user's transformation and improves his negative moods.
This work makes the following two contributions: 1. Implemented a virtual agent that helps users identify and evaluate automatic thoughts. 2. Analyzed the relationship between user facial expressions and the extent of mood improvement subjectively felt by users during the human-agent interaction.
This paper is an extended version of our previously published works (Shidara et al., 2020(Shidara et al., , 2021. It synthesizes our prior works and includes some extensions and detailed explanations. Specifically, we extend the previous works by applying machine learning models to classify users' utterances about an automatic thought in Section 2.2. In addition, we added new participants in Section 3 to increase the experiment reliability on the relationship between the evaluation of automatic thoughts and mood improvement.

IDENTIFICATION OF AUTOMATIC THOUGHTS AND RELATIONSHIPS BETWEEN MOOD CHANGES AND FACIAL EXPRESSIONS
This section describes our data collection with a virtual agent system. Our data include the text, video, and speech of users. We implemented an automatic thought classifier using the text data and analyzed the facial expressions of human-agent interactions. The system's interface consists of five parts: a camera, a microphone, a keyboard, speakers, and a display. The camera is used only for face orientation and voice recording, and the microphone is used only for voice recording. The display projects the virtual agent from the chest up. The facial expressions and postures of the virtual agents are default settings. An interaction is composed of alternating questions from the virtual agent and the answers of one user. When the user presses the keyboard, the virtual agent's question is heard from the speaker. At this time, the lip-sync manager generates lip movements that match the virtual agent's voice. After listening to the virtual agent's question, the user answers it. After completing her answer, she presses a key on the keyboard for the next question.
After the data collection, we analyzed the mood strength, automatic thoughts, and changes in the users' facial expressions during their interactions with the virtual agent. We focused on reducing the variables involved in the interaction analysis as much as possible. In our data collection, the system asked fixed questions without commenting on the user responses.
We created a scenario based on cognitive restructuring (Beck and Beck, 2011). Table 1 shows a system-side scenario created under the supervision of a psychiatrist (sixth author). We implemented this scenario in a virtual agent module with MMDAgent (Lee et al., 2013), which is a toolkit for building embedded conversational agents that allows voice dialogues. We used such default parameters as facial expressions, body posture, speaking speed, and voice pitch. The virtual agent outputs have spoken language, and the user inputs the natural language of the spoken language through a headset microphone.

Participants in Data Collection
We recruited 23 undergraduate/graduate students as users (10 females and 13 males). The research ethics committees of the Nara Institute of Science and Technology reviewed and approved this experiment (reference number: 2019-I-24-2). Written informed consent was obtained from all users before the experiment. The first author explained it to them and obtained their informed consent. We also confirmed that they had no severe mental illness based on the Kessler Psychological Distress Scale (K6) (Kessler et al., 2002). K6 is an inventory that does not distinguish between depression and anxiety disorders and investigates the tendency of mental illness. K6 scores range from 0 to 24, and the cutoff for our experiment inclusion was 13. All of the users' K6 scores were lower than the cutoff [mean (M) = 7.18, standard deviation (SD) = 4.22].

Analysis of Automatic Thoughts
In Table 1, Q4 identifies automatic thoughts. For example, the user's answer to Q4 is her automatic thought. After collecting the data, we extracted the answers to each user's Q4, and a psychiatrist who has clinical experience in cognitive restructuring labeled them based on whether they corresponded to an automatic thought. We identified the following three patterns in which the identification of automatic thoughts failed: • Users confuse automatic thoughts with facts like "I can't decide what kind of job to look for." • Users confuse their mood with such automatic thoughts as "I don't want to do anything." • Users evaded answering questions with such responses as "Nothing" or "I don't understand." If the user's answer to Q4 includes an automatic thought, its identification was deemed successful even if the user mentioned such collateral aspects as mood and situation. In the labeling, the psychiatrist determined that eight of 23 users failed to identify their automatic thoughts. This study implemented a novel classification model. Based on the methodology of cognitive restructuring, we aim to build basic technology that enables virtual agents to shape automatic thoughts. Human therapists use a variety of questions to help patients who struggle to identify automatic thoughts. By identifying such thoughts through questions, the next step can be taken: their evaluation.
Evaluating an automatic thought means judging whether the automatic thought identified by the patient is based on negatively distorted cognition or factual validity. If an automatic thought is distorted, it can be evaluated and modified to suit a particular situation. Changing distorted thoughts to balanced thoughts generally improves a person's mood. On the other hand, cognitive restructuring does not improve a mood when an automatic thought is not negatively distorted. Therefore, after cognitive restructuring, action must be taken to solve the actual problem. Both therapists and patients can only determine if their automatic thoughts are distorted after undergoing cognitive restructuring.
Since the reliable identification of automatic thoughts by patients/users is essential for evaluating them, we propose a system that helps virtual agents identify the automatic thoughts of users. When a virtual agent asks users for automatic thoughts, this system automatically classifies whether their answers are automatic thoughts, and if the identification by users fails, this leads to successful identification. To realize this system, we built a classification model of sentences of automatic thoughts. By using the data (of the 23 participants) collected by the virtual agent to build a classification model, we achieved a highly practical classification performance.
We also added to the learning data an example sentence ( Table 2) about automatic thoughts from a self-help book (Greenberger and Padesky, 2015). This list of example sentences is a worksheet with which general readers can acquire skills for identifying automatic thoughts. Thirty-three sentences are labeled as situations, moods, or thoughts. In CBT, situations and moods are different concepts from thoughts, although distinguishing among them is difficult unless one is familiar with them. We used thought as a successful identification and situation and mood as unsuccessful ones. Table 3 shows the overview of the two datasets used to build the model.
The classifier algorithm was a support vector machine (linear kernel). The feature extraction method used in this paper's classification model has distributed representation by the Term Frequency-Inverse Document Frequency (TF-IDF) and Bidirectional Encoder Representations from Transformers (BERT) (Devlin et al., 2018). TF-IDF calculates importance by considering both the common and uncommon words in the dataset. When calculating TF-IDF, we used a morphological analyzer called MeCab 1 to divide the data. In vectorization, only the vocabulary in the training data is counted; unknown words are not counted. BERT is a pre-trained language model for language comprehension. In this study, we tokenized the raw text and directly input the part-of-speech tagging for each token into the model. During tagging, special tokens are placed at the beginning and ending of sentences. A token named [CLS] is placed at the beginning of a sentence, and [SEP] is placed at its end. These tokens are placed in all input sentences. A sequence with a [CLS] token added to its beginning is input to BERT, and the output vector corresponding to the [CLS] token is used as the feature quantity. The BERT hidden vector has 768 dimensions. Table 4 shows the classification results. The best F1-score was 0.88 when we used sentences from Greenberger and Padesky (2015) as training data and TF-IDF as the features.

. Mood Scores and Mood Changes
We obtained the mood scores of participants by asking them twice about their negative mood intensities. The first score is Q3 and the second is Q14, both of which are shown in Table 1. We analyzed the changes in their moods and the corresponding changes in their facial expressions caused by the two ratings. We described the users' moods with such labels as anxious, depressed, sad, inferior, and fatigued. In this paper, we uniformly lumped all such feelings under the rubric of negative moods and only evaluated them by mood scores to focus on the changes themselves. Based on a related work (Persons and Burns, 1985), we calculated the mood changes using the following formula: Mood change = (mood score at beginning) − (mood score at end) (mood score at beginning) . (1) In this research, we implemented a virtual agent that automatically interacted with users and collected and experimented with data. We used these interactions for the mood  (Greenberger and Padesky, 2015, p. 47-49 scores to evaluate the dialogues. We did not use a standardized rating scale like K6 because our experiment was comprised of just one session. Such rating scales as K6 are unsuitable for assessing interactions in a single session because such evaluation scales assume that they will be re-measured at regular intervals. For example, K6 requires a one-month interval. Instead, a mood score is used in actual cognitive restructuring to measure mood improvement around a single session. Even though it is not a standardized measure, it is the most reasonable way to analyze one interaction session. Furthermore, a previous study (Persons and Burns, 1985) analyzed the cognitive restructuring of human therapists and patients and used mood scores as an evaluation index for a session. In our study, we calculated mood changes, which is the amount of change in the mood scores before and after a session, by referring to Persons and Burns (1985) and using it as an evaluation index. First, the users read a publically available explanation of CBT 2 that is designed for both clinical and general public uses. Then the first author explained how to use the system. Data were collected using a laptop PC (HP Probook) in a quiet room where the users sat alone. We recorded their facial expressions with its built-in camera. The completion time fluctuated based on the amount of speaking done by the user.

Extracting Facial Action Units
Based on the Facial Action Coding System (FACS) (Ekman and Friesen, 1978), a quantitative method for describing facial movements, our analysis used action units (AUs), which are partial units of facial expressions. AUs were automatically extracted with OpenFace (Baltrušaitis et al., 2016), a facial expression analysis tool. We analyzed the 17 AUs listed in Table 5. The extracted AUs were represented as intensity information of continuous values from 0 to 5. We extracted 17 types of AUs for each facial part from the video. Data processing for each AU was done independently. The data processing process for each AU is shown below: a). The AUs were extracted from the videos. b). OpenFace calculated the reliability of the successful face recognitions. Frames with less than 70% reliability were removed. c). The virtual agent and user turns were separated, and the virtual agent turns were ignored. d). The average AUs were calculated in the frame for each extracted user turn. Since there are 14 user turns (Table 1), we calculated the mean value of 14 types per user. Change of AU = (AU at the beginning) − (AU at the end). (2) We analyzed the correlation between the mood and AU changes to clarify the correlation between the relative fluctuations in the mood changes and in the AUs. The AUs at the beginning and the end in Equation (2) are pre-relativized values within the individual. Therefore, the AU change in Equation (2) is a relative scale.
Positive correlations indicate that the greater the mood change, the lower its AU expression. A negative correlation indicates that the larger the mood change is, the greater the expression of that AU before and after.

EVALUATING QUESTIONS FROM THE VIRTUAL AGENT
This section describes our experiment using a virtual agent system. We investigated the effect of the evaluation of automatic thoughts in human-agent interaction and analyzed the relationship between the number of questions that users felt useful and their mood changes.
In the experiments in this section, we improved the system configuration and the scenario of the virtual agent. We employed a 3D animated virtual agent named Shibata (Figure 2) that uses a virtual agent platform, Greta (Niewiadomski et al., 2009), which was modified in-house for Japanese text-to-speech, lipsynching, and Japanese-style animation . This virtual agent is autonomous and outputs questions through a voice. Then the user inputs natural language through a headset microphone. Figure 3 shows the configuration of our virtual agent. Its interface consists of five parts: a camera, a microphone, a speaker, and a display. The camera is used only for recording facial expressions. The microphones are used for voice recording, turn changes, and question selection. An interaction alternates one question from the virtual agent and one user answer. After the virtual agent asks a question, a speech recognition API is activated while the user answers. When the user stops talking for a certain period, the speech recognition API automatically recognizes the end. After obtaining the user's answer, the scenario selects a question from the scenario based on the question management rule. The question text is played back from the speaker as the voice of the virtual agent through text-to-speech. After listening to the virtual agent's question, the user answers it. The display visualizes the virtual agent by showing area above its chest. The facial expressions and postures of the virtual agent are its default settings. It performs no facial expressions or gestures for communication. The lip model and lip blender generate lip movements to match the voice of the virtual agent. Table 6 shows our experimental dialogue scenario. Patients sometimes fail to identify automatic thoughts in cognitive restructuring because this concept is not conventionally recognized (Beck and Beck, 2011). We began our investigation of this guiding function by investigating the percentage of user failures to identify automatic thoughts in cognitive restructuring with virtual agents. We automated the guide to identify a thought by the following procedure. Dialogue control with a classifier is done in Q6 of Table 6.

Questioning to Identify Automatic Thoughts
Experimental scenarios ( Table 6) are created and generated after the data collection scenario ( Table 6). The most crucial difference is that the experimental scenarios were improved for comparison with and without "questions for evaluating automatic thoughts." In Table 6, words of concern (e.g., Q12 and Q13) were included between Q5 (identification of an automatic thought) and Q14 (mood score after change). On the other hand, Table 6 includes only questions for evaluating an automatic thought between Q5 (identification of an automatic thought) and Q14 (mood score after change). The questions for evaluating an automatic thought in Table 6 were all taken from the same source (Beck and Beck, 2011).
1. The classifier determines whether the user identified a thought. 2. Successful: Move to the next item. 3. Unsuccessful: Ask a question to guide identification of the thought and ask the user to answer again. 4. Unsuccessful: Make a judgment with the automatic thought classifier. If it is unsuccessful again, provide another hint.
If the user fails to identify the automatic thought, the virtual agent provides a hint and asks the user to answer again: So, what did you think about yourself in that situation?. Six hints are available, and up to six unsuccessful attempts to identify the automatic thought can be handled. The six hints are given to all users in the same order. If a user fails seven times, he must proceed to the next item. The hints are taken from Beck and Beck (2011).

Questioning to Evaluate and Modify Automatic Thoughts
We evaluated automatic thoughts using the nine questions (Q6-Q14) shown in Table 6. We prepared two dialogue scenarios: one that asked all the questions and another that omitted questions Q7-Q13. Our experiment hypothesizes that evaluating automatic thoughts improves a user's mood. We prepared these seven questions to guide the evaluation of automatic thoughts and asked them in a round-robin style, assuming that one of them would be effective for the user. To clarify the hypothesis, we compared two scenarios (Figure 4): Group A, questions FIGURE 3 | Configuration diagram of our virtual agent in experiment.
Frontiers in Computer Science | www.frontiersin.org

Participants
We recruited 32 graduate students as users: 19 in Group A and 13 in Group B. The research ethics committee of the Nara Institute of Science and Technology reviewed and approved this experiment (reference number: 2019-I-24-2). Written informed consent was obtained from all the users in it. We also confirmed with K6 tests that they suffered from no severe mental illness tendencies. The K6 score for Group A was M = 4.74, SD = 4.48, and for Group B it was M = 4.46, SD = 2.57. There was no significant difference in the K6 scores between the two groups.

Procedure
The same procedure was applied to both groups. First, the users read a publically available leaflet that explained CBT 3 and how it is designed for both clinical and general applications. Next 4. RESULTS Figure 5 shows the improvement of the negative moods in both groups. The questioning with which the automatic thoughts were evaluated significantly affected the increases in the negative mood scores (p = 0.036, Hedge's g = 0.769). In Group A, the following are the results: the mood score at the start was M = 55.0, SD = 23.3; the mood score at the end was M = 34.1, SD = 18.6; the mood change was M = 0.38, SD = 0.22. In Group B, the following are the results: the mood score at the start was M = 42.8, SD = 26.5; the mood score at the end was M = 32.2, SD = 18.0; the mood change was M = 0.13, SD = 0.32. We found no significant difference in the mood intensity between the two groups at the beginning and the end of the study.

Questions Deemed Helpful by Participants
Group A evaluated questions Q6-Q14 in Table 6 as helpful for discovering new thoughts. At the end of each user's dialogue, the users were given a questionnaire that contained the virtual agent's questions and rated each question as either helpful or unhelpful.
FIGURE 5 | Boxplot of changes in mood scores between Groups A and B. The symbol * means the significance level α = 0.05 in the unpaired t-test.
FIGURE 6 | Distribution of helpful questions (n = 19, multiple answers possible). Figure 6 shows the distribution of the helpful questions. Q12 and Q13 were rated the most helpful by users; Q9 and Q10 were rated the least useful. One reason for this result is the difference in the intentions of the questions. Q9 and Q10 asked the users to dig deeper into their automatic thoughts. On the other hand, Q12 and Q13 examined automatic thoughts from a new perspective. Perhaps questions that change the user's perspective contribute more to modifying automatic thoughts.

Correlation Between Mood Scores and Number of Helpful Questions
We investigated the correlation between the mood scores and the number of helpful questions for Group A in the evaluation of automatic thoughts. Immediately after each interaction, the users filled out a questionnaire and answered whether they felt each question was helpful. We calculated the correlation between the number of helpful questions for each user and the mood changes. We performed a Spearman's rank correlation coefficient test for these analyses. The results showed a strong correlation between the number of helpful questions and mood changes (ρ = 0.63, p = 0.0035; Figure 7).

DISCUSSION
The dialogues of the virtual agent using Table 1 significantly improved the subjects' negative moods. Our work is the first to analyze how facial movements, which are related to mood changes, are affected by interaction with a virtual agent. In addition, we identified the AUs that are probably influenced by mood changes. Our finding is expected to contribute to research on virtual agents that recognize facial movements and the basic research of human facial expression analysis. The correlation of facial action units and mood changes implies that cognitive restructuring care with a virtual agent can change users' moods, and such changes appear as facial expressions in proportion to mood improvements.
However, since it remains unclear whether virtual agents are better suited to promoting the expression of moods, analyzing facial expressions among different dialogue styles is another step in our future work. Since interpreting the reasons for these movements was complicated, we also reviewed the recorded videos. Consequently, those who displayed relatively large mood changes appeared to think more deeply when answering the mood score at Q14 than at Q3 in Table 1. Therefore, we assumed that a contemplative attitude was reflected in raised eyelids and closed mouths. Although we analyzed individual AUs, there was insufficient information to conclude that they were due to mood changes. Therefore, to recognize mood changes more reliably, both facial expressions and such multimodal behavioral indicators as voice and gestures must be analyzed. We have not yet fully verified that the correlation between mood and AU changes is due to factors other than mood.
This study also showed that using questions to evaluate automatic thoughts improved users' moods. Previous studies on CBT interactions (e.g., Fitzpatrick et al., 2017) generally investigated users' moods and depressive tendencies by comparing users with and without virtual agents. However, the factors that influence the effectiveness of such dialogues have not been sufficiently examined. We found that questioning that evaluates automatic thoughts is an important factor in improving negative moods. Furthermore, the more questions that were helpful to the users, the more their moods improved. Asking questions seems to provide new information to users. If gathering information worked well, users probably improved their moods. Therefore, our virtual agent suggests the effectiveness of leveraging the ability of guided modifications of automatic thoughts.
Finally, after experimenting using a virtual agent with a function that expresses automatic thoughts, we obtained an effect that improved moods by evaluation questions. After analyzing the questions to evaluate automatic thoughts, the questions, which were intended to find different viewpoints, tended to be evaluated as more helpful for changing thoughts. The more questions that were helpful, the better the users' moods became. Our results suggest that a virtual agent's effectiveness can be improved by predicting helpful questions.

DATA AVAILABILITY STATEMENT
The datasets presented in this article are not readily available because, the dataset created in this study is private. Requests to access the datasets should be directed to Kazuhiro Shidara, shidara.kazuhiro.sc5@is.naist.jp.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Nara Institute of Science and Technology reviewed and approved this experiment (reference number: 2019-I-24-2). The patients/participants provided their written informed consent to participate in this study.