<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="review-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Artif. Intell.</journal-id>
<journal-title>Frontiers in Artificial Intelligence</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Artif. Intell.</abbrev-journal-title>
<issn pub-type="epub">2624-8212</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/frai.2023.1229805</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Artificial Intelligence</subject>
<subj-group>
<subject>Review</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>A review of the explainability and safety of conversational agents for mental health to identify avenues for improvement</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name><surname>Sarkar</surname> <given-names>Surjodeep</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="corresp" rid="c001"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/2325772/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Gaur</surname> <given-names>Manas</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/2077752/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Chen</surname> <given-names>Lujie Karen</given-names></name>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1535140/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Garg</surname> <given-names>Muskan</given-names></name>
<xref ref-type="aff" rid="aff3"><sup>3</sup></xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Srivastava</surname> <given-names>Biplav</given-names></name>
<xref ref-type="aff" rid="aff4"><sup>4</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1061803/overview"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Department of Computer Science and Electrical Engineering, University of Maryland, Baltimore County</institution>, <addr-line>Baltimore, MD</addr-line>, <country>United States</country></aff>
<aff id="aff2"><sup>2</sup><institution>Department of Information Systems, University of Maryland, Baltimore County</institution>, <addr-line>Baltimore, MD</addr-line>, <country>United States</country></aff>
<aff id="aff3"><sup>3</sup><institution>Department of AI &#x00026; Informatics, Mayo Clinic</institution>, <addr-line>Rochester, MN</addr-line>, <country>United States</country></aff>
<aff id="aff4"><sup>4</sup><institution>AI Institute, University of South Carolina</institution>, <addr-line>Columbia, SC</addr-line>, <country>United States</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Devendra Singh Dhami, Darmstadt University of Technology, Germany</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Shaina Raza, University of Toronto, Canada; Chiradeep Roy, Adobe Systems, United States; Nelson Rangel-Valdez, Instituto Tecnol&#x000F3;gico de Ciudad Madero, Mexico</p></fn>
<corresp id="c001">&#x0002A;Correspondence: Surjodeep Sarkar <email>ssarkar1&#x00040;umbc.edu</email></corresp>
</author-notes>
<pub-date pub-type="epub">
<day>12</day>
<month>10</month>
<year>2023</year>
</pub-date>
<pub-date pub-type="collection">
<year>2023</year>
</pub-date>
<volume>6</volume>
<elocation-id>1229805</elocation-id>
<history>
<date date-type="received">
<day>27</day>
<month>05</month>
<year>2023</year>
</date>
<date date-type="accepted">
<day>29</day>
<month>08</month>
<year>2023</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2023 Sarkar, Gaur, Chen, Garg and Srivastava.</copyright-statement>
<copyright-year>2023</copyright-year>
<copyright-holder>Sarkar, Gaur, Chen, Garg and Srivastava</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license></permissions>
<abstract>
<p>Virtual Mental Health Assistants (VMHAs) continuously evolve to support the overloaded global healthcare system, which receives approximately 60 million primary care visits and 6 million emergency room visits annually. These systems, developed by clinical psychologists, psychiatrists, and AI researchers, are designed to aid in Cognitive Behavioral Therapy (CBT). The main focus of VMHAs is to provide relevant information to mental health professionals (MHPs) and engage in meaningful conversations to support individuals with mental health conditions. However, certain gaps prevent VMHAs from fully delivering on their promise during active communications. One of the gaps is their inability to explain their decisions to patients and MHPs, making conversations less trustworthy. Additionally, VMHAs can be vulnerable in providing unsafe responses to patient queries, further undermining their reliability. In this review, we assess the current state of VMHAs on the grounds of user-level explainability and safety, a set of desired properties for the broader adoption of VMHAs. This includes the examination of ChatGPT, a conversation agent developed on AI-driven models: GPT3.5 and GPT-4, that has been proposed for use in providing mental health services. By harnessing the collaborative and impactful contributions of AI, natural language processing, and the mental health professionals (MHPs) community, the review identifies opportunities for technological progress in VMHAs to ensure their capabilities include explainable and safe behaviors. It also emphasizes the importance of measures to guarantee that these advancements align with the promise of fostering trustworthy conversations.</p></abstract>
<kwd-group>
<kwd>explainable AI</kwd>
<kwd>safety</kwd>
<kwd>conversational AI</kwd>
<kwd>evaluation metrics</kwd>
<kwd>knowledge-infused learning</kwd>
<kwd>mental health</kwd>
</kwd-group>
<counts>
<fig-count count="4"/>
<table-count count="2"/>
<equation-count count="0"/>
<ref-count count="142"/>
<page-count count="14"/>
<word-count count="11618"/>
</counts>
<custom-meta-wrap>
<custom-meta>
<meta-name>section-at-acceptance</meta-name>
<meta-value>Machine Learning and Artificial Intelligence</meta-value>
</custom-meta>
</custom-meta-wrap>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="s1">
<title>1. Introduction</title>
<p>Mental illness is a global concern, constituting a significant cause of distress in people&#x00027;s lives and impacting society&#x00027;s health and well-being, thereby projecting serious challenges for mental health professionals (MHPs) (Zhang et al., <xref ref-type="bibr" rid="B140">2022</xref>). According to the National Survey on Drug Use and Health, nearly one in five US adults lives with a mental illness (52.9 million in 2020) (SAMHSA, <xref ref-type="bibr" rid="B108">2020</xref>). The reports released in August 2021 indicate that <italic>1.6 million people</italic> in England were on waiting lists to seek professional help with mental healthcare (Campbell, <xref ref-type="bibr" rid="B12">2021</xref>). The disproportionate increase in the number of patients in comparison to MHPs made it necessary to employ various methods for informative healthcare. These methods included (a) public health forums such as Dialogue4Health, (b) online communities such as the r/depression subreddit on Reddit, (c) Talklife (Kruzan, <xref ref-type="bibr" rid="B68">2019</xref>), and (d) Virtual Mental Health Assistants (VMHAs) (Fitzpatrick et al., <xref ref-type="bibr" rid="B35">2017</xref>). By operating anonymously, these platforms (a, b, c) effectively eliminated the psychological stigma associated with seeking help, which had previously deterred patients from consulting an MHP (Hyman, <xref ref-type="bibr" rid="B56">2008</xref>). Furthermore, the absence of alternative sources for interpersonal interactions led to the necessity of developing Virtual Mental Health Assistants (VMHAs) (Seitz et al., <xref ref-type="bibr" rid="B109">2022</xref>).</p>
<p><bold>VMHAs</bold>: Virtual Mental Health Assistants (VMHAs) are AI-based agents designed to provide emotional support and assist in mental health-related conversations. Their primary objective is to engage in organized conversation flows to assess users&#x00027; mental health issues and gather details about the causes, symptoms, treatment options, and relevant medications. The information collected is subsequently shared with MHPs, to provide insights into the user&#x00027;s condition (Hartmann et al., <xref ref-type="bibr" rid="B50">2019</xref>). VMHAs are a valuable and distinct addition to the mental health support landscape, offering several advantages, including scalability, over conventional methods such as public health forums, online communities, and platforms such as Talklife. VMHAs can provide personalized support (Abd-Alrazaq et al., <xref ref-type="bibr" rid="B1">2021</xref>), real-time assistance (Zielasek et al., <xref ref-type="bibr" rid="B141">2022</xref>), anonymity and privacy (Sweeney et al., <xref ref-type="bibr" rid="B122">2021</xref>), complement human support with continuous availability (Ahmad et al., <xref ref-type="bibr" rid="B2">2022</xref>), and patient health-generated data-driven insight (Sheth et al., <xref ref-type="bibr" rid="B114">2019</xref>).</p>
<p>Despite the proliferation of research at the intersection of clinical psychology, AI, and NLP, VMHAs missed an opportunity to serve as life-saving contextualized, personalized, and reliable decision support during COVID-19 under the <italic>apollo</italic> moment (Czeisler et al., <xref ref-type="bibr" rid="B22">2020</xref>; Srivastava, <xref ref-type="bibr" rid="B118">2021</xref>). During the critical period of COVID-19&#x00027;s first and second waves, known as the &#x0201C;Apollo moment&#x0201D;, VMHAs could have assisted users in sharing their conditions, reducing their stress levels, and enabling MHPs to provide high-quality care. However, their capability as simple information agents, such as suggesting meditation, relaxation exercises, or providing positive affirmations, fell short in effectively bridging the gap between monitoring the mental health of individuals and the need for in-person visits. As a result, trust in the use of VMHAs was diminished.</p>
<p><bold>Trustworthiness in VMHAs</bold>: In human interactions, <italic>Trust</italic> is built through consistent and reliable behavior, open communication, and mutual understanding. It involves a willingness to rely on someone or something based on their perceived competence, integrity, and reliability. Trustworthiness is often established and reinforced over time through interactions and experiences. In the context of AI, trustworthiness takes on new dimensions and considerations. Ensuring trustworthiness in AI has traditionally been a focus within human interactions and studies. However, as the collaboration between AI systems and humans intensifies, trustworthiness is gaining greater significance in the AI context, particularly in sensitive domains such as mental health. To this end, growing concerns about (misplaced) <italic>trust</italic> on <italic>VMHA</italic> for <italic>Social Media</italic> (tackling mental health) hampers the adoption of AI techniques during emergencies such as COVID-19 (Srivastava, <xref ref-type="bibr" rid="B118">2021</xref>). This inadequacy has prompted the community to develop a question-answering dataset for mental health during COVID-19, aiming to train more advanced VMHAs (Raza et al., <xref ref-type="bibr" rid="B98">2022</xref>). A recent surge in the use of ChatGPT, in particular for mental health, is emergent for providing crucial personalized advice without clinical explanation, which can hurt user&#x00027;s <italic>safety</italic>, and thus <italic>trust</italic> (Sallam, <xref ref-type="bibr" rid="B107">2023</xref>). In the study by Varshney (<xref ref-type="bibr" rid="B126">2021</xref>), the author identifies the support for human interaction and explainable alignment with human values as essential for Trust in AI systems. To holistically contribute toward <italic>trustworthy</italic> behavior in a conversational approach in mental health, there is a need to critically examine VMHAs, as a prospective tool to handle safety and explainability.</p>
<p>This is the first comprehensive examination of VMHAs, focusing on their application from the perspective of end-users, including mental health professionals and patients, looking for both understandable outcomes and secure interactions. The review addresses five main research questions as follows: (i) Defining the concepts of explainability and safety in VMHAs. (ii) Assessing the current capabilities and limitations of VMHAs. (iii) Analyzing the current state of AI and the challenges in supporting VMHAs. (iv) Exploring potential functionalities in VMHAs that patients seek as alternatives to existing solutions. (v) Identifying necessary evaluation changes regarding explainability, safety, and trust. <xref ref-type="fig" rid="F1">Figure 1</xref> visually presents the scope of the review, explicitly designed to emphasize on generative capabilities of current AI models, exemplified by the remarkable ChatGPT. However, the progress was made without keeping in sight two concerns related to safety and explainability: Fabrication and Hallucination. While these problems already exist in smaller language models, they are even more pronounced in larger ones. This concern motivated us to create a functional taxonomy for language models, with two distinct directions of focus: (a) <italic>Low-level abstraction</italic>, which centers around analyzing linguistic cues in the data. (b) <italic>High-level abstraction</italic>, concentrates on addressing the end-user&#x00027;s primary interests. The research in category (a) has been extensively conducted on social media. However, there is a lack of focus on active communication, which is precisely the area of interest in this survey. As for high-level abstraction, current approaches such as LIME (Ribeiro et al., <xref ref-type="bibr" rid="B100">2016</xref>) have been employed, but it is crucial to explore further, considering the different types of users.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p>Functional taxonomy of mental health conversations. The blocks with black outlines define the scope of this review, and the dotted red line highlights the growing emphasis on question/response generation in mental health conversations between VHMAs and users with mental health conditions. A high-level discourse analysis demands focus on user-level explainability and safety, whereas a low-level analysis focuses on achieving clinically grounded active communications. The light gray blocks and text present the work in the past and are referred in the review.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="frai-06-1229805-g0001.tif"/>
</fig>
<p>Achieving these goals in VMHAs demands incorporating clinical knowledge, such as clinical practice guidelines and well-defined evaluation criteria. For instance, <xref ref-type="fig" rid="F2">Figure 2</xref> shows contextualization in VMHA while generating questions and responses. Furthermore, it requires VMHAs to indulge in <italic>active communication</italic>, which is required to motivate users to keep using VMHA services. MHPs and government entities have advocated this as the required functionality to address the issue of growing patient population and limiting healthcare providers (Cheng and Jiang, <xref ref-type="bibr" rid="B14">2020</xref>).</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p><bold>(Left)</bold> The results achieved by current VMHAs such as WoeBot, Wysa, and general-purpose chatbots such as ChatGPT. <bold>(Right)</bold> An example of an ideal VMHA is a knowledge-driven conversational agent designed for mental health support. This new VMHA utilizes questions based on the Patient Health Questionnaire-9 (PHQ-9) to facilitate a smooth and meaningful conversation about mental health. By incorporating clinical knowledge, the agent can identify signs of mental disturbance in the user and notify MHPs appropriately.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="frai-06-1229805-g0002.tif"/>
</fig></sec>
<sec id="s2">
<title>2. Scope of survey</title>
<p>Previous data-driven research in mental health has examined social media to identify fine-grained cues informing the mental health conditions of an individual and, in turn, have developed datasets (Uban et al., <xref ref-type="bibr" rid="B125">2021</xref>). These datasets capture authentic conversations from the real world and can be used in training VMHAs to screen users&#x00027; mental health conditions. The current datasets typically have a foundation in psychology but are crowd-sourced rather than explicitly derived from clinically grounded guidelines of psychiatrists. We argue that semantic enhancements in VMHA with clinical knowledge and associated guidelines, if they remain under-explored, may miss the hidden mental states in a given narrative which is an essential component of question generation (Gaur et al., <xref ref-type="bibr" rid="B39">2022a</xref>; Gupta et al., <xref ref-type="bibr" rid="B46">2022</xref>). To ensure that VMHAs are both safe and understandable, these datasets need to be semantically enhanced with clinically grounded knowledge [e.g., MedChatbot (Kazi et al., <xref ref-type="bibr" rid="B61">2012</xref>)] or clinical practice guidelines [e.g., Patient Health Questionnaire (PHQ-9) (Kroenke et al., <xref ref-type="bibr" rid="B67">2001</xref>)]. In this section, we explore the state of research in explainability and safety in conversational systems to ensure trust (Hoffman et al., <xref ref-type="bibr" rid="B54">2018</xref>).</p>
<sec>
<title>2.1. Explanation</title>
<p>Conversations in AI are possible with large language models (LLMs) [e.g., GPT-3 (Floridi and Chiriatti, <xref ref-type="bibr" rid="B36">2020</xref>), ChatGPT (Leiter et al., <xref ref-type="bibr" rid="B72">2023</xref>)], which are established as state-of-the-art models for developing intelligent agents that chat with the users by generating human-like questions or responses. In most instances, the output generated by LLMs tends to be grammatically accurate, but it often lacks factual accuracy or clarity. To this end, Bommasani et al. (<xref ref-type="bibr" rid="B9">2021</xref>) reports hallucination and harmful question generations as unexpected behaviors shown by such LLMs and are referred to as black box models by other authors (Rai, <xref ref-type="bibr" rid="B96">2020</xref>). Bommasani et al. (<xref ref-type="bibr" rid="B9">2021</xref>) further characterize <italic>hallucination</italic> as a generated content that <italic>deviates</italic> significantly from the subject matter or is unreasonable. Recently, <italic>Replika</italic>, a VMHA, augmented with a GPT-3, provides meditative suggestions to a user expressing self-harm tendencies (Ineqe, <xref ref-type="bibr" rid="B57">2022</xref>). The absence of any link to a factual knowledge source that can help LLMs reason on their generation introduce what is known as the &#x0201C;<italic>black box</italic>&#x0201D; effect (Rudin, <xref ref-type="bibr" rid="B106">2019</xref>). The consequences of the black box effect in LLMs are more concerning than their utility, particularly in mental health. For example, <xref ref-type="fig" rid="F3">Figure 3</xref> presents a scenario where ChatGPT advises the user about <italic>toxicity in drugs</italic>, which may have a negative consequence. The above analysis supports the critical need for an explainable approach to the decision-making mechanism of VMHAs. According to Weick (<xref ref-type="bibr" rid="B130">1995</xref>), the explanations are human-centered sentences that signify the reason or justification behind an action and are understandable to a human expert. While there are various types of explanations, it is essential to focus on user-level explainability (Bhatt et al., <xref ref-type="bibr" rid="B7">2020</xref>; Longo et al., <xref ref-type="bibr" rid="B80">2020</xref>) rather than system-level explainability, as demonstrated through LIME (Ribeiro et al., <xref ref-type="bibr" rid="B100">2016</xref>), SHAP (Lundberg and Lee, <xref ref-type="bibr" rid="B81">2017</xref>), and Integrated Gradients (Sundararajan et al., <xref ref-type="bibr" rid="B121">2017</xref>). The users interacting with the VMHAs may need more systematic information than just decision-making. Thus, this survey focuses more on &#x0201C;<italic>User-level Explainability</italic>&#x0201D;.</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p>A conversational scenario in which a user asks a query with multiple symptoms. Left is a set of generated questions obtained by repetitive prompting ChatGPT. Right is a generation from ALLEVIATE, a knowledge-infused (KI) conversational agent with access to PHQ-9 and clinical knowledge from Mayo Clinic.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="frai-06-1229805-g0003.tif"/>
</fig>
<p><bold>User-level explainability (UsEx)</bold>: The sensitive nature of VMHAs raises <italic>safety</italic> as a significant concern of conversational systems as it may trigger a negative consequence. For instance, <xref ref-type="fig" rid="F2">Figure 2</xref> presents a real-world query from a user, which was common during the COVID-19 recession. In response to the query, the existing VMHAs: Woebot (Fitzpatrick et al., <xref ref-type="bibr" rid="B35">2017</xref>), Wysa (Inkster et al., <xref ref-type="bibr" rid="B58">2018</xref>), and ChatGPT (Leiter et al., <xref ref-type="bibr" rid="B72">2023</xref>) initiated a responsive conversation without focusing on the context (e.g., connecting mental health with its symptoms). As a result, we found assumptive questions (e.g., anxiety) and responses from Wysa, Woebot, and ChatGPT with no association with a clinical reference or clinical support. On the other hand, the desired VMHA (a) should capture the relationship between the user query and expert questionnaires and (b) tailor the response to reflect on the user&#x00027;s concerns (e.g., <italic>frustrating</italic> and <italic>disheartening</italic>) about the <italic>long-term unemployment</italic>, which is linked to <italic>mental health</italic> and <italic>immediate user help</italic>.</p>
<boxed-text id="Box1">
<title>User-level Explainability</title>
<p>UsEx refers to an AI system&#x00027;s ability to explain to users when requested. The explanations are given once the AI system has made its decisions or predictions. They are intended to assist users in comprehending the logic behind the decisions.</p>
</boxed-text>
<p>UsEx goes beyond simply providing a justification or reason for the AI&#x00027;s output; it aims to provide traceable links to real-world entities and definitions (Gaur et al., <xref ref-type="bibr" rid="B39">2022a</xref>).</p></sec>
<sec>
<title>2.2. Safety</title>
<p>VMHAs must primarily prioritize safety and also maintain an element of comprehensibility to avoid undesirable outcomes. One way to accomplish this is by modifying VMHA functionality to meet the standards outlined by MHP (Koulouri et al., <xref ref-type="bibr" rid="B65">2022</xref>). <xref ref-type="fig" rid="F3">Figure 3</xref> displays a conversation excerpt exemplifying how a VMHA, equipped with access to clinical practice guidelines such as PHQ-9, generates not only safe follow-up questions but also establishes connections between the generated questions and those in PHQ-9, showcasing UsEx. Such guidelines act as standards that enable VMHAs to exercise control over content generation, preventing generating false or unsafe information. Several instances have surfaced, highlighting unsafe behavior exhibited by chatbots. Such as:</p>
<list list-type="bullet">
<list-item><p>Generating Offensive Content also known as the <italic>Instigator (Tay) Effect</italic>. It describes the tendencies of a conversational agent to display behaviors such as the Microsoft Tay chatbot (Wolf et al., <xref ref-type="bibr" rid="B135">2017</xref>), which went racial after learning from the internet.</p></list-item>
<list-item><p><italic>YEA-SAYER (ELIZA)</italic> effect is defined as the response from a conversational agent to an offensive input from the user (Dinan et al., <xref ref-type="bibr" rid="B27">2022</xref>). People have been proven to be particularly forthcoming about their mental health problems while interacting with conversational agents, which may increase the danger of &#x0201C;<italic>agreeing with those user utterances that imply self-harm</italic>&#x0201D;.</p></list-item>
<list-item><p><italic>Imposter</italic> effect applies to VMHAs that tend to respond <italic>inappropriately</italic> in sensitive scenarios (Dinan et al., <xref ref-type="bibr" rid="B29">2021</xref>). To overcome the imposter effect, Deepmind designed <italic>Sparrow</italic>, a conversational agent that responsibly leverages the live Google search to talk with users (Gupta et. al., <xref ref-type="bibr" rid="B45">2022</xref>). The agent generates answers by following the <italic>23 rules</italic> determined by researchers, such as <italic>not offering financial advice, making threatening statements</italic>, or <italic>claiming to be a person</italic>.</p></list-item>
</list>
<p>In mental health, clinical specifications can serve as a substitute for rules to confirm that the AI model is functioning within <italic>safe limits</italic>. Source for such specifications, other than PHQ-9, are as follows: Systematized Nomenclature of Medicine-Clinical Terms (SNOMED-CT) (Donnelly et al., <xref ref-type="bibr" rid="B31">2006</xref>), International Classification of Diseases (ICD-10) (Quan et al., <xref ref-type="bibr" rid="B94">2005</xref>), Diagnostic Statistical Manual for Mental Health Disorder (DSM-5) (Regier et al., <xref ref-type="bibr" rid="B99">2013</xref>), Structured Clinical Interviews for DSM-5 (SCID) (First, <xref ref-type="bibr" rid="B34">2014</xref>), and clinical questionnaire-guided lexicons. Hennemann et al. (<xref ref-type="bibr" rid="B53">2022</xref>) performs a comparative study on psychotherapy of outpatients in mental health, where an AI model used to build VMHA aligns to clinical guidelines for easy understanding of domain experts through UsEx.</p></sec></sec>
<sec id="s3">
<title>3. Knowledge-infused learning for mental health conversations</title>
<p>Machine-readable knowledge, also referred to as Knowledge Graphs (KGs), is categorized into five forms as follows: (a) lexical and linguistic, (b) general-purpose [e.g., Wikipedia, Wikidata (Vrande&#x0010D;i&#x00107; and Kr&#x000F6;tzsch, <xref ref-type="bibr" rid="B127">2014</xref>)], (c) commonsense [e.g., ConceptNet (Speer et al., <xref ref-type="bibr" rid="B117">2017</xref>)], (d) domain-specific [Unified Medical Language System (Bodenreider, <xref ref-type="bibr" rid="B8">2004</xref>)], and (e) procedural or process-oriented (Sheth et al., <xref ref-type="bibr" rid="B113">2022</xref>). Such knowledge can help AI focus on context and perform actions connected to the knowledge used.</p>
<boxed-text id="Box2">
<title>Knowledge-Infused Learning (KIL)</title>
<p>KIL is a paradigm within the field of AI that aims to address the limitations of current black-box AI systems by incorporating broader forms of knowledge into the learning process. The concept of KIL involves injecting external knowledge, such as domain-specific rules, ontologies, or expert knowledge, into the learning process to enhance the AI model&#x00027;s performance and achieve USEx and safety.</p>
</boxed-text>
<p>We categorize the KIL-driven efforts at the intersection of conversational AI and mental health into two categories as follows:</p>
<sec>
<title>3.1. Knowledge graph-guided conversations</title>
<p>Question answering using KG is seeing tremendous interest from AI and NLP community through various technological improvements in query understanding, query rewriting, knowledge retrieval, question generation, response shaping, and others (Wang et al., <xref ref-type="bibr" rid="B129">2017</xref>). For example, the HEAL KG developed by Welivita and Pu (<xref ref-type="bibr" rid="B133">2022b</xref>) allows LLMs to enhance their empathetic responses by incorporating empathy, expectations, affect, stressors, and feedback types from distressing conversations. By leveraging HEAL, the model identifies a suitable phrase from the user&#x00027;s query, effectively tailoring its response. EmoKG is another KG that connects BioPortal, SNOMED-CT, RxNORM, MedDRA, and emotion ontologies to have a conversation with a user and boost their mental health with food recommendation (Gyrard and Boudaoud, <xref ref-type="bibr" rid="B47">2022</xref>). Similarly, Cao et al. (<xref ref-type="bibr" rid="B13">2020</xref>) developed a suicide KG to train conversational agents capable of detecting whether the user involved in the interaction shows signs of suicidal tendencies (e.g., relationship issues, family problems) or exhibits suicide risk indicators (e.g., suicidal thoughts, behaviors, or attempts) before providing a response or asking further questions. As the conversation unfolds, it becomes necessary to continually update the KG to ensure safety, which holds particular significance in VMHA. Patients may experience varying levels of mental health conditions due to comorbidities and the evolving severity of their condition. Additionally, contextual dynamics may shift during multiple conversations with healthcare providers. Nevertheless, the augmentation of KG demands designing new metrics to examine the safety and user-level explainability through proxy measures such as logical coherence, semantic relations, and others (shown in Section 6.1 and Gaur et al., <xref ref-type="bibr" rid="B40">2022b</xref>).</p></sec>
<sec>
<title>3.2. Lexicon or process-guided conversations</title>
<p>Lexicons in mental health resolve ambiguities in human language. For instance, the following two sentences &#x0201C;I am feeling on edge.&#x0201D; and &#x0201C;I am feeling anxious,&#x0201D; are similar; there is a lexicon with &#x0201C;Anxiety&#x0201D; as a category and &#x0201C;feeling on edge&#x0201D; as its concept. Yazdavar et al. (<xref ref-type="bibr" rid="B138">2017</xref>) created a PHQ-9 lexicon to clinically study realistic mental health conversations on social media. Roy et al. (<xref ref-type="bibr" rid="B105">2022a</xref>) leveraged PHQ-9 and SNOMED-CT lexicons to train a question-generating agent for paraphrasing questions in PHQ-9 to introduce <italic>Diversity in Generation</italic> <bold>(DiG)</bold> (Limsopatham and Collier, <xref ref-type="bibr" rid="B75">2016</xref>).</p>
<p>Using DiG, a VMHA can rephrase its questions to obtain a meaningful response from the user while maintaining engagement. The risk of user disengagement arises if the chatbot asks redundant questions or provides repetitive responses. Ensuring diversity in generation poses a natural challenge in open-domain conversations, but it becomes an unavoidable aspect in domain-specific conversations for VMHAs. One effective approach to address this issue is utilizing clinical practice guidelines and employing a fine-tuned LLM specifically designed for paraphrasing, enabling the generation of multiple varied questions (Roy et al., <xref ref-type="bibr" rid="B105">2022a</xref>).</p>
<p><italic>Clinical specifications</italic><xref ref-type="fn" rid="fn0001"><sup>1</sup></xref> include questionnaires such as PHQ-9 (depression), Columbia Suicide Severity Rating Scale [C-SSRS; suicide (Posner et al., <xref ref-type="bibr" rid="B90">2008</xref>)], Generalized Anxiety Disorder (GAD-7) (Coda-Forno et al., <xref ref-type="bibr" rid="B18">2023</xref>). It provides a sequence of questions clinicians follow to interview individuals with mental health conditions. Such questions are safe and medically adapted. Noble et al. (<xref ref-type="bibr" rid="B86">2022</xref>) developed MIRA, a VMHA with knowledge of clinical specification to meaningfully respond to queries on mental health issues and interpersonal needs during COVID-19. Miner et al. (<xref ref-type="bibr" rid="B85">2016</xref>) leverage Relational Frame Theory (RFT), a procedural knowledge in clinical psychology to capture events between conversations and labels as positive and negative. Furthermore, Chung et al. (<xref ref-type="bibr" rid="B15">2021</xref>) develops KakaoTalk, a chatbot with prenatal and postnatal care knowledge database of Korean clinical assessment questionnaires and responses that enable the VMHA to conduct thoughtful and contextual conversations with users. As a rule-of-thumb, to facilitate DiG, VMHAs should perform a series of steps as follows: (a) identify whether the question asked received an appropriate response from the user to avoid asking the same question, (b) identify all the similar questions and similar responses that could be generated by a chatbot or received from the user, and (c) maintain a procedural mapping of question and responses to minimize redundancy. Recently, techniques such as reinforcement learning (Gaur et al., <xref ref-type="bibr" rid="B40">2022b</xref>), conceptual flow-based question generation (Zhang et al., <xref ref-type="bibr" rid="B139">2019</xref>; Sheth et al., <xref ref-type="bibr" rid="B112">2021</xref>), and use of non-conversational context (Su et al., <xref ref-type="bibr" rid="B120">2020</xref>) (similar to the use of clinical practice guidelines) have been proposed.</p></sec></sec>
<sec id="s4">
<title>4. Safe and explainable language models in mental health</title>
<p>The issue of safety in conversational AI has been a topic of concern, particularly concerning conversational language models such as Blenderbot and DialoGPT, as well as widely-used conversational agents such as Xiaoice, Tay, and Siri. This concern was evident during the inaugural <italic>workshop on safety in conversational AI</italic> (Dinan, <xref ref-type="bibr" rid="B28">2020</xref>). Approximately 70% of workshop attendees doubted the ability of present-day conversational systems that rely on language models to produce safe responses (Dinan, <xref ref-type="bibr" rid="B28">2020</xref>). Following it, Xu et al. (<xref ref-type="bibr" rid="B137">2020</xref>) introduced <italic>Bot-Adversarial Dialogue</italic> and <italic>Bot Baked In</italic> methods to present <italic>safety</italic> in conversational systems. Finally, the study was performed on <italic>Blenderbot</italic>, which had mixed opinions on safety, and <italic>DialoGPT</italic>, which enables AI models to detect unsafe/safe utterances, avoid sensitive topics and provide responses that are gender-neutral. The study utilizes knowledge from Wikipedia (for offensive words) and knowledge-powered methods to train conversational agents (Dinan et al., <xref ref-type="bibr" rid="B30">2018</xref>). Roy et al. (<xref ref-type="bibr" rid="B105">2022a</xref>) develop safety lexicons from PHQ-9 and GAD-7 for safe and explainable functioning of language models. The study showed an 85% improvement in safety across sequence-to-sequence and attention-based language models. In addition, explainability saw an uptake of 23% in terms of safety across the same language models. Similar results were noticed when PHQ-9 was used in explainable training of language models (Zirikly and Dredze, <xref ref-type="bibr" rid="B142">2022</xref>). Given these circumstances, VMHAs can efficiently integrate with clinical practice guidelines such as PHQ-9 and GAD-7, utilizing reinforcement learning. Techniques such as <italic>policy gradient-based learning</italic> can enhance the capability of chat systems in ensuring safe message generation. This can be achieved by employing specialized datasets for response reformation (Sharma et al., <xref ref-type="bibr" rid="B110">2021</xref>) or by utilizing tree-based rewards informed by procedural knowledge in the mental health field as suggested in the study by Roy et al. (<xref ref-type="bibr" rid="B103">2022b</xref>). By incorporating such knowledge, the decision-making ability of AI can be enhanced and better equipped to generate explanations that are more comprehensible to humans (Joyce et al., <xref ref-type="bibr" rid="B59">2023</xref>).</p>
<p><xref ref-type="fig" rid="F4">Figure 4</xref> presents a user-level explainability scenario, where (a) shows an explanation generated using GPT 3.5 but with specific words/phrases identified using knowledge, and (b) illustrates the explanation generated solely by GPT 3.5&#x00027;s own capabilities. In <xref ref-type="fig" rid="F4">Figure 4</xref>(a), the process generates two symbolic questions based on the relationship between pregnancy, symptoms, and causes found in clinical knowledge sources UMLS and RxNorm. This approach utilizes clinical named entity recognition (Kocaman and Talby, <xref ref-type="bibr" rid="B64">2022</xref>) and neural keyphrase extraction (Kitaev and Klein, <xref ref-type="bibr" rid="B63">2018</xref>; Kulkarni et al., <xref ref-type="bibr" rid="B69">2022</xref>) to identify the highlighted phrases. These extracted phrases are, then, provided as prompts to GPT 3.5 along with the user&#x00027;s post, and the model is asked to produce an explanation. We used langchain&#x00027;s prompting template for demonstrating user-level explainability (Harrison, <xref ref-type="bibr" rid="B49">2023</xref>).</p>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption><p>GPT 3.5 provides user-level explainability when prompted with clinically-relevant words and keyphrases such as <italic>pregnancy, morning sickness, vomiting, nausea</italic>, and <italic>anxiety caused by tranquilizers during pregnancy</italic>. Without these specific keyphrases, GPT 3.5 may produce incorrect inferences [shown in (b)]. When these keyphrases are used as prompts, the explanation provided by GPT 3.5 in (a) becomes more concise compared with the explanation in (b) generated without such prompting. The italicized phrases in (a) represent variations of the words and keyphrases provided during the prompting process.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="frai-06-1229805-g0004.tif"/>
</fig></sec>
<sec id="s5">
<title>5. Virtual mental health assistants</title>
<p>With the historical evolution of VMHAs (see <bold>Table 2</bold>) from behavioral health coaching (Ginger, <xref ref-type="bibr" rid="B43">2011</xref>) to KG-based intellectual VMHAs such as ALLEVIATE (Roy et al., <xref ref-type="bibr" rid="B104">2023</xref>), we examine the possibilities of new research directions to facilitate the expression of empathy in active communications (Sharma et al., <xref ref-type="bibr" rid="B111">2023</xref>). Existing studies suggest the risk of oversimplification of mental conditions and therapeutic approaches without considering latent or external contextual knowledge (Cirillo et al., <xref ref-type="bibr" rid="B16">2020</xref>). Thinking beyond the low-level analysis of classification and prediction, the high-level analysis of VMHAs would enrich the user-level (UL) experience and knowledge of MHPs (Roy et al., <xref ref-type="bibr" rid="B104">2023</xref>).</p>
<p>It is important to note that while LLMs have potential benefits, our observations suggest that VMHAs may not fully understand issues related to behavioral and emotional instability, self-harm tendencies, and the user&#x00027;s underlying psychological state. VMHAs (as exemplified in <xref ref-type="fig" rid="F2">Figures 2</xref>, <xref ref-type="fig" rid="F3">3</xref>) generate incoherent and unsafe responses when a user tries to seek a response for clinically relevant questions or vice-versa.</p>
<sec>
<title>5.1. Woebot and Wysa</title>
<p>Woebot and Wysa are two digital mental health applications. Woebot is an <italic>Automated Coach</italic> designed to provide a coach-like experience without human intervention, promoting good thinking hygiene through lessons, exercises, and videos rooted in Cognitive Behavioral Therapy (CBT) (Fitzpatrick et al., <xref ref-type="bibr" rid="B35">2017</xref>; Grigoruta, <xref ref-type="bibr" rid="B44">2018</xref>). On the other hand, Wysa uses a CBT conversational agent to engage in empathetic and therapeutic conversations and activities, aiming to help users with various mental health problems (Inkster et al., <xref ref-type="bibr" rid="B58">2018</xref>). Through question-answering mechanisms, Wysa recommends relaxing activities to improve mental well-being. Both apps operate in the growing industry of digital mental health space.</p>
<p>Narrowing down our investigation to context-based user-level (UL; <xref ref-type="fig" rid="F1">Figure 1</xref>) analysis, the findings about WoeBot and Wysa suggest that they observe and track various aspects of human behavior, including gratitude, mindfulness, and frequent mood changes throughout the day. Moreover, researchers have made significant contributions in assessing the <italic>trustworthiness</italic> of WoeBot and Wysa through ethical research protocols, which is crucial given the sensitive nature of virtual mental health agents (VMHAs) (Powell, <xref ref-type="bibr" rid="B92">2019</xref>). The absence of ethical considerations in WoeBot and Wysa becomes evident in their responses to emergencies such as immediate harm or suicidal ideation, where they lack clinical grounding and contextual awareness (Koutsouleris et al., <xref ref-type="bibr" rid="B66">2022</xref>). To address this issue, developing VMHAs that are safe and explainable is paramount. Such enhancements will allow these agents to understand subtle cues better and, as a result, become more accountable in their interactions. For example, a well-informed dialog agent aware of a user&#x00027;s depression may exercise caution and avoid discussing topics potentially exacerbating the user&#x00027;s mental health condition (Henderson et al., <xref ref-type="bibr" rid="B51">2018</xref>). To achieve the desired characteristics in VMHAs such as WoeBot and Wysa, we suggest relevant datasets for Contextual Awareness, explainability, and clinical grounding for conscious decision-making during sensitive scenarios [see <xref ref-type="table" rid="T1">Table 1</xref> which are examined using FAIR principles (META, <xref ref-type="bibr" rid="B84">2017</xref>)]. Furthermore, we suggest safe and explainable behavior metrics, specifically to assess how well VMHAs respond to emergencies, handle sensitive information, and avoid harmful interactions (Brocki et al., <xref ref-type="bibr" rid="B10">2023</xref>).</p>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p>Lists of conversational datasets created with support from MHPs, crisis counselors, nurse practitioners, or trained annotators.</p></caption> 
<table frame="box" rules="all">
<thead>
<tr style="background-color:#919497;color:#ffffff">
<th valign="top" align="left" colspan="2"><bold>Datasets</bold></th>
<th valign="top" align="left"><bold>Safety</bold></th>
<th valign="top" align="center"><bold>UsEx</bold></th>
<th valign="top" align="center" colspan="2"><bold>KI</bold></th>
<th valign="top" align="center"><bold>DiG</bold></th>
<th valign="top" align="center" colspan="4"><bold>FAIR Principle</bold></th>
</tr>
<tr style="background-color:#919497;color:#ffffff">
<th valign="top" align="center" colspan="2"></th>
<th/>
<th/>
<th valign="top" align="center"><bold>PK</bold></th>
<th valign="top" align="center"><bold>MK</bold></th>
<th/>
<th valign="top" align="center"><bold>F</bold></th>
<th valign="top" align="center"><bold>A</bold></th>
<th valign="top" align="center"><bold>I</bold></th>
<th valign="top" align="center"><bold>R</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">CounselChat (<xref ref-type="bibr" rid="B21">2015</xref>)</td>
<td valign="top" align="left">CounselChat</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02020;</td>
</tr> <tr>
<td valign="top" align="left">Huang (<xref ref-type="bibr" rid="B55">2015</xref>)</td>
<td valign="top" align="left">CC</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02020;</td>
</tr> <tr>
<td valign="top" align="left">Althoff et al. (<xref ref-type="bibr" rid="B3">2016</xref>)</td>
<td valign="top" align="left">SNAP Counseling</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
</tr> <tr>
<td valign="top" align="left">Rashkin et al. (<xref ref-type="bibr" rid="B97">2018</xref>)</td>
<td valign="top" align="left">Empathetic Dialogues</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
</tr> <tr>
<td valign="top" align="left">Demasi et al. (<xref ref-type="bibr" rid="B25">2019</xref>)</td>
<td valign="top" align="left">Roleplay</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02713;</td>
</tr> <tr>
<td valign="top" align="left">Liang et al. (<xref ref-type="bibr" rid="B73">2021</xref>)</td>
<td valign="top" align="left">CC-44</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02020;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02020;</td>
</tr> <tr>
<td valign="top" align="left">Gupta et al. (<xref ref-type="bibr" rid="B46">2022</xref>)</td>
<td valign="top" align="left">PRIMATE</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
</tr> <tr>
<td valign="top" align="left">Roy et al. (<xref ref-type="bibr" rid="B105">2022a</xref>)</td>
<td valign="top" align="left">ProKnow-data</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
</tr>
<tr>
<td valign="top" align="left">Welivita and Pu (<xref ref-type="bibr" rid="B132">2022a</xref>)</td>
<td valign="top" align="left">MITI</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>We have not included datasets created using crowdsource workers without proper annotation guidelines.</p>
<p>KI, Knowledge infusion; PK, Process knowledge; MK, Medical knowledge; DiG, Diversity in generation; UsEx, User-level explainability. Here, The <italic>FAIR principles</italic> stands for F, Findability; A, Accessibility; I, Interoperability; and R, Reusability. &#x02020;: partial fulfillment of the corresponding principle.</p>
</table-wrap-foot>
</table-wrap></sec>
<sec>
<title>5.2. Limbic and alleviate</title>
<p><xref ref-type="table" rid="T2">Table 2</xref> illustrates that both Limbic and ALLEVIATE incorporate safety measures, but they do so with a nuanced distinction in their implementation approaches. In Limbic, patient safety is considered to be a spontaneous assessment of the severity of the mental health condition of the user (a classification problem). It prioritizes patients seeking in-person clinical care (Sohail, <xref ref-type="bibr" rid="B116">2023</xref>). Harper, CEO of Limbic, suggests a further improvement in limbic&#x00027;s safety protocol; this includes the capability of the AI model to measure therapeutic alliance during active conversation and flag those user utterances that reflect deteriorating mental health (Rollwage et al., <xref ref-type="bibr" rid="B101">2022</xref>). On the other hand, ALLEVIATE implements safety through the use of clinical knowledge. ALLEVIATE creates a subgraph from the user&#x00027;s utterances and chatbot questions during the conversation. This subgraph is constructed by actively querying two knowledge bases: UMLS, for disorders and symptoms and Rx-NORM for medicine (Liu et al., <xref ref-type="bibr" rid="B78">2005</xref>). The subgraph allows the conversational AI model to do active inferencing, influencing the generation of the following best information-seeking question by ALLEVIATE. Due to the incorporation of a subgraph construction module, ALLEVIATE measures which is the best question to ask the user and provides the subgraph to MHPs for a better understanding of the mental health condition of the user. The question generation and response generation in ALLEVIATE are bound by the subgraph and information in the backend knowledge bases, thus ensuring accountable, transparent, and safe conversation.</p>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p>Prominent and in-use VMHAs with different objectives for supporting patients with mental disturbance.</p></caption> 
<table frame="box" rules="all">
<thead>
<tr style="background-color:#919497;color:#ffffff">
<th valign="top" align="left" colspan="2"><bold>VMHA</bold></th>
<th valign="top" align="left"><bold>Objective</bold></th>
<th valign="top" align="center" colspan="2"><bold>KI</bold></th>
<th valign="top" align="center"><bold>DiG</bold></th>
<th valign="top" align="center"><bold>Safety</bold></th>
<th valign="top" align="center"><bold>UsEx</bold></th>
<th valign="top" align="center"><bold>QM</bold></th>
</tr>
<tr style="background-color:#919497;color:#ffffff">
<th valign="top" align="center" colspan="2"></th>
<th/>
<th valign="top" align="center"><bold>PK</bold></th>
<th valign="top" align="center"><bold>MK</bold></th>
<th/>
<th/>
<th/>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Ginger (<xref ref-type="bibr" rid="B43">2011</xref>)</td>
<td valign="top" align="left">Ginger</td>
<td valign="top" align="left">Behavioral Health Coaching</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">H</td>
</tr> <tr>
<td valign="top" align="left">CompanionMX (<xref ref-type="bibr" rid="B20">2011</xref>)</td>
<td valign="top" align="left">CompanionMX</td>
<td valign="top" align="left">PTSD</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">H</td>
</tr> <tr>
<td valign="top" align="left">Quartet (<xref ref-type="bibr" rid="B95">2014</xref>)</td>
<td valign="top" align="left">Quartet</td>
<td valign="top" align="left">Therapy &#x00026; Counseling</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">H</td>
</tr> <tr>
<td valign="top" align="left">Fitzpatrick et al. (<xref ref-type="bibr" rid="B35">2017</xref>)</td>
<td valign="top" align="left">Woebot</td>
<td valign="top" align="left">CBT</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">A</td>
</tr> <tr>
<td valign="top" align="left">Limbic (<xref ref-type="bibr" rid="B74">2017</xref>)</td>
<td valign="top" align="left">Limbic</td>
<td valign="top" align="left">CBT</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">H</td>
</tr> <tr>
<td valign="top" align="left">Inkster et al. (<xref ref-type="bibr" rid="B58">2018</xref>)</td>
<td valign="top" align="left">Wysa</td>
<td valign="top" align="left">CBT</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">A</td>
</tr> <tr>
<td valign="top" align="left">Fulmer et al. (<xref ref-type="bibr" rid="B38">2018</xref>)</td>
<td valign="top" align="left">Tess</td>
<td valign="top" align="left">Anxiety &#x00026; Depression</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">-</td>
</tr> <tr>
<td valign="top" align="left">Ghandeharioun et al. (<xref ref-type="bibr" rid="B41">2019</xref>)</td>
<td valign="top" align="left">EMMA</td>
<td valign="top" align="left">CBT</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">H</td>
</tr> <tr>
<td valign="top" align="left">Denecke et al. (<xref ref-type="bibr" rid="B26">2020</xref>)</td>
<td valign="top" align="left">SERMO</td>
<td valign="top" align="left">CBT</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">H</td>
</tr> <tr>
<td valign="top" align="left">Possati (<xref ref-type="bibr" rid="B91">2022</xref>)</td>
<td valign="top" align="left">Replika</td>
<td valign="top" align="left">Empathetic &#x00026; Supportive</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="left">A</td>
</tr> <tr>
<td valign="top" align="left">Roy et al. (<xref ref-type="bibr" rid="B104">2023</xref>)</td>
<td valign="top" align="left">ALLEVIATE</td>
<td valign="top" align="left">Depression</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02717;</td>
<td valign="top" align="center">H</td>
</tr>
<tr>
<td valign="top" align="left">Our Survey Paper</td>
<td valign="top" align="left">Desired System</td>
<td valign="top" align="left">Screening, Triaging, &#x00026; MI</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">&#x02713;</td>
<td valign="top" align="center">H,A,T</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>We performed a high-level analysis of all the VMHAs based on publicly-available user reviews on forums (e.g., WebMD, AskaPatient, MedicineNet) and Reddit. For Woebot, Wysa, and Alleviate, a survey of 40 participants was carried out at Prisma Health. Here we define QM, Qualitative Metrics as H, Harmlessness; A, Adherence; T, Transparency.</p>
</table-wrap-foot>
</table-wrap>
</sec></sec>
<sec sec-type="discussion" id="s6">
<title>6. Discussion</title>
<p>The incorporation of safety, harmlessness, explainability, curation of process, and medical knowledge-based datasets and knowledge-infused learning methods in VMHAs brings forth the need for updated evaluation metrics. Traditional metrics such as accuracy, precision, and recall may not be sufficient to capture the nuances of these complex requirements. Here are some key considerations for revamping evaluation metrics.</p>
<sec>
<title>6.1. Evaluation method</title>
<p>All the notable earlier studies, such as by Walker et al. (<xref ref-type="bibr" rid="B128">1997</xref>), included subjective measures involving human-in-the-loop to evaluate a conversational system for its utility in the general purpose domain. Due to the expensive nature of human-based evaluation procedures, researchers have started using machine learning-based automatic quantitative metrics such as [e.g., BLEURT, BERTScore (Clinciu et al., <xref ref-type="bibr" rid="B17">2021</xref>), BLEU (Papineni et al., <xref ref-type="bibr" rid="B87">2002</xref>) and ROUGE (Lin, <xref ref-type="bibr" rid="B76">2004</xref>)] to evaluate the semantic similarity of the machine-translated text. Liu et al. (<xref ref-type="bibr" rid="B79">2017</xref>) highlights the disagreement of users with existing metrics, thereby lowering their expectations. In addition, most of these traditional quantitative metrics are reference-based, which are limited in availability and make it very difficult to ensure the quality of the human-written references (Bao et al., <xref ref-type="bibr" rid="B6">2022</xref>). To tackle these challenges and comprehensively assess a preferred VMHA concerning its explainability, safety, and integration of knowledge processes, it is essential to design metrics that bring VMHA systems closer to real-time applicability.</p>
<sec>
<title>6.1.1. Qualitative metrics</title>
<p>Drawing from the concerns mentioned earlier regarding VMHA on safety and explainability, we propose the following characteristics that can be qualitatively evaluated in a VMHA and strongly align with human judgment.</p>
<list list-type="bullet">
<list-item><p><bold>Adherence:</bold> Adherence, a topic extensively discussed in the healthcare field, refers to the commitment of users to specific treatment goals such as long-term therapy, physical activity, or medication (Fadhil, <xref ref-type="bibr" rid="B33">2018</xref>). Despite the AI community&#x00027;s considerable interest in evaluating health assistants&#x00027; adherence to user needs (Davis et al., <xref ref-type="bibr" rid="B23">2020</xref>), the lack of safe responses, DiG, and UsEx within VMHAs has drawn criticism and raised concerns about the impact on adherence. This situation highlights the importance of adherence as a qualitative metric in achieving more realistic and <italic>contextual</italic> VMHAs while treating patients with severe mental illnesses. Adherence to guidelines helps VMHA maintain context and ensure safe conversation. Adherence can be thought of as aligning the question generation and response shaping process in a VMHA to external clinical knowledge such as PHQ-9. For instance, Roy et al. and Zirikly et al. demonstrated that under the influence of datasets grounded in clinical knowledge, the generative model of VMHA can provide clinician-friendly explanations (Zirikly and Dredze, <xref ref-type="bibr" rid="B142">2022</xref>; Roy et al., <xref ref-type="bibr" rid="B104">2023</xref>). Another form of adherence is in the form of regulating medication adherence in users. This includes a VMHA asking whether the user follows a prescription and prescribed medication. Adherence to VMHA can be achieved in 2 ways, as shown in Section 3. For <italic>adherence to guidelines</italic>, VMHA&#x00027;s task is to leverage questions in questionnaires such as PHQ-9 as knowledge and ensure that upcoming generated questions are similar or related to CPG questions. This can be achieved through metrics such as BERTScore (Lee et al., <xref ref-type="bibr" rid="B71">2021</xref>), KL Divergence (Perez et al., <xref ref-type="bibr" rid="B88">2022</xref>), and others, often used in a setup that uses reinforcement learning (Trella et al., <xref ref-type="bibr" rid="B124">2022</xref>). In <italic>medication adherence</italic>, VMHA must be given access to the patient&#x00027;s clinical notes to ensure accurate prescription adherence. The chatbot will, then, extract essential details such as medication names, doses, and timings, using this information to generate relevant questions. To enhance its capabilities, VMHA will supplement the medication names with brand names from reliable sources such as MedDRA (Brown et al., <xref ref-type="bibr" rid="B11">1999</xref>). This process allows VMHA to educate patients on following the correct medication regimen.</p></list-item>
<list-item><p><bold>Harmlessness:</bold> The conversational agents generate harmful, unsafe, and sometimes incoherent information, which are the negative effects of generative AI (Welbl et al., <xref ref-type="bibr" rid="B131">2021</xref>). This has been observed under the term <italic>Hallucination</italic>. Hallucination is a benign term for making things up. The scenario of a woman is considered with a history of panic attacks and anxiety during pregnancy using tranquilizers. The women reach out to a VMHA for advice. The <italic>next word prediction strategy</italic> of the generative AI within the VMHA suggests that &#x0201C;the fact that you are using tranquilizer medication is a step in the right direction, but it is essential to address the root cause of your anxiety as well&#x0201D;. is a harmful statement, because tranquilizers cause anxiety during pregnancy (as shown <xref ref-type="fig" rid="F4">Figure 4</xref>). Hallucination and its closely related concept, fabrication, are currently debated within the generative AI community. Nevertheless, it is essential to approach the issue with caution and introduce safeguards to assess their harmlessness (Peterson, <xref ref-type="bibr" rid="B89">2023</xref>).</p>
<p>So far, only rule-based and data-driven methods have been proposed to control the harmful effects of generative AI. For example, the Claude LLM from anthropic uses what is known as constitution, consisting of 81 rules to measure the safety of a generated sentence before it can be shown to the end user (Bai et al., <xref ref-type="bibr" rid="B4">2022a</xref>,<xref ref-type="bibr" rid="B5">b</xref>). Amazon released DiSafety dataset for training LLM to distinguish between safe and unsafe generation (Meade et al., <xref ref-type="bibr" rid="B82">2023</xref>). Rule of thumb (RoTs) is another rule-based method for controlling text generations in generative AI (Kim et al., <xref ref-type="bibr" rid="B62">2022</xref>). Despite the efforts, VMHA is still susceptible to generating harmful and untrustworthy content, as these methods are limited by size and context. In contrast, knowledge in various human-curated knowledge bases (both online and offline) is more exhaustive in terms of context. Thus, we suggest developing metrics at the intersection of data-driven generative AI and knowledge to ensure that VMHA is always harmless.</p>
</list-item>
<list-item><p><bold>Transparency:</bold> A VMHA with transparency would allow users to inspect its attention and provide references to knowledge sources that influenced this attention. This concept is closely connected to USEx and has undergone comprehensive evaluation by Joyce et al. (<xref ref-type="bibr" rid="B59">2023</xref>), who associate USEx with transparency and interpretability, particularly concerning mental health. It is important because of various notable bad experiences from chatbots such as Tay, ChaosGPT (Hendrycks et al., <xref ref-type="bibr" rid="B52">2023</xref>), and others. Furthermore, an ethical concern goes along with these bots because of the intrinsic generative AI component. The component can generate false information or inference upon personally identifiable information, thus sacrificing user privacy (Coghlan et al., <xref ref-type="bibr" rid="B19">2023</xref>). Transparency can be achieved by either augmenting or incorporating external knowledge. The metric for transparency is still an open question. However, prior research has developed ad-hoc measures such as average knowledge capture (Roy et al., <xref ref-type="bibr" rid="B105">2022a</xref>), visualization of attention [e.g., BERTViz, Attviz (&#x00160;krlj et al., <xref ref-type="bibr" rid="B115">2020</xref>)], T-distributed Stochastic Neighbor Embedding (Tlili et al., <xref ref-type="bibr" rid="B123">2023</xref>), saliency maps (Mertes et al., <xref ref-type="bibr" rid="B83">2022</xref>), and game-theoretic transparency and transparency-specific AUC (Lee et al., <xref ref-type="bibr" rid="B70">2019</xref>).</p></list-item>
</list>
<p>The sought-after qualities in VMHAs are comparable to those being assessed in contemporary general-purpose agents, such as GPT 3.5 and GPT 4 (Fluri et al., <xref ref-type="bibr" rid="B37">2023</xref>). However, our focus should be on creating conversational agents who prioritize responsible interaction more than their general-purpose counterparts.</p></sec>
<sec>
<title>6.1.2. KI metric</title>
<p>In this section, we provide metrics that describe <italic>DiG, safety, MK</italic>, and <italic>PK</italic> in <xref ref-type="table" rid="T2">Table 2</xref>. &#x02713; and &#x02717; tell whether VMHA has been tested for these KI metrics.</p>
<list list-type="bullet">
<list-item><p><bold>Safety:</bold> For conversational systems to achieve safety, it is imperative that LLMs, which form the intrinsic components, need to exhibit safe behaviors (Henderson et al., <xref ref-type="bibr" rid="B51">2018</xref>; Perez et al., <xref ref-type="bibr" rid="B88">2022</xref>). A recent study conducted by Roy et al. (<xref ref-type="bibr" rid="B105">2022a</xref>) has introduced a safety lexicon to gauge the safety of language models within the context of mental health. Furthermore, endeavors are being made to develop datasets such as ProsocialDialog (Kim et al., <xref ref-type="bibr" rid="B62">2022</xref>) and DiSafety (Meade et al., <xref ref-type="bibr" rid="B82">2023</xref>), to ensure the capability of conversational systems to maintain safety. Nonetheless, currently, there exists no mental health-specific datasets or established method rooted in clinical principles for refining LLMs to ensure their safety.</p></list-item>
<list-item><p><bold>Logical Coherence (LC):</bold> LC is a qualitative check of the logical relationship between a user&#x00027;s input and the follow-up questions measuring <italic>PK</italic> and <italic>MK</italic>. Kane et al. (<xref ref-type="bibr" rid="B60">2020</xref>) used LC to ensure the reliable output from the RoBERTa model trained on the MNLI challenge and natural language inference GLUE benchmark, hence opening new research directions toward safer models for the MedNLI dataset (Romanov and Shivade, <xref ref-type="bibr" rid="B102">2018</xref>).</p></list-item>
<list-item><p><bold>Semantic Relations (SR):</bold> SR measures the extent of similarity between the response generation and the user&#x00027;s query (Kane et al., <xref ref-type="bibr" rid="B60">2020</xref>). Stasaski and Hearst (<xref ref-type="bibr" rid="B119">2022</xref>) highlight the use of SR for logical ordering of question generation, hence introducing diversity (<italic>DiG</italic>) and preventing models from hallucinating.</p></list-item>
</list></sec></sec>
<sec>
<title>6.2. Emerging areas of VMHAs</title>
<sec>
<title>6.2.1. Mental health triage</title>
<p>Mental Health Triage is a risk assessment that categorizes the severity of the mental disturbance before suggesting psychiatric help to the users and categorizes them on the basis of urgency. The screening and triage system could fulfill more complex requirements to achieve automated triage empowered by AI. A recent surge in the use of screening mechanisms by Babylon (Daws, <xref ref-type="bibr" rid="B24">2020</xref>) and Limbic has given new research directions toward a <italic>trustworthy</italic> and <italic>safe</italic> model in the near future (Duggan, <xref ref-type="bibr" rid="B32">1972</xref>; harper, <xref ref-type="bibr" rid="B48">2023</xref>).</p></sec>
<sec>
<title>6.2.2. Motivational interviewing</title>
<p>Motivational Interviewing (MI) is a directive, user-centered counseling style for eliciting behavior change by helping clients to explore and resolve ambivalence. In contrast to the assessment of severity in mental health triaging, MI enables more interpersonal relationships for cure with a possible extension of MI for mental illness domain (Westra et al., <xref ref-type="bibr" rid="B134">2011</xref>). Wu et al. (<xref ref-type="bibr" rid="B136">2020</xref>) suggest human-like empathetic response generation in MI with support for <italic>UsEx</italic> and <italic>contextualization</italic> with clinical knowledge. Recent studies identifying the interpersonal risk factors from offline text documents further support MI for active communications (Ghosh et al., <xref ref-type="bibr" rid="B42">2022</xref>).</p></sec>
<sec>
<title>6.2.3. Clinical diagnostic interviewing (CDI)</title>
<p>CDI is a direct client-centered interview between a clinician and patient without any intervention. With multiple modalities of the CDI data (e.g., video, text, and audio), the applications are developed in accordance with the Diagnostic and Statistical Manual of Mental Disorders (DSM-V), to facilitate a quick gathering of detailed information about the patient. In contrast to the in-person sessions (leveraged on both verbal and non-verbal communication), the conversational agents miss the <italic>personalized</italic> and <italic>contextual</italic> information from non-verbal communication hindering the efficacy of VMHAs.</p></sec></sec>
<sec>
<title>6.3. Practical considerations</title>
<p>We now consider two practical considerations with VMHAs.</p>
<p><bold>Difference in human vs. machine assistance:</bold> Creating a realistic conversational experience for VMHAs is important for user acceptance. While obtaining training data from real conversations can be challenging due to privacy concerns, some approaches can help address these issues and still provide valuable and useful outputs. Here are a few suggestions as follows:</p>
<list list-type="bullet">
<list-item><p>Simulated Conversations: Instead of relying solely on real conversations, we can generate simulated conversations that mimic the interactions between users and mental health professionals [e.g., Role Play (Demasi et al., <xref ref-type="bibr" rid="B25">2019</xref>)]. These simulated conversations can cover a wide range of scenarios and provide diverse training data for the VMHA.</p></list-item>
<list-item><p>User Feedback and Iterative Improvement: Users are encouraged to provide feedback on the system&#x00027;s output and use that feedback to improve the VMHA&#x00027;s responses over time. This iterative process can help address gaps or shortcomings in the system&#x00027;s performance and enhance its value to users.</p></list-item>
<list-item><p>Collaboration with MHPs: Collaborating with MHPs during the development and training process can provide valuable insights and ensure that the VMHA&#x00027;s responses align with established therapeutic techniques and principles. Their expertise can contribute to creating a more realistic and useful VMHA.</p></list-item>
<list-item><p>Personalized VMHAs: In the case of personalized VMHAs, real conversations can be used to create conversation templates and assign user profiles. These conversation templates can serve as a starting point for the VMHA&#x00027;s responses, and user profiles can help customize the system&#x00027;s behavior and recommendations based on individual preferences and needs (Qian et al., <xref ref-type="bibr" rid="B93">2018</xref>).</p></list-item>
</list>
<p>While it may not be possible to replicate the experience of a human MHP entirely, these approaches can help bridge the gap and create a VMHA that provides valuable support to users in need while addressing the challenges associated with obtaining real conversation data.</p>
<p><bold>Perception of quality with assistance offered:</bold> A well-understood result in marketing is that people perceive the quality of a service based on the price paid for it and the word of mouth buzz around it (Liu and Lee, <xref ref-type="bibr" rid="B77">2016</xref>). In the case of VMHAs, it is an open question whether the help offered by VMHAs will be considered inferior to that offered by professionals. More crucially, if a user perceives it negatively, will this further aggravate their mental condition?</p></sec></sec>
<sec sec-type="conclusions" id="s7">
<title>7. Conclusion</title>
<p>In the field of mental health, there has been significant research and development focused on the use of social and clinical signals to enhance AI methodologies. This includes dataset or corpus construction to train AI models for classification, prediction, and generation tasks in mental healthcare. However, VMHAs remain distant from such translational research. As such, there was not a pursuit of grounding datasets with clinical knowledge and clinical practice guidelines and use in training VMHAs. In this review, we shed light on this gap as critics who see the importance of clinical knowledge and clinical practice guidelines in making VMHAs explainable and safe.</p>
<p>As rightly stated by Geoffrey Irving, a Safety Researcher in DeepMind, &#x0201C;Dialogue is a good way to ensure Safety in AI models,&#x0201D; aligning with this, we suggest mechanisms for infusing clinical knowledge while training VMHAs and measures to ensure that infusion happens correctly, resulting in VMHA exhibiting safe behaviors. We enumerate immediate emergency areas within mental healthcare where VMHAs can be a valuable resource for improving public health surveillance.</p></sec>
<sec sec-type="author-contributions" id="s8">
<title>Author contributions</title>
<p>SS contributed to conception, design of the study, and wrote the first draft of the manuscript. All authors contributed to all aspects of the preparation and the writing of the manuscript.</p></sec>
</body>
<back>
<sec sec-type="COI-statement" id="conf1">
<title>Conflict of interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s9">
<title>Publisher&#x00027;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<fn-group>
<fn id="fn0001"><p><sup>1</sup><italic>Also called clinical practice guidelines and clinical process knowledge</italic>.</p></fn>
</fn-group>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Abd-Alrazaq</surname> <given-names>A. A.</given-names></name> <name><surname>Alajlani</surname> <given-names>M.</given-names></name> <name><surname>Ali</surname> <given-names>N.</given-names></name> <name><surname>Denecke</surname> <given-names>K.</given-names></name> <name><surname>Bewick</surname> <given-names>B. M.</given-names></name> <name><surname>Househ</surname> <given-names>M.</given-names></name></person-group> (<year>2021</year>). <article-title>Perceptions and opinions of patients about mental health chatbots: scoping review</article-title>. <source>J. Med. Internet Res</source>. <volume>23</volume>, <fpage>e17828</fpage>. <pub-id pub-id-type="doi">10.2196/17828</pub-id></citation>
</ref>
<ref id="B2">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ahmad</surname> <given-names>R.</given-names></name> <name><surname>Siemon</surname> <given-names>D.</given-names></name> <name><surname>Gnewuch</surname> <given-names>U.</given-names></name> <name><surname>Robra-Bissantz</surname> <given-names>S.</given-names></name></person-group> (<year>2022</year>). <article-title>Designing personality-adaptive conversational agents for mental health care</article-title>. <source>Inf. Syst. Front</source>. <volume>24</volume>, <fpage>923</fpage>&#x02013;<lpage>943</lpage>. <pub-id pub-id-type="doi">10.1007/s10796-022-10254-9</pub-id><pub-id pub-id-type="pmid">35250365</pub-id></citation></ref>
<ref id="B3">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Althoff</surname> <given-names>T.</given-names></name> <name><surname>Clark</surname> <given-names>K.</given-names></name> <name><surname>Leskovec</surname> <given-names>J.</given-names></name></person-group> (<year>2016</year>). <article-title>Large-scale analysis of counseling conversations: an application of natural language processing to mental health</article-title>. <source>Trans. Assoc. Comput. Linguist</source>. <volume>4</volume>, <fpage>463</fpage>&#x02013;<lpage>476</lpage>. <pub-id pub-id-type="doi">10.1162/tacl_a_00111</pub-id><pub-id pub-id-type="pmid">28344978</pub-id></citation></ref>
<ref id="B4">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bai</surname> <given-names>Y.</given-names></name> <name><surname>Jones</surname> <given-names>A.</given-names></name> <name><surname>Ndousse</surname> <given-names>K.</given-names></name> <name><surname>Askell</surname> <given-names>A.</given-names></name> <name><surname>Chen</surname> <given-names>A.</given-names></name> <name><surname>DasSarma</surname> <given-names>N.</given-names></name> <etal/></person-group>. (<year>2022a</year>). <article-title>Training a helpful and harmless assistant with reinforcement learning from human feedback</article-title>. <source>arXiv preprint arXiv</source>:2204.05862.</citation>
</ref>
<ref id="B5">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bai</surname> <given-names>Y.</given-names></name> <name><surname>Kadavath</surname> <given-names>S.</given-names></name> <name><surname>Kundu</surname> <given-names>S.</given-names></name> <name><surname>Askell</surname> <given-names>A.</given-names></name> <name><surname>Kernion</surname> <given-names>J.</given-names></name> <name><surname>Jones</surname> <given-names>A.</given-names></name> <etal/></person-group>. (<year>2022b</year>). <article-title>Constitutional ai: harmlessness from ai feedback</article-title>. <source>arXiv [Preprint]. arXiv:2212.08073</source>. <pub-id pub-id-type="doi">10.48550/arXiv.2212.08073</pub-id></citation>
</ref>
<ref id="B6">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bao</surname> <given-names>F. S.</given-names></name> <name><surname>Tu</surname> <given-names>R.</given-names></name> <name><surname>Luo</surname> <given-names>G.</given-names></name></person-group> (<year>2022</year>). <article-title>Docasref: A pilot empirical study on repurposing reference-based summary quality metrics reference-freely</article-title>. <source>arXiv [Preprint]. arXiv:2212.10013</source>. <pub-id pub-id-type="doi">10.48550/arXiv.2212.1001</pub-id></citation>
</ref>
<ref id="B7">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Bhatt</surname> <given-names>U.</given-names></name> <name><surname>Xiang</surname> <given-names>A.</given-names></name> <name><surname>Sharma</surname> <given-names>S.</given-names></name> <name><surname>Weller</surname> <given-names>A.</given-names></name> <name><surname>Taly</surname> <given-names>A.</given-names></name> <name><surname>Jia</surname> <given-names>Y.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>&#x0201C;Explainable machine learning in deployment,&#x0201D;</article-title> in <source>Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency</source>. <publisher-loc>New York</publisher-loc>: <publisher-name>Association for Computing Machinery</publisher-name>.</citation>
</ref>
<ref id="B8">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bodenreider</surname> <given-names>O.</given-names></name></person-group> (<year>2004</year>). <article-title>The unified medical language system (umls): integrating biomedical terminology</article-title>. <source>Nucleic Acids Res</source>. <volume>32</volume>, <fpage>D267</fpage>&#x02013;<lpage>D270</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkh061</pub-id><pub-id pub-id-type="pmid">14681409</pub-id></citation></ref>
<ref id="B9">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bommasani</surname> <given-names>R.</given-names></name> <name><surname>Hudson</surname> <given-names>D. A.</given-names></name> <name><surname>Adeli</surname> <given-names>E.</given-names></name> <name><surname>Altman</surname> <given-names>R.</given-names></name> <name><surname>Arora</surname> <given-names>S.</given-names></name> <name><surname>von Arx</surname> <given-names>S.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>On the opportunities and risks of foundation models</article-title>. <source>arXiv</source>.</citation>
</ref>
<ref id="B10">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Brocki</surname> <given-names>L.</given-names></name> <name><surname>Dyer</surname> <given-names>G. C.</given-names></name> <name><surname>Gladka</surname> <given-names>A.</given-names></name> <name><surname>Chung</surname> <given-names>N. C.</given-names></name></person-group> (<year>2023</year>). <article-title>&#x0201C;Deep learning mental health dialogue system,&#x0201D;</article-title> in <source>2023 IEEE International Conference on Big Data and Smart Computing (BigComp)</source>, <fpage>395</fpage>&#x02013;<lpage>398</lpage>.</citation>
</ref>
<ref id="B11">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Brown</surname> <given-names>E. G.</given-names></name> <name><surname>Wood</surname> <given-names>L.</given-names></name> <name><surname>Wood</surname> <given-names>S.</given-names></name></person-group> (<year>1999</year>). <article-title>The medical dictionary for regulatory activities (meddra)</article-title>. <source>Drug Safety</source> <volume>20</volume>, <fpage>109</fpage>&#x02013;<lpage>117</lpage>. <pub-id pub-id-type="doi">10.2165/00002018-199920020-00002</pub-id><pub-id pub-id-type="pmid">10082069</pub-id></citation></ref>
<ref id="B12">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Campbell</surname> <given-names>D.</given-names></name></person-group> (<year>2021</year>). <source>Strain on Mental Health Care Leaves 8m People Without Help, Say NHS Leaders</source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://www.theguardian.com/society/2021/aug/29/strain-on-mental-health-care-leaves-8m-people-without-help-say-nhs-leaders">https://www.theguardian.com/society/2021/aug/29/strain-on-mental-health-care-leaves-8m-people-without-help-say-nhs-leaders</ext-link>.</citation>
</ref>
<ref id="B13">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cao</surname> <given-names>L.</given-names></name> <name><surname>Zhang</surname> <given-names>H.</given-names></name> <name><surname>Feng</surname> <given-names>L.</given-names></name></person-group> (<year>2020</year>). <article-title>Building and using personal knowledge graph to improve suicidal ideation detection on social media</article-title>. <source>IEEE Trans. Multimed</source>. <volume>24</volume>, <fpage>87</fpage>&#x02013;<lpage>102</lpage>. <pub-id pub-id-type="doi">10.1109/TMM.2020.3046867</pub-id></citation>
</ref>
<ref id="B14">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cheng</surname> <given-names>Y.</given-names></name> <name><surname>Jiang</surname> <given-names>H.</given-names></name></person-group> (<year>2020</year>). <article-title>Ai-powered mental health chatbots: Examining users? motivations, active communicative action and engagement after mass-shooting disasters</article-title>. <source>J. Conting. Crisis Manage</source>. <volume>28</volume>, <fpage>339</fpage>&#x02013;<lpage>354</lpage>. <pub-id pub-id-type="doi">10.1111/1468-5973.12319</pub-id></citation>
</ref>
<ref id="B15">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chung</surname> <given-names>K.</given-names></name> <name><surname>Cho</surname> <given-names>H. Y.</given-names></name> <name><surname>Park</surname> <given-names>J. Y.</given-names></name></person-group> (<year>2021</year>). <article-title>A chatbot for perinatal women&#x00027;s and partners? obstetric and mental health care: development and usability evaluation study</article-title>. <source>JMIR Medical Informatics</source> <volume>9</volume>:<fpage>e18607</fpage>. <pub-id pub-id-type="doi">10.2196/18607</pub-id><pub-id pub-id-type="pmid">33656442</pub-id></citation></ref>
<ref id="B16">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cirillo</surname> <given-names>D.</given-names></name> <name><surname>Catuara-Solarz</surname> <given-names>S.</given-names></name> <name><surname>Morey</surname> <given-names>C.</given-names></name> <name><surname>Guney</surname> <given-names>E.</given-names></name> <name><surname>Subirats</surname> <given-names>L.</given-names></name> <name><surname>Mellino</surname> <given-names>S.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>Sex and gender differences and biases in artificial intelligence for biomedicine and healthcare</article-title>. <source>NPJ Digital Med</source>. <volume>3</volume>, <fpage>81</fpage>. <pub-id pub-id-type="doi">10.1038/s41746-020-0288-5</pub-id><pub-id pub-id-type="pmid">32529043</pub-id></citation></ref>
<ref id="B17">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Clinciu</surname> <given-names>M.</given-names></name> <name><surname>Eshghi</surname> <given-names>A.</given-names></name> <name><surname>Hastie</surname> <given-names>H.</given-names></name></person-group> (<year>2021</year>). <article-title>&#x0201C;A study of automatic metrics for the evaluation of natural language explanations&#x0201D;</article-title> in <source>Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume</source>, <fpage>2376</fpage>&#x02013;<lpage>2387</lpage>.</citation>
</ref>
<ref id="B18">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Coda-Forno</surname> <given-names>J.</given-names></name> <name><surname>Witte</surname> <given-names>K.</given-names></name> <name><surname>Jagadish</surname> <given-names>A. K.</given-names></name> <name><surname>Binz</surname> <given-names>M.</given-names></name> <name><surname>Akata</surname> <given-names>Z.</given-names></name> <name><surname>Schulz</surname> <given-names>E.</given-names></name></person-group> (<year>2023</year>). <article-title>Inducing anxiety in large language models increases exploration and bias</article-title>. <source>arXiv [Preprint]. arXiv:2304.11111</source>. <pub-id pub-id-type="doi">10.48550/arXiv.2304.11111</pub-id></citation>
</ref>
<ref id="B19">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Coghlan</surname> <given-names>S.</given-names></name> <name><surname>Leins</surname> <given-names>K.</given-names></name> <name><surname>Sheldrick</surname> <given-names>S.</given-names></name> <name><surname>Cheong</surname> <given-names>M.</given-names></name> <name><surname>Gooding</surname> <given-names>P.</given-names></name> <name><surname>D&#x00027;Alfonso</surname> <given-names>S.</given-names></name></person-group> (<year>2023</year>). <article-title>To chat or bot to chat: Ethical issues with using chatbots in mental health</article-title>. <source>Digital Health</source> <volume>9</volume>, <fpage>20552076231183542</fpage>. <pub-id pub-id-type="doi">10.1177/20552076231183542</pub-id><pub-id pub-id-type="pmid">37377565</pub-id></citation></ref>
<ref id="B20">
<citation citation-type="web"><person-group person-group-type="author"><collab>Companion MX.</collab></person-group> (<year>2011</year>). <source>Cogito:Emotion and Conversation AI</source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://cogitocorp.com/">https://cogitocorp.com/</ext-link>.</citation>
</ref>
<ref id="B21">
<citation citation-type="book"><person-group person-group-type="author"><collab>CounselChat</collab></person-group> (<year>2015</year>). <source>Mental Health Answers from Counselors</source>. <publisher-loc>Oregon</publisher-loc>: <publisher-name>CounselChat</publisher-name>.</citation>
</ref>
<ref id="B22">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Czeisler</surname> <given-names>M. &#x000C9;</given-names></name> <name><surname>Lane</surname> <given-names>R. I.</given-names></name> <name><surname>Petrosky</surname> <given-names>E.</given-names></name> <name><surname>Wiley</surname> <given-names>J. F.</given-names></name> <name><surname>Christensen</surname> <given-names>A.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>Mental health, substance use, and suicidal ideation during the covid-19 pandemic?United States, June 24&#x02013;30, 2020</article-title>. <source>Morbid. Mortal. Wkly. Rep</source>. <volume>69</volume>, <fpage>1049</fpage>. <pub-id pub-id-type="doi">10.15585/mmwr.mm6932a1</pub-id><pub-id pub-id-type="pmid">32790653</pub-id></citation></ref>
<ref id="B23">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Davis</surname> <given-names>C. R.</given-names></name> <name><surname>Murphy</surname> <given-names>K. J.</given-names></name> <name><surname>Curtis</surname> <given-names>R. G.</given-names></name> <name><surname>Maher</surname> <given-names>C. A.</given-names></name></person-group> (<year>2020</year>). <article-title>A process evaluation examining the performance, adherence, and acceptability of a physical activity and diet artificial intelligence virtual health assistant</article-title>. <source>Int. J. Environ. Res. Public Health</source> <volume>17</volume>, <fpage>9137</fpage>. <pub-id pub-id-type="doi">10.3390/ijerph17239137</pub-id><pub-id pub-id-type="pmid">33297456</pub-id></citation></ref>
<ref id="B24">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Daws</surname> <given-names>R.</given-names></name></person-group> (<year>2020</year>). <source>Babylon Health Lashes Out At Doctor Who Raised AI Chatbot Safety Concerns</source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://www.artificialintelligence-news.com/2020/02/26/babylon-health-doctor-ai-chatbot-safety-concerns/">https://www.artificialintelligence-news.com/2020/02/26/babylon-health-doctor-ai-chatbot-safety-concerns/</ext-link> (accessed September 22, 2023).</citation>
</ref>
<ref id="B25">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Demasi</surname> <given-names>O.</given-names></name> <name><surname>Hearst</surname> <given-names>M. A.</given-names></name> <name><surname>Recht</surname> <given-names>B.</given-names></name></person-group> (<year>2019</year>). <article-title>&#x0201C;Towards augmenting crisis counselor training by improving message retrieval,&#x0201D;</article-title> in <source>Proceedings of the Sixth Workshop on Computational Linguistics and Clinical Psychology</source>. <publisher-loc>Minneapolis, Minnesota</publisher-loc>: <publisher-name>Association for Computational Linguistics</publisher-name>, <fpage>1</fpage>&#x02013;<lpage>11</lpage>.</citation>
</ref>
<ref id="B26">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Denecke</surname> <given-names>K.</given-names></name> <name><surname>Vaaheesan</surname> <given-names>S.</given-names></name> <name><surname>Arulnathan</surname> <given-names>A.</given-names></name></person-group> (<year>2020</year>). <article-title>A mental health chatbot for regulating emotions (sermo)-concept and usability test</article-title>. <source>IEEE Trans. Emerg. Topics Comput</source>. <volume>9</volume>, <fpage>1170</fpage>&#x02013;<lpage>1182</lpage>. <pub-id pub-id-type="doi">10.1109/TETC.2020.2974478</pub-id></citation>
</ref>
<ref id="B27">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Dinan</surname> <given-names>E.</given-names></name> <name><surname>Abercrombie</surname> <given-names>G.</given-names></name> <name><surname>Bergman</surname> <given-names>S. A.</given-names></name> <name><surname>Spruit</surname> <given-names>S.</given-names></name> <name><surname>Hovy</surname> <given-names>D.</given-names></name> <name><surname>Boureau</surname> <given-names>Y.-L.</given-names></name> <etal/></person-group>. (<year>2022</year>). <article-title>&#x0201C;Safetykit: First aid for measuring safety in open-domain conversational systems,&#x0201D;</article-title> in <source>Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)</source>. <publisher-loc>Stroudsburg</publisher-loc>: <publisher-name>Association for Computational Linguistics</publisher-name>.</citation>
</ref>
<ref id="B28">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Dinan</surname> <given-names>R.</given-names></name></person-group> (<year>2020</year>). <source>1st Safety for Conversational AI Workshop | ACL Member Portal</source>. Association for Computational Linguistics. Available online at: <ext-link ext-link-type="uri" xlink:href="https://www.aclweb.org/portal/content/1st-safety-conversational-ai-workshop-0">https://www.aclweb.org/portal/content/1st-safety-conversational-ai-workshop-0</ext-link> (accessed September 22, 2023).</citation>
</ref>
<ref id="B29">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dinan</surname> <given-names>E.</given-names></name> <name><surname>Abercrombie</surname> <given-names>G.</given-names></name> <name><surname>Bergman</surname> <given-names>A. S.</given-names></name> <name><surname>Spruit</surname> <given-names>S.</given-names></name> <name><surname>Hovy</surname> <given-names>D.</given-names></name> <name><surname>Boureau</surname> <given-names>Y.-L.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>Anticipating safety issues in e2e conversational AI: framework and tooling</article-title>. <source>arXiv [Preprint].arXiv:2107.03451</source></citation>
</ref>
<ref id="B30">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dinan</surname> <given-names>E.</given-names></name> <name><surname>Roller</surname> <given-names>S.</given-names></name> <name><surname>Shuster</surname> <given-names>K.</given-names></name> <name><surname>Fan</surname> <given-names>A.</given-names></name> <name><surname>Auli</surname> <given-names>M.</given-names></name> <name><surname>Weston</surname> <given-names>J.</given-names></name></person-group> (<year>2018</year>). <article-title>&#x0201C;Wizard of Wikipedia: knowledge-powered conversational agents,&#x0201D;</article-title> in <source>International Conference on Learning Representations (Kigali)</source>.</citation>
</ref>
<ref id="B31">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Donnelly</surname> <given-names>K.</given-names></name></person-group> (<year>2006</year>). <article-title>Snomed-ct: the advanced terminology and coding system for ehealth</article-title>. <source>Stud. Health Technol. Inform</source>. <volume>121</volume>, <fpage>279</fpage>.<pub-id pub-id-type="pmid">17095826</pub-id></citation></ref>
<ref id="B32">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Duggan</surname> <given-names>K. Z.</given-names></name></person-group> (<year>1972</year>). <source>Limbic Mental Health E-Triage Chatbot Gets UKCA Certification</source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://www.fdanews.com/articles/210983-limbic-mental-health-e-triage-chatbot-gets-ukca-certification">https://www.fdanews.com/articles/210983-limbic-mental-health-e-triage-chatbot-gets-ukca-certification</ext-link> (accessed September 22, 2023).</citation>
</ref>
<ref id="B33">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fadhil</surname> <given-names>A.</given-names></name></person-group> (<year>2018</year>). <article-title>A conversational interface to improve medication adherence: towards AI support in patient&#x00027;s treatment</article-title>. <source>arXiv [Preprint]. arXiv:1803.09844</source>.</citation>
</ref>
<ref id="B34">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>First</surname> <given-names>M. B.</given-names></name></person-group> (<year>2014</year>). <article-title>Structured clinical interview for the dsm (scid)</article-title>. <source>Encyclop. Clini. Psychol</source>. <volume>351</volume>, <fpage>1</fpage>&#x02013;<lpage>6</lpage>. <pub-id pub-id-type="doi">10.1002/9781118625392.wbecp351</pub-id></citation>
</ref>
<ref id="B35">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fitzpatrick</surname> <given-names>K. K.</given-names></name> <name><surname>Darcy</surname> <given-names>A.</given-names></name> <name><surname>Vierhile</surname> <given-names>M.</given-names></name></person-group> (<year>2017</year>). <article-title>Delivering cognitive behavior therapy to young adults with symptoms of depression and anxiety using a fully automated conversational agent (woebot): a randomized controlled trial</article-title>. <source>JMIR Mental Health</source> <volume>4</volume>, <fpage>e7785</fpage>. <pub-id pub-id-type="doi">10.2196/mental.7785</pub-id><pub-id pub-id-type="pmid">28588005</pub-id></citation></ref>
<ref id="B36">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Floridi</surname> <given-names>L.</given-names></name> <name><surname>Chiriatti</surname> <given-names>M.</given-names></name></person-group> (<year>2020</year>). <article-title>Gpt-3: Its nature, scope, limits, and consequences</article-title>. <source>Minds Mach</source>. <volume>30</volume>, <fpage>681</fpage>&#x02013;<lpage>694</lpage>. <pub-id pub-id-type="doi">10.1007/s11023-020-09548-1</pub-id></citation>
</ref>
<ref id="B37">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fluri</surname> <given-names>L.</given-names></name> <name><surname>Paleka</surname> <given-names>D.</given-names></name> <name><surname>Tram&#x000E8;r</surname> <given-names>F.</given-names></name></person-group> (<year>2023</year>). <article-title>Evaluating superhuman models with consistency checks</article-title>. <source>arXiv [Preprint]. arXiv:2306.09983</source>.</citation>
</ref>
<ref id="B38">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fulmer</surname> <given-names>R.</given-names></name> <name><surname>Joerin</surname> <given-names>A.</given-names></name> <name><surname>Gentile</surname> <given-names>B.</given-names></name> <name><surname>Lakerink</surname> <given-names>L.</given-names></name> <name><surname>Rauws</surname> <given-names>M.</given-names></name> <etal/></person-group>. (<year>2018</year>). <article-title>Using psychological artificial intelligence (tess) to relieve symptoms of depression and anxiety: randomized controlled trial</article-title>. <source>JMIR Mental Health</source> <volume>5</volume>, <fpage>e9782</fpage>. <pub-id pub-id-type="doi">10.2196/preprints.9782</pub-id><pub-id pub-id-type="pmid">30545815</pub-id></citation></ref>
<ref id="B39">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gaur</surname> <given-names>M.</given-names></name> <name><surname>Gunaratna</surname> <given-names>K.</given-names></name> <name><surname>Bhatt</surname> <given-names>S.</given-names></name> <name><surname>Sheth</surname> <given-names>A.</given-names></name></person-group> (<year>2022a</year>). <article-title>Knowledge-infused learning: a sweet spot in neuro-symbolic ai</article-title>. <source>IEEE Inter. Comp</source>. <volume>26</volume>, <fpage>5</fpage>&#x02013;<lpage>11</lpage>. <pub-id pub-id-type="doi">10.1109/MIC.2022.3179759</pub-id></citation>
</ref>
<ref id="B40">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gaur</surname> <given-names>M.</given-names></name> <name><surname>Gunaratna</surname> <given-names>K.</given-names></name> <name><surname>Srinivasan</surname> <given-names>V.</given-names></name> <name><surname>Jin</surname> <given-names>H.</given-names></name></person-group> (<year>2022b</year>). <article-title>Iseeq: Information seeking question generation using dynamic meta-information retrieval and knowledge graphs</article-title>. <source>Proc. Innov. Appl. Artif. Intell. Conf</source>. <volume>36</volume>, <fpage>10672</fpage>&#x02013;<lpage>10680</lpage>. <pub-id pub-id-type="doi">10.1609/aaai.v36i10.21312</pub-id></citation>
</ref>
<ref id="B41">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Ghandeharioun</surname> <given-names>A.</given-names></name> <name><surname>McDuff</surname> <given-names>D.</given-names></name> <name><surname>Czerwinski</surname> <given-names>M.</given-names></name> <name><surname>Rowan</surname> <given-names>K.</given-names></name></person-group> (<year>2019</year>). <article-title>&#x0201C;Emma: An emotion-aware wellbeing chatbot,&#x0201D;</article-title> in <source>2019 8th International Conference on Affective Computing and Intelligent Interaction (ACII)</source>. <publisher-loc>Cambridge, UK</publisher-loc>: <publisher-name>IEEE</publisher-name>, <fpage>1</fpage>&#x02013;<lpage>7</lpage>.</citation>
</ref>
<ref id="B42">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ghosh</surname> <given-names>S.</given-names></name> <name><surname>Ekbal</surname> <given-names>A.</given-names></name> <name><surname>Bhattacharyya</surname> <given-names>P.</given-names></name></person-group> (<year>2022</year>). <article-title>Am i no good? Towards detecting perceived burdensomeness and thwarted belongingness from suicide notes</article-title>. <source>arXiv [Preprint]. arXiv:2206.06141.</source> <pub-id pub-id-type="doi">10.24963/ijcai.2022/704</pub-id></citation>
</ref>
<ref id="B43">
<citation citation-type="web"><person-group person-group-type="author"><collab>Ginger</collab></person-group> (<year>2011</year>). <source>In-the-Moment Care for Every Emotion</source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://www.ginger.com">https://www.ginger.com</ext-link> (accessed September 23, 2023).</citation>
</ref>
<ref id="B44">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Grigoruta</surname> <given-names>C.</given-names></name></person-group> (<year>2018</year>). <source>Why We Need Mental Health Chatbots</source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://woebothealth.com/why-we-need-mental-health-chatbots">https://woebothealth.com/why-we-need-mental-health-chatbots</ext-link> (accessed September 23, 2023).</citation>
</ref>
<ref id="B45">
<citation citation-type="book"><person-group person-group-type="author"><collab>Gupta K</collab></person-group>. (<year>2022</year>). <source>Deepmind Introduces &#x00027;Sparrow,&#x00027; An Artificial Intelligence-Powered Chatbot Developed to Build Safer Machine Learning Systems</source>. <publisher-loc>California</publisher-loc>: <publisher-name>MarkTechPost</publisher-name>.</citation>
</ref>
<ref id="B46">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gupta</surname> <given-names>S.</given-names></name> <name><surname>Agarwal</surname> <given-names>A.</given-names></name> <name><surname>Gaur</surname> <given-names>M.</given-names></name> <name><surname>Roy</surname> <given-names>K.</given-names></name> <name><surname>Narayanan</surname> <given-names>V.</given-names></name> <name><surname>Kumaraguru</surname> <given-names>P.</given-names></name> <etal/></person-group>. (<year>2022</year>). <article-title>Learning to automate follow-up question generation using process knowledge for depression triage on reddit posts</article-title>. <source>arXiv</source>. <pub-id pub-id-type="doi">10.18653/v1/2022.clpsych-1.12</pub-id></citation>
</ref>
<ref id="B47">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gyrard</surname> <given-names>A.</given-names></name> <name><surname>Boudaoud</surname> <given-names>K.</given-names></name></person-group> (<year>2022</year>). <article-title>Interdisciplinary iot and emotion knowledge graph-based recommendation system to boost mental health</article-title>. <source>Appl. Sci</source>. <volume>12</volume>, <fpage>9712</fpage>. <pub-id pub-id-type="doi">10.3390/app12199712</pub-id></citation>
</ref>
<ref id="B48">
<citation citation-type="web"><person-group person-group-type="author"><collab>Harper</collab></person-group> (<year>2023</year>). <source>Limbic Access AI Conversational Chatbot for e-triage - <italic>Digital Marketplace</italic> &#x02013; <italic>Applytosupply</italic></source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://www.applytosupply.digitalmarketplace.service.gov.uk/g-cloud/services/350866714426117">https://www.applytosupply.digitalmarketplace.service.gov.uk/g-cloud/services/350866714426117</ext-link> (accessed 29 July, 2023).</citation>
</ref>
<ref id="B49">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Harrison</surname> <given-names>C.</given-names></name></person-group> (<year>2023</year>). <source>GitHub - <italic>Langchain-ai/langchain: Building Applications with LLMs Through Composability</italic></source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://github.com/hwchase17/langchain">https://github.com/hwchase17/langchain</ext-link>. (accessed 30 July, 2023).</citation>
</ref>
<ref id="B50">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hartmann</surname> <given-names>R.</given-names></name> <name><surname>Sander</surname> <given-names>C.</given-names></name> <name><surname>Lorenz</surname> <given-names>N.</given-names></name> <name><surname>B&#x000F6;ttger</surname> <given-names>D.</given-names></name> <name><surname>Hegerl</surname> <given-names>U.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title>Utilization of patient-generated data collected through mobile devices: insights from a survey on attitudes toward mobile self-monitoring and self-management apps for depression</article-title>. <source>JMIR Mental Health</source> <volume>6</volume>, <fpage>e11671</fpage>. <pub-id pub-id-type="doi">10.2196/11671</pub-id><pub-id pub-id-type="pmid">30942693</pub-id></citation></ref>
<ref id="B51">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Henderson</surname> <given-names>P.</given-names></name> <name><surname>Sinha</surname> <given-names>K.</given-names></name> <name><surname>Angelard-Gontier</surname> <given-names>N.</given-names></name> <name><surname>Ke</surname> <given-names>N. R.</given-names></name> <name><surname>Fried</surname> <given-names>G.</given-names></name> <name><surname>Lowe</surname> <given-names>R.</given-names></name> <etal/></person-group>. (<year>2018</year>). <article-title>&#x0201C;Ethical challenges in data-driven dialogue systems,&#x0201D;</article-title> in <source>Proceedings of the</source> 2018 AAAI/<italic>ACM Conference on AI, Ethics, and Societ</italic>, 123&#x02013;129. <pub-id pub-id-type="doi">10.1145/3278721.3278777</pub-id></citation>
</ref>
<ref id="B52">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hendrycks</surname> <given-names>D.</given-names></name> <name><surname>Mazeika</surname> <given-names>M.</given-names></name> <name><surname>Woodside</surname> <given-names>T.</given-names></name></person-group> (<year>2023</year>). <article-title>An overview of catastrophic ai risks</article-title>. <source>arXiv</source>. <pub-id pub-id-type="doi">10.48550/arXiv.2306.12001</pub-id></citation>
</ref>
<ref id="B53">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hennemann</surname> <given-names>S.</given-names></name> <name><surname>Kuhn</surname> <given-names>S.</given-names></name> <name><surname>Witth&#x000F6;ft</surname> <given-names>M.</given-names></name> <name><surname>Jungmann</surname> <given-names>S. M.</given-names></name></person-group> (<year>2022</year>). <article-title>Diagnostic performance of an app-based symptom checker in mental disorders: comparative study in psychotherapy outpatients</article-title>. <source>JMIR Ment Health</source> <volume>9</volume>, <fpage>e32832</fpage>. <pub-id pub-id-type="doi">10.2196/32832</pub-id><pub-id pub-id-type="pmid">35099395</pub-id></citation></ref>
<ref id="B54">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hoffman</surname> <given-names>R. R.</given-names></name> <name><surname>Mueller</surname> <given-names>S. T.</given-names></name> <name><surname>Klein</surname> <given-names>G.</given-names></name> <name><surname>Litman</surname> <given-names>J.</given-names></name></person-group> (<year>2018</year>). <article-title>Metrics for explainable AI: challenges and prospects</article-title>. <source>arXiv [Preprint]. arXiv:1812.04608</source>. <pub-id pub-id-type="doi">10.48550/arXiv.1812.04608</pub-id></citation>
</ref>
<ref id="B55">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Huang</surname> <given-names>R.</given-names></name></person-group> (<year>2015</year>). <source>Language Use in Teenage Crisis Intervention and the Immediate Outcome: A Machine Automated Analysis of Large Scale Text Data</source> (PhD thesis, Master&#x00027;s thesis). <publisher-loc>New York</publisher-loc>: <publisher-name>Columbia University</publisher-name>.</citation>
</ref>
<ref id="B56">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Hyman</surname> <given-names>I.</given-names></name></person-group> (<year>2008</year>). <source>Self-Disclosure and its Impact on Individuals Who Receive Mental Health Services (hhs pub. no. sma-08-4337)</source>. <publisher-loc>Rockville, MD</publisher-loc>: <publisher-name>Center for mental health services</publisher-name>. Substance Abuse and Mental Health Services Administration.</citation>
</ref>
<ref id="B57">
<citation citation-type="web"><person-group person-group-type="author"><collab>Ineqe</collab></person-group> (<year>2022</year>). <source>What You Need to Know About Replika</source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://ineqe.com/2022/01/20/replika-ai-friend/">https://ineqe.com/2022/01/20/replika-ai-friend/</ext-link></citation>
</ref>
<ref id="B58">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Inkster</surname> <given-names>B.</given-names></name> <name><surname>Sarda</surname> <given-names>S.</given-names></name> <name><surname>Subramanian</surname> <given-names>V.</given-names></name> <etal/></person-group>. (<year>2018</year>). <article-title>An empathy-driven, conversational artificial intelligence agent (wysa) for digital mental well-being: real-world data evaluation mixed-methods study</article-title>. <source>JMIR mHealth</source> <volume>6</volume>, <fpage>e12106</fpage>. <pub-id pub-id-type="doi">10.2196/12106</pub-id><pub-id pub-id-type="pmid">30470676</pub-id></citation></ref>
<ref id="B59">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Joyce</surname> <given-names>D. W.</given-names></name> <name><surname>Kormilitzin</surname> <given-names>A.</given-names></name> <name><surname>Smith</surname> <given-names>K. A.</given-names></name> <name><surname>Cipriani</surname> <given-names>A.</given-names></name></person-group> (<year>2023</year>). <article-title>Explainable artificial intelligence for mental health through transparency and interpretability for understandability</article-title>. <source>NPJ Digital Med</source>. <volume>6</volume>, <fpage>6</fpage>. <pub-id pub-id-type="doi">10.1038/s41746-023-00751-9</pub-id><pub-id pub-id-type="pmid">36653524</pub-id></citation></ref>
<ref id="B60">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kane</surname> <given-names>H.</given-names></name> <name><surname>Kocyigit</surname> <given-names>M. Y.</given-names></name> <name><surname>Abdalla</surname> <given-names>A.</given-names></name> <name><surname>Ajanoh</surname> <given-names>P.</given-names></name> <name><surname>Coulibali</surname> <given-names>M.</given-names></name></person-group> (<year>2020</year>). <article-title>NUBIA: neural based interchangeability assessor for text generation</article-title>. <source>arXiv [Preprint]. arXiv: 2004.14667</source>. <pub-id pub-id-type="doi">10.48550/arXiv.2004.14667</pub-id></citation>
</ref>
<ref id="B61">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kazi</surname> <given-names>H.</given-names></name> <name><surname>Chowdhry</surname> <given-names>B. S.</given-names></name> <name><surname>Memon</surname> <given-names>Z.</given-names></name></person-group> (<year>2012</year>). <article-title>Medchatbot: An umls based chatbot for medical students</article-title>. <source>Int. J. Comp. Appl</source>. <volume>55</volume>, <fpage>1</fpage>&#x02013;<lpage>5</lpage>. <pub-id pub-id-type="doi">10.5120/8844-2886</pub-id></citation>
</ref>
<ref id="B62">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kim</surname> <given-names>H.</given-names></name> <name><surname>Yu</surname> <given-names>Y.</given-names></name> <name><surname>Jiang</surname> <given-names>L.</given-names></name> <name><surname>Lu</surname> <given-names>X.</given-names></name> <name><surname>Khashabi</surname> <given-names>D.</given-names></name> <name><surname>Kim</surname> <given-names>G.</given-names></name> <etal/></person-group>. (<year>2022</year>). <article-title>&#x0201C;Prosocialdialog: A prosocial backbone for conversational agents,&#x0201D;</article-title> in <source>Proceedings of the</source> 2022 Conference <italic>on Empirical Methods in Natural Language Processing</italic>, 4005&#x02013;4029. <pub-id pub-id-type="doi">10.18653/v1/2022.emnlp-main.267</pub-id></citation>
</ref>
<ref id="B63">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kitaev</surname> <given-names>N.</given-names></name> <name><surname>Klein</surname> <given-names>D.</given-names></name></person-group> (<year>2018</year>). <article-title>&#x0201C;Constituency parsing with a self-attentive encoder,&#x0201D;</article-title> in <source>Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)</source>, 2676&#x02013;2686. <pub-id pub-id-type="doi">10.18653/v1/P18-1249</pub-id></citation>
</ref>
<ref id="B64">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kocaman</surname> <given-names>V.</given-names></name> <name><surname>Talby</surname> <given-names>D.</given-names></name></person-group> (<year>2022</year>). <article-title>Accurate clinical and biomedical named entity recognition at scale</article-title>. <source>Softw. Impac</source>. <volume>13</volume>, <fpage>100373</fpage>. <pub-id pub-id-type="doi">10.1016/j.simpa.2022.100373</pub-id></citation>
</ref>
<ref id="B65">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Koulouri</surname> <given-names>T.</given-names></name> <name><surname>Macredie</surname> <given-names>R. D.</given-names></name> <name><surname>Olakitan</surname> <given-names>D.</given-names></name></person-group> (<year>2022</year>). <article-title>Chatbots to support young adults? mental health: an exploratory study of acceptability</article-title>. <source>ACM Trans. Interact. Intell. Syst</source>. <volume>12</volume>, <fpage>1</fpage>&#x02013;<lpage>39</lpage>. <pub-id pub-id-type="doi">10.1145/3485874</pub-id></citation>
</ref>
<ref id="B66">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Koutsouleris</surname> <given-names>N.</given-names></name> <name><surname>Hauser</surname> <given-names>T. U.</given-names></name> <name><surname>Skvortsova</surname> <given-names>V.</given-names></name> <name><surname>De Choudhury</surname> <given-names>M.</given-names></name></person-group> (<year>2022</year>). <article-title>From promise to practice: towards the realisation of ai-informed mental health care</article-title>. <source>Lancet Digital Health</source>. <volume>4</volume>, :<fpage>e829?e840</fpage>. <pub-id pub-id-type="doi">10.1016/S2589-7500(22)00153-4</pub-id><pub-id pub-id-type="pmid">36229346</pub-id></citation></ref>
<ref id="B67">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kroenke</surname> <given-names>K.</given-names></name> <name><surname>Spitzer</surname> <given-names>R. L.</given-names></name> <name><surname>Williams</surname> <given-names>J. B.</given-names></name></person-group> (<year>2001</year>). <article-title>The phq-9: validity of a brief depression severity measure</article-title>. <source>J. Gen. Intern. Med</source>. <volume>16</volume>, <fpage>606</fpage>&#x02013;<lpage>613</lpage>. <pub-id pub-id-type="doi">10.1046/j.1525-1497.2001.016009606.x</pub-id><pub-id pub-id-type="pmid">11556941</pub-id></citation></ref>
<ref id="B68">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Kruzan</surname> <given-names>K. P.</given-names></name></person-group> (<year>2019</year>). <source>Self-Injury Support Online: Exploring Use of the Mobile Peer Support Application TalkLife</source>. <publisher-loc>Ithaca, NY</publisher-loc>: <publisher-name>Cornell University</publisher-name>.</citation>
</ref>
<ref id="B69">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kulkarni</surname> <given-names>M.</given-names></name> <name><surname>Mahata</surname> <given-names>D.</given-names></name> <name><surname>Arora</surname> <given-names>R.</given-names></name> <name><surname>Bhowmik</surname> <given-names>R.</given-names></name></person-group> (<year>2022</year>). <article-title>&#x0201C;Learning rich representation of keyphrases from text,&#x0201D;</article-title> in <source>Findings of the Association for Computational Linguistics: NAACL</source>, 891&#x02013;906. <pub-id pub-id-type="doi">10.18653/v1/2022.findings-naacl.67</pub-id></citation>
</ref>
<ref id="B70">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Lee</surname> <given-names>G.-H.</given-names></name> <name><surname>Jin</surname> <given-names>W.</given-names></name> <name><surname>Alvarez-Melis</surname> <given-names>D.</given-names></name> <name><surname>Jaakkola</surname> <given-names>T.</given-names></name></person-group> (<year>2019</year>). <article-title>&#x0201C;Functional transparency for structured data: a game-theoretic approach,&#x0201D;</article-title> in <source>International Conference on Machine Learning</source>. <publisher-loc>New York</publisher-loc>: <publisher-name>PMLR</publisher-name>, <fpage>3723</fpage>&#x02013;<lpage>3733</lpage>.</citation>
</ref>
<ref id="B71">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Lee</surname> <given-names>J. S.</given-names></name> <name><surname>Liang</surname> <given-names>B.</given-names></name> <name><surname>Fong</surname> <given-names>H. H.</given-names></name></person-group> (<year>2021</year>). <source>Restatement and question generation for counsellor chatbot. In 1st Workshop on Natural Language Processing for Programming (NLP4Prog)</source>. <publisher-loc>Stroudsburg</publisher-loc>: <publisher-name>Association for Computational Linguistics (ACL)</publisher-name>, <fpage>1</fpage>&#x02013;<lpage>7</lpage>.</citation>
</ref>
<ref id="B72">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Leiter</surname> <given-names>C.</given-names></name> <name><surname>Zhang</surname> <given-names>R.</given-names></name> <name><surname>Chen</surname> <given-names>Y.</given-names></name> <name><surname>Belouadi</surname> <given-names>J.</given-names></name> <name><surname>Larionov</surname> <given-names>D.</given-names></name> <name><surname>Fresen</surname> <given-names>V.</given-names></name> <etal/></person-group>. (<year>2023</year>). <source>ChatGPT: A Meta-Analysis After 2.5 Months</source>.</citation>
</ref>
<ref id="B73">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Liang</surname> <given-names>K.-H.</given-names></name> <name><surname>Lange</surname> <given-names>P.</given-names></name> <name><surname>Oh</surname> <given-names>Y. J.</given-names></name> <name><surname>Zhang</surname> <given-names>J.</given-names></name> <name><surname>Fukuoka</surname> <given-names>Y.</given-names></name> <name><surname>Yu</surname> <given-names>Z.</given-names></name></person-group> (<year>2021</year>). <article-title>&#x0201C;Evaluation of in-person counseling strategies to develop physical activity chatbot for women,&#x0201D;</article-title> in <source>Proceedings of the 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue</source> (<publisher-loc>Singapore</publisher-loc>), <fpage>32</fpage>&#x02013;<lpage>44</lpage>.</citation>
</ref>
<ref id="B74">
<citation citation-type="web"><person-group person-group-type="author"><collab>Limbic</collab></person-group> (<year>2017</year>). <source>Enabling the Best Psychological Therapy</source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://limbic.ai/">https://limbic.ai/</ext-link></citation>
</ref>
<ref id="B75">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Limsopatham</surname> <given-names>N.</given-names></name> <name><surname>Collier</surname> <given-names>N.</given-names></name></person-group> (<year>2016</year>). <article-title>&#x0201C;Normalising medical concepts in social media texts by learning semantic representation,&#x0201D;</article-title> in <source>Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (volume 1: long papers)</source>, <fpage>1014</fpage>&#x02013;<lpage>1023</lpage>.</citation>
</ref>
<ref id="B76">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Lin</surname> <given-names>C.-Y.</given-names></name></person-group> (<year>2004</year>). <article-title>&#x0201C;Rouge: a package for automatic evaluation of summaries&#x0201D;</article-title> in <source>Text Summarization Branches Out</source> (<publisher-loc>Barcelona</publisher-loc>), <fpage>74</fpage>&#x02013;<lpage>81</lpage>.</citation>
</ref>
<ref id="B77">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liu</surname> <given-names>C.-H. S.</given-names></name> <name><surname>Lee</surname> <given-names>T.</given-names></name></person-group> (<year>2016</year>). <article-title>Service quality and price perception of service: influence on word-of-mouth and revisit intention</article-title>. <source>J. Air Transport Manage</source>. <volume>52</volume>, <fpage>42</fpage>&#x02013;<lpage>54</lpage>. <pub-id pub-id-type="doi">10.1016/j.jairtraman.2015.12.007</pub-id></citation>
</ref>
<ref id="B78">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liu</surname> <given-names>S.</given-names></name> <name><surname>Ma</surname> <given-names>W.</given-names></name> <name><surname>Moore</surname> <given-names>R.</given-names></name> <name><surname>Ganesan</surname> <given-names>V.</given-names></name> <name><surname>Nelson</surname> <given-names>S.</given-names></name></person-group> (<year>2005</year>). <article-title>Rxnorm: prescription for electronic drug information exchange</article-title>. <source>IT Prof</source>. <volume>7</volume>, <fpage>17</fpage>&#x02013;<lpage>23</lpage>. <pub-id pub-id-type="doi">10.1109/MITP.2005.122</pub-id></citation>
</ref>
<ref id="B79">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liu</surname> <given-names>S.</given-names></name> <name><surname>Zhu</surname> <given-names>Z.</given-names></name> <name><surname>Ye</surname> <given-names>N.</given-names></name> <name><surname>Guadarrama</surname> <given-names>S.</given-names></name> <name><surname>Murphy</surname> <given-names>K.</given-names></name></person-group> (<year>2017</year>). <article-title>&#x0201C;Improved image captioning via policy gradient optimization of spider,&#x0201D;</article-title> in <source>Proceedings of the IEEE International Conference on Computer Vision</source>, <fpage>873</fpage>&#x02013;<lpage>881</lpage>.</citation>
</ref>
<ref id="B80">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Longo</surname> <given-names>L.</given-names></name> <name><surname>Goebel</surname> <given-names>R.</given-names></name> <name><surname>Lecue</surname> <given-names>F.</given-names></name> <name><surname>Kieseberg</surname> <given-names>P.</given-names></name> <name><surname>Holzinger</surname> <given-names>A.</given-names></name></person-group> (<year>2020</year>). <article-title>&#x0201C;Explainable artificial intelligence: Concepts, applications, research challenges and visions,&#x0201D;</article-title> in <source>Machine Learning and Knowledge Extraction</source>, eds. <person-group person-group-type="editor"><name><surname>Holzinger</surname><given-names>A.</given-names></name> <name><surname>Kieseberg</surname> <given-names>P.</given-names></name> <name><surname>Tjoa</surname> <given-names>A. M.</given-names></name> <name><surname>Weippl</surname> <given-names>E.</given-names></name></person-group>. <publisher-loc>Cham</publisher-loc>: <publisher-name>Springer International Publishing</publisher-name>. <pub-id pub-id-type="doi">10.1007/978-3-030-57321-8_1</pub-id></citation>
</ref>
<ref id="B81">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lundberg</surname> <given-names>S. M.</given-names></name> <name><surname>Lee</surname> <given-names>S.-I.</given-names></name></person-group> (<year>2017</year>). <article-title>A unified approach to interpreting model predictions</article-title>. <source>Adv. Neural Inf. Process. Syst</source>. <volume>2017</volume>, <fpage>30</fpage>. <pub-id pub-id-type="doi">10.48550/arXiv.1705.07874</pub-id></citation>
</ref>
<ref id="B82">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Meade</surname> <given-names>N.</given-names></name> <name><surname>Gella</surname> <given-names>S.</given-names></name> <name><surname>Hazarika</surname> <given-names>D.</given-names></name> <name><surname>Gupta</surname> <given-names>P.</given-names></name> <name><surname>Jin</surname> <given-names>D.</given-names></name> <name><surname>Reddy</surname> <given-names>S.</given-names></name> <etal/></person-group>. (<year>2023</year>). <article-title>Using in-context learning to improve dialogue safety</article-title>. <source>arXiv</source>. <pub-id pub-id-type="doi">10.48550/arXiv.2302.00871</pub-id></citation>
</ref>
<ref id="B83">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mertes</surname> <given-names>S.</given-names></name> <name><surname>Huber</surname> <given-names>T.</given-names></name> <name><surname>Weitz</surname> <given-names>K.</given-names></name> <name><surname>Heimerl</surname> <given-names>A.</given-names></name> <name><surname>Andr&#x000E9;</surname> <given-names>E.</given-names></name></person-group> (<year>2022</year>). <article-title>Ganterfactual?counterfactual explanations for medical non-experts using generative adversarial learning</article-title>. <source>Front. Artif. Intell</source>. <volume>5</volume>, <fpage>825565</fpage>. <pub-id pub-id-type="doi">10.3389/frai.2022.825565</pub-id><pub-id pub-id-type="pmid">35464995</pub-id></citation></ref>
<ref id="B84">
<citation citation-type="web"><person-group person-group-type="author"><collab>META</collab></person-group> (<year>2017</year>). <source>FAIR Principles - <italic>GO FAIR</italic></source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://www.go-fair.org/fair-principles/">https://www.go-fair.org/fair-principles/</ext-link> (accessed September 23, 2023).</citation>
</ref>
<ref id="B85">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Miner</surname> <given-names>A.</given-names></name> <name><surname>Chow</surname> <given-names>A.</given-names></name> <name><surname>Adler</surname> <given-names>S.</given-names></name> <name><surname>Zaitsev</surname> <given-names>I.</given-names></name> <name><surname>Tero</surname> <given-names>P.</given-names></name> <name><surname>Darcy</surname> <given-names>A.</given-names></name> <etal/></person-group>. (<year>2016</year>). <article-title>&#x0201C;Conversational agents and mental health: theory-informed assessment of language and affect,&#x0201D;</article-title> in <source>Proceedings of the Fourth International Conference on Human Agent Interaction</source>, 123&#x02013;130. <pub-id pub-id-type="doi">10.1145/2974804.2974820</pub-id></citation>
</ref>
<ref id="B86">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Noble</surname> <given-names>J. M.</given-names></name> <name><surname>Zamani</surname> <given-names>A.</given-names></name> <name><surname>Gharaat</surname> <given-names>M.</given-names></name> <name><surname>Merrick</surname> <given-names>D.</given-names></name> <name><surname>Maeda</surname> <given-names>N.</given-names></name> <name><surname>Foster</surname> <given-names>A. L.</given-names></name> <etal/></person-group>. (<year>2022</year>). <article-title>Developing, implementing, and evaluating an artificial intelligence&#x02013;guided mental health resource navigation chatbot for health care workers and their families during and following the COVID-19 pandemic: protocol for a cross-sectional study</article-title>. <source>JMIR Res Protoc</source>. <volume>11</volume>:<fpage>e33717</fpage>. <pub-id pub-id-type="doi">10.2196/33717</pub-id><pub-id pub-id-type="pmid">35877158</pub-id></citation></ref>
<ref id="B87">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Papineni</surname> <given-names>K.</given-names></name> <name><surname>Roukos</surname> <given-names>S.</given-names></name> <name><surname>Ward</surname> <given-names>T.</given-names></name> <name><surname>Zhu</surname> <given-names>W.-J.</given-names></name></person-group> (<year>2002</year>). <article-title>&#x0201C;Bleu: a method for automatic evaluation of machine translation,&#x0201D;</article-title> in <source>Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics</source>, 311&#x02013;318. <pub-id pub-id-type="doi">10.3115/1073083.1073135</pub-id></citation>
</ref>
<ref id="B88">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Perez</surname> <given-names>E.</given-names></name> <name><surname>Huang</surname> <given-names>S.</given-names></name> <name><surname>Song</surname> <given-names>F.</given-names></name> <name><surname>Cai</surname> <given-names>T.</given-names></name> <name><surname>Ring</surname> <given-names>R.</given-names></name> <name><surname>Aslanides</surname> <given-names>J.</given-names></name> <etal/></person-group>. (<year>2022</year>). <article-title>&#x0201C;Red teaming language models with language models,&#x0201D;</article-title> in <source>Proceedings of the</source> 2022 Conference <italic>on Empirical Methods in Natural Language Processing</italic>, <fpage>3419</fpage>&#x02013;<lpage>3448</lpage>. <pub-id pub-id-type="doi">10.18653/v1/2022.emnlp-main.225</pub-id></citation>
</ref>
<ref id="B89">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Peterson</surname> <given-names>C.</given-names></name></person-group> (<year>2023</year>). <article-title>ChatGPT and medicine: Fears, fantasy, and the future of physicians</article-title>. <source>Southwest respir. Crit. Care chron</source>. <volume>11</volume>, <fpage>18</fpage>&#x02013;<lpage>30</lpage>. <pub-id pub-id-type="doi">10.12746/swrccc.v11i48.1193</pub-id></citation>
</ref>
<ref id="B90">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Posner</surname> <given-names>K.</given-names></name> <name><surname>Brent</surname> <given-names>D.</given-names></name> <name><surname>Lucas</surname> <given-names>C.</given-names></name> <name><surname>Gould</surname> <given-names>M.</given-names></name> <name><surname>Stanley</surname> <given-names>B.</given-names></name> <name><surname>Brown</surname> <given-names>G.</given-names></name> <etal/></person-group>. (<year>2008</year>). <source>Columbia-Suicide Severity Rating Scale (c-ssrs)</source>. <publisher-loc>New York, NY</publisher-loc>: <publisher-name>Columbia University Medical Center 10</publisher-name>:<fpage>2008</fpage>.</citation>
</ref>
<ref id="B91">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Possati</surname> <given-names>L. M.</given-names></name></person-group> (<year>2022</year>). <article-title>Psychoanalyzing artificial intelligence: the case of replika</article-title>. <source>AI Society</source>. <volume>38</volume>, <fpage>1725</fpage>&#x02013;<lpage>1738</lpage>. <pub-id pub-id-type="doi">10.1007/s00146-021-01379-7</pub-id></citation>
</ref>
<ref id="B92">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Powell</surname> <given-names>J.</given-names></name></person-group> (<year>2019</year>). <article-title>Trust me, i&#x00027;ma chatbot: how artificial intelligence in health care fails the turing test</article-title>. <source>J. Med. Internet Res</source>. <volume>21</volume>, <fpage>e16222</fpage>. <pub-id pub-id-type="doi">10.2196/16222</pub-id><pub-id pub-id-type="pmid">31661083</pub-id></citation></ref>
<ref id="B93">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Qian</surname> <given-names>Q.</given-names></name> <name><surname>Huang</surname> <given-names>M.</given-names></name> <name><surname>Zhao</surname> <given-names>H.</given-names></name> <name><surname>Xu</surname> <given-names>J.</given-names></name> <name><surname>Zhu</surname> <given-names>X.</given-names></name></person-group> (<year>2018</year>). <article-title>Assigning personality/profile to a chatting machine for coherent conversation generation</article-title>. <source>IJCAI</source>. <volume>2018</volume>, <fpage>4279</fpage>&#x02013;<lpage>4285</lpage>. <pub-id pub-id-type="doi">10.24963/ijcai.2018/595</pub-id></citation>
</ref>
<ref id="B94">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Quan</surname> <given-names>H.</given-names></name> <name><surname>Sundararajan</surname> <given-names>V.</given-names></name> <name><surname>Halfon</surname> <given-names>P.</given-names></name> <name><surname>Fong</surname> <given-names>A.</given-names></name> <name><surname>Burnand</surname> <given-names>B.</given-names></name> <name><surname>Luthi</surname> <given-names>J.-C.</given-names></name> <etal/></person-group>. (<year>2005</year>). <article-title>Coding algorithms for defining comorbidities in icd-9-cm and icd-10 administrative data</article-title>. <source>Med. Care</source>. <volume>43</volume>, <fpage>1130</fpage>&#x02013;<lpage>1139</lpage>. <pub-id pub-id-type="doi">10.1097/01.mlr.0000182534.19832.83</pub-id><pub-id pub-id-type="pmid">16224307</pub-id></citation></ref>
<ref id="B95">
<citation citation-type="web"><person-group person-group-type="author"><collab>Quartet</collab></person-group> (<year>2014</year>). <source>Mental Health Care, Made Easier</source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://www.quartethealth.com">https://www.quartethealth.com</ext-link> (accessed September 23, 2023).</citation>
</ref>
<ref id="B96">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rai</surname> <given-names>A.</given-names></name></person-group> (<year>2020</year>). <article-title>Explainable AI: from black box to glass box</article-title>. <source>J. Acad. Market. Sci</source>. <volume>48</volume>, <fpage>137</fpage>&#x02013;<lpage>141</lpage>. <pub-id pub-id-type="doi">10.1007/s11747-019-00710-5</pub-id></citation>
</ref>
<ref id="B97">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rashkin</surname> <given-names>H.</given-names></name> <name><surname>Smith</surname> <given-names>E. M.</given-names></name> <name><surname>Li</surname> <given-names>M.</given-names></name> <name><surname>Boureau</surname> <given-names>Y.-L.</given-names></name></person-group> (<year>2018</year>). <article-title>Towards empathetic open-domain conversation models: a new benchmark and dataset</article-title>. <source>arXiv</source>. <pub-id pub-id-type="doi">10.18653/v1/P19-1534</pub-id></citation>
</ref>
<ref id="B98">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Raza</surname> <given-names>S.</given-names></name> <name><surname>Schwartz</surname> <given-names>B.</given-names></name> <name><surname>Rosella</surname> <given-names>L. C.</given-names></name></person-group> (<year>2022</year>). <article-title>Coquad: a covid-19 question answering dataset system, facilitating research, benchmarking, and practice</article-title>. <source>BMC Bioinformat</source>. <volume>23</volume>, <fpage>1</fpage>&#x02013;<lpage>28</lpage>. <pub-id pub-id-type="doi">10.1186/s12859-022-04751-6</pub-id><pub-id pub-id-type="pmid">35655148</pub-id></citation></ref>
<ref id="B99">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Regier</surname> <given-names>D. A.</given-names></name> <name><surname>Kuhl</surname> <given-names>E. A.</given-names></name> <name><surname>Kupfer</surname> <given-names>D. J.</given-names></name></person-group> (<year>2013</year>). <article-title>The dsm-5: classification and criteria changes</article-title>. <source>World Psychiat</source>. <volume>12</volume>, <fpage>92</fpage>&#x02013;<lpage>98</lpage>. <pub-id pub-id-type="doi">10.1002/wps.20050</pub-id><pub-id pub-id-type="pmid">23737408</pub-id></citation></ref>
<ref id="B100">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ribeiro</surname> <given-names>M. T.</given-names></name> <name><surname>Singh</surname> <given-names>S.</given-names></name> <name><surname>Guestrin</surname> <given-names>C.</given-names></name></person-group> (<year>2016</year>). <article-title>&#x0201C;Why should i trust you? Explaining the predictions of any classifier,&#x0201D;</article-title> in <source>Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</source>, <fpage>1135</fpage>&#x02013;<lpage>1144</lpage>. <pub-id pub-id-type="doi">10.1145/2939672.2939778</pub-id></citation>
</ref>
<ref id="B101">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rollwage</surname> <given-names>M.</given-names></name> <name><surname>Juchems</surname> <given-names>K.</given-names></name> <name><surname>Habicht</surname> <given-names>J.</given-names></name> <name><surname>Carrington</surname> <given-names>B.</given-names></name> <name><surname>Hauser</surname> <given-names>T.</given-names></name> <name><surname>Harper</surname> <given-names>R.</given-names></name></person-group> (<year>2022</year>). <article-title>Conversational ai facilitates mental health assessments and is associated with improved recovery rates</article-title>. <source>medRxiv</source>. <volume>2022</volume>, <fpage>2022</fpage>&#x02013;<lpage>11</lpage>. <pub-id pub-id-type="doi">10.1101/2022.11.03.22281887</pub-id></citation>
</ref>
<ref id="B102">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Romanov</surname> <given-names>A.</given-names></name> <name><surname>Shivade</surname> <given-names>C.</given-names></name></person-group> (<year>2018</year>). <article-title>&#x0201C;Lessons from natural language inference in the clinical domain,&#x0201D;</article-title> in <source>Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing</source> (<publisher-loc>Brussels</publisher-loc>), <fpage>1586</fpage>&#x02013;<lpage>1596</lpage>. <pub-id pub-id-type="doi">10.18653/v1/D18-1187</pub-id></citation>
</ref>
<ref id="B103">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Roy</surname> <given-names>K.</given-names></name> <name><surname>Gaur</surname> <given-names>M.</given-names></name> <name><surname>Zhang</surname> <given-names>Q.</given-names></name> <name><surname>Sheth</surname> <given-names>A.</given-names></name></person-group> (<year>2022b</year>). <article-title>Process knowledge-infused learning for suicidality assessment on social media</article-title>. <source>arXiv</source>.</citation>
</ref>
<ref id="B104">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Roy</surname> <given-names>K.</given-names></name> <name><surname>Sheth</surname> <given-names>A.</given-names></name> <name><surname>Gaur</surname> <given-names>M.</given-names></name></person-group> (<year>2023</year>). <source>Alleviate ChatBot</source>. <publisher-loc>Baltimore</publisher-loc>: <publisher-name>UMBC Faculty Collection</publisher-name>.</citation>
</ref>
<ref id="B105">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Roy</surname> <given-names>K.</given-names></name> <name><surname>Gaur</surname> <given-names>M.</given-names></name> <name><surname>Rawte</surname> <given-names>V.</given-names></name> <name><surname>Kalyan</surname> <given-names>A.</given-names></name> <name><surname>Sheth</surname> <given-names>A.</given-names></name></person-group> (<year>2022a</year>). <article-title>Proknow: Process knowledge for safety constrained and explainable question generation for mental health diagnostic assistance</article-title>. <source>Front. Big Data</source> <volume>5</volume>, <fpage>1056728</fpage>. <pub-id pub-id-type="doi">10.3389/fdata.2022.1056728</pub-id><pub-id pub-id-type="pmid">36700134</pub-id></citation></ref>
<ref id="B106">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rudin</surname> <given-names>C.</given-names></name></person-group> (<year>2019</year>). <article-title>Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead</article-title>. <source>Nat. Mach. Intelli</source>. <volume>1</volume>, <fpage>206</fpage>&#x02013;<lpage>215</lpage>. <pub-id pub-id-type="doi">10.1038/s42256-019-0048-x</pub-id><pub-id pub-id-type="pmid">35603010</pub-id></citation></ref>
<ref id="B107">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sallam</surname> <given-names>M.</given-names></name></person-group> (<year>2023</year>). <article-title>ChatGPT utility in healthcare education, research, and practice: systematic review on the promising perspectives and valid concerns</article-title>. <source>Healthcare</source> <volume>11</volume>, <fpage>887</fpage>. <pub-id pub-id-type="doi">10.3390/healthcare11060887</pub-id><pub-id pub-id-type="pmid">36981544</pub-id></citation></ref>
<ref id="B108">
<citation citation-type="web"><person-group person-group-type="author"><collab>SAMHSA</collab></person-group> (<year>2020</year>). <source>2020 National Survey of Drug Use and Health (NSDUH) Releases</source>. Available online at: <ext-link ext-link-type="uri" xlink:href="http://www.samhsa.gov/data/release/2020-national-survey-drug-use-and-health-nsduh-releases">http://www.samhsa.gov/data/release/2020-national-survey-drug-use-and-health-nsduh-releases</ext-link>.</citation>
</ref>
<ref id="B109">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Seitz</surname> <given-names>L.</given-names></name> <name><surname>Bekmeier-Feuerhahn</surname> <given-names>S.</given-names></name> <name><surname>Gohil</surname> <given-names>K.</given-names></name></person-group> (<year>2022</year>). <article-title>Can we trust a chatbot like a physician? A qualitative study on understanding the emergence of trust toward diagnostic chatbots</article-title>. <source>Int. J. Hum. Comput. Stud</source>. <volume>165</volume>, <fpage>102848</fpage>. <pub-id pub-id-type="doi">10.1016/j.ijhcs.2022.102848</pub-id></citation>
</ref>
<ref id="B110">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Sharma</surname> <given-names>A.</given-names></name> <name><surname>Lin</surname> <given-names>I. W.</given-names></name> <name><surname>Miner</surname> <given-names>A. S.</given-names></name> <name><surname>Atkins</surname> <given-names>D. C.</given-names></name> <name><surname>Althoff</surname> <given-names>T.</given-names></name></person-group> (<year>2021</year>). <source>Towards Facilitating Empathic Conversations in Online Mental Health Support: A Reinforcement Learning Approach</source>. <publisher-loc>New York, NY</publisher-loc>: <publisher-name>Association for Computing Machinery</publisher-name>. <pub-id pub-id-type="doi">10.1145/3442381.3450097</pub-id></citation>
</ref>
<ref id="B111">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sharma</surname> <given-names>A.</given-names></name> <name><surname>Lin</surname> <given-names>I. W.</given-names></name> <name><surname>Miner</surname> <given-names>A. S.</given-names></name> <name><surname>Atkins</surname> <given-names>D. C.</given-names></name> <name><surname>Althoff</surname> <given-names>T.</given-names></name></person-group> (<year>2023</year>). <article-title>Human-ai collaboration enables more empathic conversations in text-based peer-to-peer mental health support</article-title>. <source>Nat. Mach. Intellig</source>. <volume>5</volume>, <fpage>46</fpage>&#x02013;<lpage>57</lpage>. <pub-id pub-id-type="doi">10.1038/s42256-022-00593-2</pub-id></citation>
</ref>
<ref id="B112">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sheth</surname> <given-names>A.</given-names></name> <name><surname>Gaur</surname> <given-names>M.</given-names></name> <name><surname>Roy</surname> <given-names>K.</given-names></name> <name><surname>Faldu</surname> <given-names>K.</given-names></name></person-group> (<year>2021</year>). <article-title>Knowledge-intensive language understanding for explainable ai</article-title>. <source>IEEE Internet Computing</source> <volume>25</volume>, <fpage>19</fpage>&#x02013;<lpage>24</lpage>. <pub-id pub-id-type="doi">10.1109/MIC.2021.3101919</pub-id></citation>
</ref>
<ref id="B113">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sheth</surname> <given-names>A.</given-names></name> <name><surname>Gaur</surname> <given-names>M.</given-names></name> <name><surname>Roy</surname> <given-names>K.</given-names></name> <name><surname>Venkataraman</surname> <given-names>R.</given-names></name> <name><surname>Khandelwal</surname> <given-names>V.</given-names></name></person-group> (<year>2022</year>). <article-title>Process knowledge-infused ai: Toward user-level explainability, interpretability, and safety</article-title>. <source>IEEE Inter. Comput</source>. <volume>26</volume>, <fpage>76</fpage>&#x02013;<lpage>84</lpage>. <pub-id pub-id-type="doi">10.1109/MIC.2022.3182349</pub-id></citation>
</ref>
<ref id="B114">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sheth</surname> <given-names>A.</given-names></name> <name><surname>Yip</surname> <given-names>H. Y.</given-names></name> <name><surname>Shekarpour</surname> <given-names>S.</given-names></name></person-group> (<year>2019</year>). <article-title>Extending patient-chatbot experience with internet-of-things and background knowledge: case studies with healthcare applications</article-title>. <source>IEEE Intell. Syst</source>. <volume>34</volume>, <fpage>24</fpage>&#x02013;<lpage>30</lpage>. <pub-id pub-id-type="doi">10.1109/MIS.2019.2905748</pub-id><pub-id pub-id-type="pmid">34690576</pub-id></citation></ref>
<ref id="B115">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>&#x00160;krlj</surname> <given-names>B.</given-names></name> <name><surname>Ervzen</surname> <given-names>N.</given-names></name> <name><surname>Sheehan</surname> <given-names>S.</given-names></name> <name><surname>Luz</surname> <given-names>S.</given-names></name> <name><surname>Robnik-vSikonja</surname> <given-names>M.</given-names></name> <name><surname>Pollak</surname> <given-names>S.</given-names></name></person-group> (<year>2020</year>). <article-title>Attviz: Online exploration of self-attention for transparent neural language modeling</article-title>. <source>arXiv</source>. <pub-id pub-id-type="doi">10.48550/arXiv.2005.05716</pub-id></citation>
</ref>
<ref id="B116">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Sohail</surname> <given-names>S. H.</given-names></name></person-group> (<year>2023</year>). <source>AI Mental Health Chatbot Diagnoses Disorders with 93% Accuracy</source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://hitconsultant.net/2023/01/23/ai-mental-health-chatbot-diagnoses-disorders-with-93-accuracy/">https://hitconsultant.net/2023/01/23/ai-mental-health-chatbot-diagnoses-disorders-with-93-accuracy/</ext-link> (accessed 29 July, 2023).</citation>
</ref>
<ref id="B117">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Speer</surname> <given-names>R.</given-names></name> <name><surname>Chin</surname> <given-names>J.</given-names></name> <name><surname>Havasi</surname> <given-names>C.</given-names></name></person-group> (<year>2017</year>). <article-title>&#x0201C;Conceptnet 5.5: an open multilingual graph of general knowledge,&#x0201D;</article-title> in <source>Proceedings of the AAAI Conference on Artificial Intelligence</source>, 31. <pub-id pub-id-type="doi">10.1609/aaai.v31i1.11164</pub-id></citation>
</ref>
<ref id="B118">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Srivastava</surname> <given-names>B.</given-names></name></person-group> (<year>2021</year>). <article-title>Did chatbots miss their &#x0201C;apollo moment&#x0201D;? potential, gaps, and lessons from using collaboration assistants during covid-19</article-title>. <source>Patterns</source> <volume>2</volume>, <fpage>100308</fpage>. <pub-id pub-id-type="doi">10.1016/j.patter.2021.100308</pub-id><pub-id pub-id-type="pmid">34430927</pub-id></citation></ref>
<ref id="B119">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Stasaski</surname> <given-names>K.</given-names></name> <name><surname>Hearst</surname> <given-names>M. A.</given-names></name></person-group> (<year>2022</year>). <article-title>&#x0201C;Semantic diversity in dialogue with natural language inference,&#x0201D;</article-title> in <source>Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</source>, <fpage>85</fpage>&#x02013;<lpage>98</lpage>. <pub-id pub-id-type="doi">10.18653/v1/2022.naacl-main.6</pub-id></citation>
</ref>
<ref id="B120">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Su</surname> <given-names>H.</given-names></name> <name><surname>Shen</surname> <given-names>X.</given-names></name> <name><surname>Zhao</surname> <given-names>S.</given-names></name> <name><surname>Xiao</surname> <given-names>Z.</given-names></name> <name><surname>Hu</surname> <given-names>P.</given-names></name> <name><surname>Niu</surname> <given-names>C.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>&#x0201C;Diversifying dialogue generation with non-conversational text,&#x0201D;</article-title> in <source>58th Annual Meeting of the Association for Computational Linguistics</source>. <publisher-loc>Cambridge</publisher-loc>: <publisher-name>ACL, 7087-7097</publisher-name>.</citation>
</ref>
<ref id="B121">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Sundararajan</surname> <given-names>M.</given-names></name> <name><surname>Taly</surname> <given-names>A.</given-names></name> <name><surname>Yan</surname> <given-names>Q.</given-names></name></person-group> (<year>2017</year>). <article-title>&#x0201C;Axiomatic attribution for deep networks,&#x0201D;</article-title> in <source>International Conference on Machine Learning</source>. <publisher-loc>New York</publisher-loc>: <publisher-name>PMLR</publisher-name>, <fpage>3319</fpage>&#x02013;<lpage>3328</lpage>.</citation>
</ref>
<ref id="B122">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sweeney</surname> <given-names>C.</given-names></name> <name><surname>Potts</surname> <given-names>C.</given-names></name> <name><surname>Ennis</surname> <given-names>E.</given-names></name> <name><surname>Bond</surname> <given-names>R.</given-names></name> <name><surname>Mulvenna</surname> <given-names>M. D.</given-names></name> <name><surname>O?neill</surname> <given-names>S.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>Can chatbots help support a person&#x00027;s mental health? Perceptions and views from mental healthcare professionals and experts</article-title>. <source>ACM Trans. Comp. Healthcare</source> <volume>2</volume>, <fpage>1</fpage>&#x02013;<lpage>15</lpage>. <pub-id pub-id-type="doi">10.1145/3453175</pub-id></citation>
</ref>
<ref id="B123">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tlili</surname> <given-names>A.</given-names></name> <name><surname>Shehata</surname> <given-names>B.</given-names></name> <name><surname>Adarkwah</surname> <given-names>M. A.</given-names></name> <name><surname>Bozkurt</surname> <given-names>A.</given-names></name> <name><surname>Hickey</surname> <given-names>D. T.</given-names></name> <name><surname>Huang</surname> <given-names>R.</given-names></name> <etal/></person-group>. (<year>2023</year>). <article-title>What if the devil is my guardian angel: ChatGPT as a case study of using chatbots in education</article-title>. <source>Smart Learn. Environm</source>. <volume>10</volume>, <fpage>15</fpage>. <pub-id pub-id-type="doi">10.1186/s40561-023-00237-x</pub-id></citation>
</ref>
<ref id="B124">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Trella</surname> <given-names>A. L.</given-names></name> <name><surname>Zhang</surname> <given-names>K. W.</given-names></name> <name><surname>Nahum-Shani</surname> <given-names>I.</given-names></name> <name><surname>Shetty</surname> <given-names>V.</given-names></name> <name><surname>Doshi-Velez</surname> <given-names>F.</given-names></name> <name><surname>Murphy</surname> <given-names>S. A.</given-names></name></person-group> (<year>2022</year>). <article-title>Designing reinforcement learning algorithms for digital interventions: pre-implementation guidelines</article-title>. <source>Algorithms</source> <volume>15</volume>, <fpage>255</fpage>. <pub-id pub-id-type="doi">10.3390/a15080255</pub-id><pub-id pub-id-type="pmid">36713810</pub-id></citation></ref>
<ref id="B125">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Uban</surname> <given-names>A.-S.</given-names></name> <name><surname>Chulvi</surname> <given-names>B.</given-names></name> <name><surname>Rosso</surname> <given-names>P.</given-names></name></person-group> (<year>2021</year>). <article-title>An emotion and cognitive based analysis of mental health disorders from social media data</article-title>. <source>Future Generat. Computer Syst</source>. <volume>124</volume>, <fpage>480</fpage>&#x02013;<lpage>494</lpage>. <pub-id pub-id-type="doi">10.1016/j.future.2021.05.032</pub-id></citation>
</ref>
<ref id="B126">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Varshney</surname> <given-names>K. R.</given-names></name></person-group> (<year>2021</year>). <source>Trustworthy Machine Learning</source>. Chappaqua, NY.</citation>
</ref>
<ref id="B127">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Vrande&#x0010D;i&#x00107;</surname> <given-names>D.</given-names></name> <name><surname>Kr&#x000F6;tzsch</surname> <given-names>M.</given-names></name></person-group> (<year>2014</year>). <article-title>Wikidata: a free collaborative knowledgebase</article-title>. <source>Commun. ACM</source>. <volume>57</volume>, <fpage>78</fpage>&#x02013;<lpage>85</lpage>. <pub-id pub-id-type="doi">10.1145/2629489</pub-id></citation>
</ref>
<ref id="B128">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Walker</surname> <given-names>M. A.</given-names></name> <name><surname>Litman</surname> <given-names>D. J.</given-names></name> <name><surname>Kamm</surname> <given-names>C. A.</given-names></name> <name><surname>Abella</surname> <given-names>A.</given-names></name></person-group> (<year>1997</year>). <article-title>&#x0201C;PARADISE: a framework for evaluating spoken dialogue agents,&#x0201D;</article-title> in <source>35th Annual Meeting of the Association for Computational Linguistics and 8th Conference of the European Chapter of the Association for Computational Linguistics</source> (<publisher-loc>Madrid</publisher-loc>).</citation>
</ref>
<ref id="B129">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>Q.</given-names></name> <name><surname>Mao</surname> <given-names>Z.</given-names></name> <name><surname>Wang</surname> <given-names>B.</given-names></name> <name><surname>Guo</surname> <given-names>L.</given-names></name></person-group> (<year>2017</year>). <article-title>Knowledge graph embedding: a survey of approaches and applications</article-title>. <source>IEEE Trans. Knowl. Data Eng</source>. <volume>29</volume>, <fpage>2724</fpage>&#x02013;<lpage>2743</lpage>. <pub-id pub-id-type="doi">10.1109/TKDE.2017.2754499</pub-id></citation>
</ref>
<ref id="B130">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Weick</surname> <given-names>K. E.</given-names></name></person-group> (<year>1995</year>). <source>Sensemaking in Organizations</source>. <publisher-loc>Newbury Park</publisher-loc>: <publisher-name>Sage</publisher-name>.</citation>
</ref>
<ref id="B131">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Welbl</surname> <given-names>J.</given-names></name> <name><surname>Glaese</surname> <given-names>A.</given-names></name> <name><surname>Uesato</surname> <given-names>J.</given-names></name> <name><surname>Dathathri</surname> <given-names>S.</given-names></name> <name><surname>Mellor</surname> <given-names>J.</given-names></name> <name><surname>Hendricks</surname> <given-names>L. A.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>&#x0201C;Challenges in detoxifying language models,&#x0201D;</article-title> in <source>Findings of the Association for Computational Linguistics: EMNLP 2021</source> (<publisher-loc>Punta Cana</publisher-loc>).</citation>
</ref>
<ref id="B132">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Welivita</surname> <given-names>A.</given-names></name> <name><surname>Pu</surname> <given-names>P.</given-names></name></person-group> (<year>2022a</year>). <article-title>&#x0201C;Curating a large-scale motivational interviewing dataset using peer support forums,&#x0201D;</article-title> in <source>Proceedings of the 29th International Conference on Computational Linguistics</source>. <publisher-loc>Gyeongju, Republic of Korea</publisher-loc>: <publisher-name>International Committee on Computational Linguistics</publisher-name>, <fpage>3315</fpage>&#x02013;<lpage>3330</lpage>.</citation>
</ref>
<ref id="B133">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Welivita</surname> <given-names>A.</given-names></name> <name><surname>Pu</surname> <given-names>P.</given-names></name></person-group> (<year>2022b</year>). <article-title>&#x0201C;Heal: A knowledge graph for distress management conversations</article-title>. <source>Proc. AAAI Conf. Artificial Intell</source>. <volume>36</volume>, <fpage>11459</fpage>&#x02013;<lpage>11467</lpage>. <pub-id pub-id-type="doi">10.1609/aaai.v36i10.21398</pub-id></citation>
</ref>
<ref id="B134">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Westra</surname> <given-names>H. A.</given-names></name> <name><surname>Aviram</surname> <given-names>A.</given-names></name> <name><surname>Doell</surname> <given-names>F. K.</given-names></name></person-group> (<year>2011</year>). <article-title>Extending motivational interviewing to the treatment of major mental health problems: current directions and evidence</article-title>. <source>Canadian J. Psychiat</source>. <volume>56</volume>, <fpage>643</fpage>&#x02013;<lpage>650</lpage>. <pub-id pub-id-type="doi">10.1177/070674371105601102</pub-id></citation>
</ref>
<ref id="B135">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wolf</surname> <given-names>M. J.</given-names></name> <name><surname>Miller</surname> <given-names>K.</given-names></name> <name><surname>Grodzinsky</surname> <given-names>F. S.</given-names></name></person-group> (<year>2017</year>). <article-title>Why we should have seen that coming: comments on microsoft&#x00027;s tay &#x0201C;experiment,&#x0201D; and wider implications</article-title>. <source>Acm Sigcas Comp. Soc</source>. <volume>47</volume>:<fpage>54</fpage>&#x02013;<lpage>64</lpage>. <pub-id pub-id-type="doi">10.1145/3144592.3144598</pub-id></citation>
</ref>
<ref id="B136">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wu</surname> <given-names>Z.</given-names></name> <name><surname>Helaoui</surname> <given-names>R.</given-names></name> <name><surname>Kumar</surname> <given-names>V.</given-names></name> <name><surname>Reforgiato Recupero</surname> <given-names>D.</given-names></name> <name><surname>Riboni</surname> <given-names>D.</given-names></name></person-group> (<year>2020</year>). <article-title>&#x0201C;Towards detecting need for empathetic response in motivational interviewing,&#x0201D;</article-title> in <source>Companion Publication of the</source> 2020 International <italic>Conference on Multimodal Interaction</italic>, <fpage>497</fpage>&#x02013;<lpage>502</lpage>.</citation>
</ref>
<ref id="B137">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Xu</surname> <given-names>J.</given-names></name> <name><surname>Ju</surname> <given-names>D.</given-names></name> <name><surname>Li</surname> <given-names>M.</given-names></name> <name><surname>Boureau</surname> <given-names>Y.-L.</given-names></name> <name><surname>Weston</surname> <given-names>J.</given-names></name> <name><surname>Dinan</surname> <given-names>E.</given-names></name></person-group> (<year>2020</year>). <article-title>Recipes for safety in open-domain chatbots</article-title>. <source>arXiv</source>. <pub-id pub-id-type="doi">10.48550/arXiv.2010.07079</pub-id></citation>
</ref>
<ref id="B138">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Yazdavar</surname> <given-names>A. H.</given-names></name> <name><surname>Al-Olimat</surname> <given-names>H. S.</given-names></name> <name><surname>Ebrahimi</surname> <given-names>M.</given-names></name> <name><surname>Bajaj</surname> <given-names>G.</given-names></name> <name><surname>Banerjee</surname> <given-names>T.</given-names></name> <name><surname>Thirunarayan</surname> <given-names>K.</given-names></name> <etal/></person-group>. (<year>2017</year>). <source>Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media</source> <publisher-loc>New York, NY</publisher-loc>: <publisher-name>Association for Computing Machinery</publisher-name>.<pub-id pub-id-type="pmid">29707701</pub-id></citation></ref>
<ref id="B139">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>H.</given-names></name> <name><surname>Liu</surname> <given-names>Z.</given-names></name> <name><surname>Xiong</surname> <given-names>C.</given-names></name> <name><surname>Liu</surname> <given-names>Z.</given-names></name></person-group> (<year>2019</year>). <source>Conversation generation with concept</source> flow.</citation>
</ref>
<ref id="B140">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>T.</given-names></name> <name><surname>Schoene</surname> <given-names>A. M.</given-names></name> <name><surname>Ji</surname> <given-names>S.</given-names></name> <name><surname>Ananiadou</surname> <given-names>S.</given-names></name></person-group> (<year>2022</year>). <article-title>Natural language processing applied to mental illness detection: a narrative review</article-title>. <source>NPJ Digital Med</source>. <volume>5</volume>, <fpage>46</fpage>. <pub-id pub-id-type="doi">10.1038/s41746-022-00589-7</pub-id></citation>
</ref>
<ref id="B141">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zielasek</surname> <given-names>J.</given-names></name> <name><surname>Reinhardt</surname> <given-names>I.</given-names></name> <name><surname>Schmidt</surname> <given-names>L.</given-names></name> <name><surname>Gouzoulis-Mayfrank</surname> <given-names>E.</given-names></name></person-group> (<year>2022</year>). <article-title>Adapting and implementing apps for mental healthcare</article-title>. <source>Curr. Psychiatry Rep</source>. <volume>24</volume>, <fpage>407</fpage>&#x02013;<lpage>417</lpage>. <pub-id pub-id-type="doi">10.1007/s11920-022-01350-3</pub-id><pub-id pub-id-type="pmid">35835898</pub-id></citation></ref>
<ref id="B142">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zirikly</surname> <given-names>A.</given-names></name> <name><surname>Dredze</surname> <given-names>M.</given-names></name></person-group> (<year>2022</year>). <article-title>&#x0201C;Explaining models of mental health via clinically grounded auxiliary tasks,&#x0201D;</article-title> in <source>Proceedings of the Eighth Workshop on Computational Linguistics and Clinical Psychology</source>, <fpage>30</fpage>&#x02013;<lpage>39</lpage>.</citation>
</ref>
</ref-list>
</back>
</article> 