<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Psychiatry</journal-id>
<journal-title>Frontiers in Psychiatry</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Psychiatry</abbrev-journal-title>
<issn pub-type="epub">1664-0640</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fpsyt.2023.1110527</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Psychiatry</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>A machine learning approach to identifying suicide risk among text-based crisis counseling encounters</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Broadbent</surname> <given-names>Meghan</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/2204054/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Medina Grespan</surname> <given-names>Mattia</given-names></name>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Axford</surname> <given-names>Katherine</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="corresp" rid="c001"><sup>&#x002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/2068101/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Zhang</surname> <given-names>Xinyao</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Srikumar</surname> <given-names>Vivek</given-names></name>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Kious</surname> <given-names>Brent</given-names></name>
<xref ref-type="aff" rid="aff3"><sup>3</sup></xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Imel</surname> <given-names>Zac</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Educational Psychology, The University of Utah</institution>, <addr-line>Salt Lake City, UT</addr-line>, <country>United States</country></aff>
<aff id="aff2"><sup>2</sup><institution>Kahlert School of Computing, The University of Utah</institution>, <addr-line>Salt Lake City, UT</addr-line>, <country>United States</country></aff>
<aff id="aff3"><sup>3</sup><institution>Department of Psychiatry, The University of Utah</institution>, <addr-line>Salt Lake City, UT</addr-line>, <country>United States</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Brian Schwartz, University of Trier, Germany</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Mila Hall, University of Giessen, Germany; Tope Oyelade, University College London, United Kingdom; Batyrkhan Omarov, Al-Farabi Kazakh National University, Kazakhstan</p></fn>
<corresp id="c001">&#x002A;Correspondence: Katherine Axford, <email>kate.axford@utah.edu</email></corresp>
<fn fn-type="other" id="fn004"><p>This article was submitted to Psychological Therapy and Psychosomatics, a section of the journal Frontiers in Psychiatry</p></fn>
</author-notes>
<pub-date pub-type="epub">
<day>23</day>
<month>03</month>
<year>2023</year>
</pub-date>
<pub-date pub-type="collection">
<year>2023</year>
</pub-date>
<volume>14</volume>
<elocation-id>1110527</elocation-id>
<history>
<date date-type="received">
<day>28</day>
<month>11</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>23</day>
<month>02</month>
<year>2023</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x00A9; 2023 Broadbent, Medina Grespan, Axford, Zhang, Srikumar, Kious and Imel.</copyright-statement>
<copyright-year>2023</copyright-year>
<copyright-holder>Broadbent, Medina Grespan, Axford, Zhang, Srikumar, Kious and Imel</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract>
<sec>
<title>Introduction</title>
<p>With the increasing utilization of text-based suicide crisis counseling, new means of identifying at risk clients must be explored. Natural language processing (NLP) holds promise for evaluating the content of crisis counseling; here we use a data-driven approach to evaluate NLP methods in identifying client suicide risk.</p>
</sec>
<sec>
<title>Methods</title>
<p>De-identified crisis counseling data from a regional text-based crisis encounter and mobile tipline application were used to evaluate two modeling approaches in classifying client suicide risk levels. A manual evaluation of model errors and system behavior was conducted.</p>
</sec>
<sec>
<title>Results</title>
<p>The neural model outperformed a term frequency-inverse document frequency (tf-idf) model in the false-negative rate. While 75% of the neural model&#x2019;s false negative encounters had some discussion of suicidality, 62.5% saw a resolution of the client&#x2019;s initial concerns. Similarly, the neural model detected signals of suicidality in 60.6% of false-positive encounters.</p>
</sec>
<sec>
<title>Discussion</title>
<p>The neural model demonstrated greater sensitivity in the detection of client suicide risk. A manual assessment of errors and model performance reflected these same findings, detecting higher levels of risk in many of the false-positive encounters and lower levels of risk in many of the false negatives. NLP-based models can detect the suicide risk of text-based crisis encounters from the encounter&#x2019;s content.</p>
</sec>
</abstract>
<kwd-group>
<kwd>machine learning</kwd>
<kwd>suicide</kwd>
<kwd>crisis text-line</kwd>
<kwd>text content</kwd>
<kwd>natural language processing</kwd>
</kwd-group>
<counts>
<fig-count count="5"/>
<table-count count="4"/>
<equation-count count="0"/>
<ref-count count="57"/>
<page-count count="10"/>
<word-count count="6902"/>
</counts>
</article-meta>
</front>
<body>
<sec id="S1" sec-type="intro">
<title>1. Introduction</title>
<p>Suicide and crisis hotlines can be an effective, inexpensive, and accessible resource for people in crisis in need of confidential support (<xref ref-type="bibr" rid="B1">1</xref>, <xref ref-type="bibr" rid="B2">2</xref>). These interventions can help deescalate and reduce feelings of distress and hopelessness (<xref ref-type="bibr" rid="B2">2</xref>, <xref ref-type="bibr" rid="B3">3</xref>) as well as decrease the likelihood of attempting suicide (<xref ref-type="bibr" rid="B4">4</xref>, <xref ref-type="bibr" rid="B5">5</xref>). In the last decade, text-messaging has become a dominant form of communication generally and in mental health care, especially among youth who may prefer text-messaging as a more immediate, private, and familiar modality (<xref ref-type="bibr" rid="B2">2</xref>, <xref ref-type="bibr" rid="B6">6</xref>&#x2013;<xref ref-type="bibr" rid="B12">12</xref>). As such, text-based crisis counseling has begun to supplement phone-based conversations with tens of thousands of crisis messages sent each day (<xref ref-type="bibr" rid="B13">13</xref>). While the reach of text messaging is impressive, this volume of care places a tremendous burden on crisis systems to train and support counselors who are in short supply and at risk of burnout (<xref ref-type="bibr" rid="B14">14</xref>&#x2013;<xref ref-type="bibr" rid="B19">19</xref>). Ensuring the consistent provision of high-quality crisis services is critical to their effectiveness (<xref ref-type="bibr" rid="B20">20</xref>, <xref ref-type="bibr" rid="B21">21</xref>), and one such way to support this is through accurate and consistent evaluation of the level of risk in a given crisis conversation.</p>
<p>Current guidelines on crisis counseling and suicide risk assessment suggest a minimum of three questions to evaluate risk of suicide (<xref ref-type="bibr" rid="B20">20</xref>, <xref ref-type="bibr" rid="B21">21</xref>). However, the consistency in following these guidelines in crisis services can vary dramatically, with some studies showing that crisis clients are often not asked about suicidal ideation (<xref ref-type="bibr" rid="B3">3</xref>, <xref ref-type="bibr" rid="B22">22</xref>), and one study in particular finding no assessment of suicide risk in over half of all telephone crisis calls (<xref ref-type="bibr" rid="B23">23</xref>). However, it should be noted that counselors in crisis settings, especially text-based crisis settings, are often navigating these complex conversations with multiple clients simultaneously. While counselors, who often have relatively minimal training, are expected to report on risk after a conversation as a means of quality assurance, counselors are navigating many conflicting demands (e.g., responding to the next texter promptly and empathically vs. generating thorough documentation). Given broad concerns about the accuracy of documentation in medical records (<xref ref-type="bibr" rid="B24">24</xref>), it is possible they may overlook risk in a conversation or may simply forget to document risk in the midst of other competing tasks. It is imperative we explore new means of supporting counselors in the identification of at-risk clients.</p>
<p>Natural language processing (NLP) is one tool that holds significant promise for developing scalable methods for evaluating the content of crisis counseling (<xref ref-type="bibr" rid="B25">25</xref>). A subfield of artificial intelligence and linguistics, NLP methods enable a computerized approach to the learning, interpretation, processing, and analysis of human language in written or spoken form (<xref ref-type="bibr" rid="B26">26</xref>&#x2013;<xref ref-type="bibr" rid="B28">28</xref>). Modern NLP methods rely on a family of machine learning models called neural networks that are trained to encode linguistic information from input text data for a specific task such as text classification, text summarization, and text translation (<xref ref-type="bibr" rid="B29">29</xref>&#x2013;<xref ref-type="bibr" rid="B31">31</xref>). There have been notable efforts in recent years to deploy NLP methods in psychotherapy and mental health research (<xref ref-type="bibr" rid="B32">32</xref>, <xref ref-type="bibr" rid="B33">33</xref>), with numerous studies showing potential success in identifying and predicting instances of suicidality across a variety of text-based sources including clinical records, discharge notes, patient-therapist dialogues, and social media posts (<xref ref-type="bibr" rid="B34">34</xref>&#x2013;<xref ref-type="bibr" rid="B40">40</xref>). While this evidence suggests risk in text-based mental health counseling can be estimated using NLP-methods, research evaluating risk from clinical dialogues has focused on general asynchronous counseling environments where the risk of suicide is lower than in crisis counseling (<xref ref-type="bibr" rid="B33">33</xref>). Most recently, one study used domain knowledge to encode the content of the conversations for risk assessment (<xref ref-type="bibr" rid="B41">41</xref>). This work is promising but relies on the necessarily incomplete theoretical frameworks of experts, rather than a data-driven approach to learning associations between text and suicide risk, likely reducing generalizability and performance. There is no published application of modern transformer-based language models to risk identification in crisis counseling.</p>
<p>In this study, we build upon recent advancements in NLP using a data-driven approach to train and test NLP-methods on naturalistic crisis counseling data in identifying the presence of suicide risk. Specifically, we present a modern transformer-based neural architecture powered by state-of-the-art Robustly Optimized BERT Pre-training Approach (RoBERTa) embeddings trained over large, labeled crisis conversations from a regional crisis counseling app (<xref ref-type="bibr" rid="B42">42</xref>). It is possible for neural network models to learn spurious correlations based on artifacts of data collected (<xref ref-type="bibr" rid="B43">43</xref>, <xref ref-type="bibr" rid="B44">44</xref>). Accordingly, we also conducted a thorough analysis of model errors and system behavior including a manual evaluation of encounters associated with model errors and cumulative risk throughout a crisis counseling dialogue.</p>
</sec>
<sec id="S2" sec-type="materials|methods">
<title>2. Materials and methods</title>
<sec id="S2.SS1">
<title>2.1. Data</title>
<p>This retrospective study utilized de-identified data from 5,992 crisis counseling encounters (totaling 273,804 messages) collected from SafeUT, a regional text-based crisis encounter and mobile tip line app (see <xref ref-type="table" rid="T1">Table 1</xref> for a data summary). The SafeUT counselors are licensed or license-eligible clinical social workers with a background in crisis counseling. SafeUT counselors receive additional training in suicide risk assessment and safety planning. The study sample included crisis encounters from clients of any age located in Utah, Idaho, and Nevada who utilized the service between June 2020&#x2013;April 2021. Mobile tips, a system for notifying schools and educators about potentially at-risk student peers, were excluded from the study sample. Institutional Review Board (IRB) approval was obtained for this study. SafeUT does not systematically collect potentially identifiable information and text messages were scrubbed of incidental identifying information prior to analysis.</p>
<table-wrap position="float" id="T1">
<label>TABLE 1</label>
<caption><p>Data summary of crisis counseling encounters.</p></caption>
<table cellspacing="5" cellpadding="5" frame="box" rules="all">
<thead>
<tr>
<td valign="top" align="left" style="color:#ffffff;background-color: #7f8080;">Encounter measure</td>
<td valign="top" align="center" style="color:#ffffff;background-color: #7f8080;">Minimum</td>
<td valign="top" align="center" style="color:#ffffff;background-color: #7f8080;">Median</td>
<td valign="top" align="center" style="color:#ffffff;background-color: #7f8080;">Mean (SD)</td>
<td valign="top" align="center" style="color:#ffffff;background-color: #7f8080;">Maximum</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Duration (minutes)</td>
<td valign="top" align="center">0.014</td>
<td valign="top" align="center">0.950</td>
<td valign="top" align="center">1.106 (0.776)</td>
<td valign="top" align="center">9.605</td>
</tr>
<tr>
<td valign="top" align="left">Number of counselors</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">1.404 (0.574)</td>
<td valign="top" align="center">5</td>
</tr>
<tr>
<td valign="top" align="left">Counselor messages</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">17</td>
<td valign="top" align="center">20.973 (14.568)</td>
<td valign="top" align="center">159</td>
</tr>
<tr>
<td valign="top" align="left">Client messages</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">20</td>
<td valign="top" align="center">24.840 (19.427)</td>
<td valign="top" align="center">287</td>
</tr>
</tbody>
</table></table-wrap>
</sec>
<sec id="S2.SS2">
<title>2.2. Measure</title>
<p>Dispositions of each crisis counseling encounter, labeled by the SafeUT counselors, were used to measure the level of client risk. Counselor-generated dispositions cover a range of topics discussed, services provided, type and level of action needed, client perceptions of crisis counseling interaction, as well as the degree of client suicide risk perceived by the counselor. For the latter, counselors are asked to follow the Suicide Risk Assessment Standards of the National Suicide Prevention Lifeline (<xref ref-type="bibr" rid="B45">45</xref>). This guideline, recommends crisis workers to a minimum number of suicide status prompt questions (see <xref ref-type="supplementary-material" rid="DS1">Supplementary Appendix I</xref> for more details). Importantly, crisis workers are instructed to mark the degree of risk in a way that reflects the whole encounter in aggregate.</p>
<p>Counselors assign suicide risk labels (i.e., dispositions) to each crisis counseling encounter, with options ranging from low-risk, moderate-risk, high-risk, and emergency referral (mobile crisis outreach team response, active rescue by law enforcement or paramedics, school contact). Counselors evaluate suicide risk level, based on clients&#x2019; self-report, using their clinical judgment with respect to the intensity of reported suicidal ideation and other clinical risk factors (such as access to lethal means) that are endorsed by the client. Overall, 85.3% of all crisis counseling encounters were categorized as &#x201C;lower risk,&#x201D; followed by 9.25% as moderate-risk encounters, 3.47% as high-risk encounters, and 1.95% as emergency-referral encounters. For the purposes of this study, we collapsed these ratings into a binary label classifying risk as either &#x201C;lower risk&#x201D; or &#x201C;higher risk&#x201D; to form more even groups based on sample size and to allow for a logistic regression analysis (where &#x201C;higher risk&#x201D; included all other categories except low-risk). Overall, 85.3% of all crisis counseling encounters were categorized as &#x201C;lower risk,&#x201D; followed by 9.25% as moderate-risk encounters, 3.47% as high-risk encounters, and 1.95% as emergency-referral encounters (resulting in 14.67% of all encounters categorized as having &#x201C;higher risk;&#x201D; <xref ref-type="table" rid="T2">Table 2</xref>).</p>
<table-wrap position="float" id="T2">
<label>TABLE 2</label>
<caption><p>Categories of risk.</p></caption>
<table cellspacing="5" cellpadding="5" frame="box" rules="all">
<thead>
<tr>
<td valign="top" align="left" colspan="2" style="color:#ffffff;background-color: #7f8080;">Category</td>
<td valign="top" align="center" style="color:#ffffff;background-color: #7f8080;"><italic>n</italic></td>
<td valign="top" align="center" style="color:#ffffff;background-color: #7f8080;">%</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left" colspan="2">Lower risk</td>
<td valign="top" align="center">5,113</td>
<td valign="top" align="center">85.3</td>
</tr>
<tr>
<td valign="top" align="left" rowspan="3">Higher risk<xref ref-type="table-fn" rid="t2fns1">&#x002A;</xref></td>
<td valign="top" align="center">Moderate risk</td>
<td valign="top" align="center">554</td>
<td valign="top" align="center">9.25</td>
</tr>
<tr>
<td valign="top" align="center">High risk</td>
<td valign="top" align="center">208</td>
<td valign="top" align="center">3.47</td>
</tr>
<tr>
<td valign="top" align="center">Emergency referral</td>
<td valign="top" align="center">117</td>
<td valign="top" align="center">1.95</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="t2fns1"><p>&#x002A;<italic>n</italic> = 879 (14.6% of all crisis counseling encounters).</p></fn>
</table-wrap-foot>
</table-wrap>
</sec>
<sec id="S2.SS3">
<title>2.3. Model training and analysis</title>
<p>Two modeling approaches were evaluated for the classification of risk level in crisis encounters, a neural network model and a term frequency-inverse document frequency (tf-idf) weighted logistic regression model for a baseline comparison. Existing counseling-generated risk dispositions were used to train both models in classifying the level of risk for each crisis counseling encounter. Both models were evaluated using the receiver operating characteristic area under the curve (ROC AUC). The higher the AUC score the better the model classification, with an AUC of 1 suggesting a perfect (but likely over-fitted) classifier, an AUC of 0.5 suggesting a random-chance classifier, and an AUC of 0.8 or higher suggesting a good classifier (<xref ref-type="bibr" rid="B46">46</xref>). Other evaluation measures included sensitivity, proportion of higher risk encounters correctly classified by the model; specificity, proportion of lower risk encounters correctly classified by the model; precision, proportion of correct higher risk predictions of the model; false-negative rate (1-sensitivity), and false-positive rate (1-specificity).</p>
<p>Both the neural network model and tf-idf model received a crisis counseling encounter as input and output a probability distribution for &#x201C;lower risk&#x201D; vs. &#x201C;higher risk.&#x201D; An 80/20 train-test split was used, with model training on 80% of the data and two hold-out (test) sets, each corresponding to 10% of the remaining data, for development and testing, respectively. This partition was done maintaining the same distribution of the labels across the three datasets.</p>
</sec>
<sec id="S2.SS4">
<title>2.4. Neural network</title>
<p>The neural network model utilized a machine learning transformer architecture. Transformers are a family of neural networks designed to process sequential data using self-attention, a mechanism allowing the network to extract and use information from arbitrarily large input contexts efficiently (<xref ref-type="bibr" rid="B47">47</xref>). The initial component in the transformer architecture (i.e., encoder) is particularly useful in natural language processing as it takes a string of text (sequence of words) and returns a sequence of numerical representations of the input corresponding to each word in the input text (<xref ref-type="bibr" rid="B47">47</xref>). These numerical representations (i.e., word embeddings) contain the semantic and grammatical meaning learned from context through the transformer&#x2019;s self-attention process (<xref ref-type="bibr" rid="B47">47</xref>). Current state-of-the-art approaches in many NLP-tasks are based on RoBERTa embeddings, which are contextualized word embeddings obtained from stacking multiple transformer encoder blocks or layers pre-trained on large corpora of text (<xref ref-type="bibr" rid="B42">42</xref>, <xref ref-type="bibr" rid="B48">48</xref>).</p>
<p>In this study, RoBERTa word embeddings were aggregated into a single sentence embedding for each message in each crisis counseling encounter.<sup><xref ref-type="fn" rid="footnote1">1</xref></sup> Importantly, to adapt the original general-purpose RoBERTa embeddings to the domain of crisis counseling (<xref ref-type="bibr" rid="B49">49</xref>), we continued pretraining the model using 120,000 encounters from the SafeUT app (almost 2.5 million messages). The originator of each message (client or counselor) was prepended to each message in an encounter, whereby the concatenation of each pair of consecutive messages in an encounter was provided as inputs to the language model (e.g., back-to-back messages were provided as a single message input). To obtain the embedding representation of an encounter, we averaged the output of 6 transformer encoder layers. Lastly, the encounter embedding was passed through a linear neural layer (a neural network with just one layer of nodes) for binary classification [<xref ref-type="fig" rid="F1">Figure 1</xref>; (<xref ref-type="bibr" rid="B47">47</xref>)]. To measure the stability of the model, the training process was repeated and averaged across five different random seeds. We used the model with the best AUC performance in the development set (out of the five seeds) for error and system behavior analysis in later sections.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption><p>Architecture of the neural network model using an example of five messages. B, beginning token; C, client, M, message; MLP, Multiple-layer perception; T, therapist. Example using a 5-message encounter input. Special tokens (#T) and (#C) are prepared to each client and counselor message, respectively, to inform the model about the originator of each message along the conversation. We preprocessed each encounter by obtaining the aggregated RoBERTa sentence embeddings of every pair of consecutive message-spans in the encounter. Each obtained sentence embedding sequence is passed through a transformer-encoder block, whose output is aggregated to obtain an encounter embedding representation. The encounter vector is then fed to a final neural linear layer to obtain the binary confidence estimates for classification.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpsyt-14-1110527-g001.tif"/>
</fig>
</sec>
<sec id="S2.SS5">
<title>2.5. Tf-logistic regression</title>
<p>In this comparison approach, we vectorized the data using a term frequency-inverse document frequency statistic (tf-idf), combining unigrams (individual words) and bi-grams (pairs of consecutive words) from the messages on the sessions. A binary logistic regression classifier was trained on the vectorized data until convergence was achieved.</p>
</sec>
<sec id="S2.SS6">
<title>2.6. Error assessment</title>
<p>A manual evaluation of model errors was conducted to assess features of crisis counseling encounters falsely categorized as &#x201C;higher risk&#x201D; or &#x201C;lower risk&#x201D; by the neural model to better understand the behavior of the model and its prediction of risk. A team of four human reviewers assessed each encounter erroneously categorized by the neural model for indicators of suicidality, non-suicidal self-injury, abuse, emergency service triage, mobile tips, social service involvement, discussion of therapy services, and client drop-off (i.e., when the client stops responding to the encounter). The degree of resolution of the client&#x2019;s complaint was also evaluated, with resolution indicating a near total reduction of a client&#x2019;s risk or distress and/or de-escalation of client crisis at encounter end. Partial resolution similarly indicates some reduction of client distress with moderate to low client risk at end of the encounter. No resolution indicates minimal to no reduction of client risk or distress with the client remaining at risk at end of the encounter. Lastly, the team of reviewers made a final determination of whether the encounter should be labeled higher risk or lower risk based on the context and indicators of the encounter.</p>
<p>Moreover, the dynamics of the risk probability continuum within a crisis conversation were evaluated to better understand the cumulative signal of risk captured by the neural model as counselor-client dialogue develops. Three crisis counseling encounters were selected at random to illustrate the neural model&#x2019;s performance in predicting actual higher risk (true positives), falsely predicting higher risk (false positives), and falsely predicting lower risk (false negatives). True negative encounters were excluded from this illustration as these were mostly flat lines indicating minimal to no signal of risk throughout a counselor-client dialogue. This secondary evaluation of model errors allowed for a deeper inspection of model behavior and content that drove the model predictions.</p>
</sec>
</sec>
<sec id="S3" sec-type="results">
<title>3. Results</title>
<p>The neural model achieved an average AUC of 90.37% (<xref ref-type="table" rid="T3">Table 3</xref>). Precision, specificity, sensitivity, and other model metrics are reported in <xref ref-type="table" rid="T3">Table 3</xref>. In comparison with the neural model, the tf-idf model demonstrated a slight decrease in performance with an AUC of 88.17 (<xref ref-type="fig" rid="F2">Figure 2</xref> and <xref ref-type="table" rid="T3">Table 3</xref>).</p>
<table-wrap position="float" id="T3">
<label>TABLE 3</label>
<caption><p>Model performance.</p></caption>
<table cellspacing="5" cellpadding="5" frame="box" rules="all">
<thead>
<tr>
<td valign="top" align="left" style="color:#ffffff;background-color: #7f8080;"></td>
<td valign="top" align="center" style="color:#ffffff;background-color: #7f8080;">Neural model (SD)</td>
<td valign="top" align="center" style="color:#ffffff;background-color: #7f8080;">Tf-idf model</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">ROC AUC</td>
<td valign="top" align="center">90.37 (0.18)</td>
<td valign="top" align="center">88.18</td>
</tr>
<tr>
<td valign="top" align="left">Specificity</td>
<td valign="top" align="center">92.89 (0.46)</td>
<td valign="top" align="center">97.84</td>
</tr>
<tr>
<td valign="top" align="left">Sensitivity</td>
<td valign="top" align="center">62.02 (0.54)</td>
<td valign="top" align="center">39.52</td>
</tr>
<tr>
<td valign="top" align="left">Precision</td>
<td valign="top" align="center">59.52 (1.80)</td>
<td valign="top" align="center">75.56</td>
</tr>
<tr>
<td valign="top" align="left">FNR</td>
<td valign="top" align="center">37.98 (0.56)</td>
<td valign="top" align="center">60.47</td>
</tr>
<tr>
<td valign="top" align="left">FPR</td>
<td valign="top" align="center">7.11 (0.46)</td>
<td valign="top" align="center">2.16</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn><p>FNR, false negative rate; FPR, false positive rate; SD, standard deviation. Average test performance across training with five different random seeds is shown in the neural model column.</p></fn>
</table-wrap-foot>
</table-wrap>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption><p>Comparison of model ROC AUCs. AUC, area under the curve; ROC, receiver operating characteristic. ROC curves of the tf-idf weighted logistic regression model (blue) and the neural model with the best performance in the development set out of the five runs (orange).</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpsyt-14-1110527-g002.tif"/>
</fig>
<p>Overall, the ROC curves and AUC scores of the two models are similar. Yet there are important differences. The false-negative rate of the neural model was relatively high at 37.98%, but it represents a 22.49-percentage point improvement in classifying risk compared to the tf-idf model at a 60.47% false-negative rate (<xref ref-type="table" rid="T3">Table 3</xref>). Similarly, the neural model had nearly double the sensitivity compared to the tf-idf model (62.02 vs. 39.52%, respectively; <xref ref-type="table" rid="T3">Table 3</xref>). The false-positive rate was low for both models, with the neural model slightly higher at 7.11%, highlighting the increased sensitivity of the neural model compared to the tf-idf model. Lastly, the tf-idf model demonstrated a higher positive predictive value and slightly higher specificity compared to the neural model (<xref ref-type="table" rid="T3">Table 3</xref>); highlighting the tf-idf model&#x2019;s tendency to classify crisis counseling encounters as a lower risk compared to the neural model.</p>
<sec id="S3.SS1">
<title>3.1. Manual assessment of errors</title>
<p>Based on counselor dispositions, the neural model incorrectly categorized a total of 32 encounters as lower risk (false negatives&#x2013;i.e., the model labeled the encounter as lower risk when the human counselor had rated it as having significant risk) and 33 encounters as higher risk (false positives&#x2013;i.e., the model predicted significant risk in the encounter when the human counselor had rated it as lower risk); <xref ref-type="table" rid="T4">Table 4</xref>; see <xref ref-type="supplementary-material" rid="DS1">Supplementary Appendix II</xref> for Tables 5, 6 detailing <italic>ad hoc</italic> human evaluation.</p>
<table-wrap position="float" id="T4">
<label>TABLE 4</label>
<caption><p>Summary of manual error assessment.</p></caption>
<table cellspacing="5" cellpadding="5" frame="box" rules="all">
<thead>
<tr>
<td valign="top" align="left" colspan="2" style="color:#ffffff;background-color: #7f8080;">Measure<xref ref-type="table-fn" rid="t4fns1">&#x002A;</xref></td>
<td valign="top" align="center" style="color:#ffffff;background-color: #7f8080;">False negatives<break/> <italic>N</italic> = 32<break/> encounters<break/> <italic>n</italic> (%)</td>
<td valign="top" align="center" style="color:#ffffff;background-color: #7f8080;">False positives<break/> <italic>N</italic> = 33<break/> encounters<break/> <italic>n</italic> (%)</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left" colspan="2">Higher risk<xref ref-type="table-fn" rid="t4fna"><sup>a</sup></xref></td>
<td valign="top" align="center">17 (53.1)</td>
<td valign="top" align="center">13 (39.4)</td>
</tr>
<tr>
<td valign="top" align="left" colspan="2">Suicidality</td>
<td valign="top" align="center">24 (75.0)</td>
<td valign="top" align="center">20 (60.6)</td>
</tr>
<tr>
<td valign="top" align="left" colspan="2">NSSI<xref ref-type="table-fn" rid="t4fnb"><sup>b</sup></xref></td>
<td valign="top" align="center">10 (31.3)</td>
<td valign="top" align="center">9 (27.3)</td>
</tr>
<tr>
<td valign="top" align="left" colspan="2">Abuse</td>
<td valign="top" align="center">6 (18.8)</td>
<td valign="top" align="center">11 (33.3)</td>
</tr>
<tr>
<td valign="top" align="left" colspan="2">Emergency triage<xref ref-type="table-fn" rid="t4fnc"><sup>c</sup></xref></td>
<td valign="top" align="center">6 (18.8)</td>
<td valign="top" align="center">12 (36.4)</td>
</tr>
<tr>
<td valign="top" align="left" colspan="2">Mobile tips</td>
<td valign="top" align="center">10 (31.3)</td>
<td valign="top" align="center">4 (12.1)</td>
</tr>
<tr>
<td valign="top" align="left" colspan="2">Social services</td>
<td valign="top" align="center">6 (18.8)</td>
<td valign="top" align="center">3 (9.1)</td>
</tr>
<tr>
<td valign="top" align="left" colspan="2">External therapy services</td>
<td valign="top" align="center">15 (46.9)</td>
<td valign="top" align="center">16 (48.5)</td>
</tr>
<tr>
<td valign="top" align="left" colspan="2">Client drop off<xref ref-type="table-fn" rid="t4fnd"><sup>d</sup></xref></td>
<td valign="top" align="center">11 (34.4)</td>
<td valign="top" align="center">14 (42.4)</td>
</tr>
<tr>
<td valign="top" align="left">Complaint<break/> resolution<xref ref-type="table-fn" rid="t4fne"><sup>e</sup></xref><break/></td>
<td valign="top" align="left">Yes<break/> Partial<break/> No</td>
<td valign="top" align="center">20 (62.5)<break/> 4 (12.5)<break/> 8 (25.0)</td>
<td valign="top" align="center">14 (42.4)<break/> 11 (33.3)<break/> 8 (24.2)</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="t4fna"><p><sup>a</sup>Higher risk determination based on team assessment of encounters.</p></fn>
<fn id="t4fnb"><p><sup>b</sup>Non-suicidal self-injury (NSSI) includes self-harm without intent to die, such as cutting.</p></fn>
<fn id="t4fnc"><p><sup>c</sup>Emergency triage includes triaging to emergency responders, hospital emergency rooms, and mobile crisis outreach teams.</p></fn>
<fn id="t4fnd"><p><sup>d</sup>Client drop-off indicates the client stopped responding to the counselor.</p></fn>
<fn id="t4fne"><p><sup>e</sup>Resolve indicates a reduction in client risk or distress and/or de-escalation of a client crisis.</p></fn>
<fn id="t4fns1"><p>&#x002A;See <xref ref-type="supplementary-material" rid="DS1">Supplementary Appendix II</xref> for more details on encounter summaries and indicators.</p></fn>
</table-wrap-foot>
</table-wrap>
<p>A manual assessment of each falsely categorized encounter revealed that of the 32 encounters where the model rated risk low (when a human therapist had rated it higher risk), 46.9% were considered lower risk by the team assessment (<xref ref-type="table" rid="T4">Table 4</xref>). Moreover, the majority of these false-negative encounters also had no discussion of other concerning topics such as non-suicidal self-injury, abuse, or involvement of first responders or crisis services (68.7, 81.2, and 81.2%, respectively). While a 75% majority of false-negative encounters had some discussion of suicidality, 62.5% also saw a resolution of the client&#x2019;s initial concerns and reasons for using the service, suggesting some potential appropriateness in the neural model&#x2019;s assessment. On the other hand, only 40% of the encounters categorized as higher risk by the model but lower risk by the counselor (i.e., false positives) were determined to be lower risk by the team of reviewers, while 60.6% included discussions of suicidality. A substantial minority of encounters also included concerns with non-suicidal self-injury, abuse, or involvement of first responders or crisis services (27.3, 33.3, and 36.4%, respectively). Furthermore, the majority of the encounters saw only partial or no resolution of the client&#x2019;s concerns using the service (<xref ref-type="table" rid="T4">Table 4</xref>).</p>
<p>One potential interpretation of these findings is that the models learned appropriate indicators of risk, making them robust to the inherent inconsistency noise of the human counselor labeling. As noted in the introduction, crisis counselors are responding to multiple high-stress situations and their ratings may not be without error. Furthermore, counselors have access to historical information from clients, such as prior utilization of the SafeUT app, that also may affect the risk assessment.</p>
</sec>
<sec id="S3.SS2">
<title>3.2. Dialogue risk curves</title>
<p>To better understand the neural model&#x2019;s performance, a cumulative probability of higher risk of each message and its contribution to an encounter was evaluated. The probability of higher risk throughout an encounter was visualized to demonstrate the continual dynamic of risk assessed by the neural model and where the neural model picks up on signals of risk. Three crisis counseling encounters were selected at random to illustrate the neural model&#x2019;s performance in predicting actual higher risk (true positives; <xref ref-type="fig" rid="F3">Figure 3</xref>), falsely predicting higher risk (false positives; <xref ref-type="fig" rid="F4">Figure 4</xref>), and falsely predicting lower risk (false negatives; <xref ref-type="fig" rid="F5">Figure 5</xref>).</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption><p>Dialogue risk curves: Model rating = higher risk, Human rating = higher risk (True positive). S, SafeUT counselor; C, SafeUT client; MCOT, Mobile Crisis Outreach Team. Crisis counseling encounter has been fictionalized to maintain confidentiality.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpsyt-14-1110527-g003.tif"/>
</fig>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption><p>Dialogue risk curves: Model rating = higher risk, Human rating = higher risk (False positive). S, SafeUT counselor; C, SafeUT client. Crisis counseling encounter has been fictionalized to maintain confidentiality.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpsyt-14-1110527-g004.tif"/>
</fig>
<fig id="F5" position="float">
<label>FIGURE 5</label>
<caption><p>Dialogue risk curves: Model rating = lower risk, Human rating = lower risk (False negative). S, SafeUT counselor; C, SafeUT client; DCFS, Division of child and Family Services. Crisis counseling encounter has been fictionalized to maintain confidentiality.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpsyt-14-1110527-g005.tif"/>
</fig>
<p>Overall, the neural model appears to appropriately detect and predict both logical signals of risk and fluctuations in the degree of risk throughout a crisis counseling encounter. In the example where the neural model accurately predicts higher risk, a sudden increase in predicted higher risk is seen when a client states &#x201C;I want to kill myself&#x201D; (<xref ref-type="fig" rid="F3">Figure 3</xref>). Moreover, the neural model seems to capture the higher risk associated with a distressed client who suddenly stops responding to the counselor. Interestingly, the neural model demonstrates an ability to capture the fluctuations in risk even in cases where it inaccurately predicts the level of risk (<xref ref-type="fig" rid="F3">Figures 3</xref>, <xref ref-type="fig" rid="F4">4</xref>). While the counselor disposition in <xref ref-type="fig" rid="F4">Figure 4</xref> did not indicate higher risk, the neural model picks up on signals of higher risk when the client confirms suicidality with intent and similarly stops responding to the counselor; this might suggest that the neural model is able to aid a counselor in detecting higher risk. Similar to the examples in <xref ref-type="fig" rid="F3">Figures 3</xref>, <xref ref-type="fig" rid="F4">4</xref>, the neural model demonstrates a sensible prediction of risk for the false-negative encounter in <xref ref-type="fig" rid="F5">Figure 5</xref> (where the counselor indicated higher risk and the neural model assigned lower risk). Overall, the neural model positively reflects a higher risk throughout this encounter when suicidality and active distress are present (<xref ref-type="fig" rid="F5">Figure 5</xref>). However, the neural model seems to detect the reduced risk associated with casual conversation and the overall diffusion of the client&#x2019;s distress at the end of the encounter.</p>
</sec>
</sec>
<sec id="S4" sec-type="discussion">
<title>4. Discussion</title>
<p>The current study illustrates the development and predictive utility of an NLP-algorithm to detect and classify the level of client suicide risk in a text-based crisis counseling environment. We examined a large sample of anonymous clients engaged in crisis counseling through the SafeUT platform, analyzing the content of crisis counseling text messages with SafeUT counselors.</p>
<p>Overall, the neural model yielded an excellent ROC AUC score of 0.9037. While the false positive and negative rates were higher than ideal (7.11 and 37.98%, respectively), a manual assessment of errors and evaluation of model performance throughout an encounter revealed the neural model was able to detect legitimate signals of higher risk in many of the false-positive encounters as well as lower levels of risk in many of the false negatives. This suggests that the neural model detected signals of risk even when learning from imperfect data. These findings further add support to the use of NLP methods as a potentially effective tool for aiding counselors in evaluating client risk in a text-based crisis counseling environment. This has important practice implications, not only for improving crisis counseling services but also to inform training and best practices for mental health counselors providing those services. In particular, eventual applications of systems like the one evaluated in this article could be used in parallel&#x2013;not as a replacement&#x2013;to provider assessment of risk. System detection of valid indicators of risk not identified by providers could provide an important stopgap in documenting key clinical processes when busy providers might be distracted by the next important clinical need.</p>
<p>Clear distinctions exist between our work and more recent efforts in automatic suicide risk detection in mental health counseling. Other studies have relied primarily on telemedicine psychotherapy dyads data including non-crisis interventions, making the base rate of suicide-related content dramatically different from the one found in an exclusive crisis counseling service. The work that pioneered this line of research reports models trained only with client messages from asynchronous encounters, hindering their system from learning crucial aspects of the task such as real-time risk dynamics of counseling, and the effect of counselor messages within the conversation (<xref ref-type="bibr" rid="B33">33</xref>). A more recent paper relies on data obtained from a similar text-based counseling platform, encoding entire encounters including counselor and client messages, as we do in our study (<xref ref-type="bibr" rid="B41">41</xref>). The authors introduce an interesting model with a knowledge-aware encoding layer obtained from a knowledge graph constructed by mental health experts demonstrating its efficacy through ablation studies. A downside of this kind of handcrafted database approach is the quality analysis, namely how to evaluate the completeness and correctness of the graph (<xref ref-type="bibr" rid="B50">50</xref>). In contrast, we propose a purely data-driven system where any existing relationships between concept words is extracted directly from the semantic and syntactic information present in the data. We argue that our approach is more robust and flexible when translated to other clinical fields, as it would only need the new dataset (the more data the better) without the need to rebuild a knowledge database that could depend not only on the specific domain expertise, but also on important aspects like cultural background of the study, data coming from a different language, or domain knowledge availability.</p>
<p>A contribution we consider unique in this article is that we provide detailed descriptive analyses of model results to evaluate the disagreements between the model and counselors who originally labeled the interactions. We had four clinical scientists read and relabel each encounter that had a predicted label different from the original counselor-generated label. From these experiments, we observed that for a significant number of encounters, the experts agreed with the neural model. This may indicate that our model is generalizing beyond the noise usually present in the labeling of clinical assessment datasets like this one. We extended our error analysis by creating dialog risk plots for such encounters, observing that our model captures risk in sync with the dynamics of the conversation. Although more analysis is needed, these results suggest that it is possible to obtain message-level supervision from encounter-level risk disposition labeling. We consider this observation a promising opportunity for future work.</p>
<sec id="S4.SS1">
<title>4.1. Limitations</title>
<p>While this study examined suicide risk with reasonably high performance, it relied on a single crisis counseling encounter source for data. Moreover, the true risk of suicidality could not be reliably determined and instead relied on counselor-provided dispositions. These dispositions, used for model training, depend on accurate labeling from the counselors. Similarly, counselors may have access to historical information, such as prior utilization of the SafeUT app. It is possible that counselors do not adhere to the same standards for disposition selection or consideration of historical client information, potentially biasing model training results. This is further suggested by the manual assessment of errors and evaluation of dialogue arcs, with some &#x201C;lower risk&#x201D; crisis counseling encounters (deemed &#x201C;lower risk&#x201D; by the counselor dispositions and classified as having &#x201C;higher risk&#x201D; by the neural model) found to contain suicide-related content or risk-associated discourse (e.g., active self-harm, abuse, and requests for emergency responders). As such, future research should utilize human coding to evaluate and establish a baseline for suicidality (both client expression and counselor assessment). It is important to point out that in spite of the advances brought by modern NLP methods to the mental health community, further studies need to be done to assess the robustness of these systems (<xref ref-type="bibr" rid="B51">51</xref>&#x2013;<xref ref-type="bibr" rid="B53">53</xref>). Lastly, client demographic data was not available for this study. Large language models have been shown to underperform when utilized by populations these models were not trained on (<xref ref-type="bibr" rid="B54">54</xref>&#x2013;<xref ref-type="bibr" rid="B57">57</xref>). As such, it is possible these study findings may not generalize to populations whose racial, ethnic, or cultural demographics differ from the population in this study.</p>
</sec>
</sec>
<sec id="S5" sec-type="conclusion">
<title>5. Conclusion</title>
<p>We observed that NLP-based models are capable of detecting suicide risk at the conversation level on text-based crisis encounters, suggesting important practice implications. Our results show outstanding AUC score performance, and remarkable precision and recall metrics by a modern neural transformer-based architecture. Manual analysis indicates that these models can learn appropriate indicators of risk making them robust to the inherent noise of the labels in real-time crisis encounters services. Furthermore, the dialog risk curves are a novel demonstration of how risk prediction fluctuates at the message level, capturing the dynamics of risk throughout the different stages of real crisis counseling conversations.</p>
</sec>
<sec id="S6" sec-type="data-availability">
<title>Data availability statement</title>
<p>The datasets presented in this article are not readily available. Due to privacy and ethical concerns, the data cannot be made public. Requests to access the datasets should be directed to KA, <email>kate.axford@utah.edu</email>.</p>
</sec>
<sec id="S7" sec-type="ethics-statement">
<title>Ethics statement</title>
<p>The studies involving human participants were reviewed and approved by the University of Utah Institutional Review Board. Written informed consent from the participants&#x2019; legal guardian/next of kin was not required to participate in this study in accordance with the national legislation and the institutional requirements.</p>
</sec>
<sec id="S8" sec-type="author-contributions">
<title>Author contributions</title>
<p>MB and MM contributed to the data preparation and analysis. MB, MM, and KA took the lead in the writing and preparation of the manuscript. MM, XZ, and VS designed the model. VS, BK, and ZI provided the feedback and insight that facilitated the analysis and manuscript. All authors contributed to the overall end product, contributed to the article, and approved the submitted version.</p>
</sec>
</body>
<back>
<sec id="S9" sec-type="COI-statement">
<title>Conflict of interest</title>
<p>ZI is co-founder and minority equity stakeholder of a technology company, Lyssn.io that is focused on developing computational models that quantify aspects of patient-provider interactions in psychotherapy. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec id="S10" sec-type="disclaimer">
<title>Publisher&#x2019;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<sec id="S11" sec-type="supplementary-material">
<title>Supplementary material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fpsyt.2023.1110527/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fpsyt.2023.1110527/full#supplementary-material</ext-link></p>
<supplementary-material xlink:href="Data_Sheet_1.docx" id="DS1" mimetype="application/vnd.openxmlformats-officedocument.wordprocessingml.document" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<fn-group>
<fn id="footnote1">
<label>1</label>
<p>The size of the encounters and hardware limitations prevented us from fine-tuning the RoBERTa model.</p></fn>
</fn-group>
<ref-list>
<title>References</title>
<ref id="B1"><label>1.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hoffberg</surname> <given-names>AS</given-names></name> <name><surname>Stearns-Yoder</surname> <given-names>KA</given-names></name> <name><surname>Brenner</surname> <given-names>LA</given-names></name></person-group>. <article-title>The effectiveness of crisis line services: a systematic review.</article-title> <source><italic>Front Publ Health.</italic></source> (<year>2019</year>) <volume>7</volume>:<issue>399</issue>. <pub-id pub-id-type="doi">10.3389/fpubh.2019.00399</pub-id> <pub-id pub-id-type="pmid">32010655</pub-id></citation></ref>
<ref id="B2"><label>2.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Evans</surname> <given-names>WP</given-names></name> <name><surname>Davidson</surname> <given-names>L</given-names></name> <name><surname>Sicafuse</surname> <given-names>L</given-names></name></person-group>. <article-title>Someone to listen: increasing youth help-seeking behavior through a text-based crisis line for Youth.</article-title> <source><italic>J Commun Psychol.</italic></source> (<year>2013</year>) <volume>41</volume>:<fpage>471</fpage>&#x2013;<lpage>87</lpage>. <pub-id pub-id-type="doi">10.1002/jcop.21551</pub-id></citation></ref>
<ref id="B3"><label>3.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kalafat</surname> <given-names>J</given-names></name> <name><surname>Gould</surname> <given-names>MS</given-names></name> <name><surname>Munfakh</surname> <given-names>J</given-names></name> <name><surname>Kleinman</surname> <given-names>M</given-names></name></person-group>. <article-title>An evaluation of crisis hotline outcomes: Part 1: Non-suicidal crisis callers.</article-title> <source><italic>Suic Life Threat Behav.</italic></source> (<year>2007</year>) <volume>37</volume>:<fpage>322</fpage>&#x2013;<lpage>37</lpage>. <pub-id pub-id-type="doi">10.1521/suli.2007.37.3.322</pub-id> <pub-id pub-id-type="pmid">17579544</pub-id></citation></ref>
<ref id="B4"><label>4.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gould</surname> <given-names>MS</given-names></name> <name><surname>Chowdhury</surname> <given-names>S</given-names></name> <name><surname>Lake</surname> <given-names>AM</given-names></name> <name><surname>Galfalvy</surname> <given-names>H</given-names></name> <name><surname>Kleinman</surname> <given-names>M</given-names></name> <name><surname>Kuchuk</surname> <given-names>M</given-names></name><etal/></person-group> <article-title>National Suicide Prevention Lifeline crisis chat interventions: evaluation of chatters&#x2019; perceptions of effectiveness.</article-title> <source><italic>Suic Life Threat Behav.</italic></source> (<year>2021</year>) <volume>51</volume>:<fpage>1126</fpage>&#x2013;<lpage>37</lpage>. <pub-id pub-id-type="doi">10.1111/sltb.12795</pub-id> <pub-id pub-id-type="pmid">34331471</pub-id></citation></ref>
<ref id="B5"><label>5.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gould</surname> <given-names>MS</given-names></name> <name><surname>Kalafat</surname> <given-names>J</given-names></name> <name><surname>Munfakh</surname> <given-names>J</given-names></name> <name><surname>Kleinman</surname> <given-names>M</given-names></name></person-group>. <article-title>An evaluation of crisis hotline outcomes: Part 2: suicidal callers.</article-title> <source><italic>Suic Life Threat Behav.</italic></source> (<year>2007</year>) <volume>37</volume>:<fpage>338</fpage>&#x2013;<lpage>52</lpage>.</citation></ref>
<ref id="B6"><label>6.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fukkink</surname> <given-names>RG</given-names></name> <name><surname>Hermanns</surname> <given-names>JMA</given-names></name></person-group>. <article-title>Counseling children at a helpline: Chatting or calling?</article-title> <source><italic>J Commun Psychol.</italic></source> (<year>2009</year>) <volume>37</volume>:<fpage>939</fpage>&#x2013;<lpage>48</lpage>. <pub-id pub-id-type="doi">10.1002/jcop.20340</pub-id></citation></ref>
<ref id="B7"><label>7.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gatti</surname> <given-names>FM</given-names></name> <name><surname>Brivio</surname> <given-names>E</given-names></name> <name><surname>Calciano</surname> <given-names>S</given-names></name></person-group>. <article-title>&#x201C;Hello! I know you help people here, right?&#x201D;: a qualitative study of young people&#x2019;s acted motivations in text-based counseling.</article-title> <source><italic>Child Y Serv Rev.</italic></source> (<year>2016</year>) <volume>71</volume>:<fpage>27</fpage>&#x2013;<lpage>35</lpage>. <pub-id pub-id-type="doi">10.1016/j.childyouth.2016.10.029</pub-id></citation></ref>
<ref id="B8"><label>8.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gibson</surname> <given-names>K</given-names></name> <name><surname>Cartwright</surname> <given-names>C</given-names></name></person-group>. <article-title>Young people&#x2019;s experiences of mobile phone text counselling: Balancing connection and control.</article-title> <source><italic>Child Y Serv Rev.</italic></source> (<year>2014</year>) <volume>43</volume>:<fpage>96</fpage>&#x2013;<lpage>104</lpage>. <pub-id pub-id-type="doi">10.1016/j.childyouth.2014.05.010</pub-id></citation></ref>
<ref id="B9"><label>9.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lenhart</surname> <given-names>A</given-names></name> <name><surname>Ling</surname> <given-names>R</given-names></name> <name><surname>Campbell</surname> <given-names>S</given-names></name> <name><surname>Purcell</surname> <given-names>K.</given-names></name></person-group> <source><italic>Teens and mobile phones.</italic></source> <publisher-loc>Washington, DC</publisher-loc>: <publisher-name>Pew Research Center</publisher-name> (<year>2010</year>).</citation></ref>
<ref id="B10"><label>10.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mokkenstorm</surname> <given-names>JK</given-names></name> <name><surname>Eikelenboom</surname> <given-names>M</given-names></name> <name><surname>Huisman</surname> <given-names>A</given-names></name> <name><surname>Wiebenga</surname> <given-names>J</given-names></name> <name><surname>Glissen</surname> <given-names>R</given-names></name> <name><surname>Kerkhof</surname> <given-names>JFM</given-names></name><etal/></person-group> <article-title>Evaluation of the 113 online suicide prevention crisis chat service: outcomes, helper behaviors and comparison to telephone hotlines.</article-title> <source><italic>Suic Life Threat Behav.</italic></source> (<year>2017</year>) <volume>47</volume>:<fpage>282</fpage>&#x2013;<lpage>96</lpage>. <pub-id pub-id-type="doi">10.1111/sltb.12286</pub-id> <pub-id pub-id-type="pmid">27539122</pub-id></citation></ref>
<ref id="B11"><label>11.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Haner</surname> <given-names>D</given-names></name> <name><surname>Pepler</surname> <given-names>D</given-names></name></person-group>. <article-title>&#x201C;Live Chat&#x201D; clients at kids help phone: Individual characteristics and problem topics.</article-title> <source><italic>J Canad Acad Child Adolesc Psych.</italic></source> (<year>2016</year>) <volume>25</volume>: <fpage>138</fpage>&#x2013;<lpage>44</lpage>. <pub-id pub-id-type="pmid">27924143</pub-id></citation></ref>
<ref id="B12"><label>12.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fukkink</surname> <given-names>RG</given-names></name> <name><surname>Hermanns</surname> <given-names>JMA</given-names></name></person-group>. <article-title>Children&#x2019;s experiences with chat support and telephone support.</article-title> <source><italic>J Child Psychol Psych.</italic></source> (<year>2009</year>) <volume>50</volume>:<fpage>759</fpage>&#x2013;<lpage>66</lpage>. <pub-id pub-id-type="doi">10.1111/j.1469-7610.2008.02024.x</pub-id> <pub-id pub-id-type="pmid">19207634</pub-id></citation></ref>
<ref id="B13"><label>13.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lublin</surname> <given-names>N.</given-names></name></person-group> <source><italic>Notes from Nancy and Bob: 100 Million messages.</italic></source> <publisher-loc>New York, NY</publisher-loc>: <publisher-name>Crisis Text Line</publisher-name> (<year>2009</year>).</citation></ref>
<ref id="B14"><label>14.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Morse</surname> <given-names>G</given-names></name> <name><surname>Salyers</surname> <given-names>MP</given-names></name> <name><surname>Rollins</surname> <given-names>AL</given-names></name> <name><surname>Monroe-DeVita</surname> <given-names>M</given-names></name> <name><surname>Pfahler</surname> <given-names>C</given-names></name></person-group>. <article-title>Burnout in mental health services: a review of the problem and its remediation.</article-title> <source><italic>Administr Policy Mental Health.</italic></source> (<year>2011</year>) <volume>39</volume>:<fpage>341</fpage>&#x2013;<lpage>52</lpage>. <pub-id pub-id-type="doi">10.1007/s10488-011-0352-1</pub-id> <pub-id pub-id-type="pmid">21533847</pub-id></citation></ref>
<ref id="B15"><label>15.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bowden</surname> <given-names>GE</given-names></name> <name><surname>Smith</surname> <given-names>JC</given-names></name> <name><surname>Parker</surname> <given-names>PA</given-names></name> <name><surname>Boxall</surname> <given-names>MJ</given-names></name></person-group>. <article-title>Working on the edge: stresses and rewards of work in a front-line mental health service.</article-title> <source><italic>Clin Psychol Psychother.</italic></source> (<year>2015</year>) <volume>22</volume>:<fpage>488</fpage>&#x2013;<lpage>501</lpage>. <pub-id pub-id-type="doi">10.1002/cpp.1912</pub-id> <pub-id pub-id-type="pmid">25044605</pub-id></citation></ref>
<ref id="B16"><label>16.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kitchingman</surname> <given-names>TA</given-names></name> <name><surname>Caputi</surname> <given-names>P</given-names></name> <name><surname>Woodward</surname> <given-names>A</given-names></name> <name><surname>Wilson</surname> <given-names>CJ</given-names></name> <name><surname>Wilson</surname> <given-names>I</given-names></name></person-group>. <article-title>The impact of their role on telephone crisis support workers&#x2019; psychological wellbeing and functioning: Quantitative findings from a mixed methods investigation.</article-title> <source><italic>PLoS One.</italic></source> (<year>2018</year>) <volume>13</volume>:<issue>e0207645</issue>. <pub-id pub-id-type="doi">10.1371/journal.pone.0207645</pub-id> <pub-id pub-id-type="pmid">30566435</pub-id></citation></ref>
<ref id="B17"><label>17.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Williams</surname> <given-names>R</given-names></name> <name><surname>Crossaert</surname> <given-names>C</given-names></name> <name><surname>Vuijk</surname> <given-names>P</given-names></name> <name><surname>Bohlmeijer</surname> <given-names>E</given-names></name></person-group>. <article-title>Impact of crisis line volunteering on mental wellbeing and the associated factors: a systematic review.</article-title> <source><italic>Int J Environ Res Publ Health.</italic></source> (<year>2020</year>) <volume>17</volume>:<issue>1641</issue>. <pub-id pub-id-type="doi">10.3390/ijerph17051641</pub-id> <pub-id pub-id-type="pmid">32138360</pub-id></citation></ref>
<ref id="B18"><label>18.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kim</surname> <given-names>V.</given-names></name></person-group> <source><italic>As calls to the suicide prevention lifeline surge, under-resourced centers struggle to keep up.</italic></source> <publisher-loc>Arlington, VA</publisher-loc>: <publisher-name>PBS</publisher-name> (<year>2018</year>).</citation></ref>
<ref id="B19"><label>19.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Beidas</surname> <given-names>RS</given-names></name> <name><surname>Marcus</surname> <given-names>S</given-names></name> <name><surname>Wolk</surname> <given-names>CB</given-names></name> <name><surname>Powell</surname> <given-names>B</given-names></name> <name><surname>Aarons</surname> <given-names>GA</given-names></name> <name><surname>Evans</surname> <given-names>AC</given-names></name><etal/></person-group> <article-title>A prospective examination of clinician and supervisor turnover within the context of implementation of evidence-based practices in a publicly-funded mental health system.</article-title> <source><italic>Administr Policy Mental Health Mental Health Ser Res.</italic></source> (<year>2015</year>) <volume>43</volume>:<fpage>640</fpage>&#x2013;<lpage>9</lpage>. <pub-id pub-id-type="doi">10.1007/s10488-015-0673-6</pub-id> <pub-id pub-id-type="pmid">26179469</pub-id></citation></ref>
<ref id="B20"><label>20.</label><citation citation-type="journal"><collab>World Health Organization [WHO].</collab> <source><italic>Preventing Suicide: A Resource for Establishing a Crisis Line.</italic></source> <publisher-loc>Geneva</publisher-loc>: <publisher-name>World Health Organization</publisher-name> (<year>2018</year>).</citation></ref>
<ref id="B21"><label>21.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Joiner</surname> <given-names>T</given-names></name> <name><surname>Kalafat</surname> <given-names>J</given-names></name> <name><surname>Draper</surname> <given-names>J</given-names></name> <name><surname>Stokes</surname> <given-names>H</given-names></name> <name><surname>Knudson</surname> <given-names>M</given-names></name> <name><surname>Berman</surname> <given-names>AL</given-names></name><etal/></person-group> <article-title>Establishing standards for the assessment of suicide risk among callers to the National Suicide Prevention Lifeline.</article-title> <source><italic>Suic Life Threat Behav.</italic></source> (<year>2007</year>) <volume>37</volume>:<fpage>353</fpage>&#x2013;<lpage>65</lpage>. <pub-id pub-id-type="doi">10.1521/suli.2007.37.3.353</pub-id> <pub-id pub-id-type="pmid">17579546</pub-id></citation></ref>
<ref id="B22"><label>22.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ramchand</surname> <given-names>R</given-names></name> <name><surname>Jaycox</surname> <given-names>L</given-names></name> <name><surname>Ebener</surname> <given-names>P</given-names></name> <name><surname>Gilbert</surname> <given-names>ML</given-names></name> <name><surname>Barnes-Proby</surname> <given-names>D</given-names></name> <name><surname>Goutam</surname> <given-names>P</given-names></name></person-group>. <article-title>Characteristics and proximal outcomes of calls made to suicide crisis hotlines in California.</article-title> <source><italic>Crisis.</italic></source> (<year>2017</year>) <volume>38</volume>:<fpage>26</fpage>&#x2013;<lpage>35</lpage>. <pub-id pub-id-type="doi">10.1027/0227-5910/a000401</pub-id> <pub-id pub-id-type="pmid">27338290</pub-id></citation></ref>
<ref id="B23"><label>23.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mishara</surname> <given-names>BL</given-names></name> <name><surname>Chagnon</surname> <given-names>F</given-names></name> <name><surname>Daigle</surname> <given-names>M</given-names></name> <name><surname>Balan</surname> <given-names>B</given-names></name> <name><surname>Raymond</surname> <given-names>S</given-names></name> <name><surname>Marcoux</surname> <given-names>I</given-names></name><etal/></person-group> <article-title>Comparing models of helper behavior to actual practice in telephone crisis intervention: A silent monitoring study of calls to the U.S. 1-800-SUICIDE Network.</article-title> <source><italic>Suic Life Threat Behav.</italic></source> (<year>2007</year>) <volume>37</volume>:<fpage>291</fpage>&#x2013;<lpage>307</lpage>. <pub-id pub-id-type="doi">10.1521/suli.2007.37.3.291</pub-id> <pub-id pub-id-type="pmid">17579542</pub-id></citation></ref>
<ref id="B24"><label>24.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>MD</given-names></name> <name><surname>Khanna</surname> <given-names>R</given-names></name> <name><surname>Najafi</surname> <given-names>N</given-names></name></person-group>. <article-title>Characterizing the source of text in electronic health record progress notes.</article-title> <source><italic>JAMA Intern Med.</italic></source> (<year>2017</year>) <volume>177</volume>:<issue>1212</issue>. <pub-id pub-id-type="doi">10.1001/jamainternmed.2017.1548</pub-id> <pub-id pub-id-type="pmid">28558106</pub-id></citation></ref>
<ref id="B25"><label>25.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Imel</surname> <given-names>ZE</given-names></name> <name><surname>Steyvers</surname> <given-names>M</given-names></name> <name><surname>Atkins</surname> <given-names>DC</given-names></name></person-group>. <article-title>Computational psychotherapy research: Scaling up the evaluation of patient&#x2013;provider interactions.</article-title> <source><italic>Psychotherapy.</italic></source> (<year>2015</year>) <volume>52</volume>:<fpage>19</fpage>&#x2013;<lpage>30</lpage>. <pub-id pub-id-type="doi">10.1037/a0036841</pub-id> <pub-id pub-id-type="pmid">24866972</pub-id></citation></ref>
<ref id="B26"><label>26.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chowdhury</surname> <given-names>GG</given-names></name></person-group>. <article-title>Natural language processing.</article-title> <source><italic>Annal Rev Inform Sci Technol.</italic></source> (<year>2003</year>) <volume>37</volume>:<fpage>51</fpage>&#x2013;<lpage>89</lpage>. <pub-id pub-id-type="doi">10.1002/aris.1440370103</pub-id></citation></ref>
<ref id="B27"><label>27.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bernert</surname> <given-names>RA</given-names></name> <name><surname>Hilberg</surname> <given-names>AM</given-names></name> <name><surname>Melia</surname> <given-names>R</given-names></name> <name><surname>Kim</surname> <given-names>JP</given-names></name> <name><surname>Shah</surname> <given-names>NH</given-names></name> <name><surname>Abnousi</surname> <given-names>F</given-names></name></person-group>. <article-title>Artificial intelligence and suicide prevention: A systematic review of machine learning investigations.</article-title> <source><italic>Int J Environ Res Publ Health.</italic></source> (<year>2020</year>) <volume>17</volume>:<issue>5929</issue>. <pub-id pub-id-type="doi">10.3390/ijerph17165929</pub-id> <pub-id pub-id-type="pmid">32824149</pub-id></citation></ref>
<ref id="B28"><label>28.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fonseka</surname> <given-names>TM</given-names></name> <name><surname>Bhat</surname> <given-names>V</given-names></name> <name><surname>Kennedy</surname> <given-names>SH</given-names></name></person-group>. <article-title>The utility of artificial intelligence in suicide risk prediction and the management of suicidal behaviors.</article-title> <source><italic>Austral N Zeal J Psychol.</italic></source> (<year>2019</year>) <volume>53</volume>:<fpage>954</fpage>&#x2013;<lpage>64</lpage>. <pub-id pub-id-type="doi">10.1177/0004867419864428</pub-id> <pub-id pub-id-type="pmid">31347389</pub-id></citation></ref>
<ref id="B29"><label>29.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Young</surname> <given-names>T</given-names></name> <name><surname>Hazarika</surname> <given-names>D</given-names></name> <name><surname>Poria</surname> <given-names>S</given-names></name> <name><surname>Cambria</surname> <given-names>E</given-names></name></person-group>. <article-title>Recent trends in deep learning based Natural Language Processing [review article].</article-title> <source><italic>IEEE Comput Intellig Magaz.</italic></source> (<year>2018</year>) <volume>13</volume>:<fpage>55</fpage>&#x2013;<lpage>75</lpage>. <pub-id pub-id-type="doi">10.1109/mci.2018.2840738</pub-id></citation></ref>
<ref id="B30"><label>30.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Goldberg</surname> <given-names>Y</given-names></name></person-group>. <article-title>A Primer on neural network models for Natural Language Processing.</article-title> <source><italic>arXiv</italic></source> [<comment>Preprint</comment>]. (<year>2015</year>) <pub-id pub-id-type="doi">10.48550/arXiv.1510.00726</pub-id> <pub-id pub-id-type="pmid">35895330</pub-id></citation></ref>
<ref id="B31"><label>31.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kim</surname> <given-names>Y</given-names></name></person-group>. <article-title>Convolutional neural networks for sentence classification.</article-title> <source><italic>Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP).</italic></source> <publisher-loc>Arlington, VA</publisher-loc>: <publisher-name>PBS</publisher-name> (<year>2014</year>). <pub-id pub-id-type="doi">10.3115/v1/d14-1181</pub-id></citation></ref>
<ref id="B32"><label>32.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ji</surname> <given-names>S</given-names></name> <name><surname>Pan</surname> <given-names>S</given-names></name> <name><surname>Li</surname> <given-names>X</given-names></name> <name><surname>Cambria</surname> <given-names>E</given-names></name> <name><surname>Long</surname> <given-names>G</given-names></name> <name><surname>Huang</surname> <given-names>Z</given-names></name></person-group>. <article-title>Suicidal ideation detection: A review of machine learning methods and applications.</article-title> <source><italic>IEEE Transac Comput Soc Syst.</italic></source> (<year>2021</year>) <volume>8</volume>:<fpage>214</fpage>&#x2013;<lpage>26</lpage>. <pub-id pub-id-type="doi">10.1109/tcss.2020.3021467</pub-id></citation></ref>
<ref id="B33"><label>33.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bantilan</surname> <given-names>N</given-names></name> <name><surname>Malgaroli</surname> <given-names>M</given-names></name> <name><surname>Ray</surname> <given-names>B</given-names></name> <name><surname>Hull</surname> <given-names>TD</given-names></name></person-group>. <article-title>Just in Time Crisis response: Suicide alert system for telemedicine psychotherapy settings.</article-title> <source><italic>Psychother Res.</italic></source> (<year>2021</year>) <volume>31</volume>:<fpage>289</fpage>&#x2013;<lpage>99</lpage>. <pub-id pub-id-type="doi">10.1080/10503307.2020.1781952</pub-id> <pub-id pub-id-type="pmid">32558625</pub-id></citation></ref>
<ref id="B34"><label>34.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cusick</surname> <given-names>M</given-names></name> <name><surname>Adekkanattu</surname> <given-names>P</given-names></name> <name><surname>Campion</surname> <given-names>TR</given-names> <suffix>Jr</suffix></name> <name><surname>Sholle</surname> <given-names>ET</given-names></name> <name><surname>Myers</surname> <given-names>A</given-names></name> <name><surname>Banerjee</surname> <given-names>S</given-names></name><etal/></person-group> <article-title>Using weak supervision and deep learning to classify clinical notes for identification of current suicidal ideation.</article-title> <source><italic>J Psychiatr Res.</italic></source> (<year>2021</year>) <volume>136</volume>:<fpage>95</fpage>&#x2013;<lpage>102</lpage>. <pub-id pub-id-type="doi">10.1016/j.jpsychires.2021.01.052</pub-id> <pub-id pub-id-type="pmid">33581461</pub-id></citation></ref>
<ref id="B35"><label>35.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ophir</surname> <given-names>Y</given-names></name> <name><surname>Tikochinski</surname> <given-names>R</given-names></name> <name><surname>Asterhan</surname> <given-names>C</given-names></name> <name><surname>Sisso</surname> <given-names>I</given-names></name> <name><surname>Reichart</surname> <given-names>R</given-names></name></person-group>. <article-title>Deep neural networks detect suicide risk from textual Facebook posts.</article-title> <source><italic>Sci Rep.</italic></source> (<year>2020</year>) <volume>10</volume>:<issue>16685</issue>. <pub-id pub-id-type="doi">10.31234/osf.io/k47hr</pub-id></citation></ref>
<ref id="B36"><label>36.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Simon</surname> <given-names>GE</given-names></name> <name><surname>Shortreed</surname> <given-names>SM</given-names></name> <name><surname>Coley</surname> <given-names>RY</given-names></name></person-group>. <article-title>Positive predictive values and potential success of suicide prediction models.</article-title> <source><italic>JAMA Psychol.</italic></source> (<year>2019</year>) <volume>76</volume>:<fpage>868</fpage>&#x2013;<lpage>9</lpage>. <pub-id pub-id-type="doi">10.1001/jamapsychiatry.2019.1516</pub-id> <pub-id pub-id-type="pmid">31241735</pub-id></citation></ref>
<ref id="B37"><label>37.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Velupillai</surname> <given-names>S</given-names></name> <name><surname>Epstein</surname> <given-names>S</given-names></name> <name><surname>Bittar</surname> <given-names>A</given-names></name> <name><surname>Stephenson</surname> <given-names>T</given-names></name> <name><surname>Dutta</surname> <given-names>R</given-names></name> <name><surname>Downs</surname> <given-names>J</given-names></name></person-group>. <article-title>Identifying suicidal adolescents from mental health records using natural language processing.</article-title> <source><italic>Stud Health Technol Inform.</italic></source> (<year>2019</year>) <volume>264</volume>:<fpage>413</fpage>&#x2013;<lpage>7</lpage>. <pub-id pub-id-type="doi">10.3233/SHTI190254</pub-id> <pub-id pub-id-type="pmid">31437956</pub-id></citation></ref>
<ref id="B38"><label>38.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Downs</surname> <given-names>J</given-names></name> <name><surname>Velupillai</surname> <given-names>S</given-names></name> <name><surname>George</surname> <given-names>G</given-names></name> <name><surname>Holden</surname> <given-names>R</given-names></name> <name><surname>Kikoler</surname> <given-names>M</given-names></name> <name><surname>Dean</surname> <given-names>H</given-names></name><etal/></person-group> <article-title>Detection of suicidality in adolescents with autism spectrum disorders: Developing a natural language processing approach for use in electronic health records.</article-title> <source><italic>AMIA.</italic></source> (<year>2017</year>) <volume>2017</volume>:<fpage>641</fpage>&#x2013;<lpage>9</lpage>.</citation></ref>
<ref id="B39"><label>39.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fernandes</surname> <given-names>AC</given-names></name> <name><surname>Dutta</surname> <given-names>R</given-names></name> <name><surname>Velupillai</surname> <given-names>S</given-names></name> <name><surname>Sanyal</surname> <given-names>J</given-names></name> <name><surname>Stewart</surname> <given-names>R</given-names></name> <name><surname>Chandran</surname> <given-names>D</given-names></name></person-group>. <article-title>Identifying suicide ideation and suicidal attempts in a psychiatric clinical research database using natural language processing.</article-title> <source><italic>Sci Rep.</italic></source> (<year>2018</year>) <volume>8</volume>:<issue>7426</issue>. <pub-id pub-id-type="doi">10.1038/s41598-018-25773-2</pub-id> <pub-id pub-id-type="pmid">29743531</pub-id></citation></ref>
<ref id="B40"><label>40.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Coppersmith</surname> <given-names>G</given-names></name> <name><surname>Leary</surname> <given-names>R</given-names></name> <name><surname>Crutchley</surname> <given-names>P</given-names></name> <name><surname>Fine</surname> <given-names>A</given-names></name></person-group>. <article-title>Natural language processing of social media as screening for suicide risk.</article-title> <source><italic>Biomed Inform Insights</italic>.</source> (<year>2018</year>) <volume>10</volume>:<issue>1178222618792860</issue>. <pub-id pub-id-type="doi">10.1177/1178222618792860</pub-id> <pub-id pub-id-type="pmid">30158822</pub-id></citation></ref>
<ref id="B41"><label>41.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Xu</surname> <given-names>Z</given-names></name> <name><surname>Xu</surname> <given-names>Y</given-names></name> <name><surname>Cheung</surname> <given-names>F</given-names></name> <name><surname>Cheng</surname> <given-names>M</given-names></name> <name><surname>Lung</surname> <given-names>D</given-names></name> <name><surname>Law</surname> <given-names>YW</given-names></name><etal/></person-group> <article-title>Detecting suicide risk using knowledge-aware natural language processing and counseling service data.</article-title> <source><italic>Soc Sci Med.</italic></source> (<year>2021</year>) <volume>283</volume>:<issue>114176</issue>. <pub-id pub-id-type="doi">10.1016/j.socscimed.2021.114176</pub-id> <pub-id pub-id-type="pmid">34214846</pub-id></citation></ref>
<ref id="B42"><label>42.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liu</surname> <given-names>Y</given-names></name> <name><surname>Ott</surname> <given-names>M</given-names></name> <name><surname>Goyal</surname> <given-names>N</given-names></name></person-group>. <article-title>RoBERTa: a robustly optimized BERT pretraining approach.</article-title> <source><italic>arXiv</italic></source> [<comment>Preprint</comment>]. (<year>2019</year>) <pub-id pub-id-type="doi">10.48550/arXiv.1907.11692</pub-id> <pub-id pub-id-type="pmid">35895330</pub-id></citation></ref>
<ref id="B43"><label>43.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yang</surname> <given-names>Y</given-names></name> <name><surname>Chaudhuri</surname> <given-names>K</given-names></name></person-group>. <article-title>Understanding rare spurious correlations in neural networks.</article-title> <source><italic>arXiv</italic></source> [<comment>Preprint</comment>]. (<year>2022</year>) <pub-id pub-id-type="doi">10.48550/arXiv.2202.05189</pub-id> <pub-id pub-id-type="pmid">35895330</pub-id></citation></ref>
<ref id="B44"><label>44.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gupta</surname> <given-names>A</given-names></name> <name><surname>Kvernadze</surname> <given-names>G</given-names></name> <name><surname>Srikumar</surname> <given-names>V</given-names></name></person-group>. <article-title>BERT and family eat word salad: Experiments with text understanding.</article-title> <source><italic>Proceedings of the AAAI conference on artificial intelligence.</italic></source> <publisher-loc>New York, NY</publisher-loc>: <publisher-name>Association for Computing Machinery</publisher-name> (<year>2021</year>). <fpage>12946</fpage>&#x2013;<lpage>54</lpage>.</citation></ref>
<ref id="B45"><label>45.</label><citation citation-type="journal"><collab>National Suicide Prevention Lifeline.</collab> <source><italic>Best practices.</italic></source> <publisher-loc>New York, NY</publisher-loc>: <publisher-name>National Suicide Prevention Lifeline</publisher-name> (<year>2007</year>).</citation></ref>
<ref id="B46"><label>46.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hosmer</surname> <given-names>DW</given-names> <suffix>Jr</suffix></name> <name><surname>Lemeshow</surname> <given-names>S</given-names></name> <name><surname>Sturdivant</surname> <given-names>RX.</given-names></name></person-group> <source><italic>Applied logistic regression.</italic></source> <publisher-loc>New York, NY</publisher-loc>: <publisher-name>Wiley</publisher-name> (<year>2013</year>).</citation></ref>
<ref id="B47"><label>47.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Vaswani</surname> <given-names>A</given-names></name> <name><surname>Shazeer</surname> <given-names>N</given-names></name> <name><surname>Parmar</surname> <given-names>N</given-names></name> <name><surname>Uszkoreit</surname> <given-names>J</given-names></name> <name><surname>Jones</surname> <given-names>L</given-names></name> <name><surname>Gomez</surname> <given-names>AN</given-names></name><etal/></person-group> <article-title>Attention is all you need.</article-title> <source><italic>arXiv</italic></source> [<comment>Preprint</comment>]. (<year>2017</year>) <pub-id pub-id-type="doi">10.48550/arXiv.1706.03762</pub-id> <pub-id pub-id-type="pmid">35895330</pub-id></citation></ref>
<ref id="B48"><label>48.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Devlin</surname> <given-names>J</given-names></name> <name><surname>Chang</surname> <given-names>M</given-names></name> <name><surname>Lee</surname> <given-names>K</given-names></name> <name><surname>Toutanova</surname> <given-names>K</given-names></name></person-group>. <article-title>BERT: Pre-training of deep bidirectional transformers for language understanding.</article-title> <source><italic>NAACL-HLT.</italic></source> (<year>2019</year>) <volume>2019</volume>:<fpage>4171</fpage>&#x2013;<lpage>86</lpage>.</citation></ref>
<ref id="B49"><label>49.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gururangan</surname> <given-names>S</given-names></name> <name><surname>Marasovi&#x0107;</surname> <given-names>A</given-names></name> <name><surname>Swayamdipta</surname> <given-names>S</given-names></name> <name><surname>Lo</surname> <given-names>K</given-names></name> <name><surname>Beltagy</surname> <given-names>I</given-names></name> <name><surname>Downey</surname> <given-names>D</given-names></name><etal/></person-group> <article-title>Don&#x2019;t stop pretraining: Adapt language models to domains and tasks.</article-title> <source><italic>Proceedings of the 58th annual meeting of the association for computational linguistics.</italic></source> <publisher-loc>New York, NY</publisher-loc>: <publisher-name>Association for Computing Machinery</publisher-name> (<year>2020</year>). <fpage>8342</fpage>&#x2013;<lpage>60</lpage>.</citation></ref>
<ref id="B50"><label>50.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Issa</surname> <given-names>S</given-names></name> <name><surname>Adekunle</surname> <given-names>O</given-names></name> <name><surname>Hamdi</surname> <given-names>F</given-names></name> <name><surname>Cherfi</surname> <given-names>SS</given-names></name> <name><surname>Dumontier</surname> <given-names>M</given-names></name> <name><surname>Zaveri</surname> <given-names>A</given-names></name></person-group>. <article-title>Knowledge graph completeness: a systematic literature review.</article-title> <source><italic>IEEE Access.</italic></source> (<year>2021</year>) <volume>9</volume>:<fpage>31322</fpage>&#x2013;<lpage>39</lpage>.</citation></ref>
<ref id="B51"><label>51.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bender</surname> <given-names>EM</given-names></name> <name><surname>Koller</surname> <given-names>A</given-names></name></person-group>. <article-title>Climbing towards NLU: On meaning, form, and understanding in the age of data.</article-title> <source><italic>Proceedings of the 58th annual meeting of the association for computational linguistics.</italic></source> <publisher-loc>New York, NY</publisher-loc>: <publisher-name>Association for Computing Machinery</publisher-name> (<year>2020</year>). <fpage>5185</fpage>&#x2013;<lpage>98</lpage>.</citation></ref>
<ref id="B52"><label>52.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ribeiro</surname> <given-names>MT</given-names></name> <name><surname>Wu</surname> <given-names>T</given-names></name> <name><surname>Guestrin</surname> <given-names>C</given-names></name></person-group>. <article-title>Beyond accuracy: Behavioral testing of NLP models with a checklist.</article-title> <source><italic>Proceedings of the thirtieth international joint conference on artificial intelligence.</italic></source> <publisher-loc>New York, NY</publisher-loc>: <publisher-name>Association for Computing Machinery</publisher-name> (<year>2021</year>). <fpage>4824</fpage>&#x2013;<lpage>8</lpage>.</citation></ref>
<ref id="B53"><label>53.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rogers</surname> <given-names>A</given-names></name> <name><surname>Kovaleva</surname> <given-names>O</given-names></name> <name><surname>Rumshisky</surname> <given-names>A</given-names></name></person-group>. <article-title>A primer in BERTology: What we know about how BERT works.</article-title> <source><italic>Transact Assoc Comput Ling.</italic></source> (<year>2020</year>) <volume>8</volume>:<fpage>842</fpage>&#x2013;<lpage>66</lpage>. <pub-id pub-id-type="doi">10.1162/tacl_a_00349</pub-id></citation></ref>
<ref id="B54"><label>54.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bender</surname> <given-names>EM</given-names></name> <name><surname>Gebru</surname> <given-names>T</given-names></name> <name><surname>McMillan-Major</surname> <given-names>A</given-names></name> <name><surname>Shmitchell</surname> <given-names>S</given-names></name></person-group>. <article-title>On the dangers of stochastic parrots: Can language models be too big.</article-title> <source><italic>Proceedings of the 2021 ACM conference on fairness, accountability, and transparency.</italic></source> <publisher-loc>New York, NY</publisher-loc>: <publisher-name>Association for Computing Machinery</publisher-name> (<year>2021</year>). <pub-id pub-id-type="doi">10.1145/3442188.3445922</pub-id></citation></ref>
<ref id="B55"><label>55.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Blodgett</surname> <given-names>SL</given-names></name> <name><surname>Barocas</surname> <given-names>S</given-names></name> <name><surname>Iii</surname> <given-names>HD</given-names></name> <name><surname>Wallach</surname> <given-names>H</given-names></name></person-group>. <article-title>Language (technology) IS POWER: A critical survey of &#x201C;bias&#x201D;.</article-title> <source><italic>Proceedings of the 58th annual meeting of the association for computational linguistics.</italic></source> <publisher-loc>New York, NY</publisher-loc>: <publisher-name>Association for Computing Machinery</publisher-name> (<year>2020</year>). <pub-id pub-id-type="doi">10.18653/v1/2020.acl-main.485</pub-id> <pub-id pub-id-type="pmid">36568019</pub-id></citation></ref>
<ref id="B56"><label>56.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dev</surname> <given-names>S</given-names></name> <name><surname>Li</surname> <given-names>T</given-names></name> <name><surname>Phillips</surname> <given-names>JM</given-names></name> <name><surname>Srikumar</surname> <given-names>V</given-names></name></person-group>. <article-title>On measuring and mitigating biased inferences of word embeddings.</article-title> <source><italic>Proc AAAI Conf Artif Intellig.</italic></source> (<year>2020</year>) <volume>34</volume>:<issue>05</issue>. <pub-id pub-id-type="doi">10.1609/aaai.v34i05.6267</pub-id></citation></ref>
<ref id="B57"><label>57.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shah</surname> <given-names>DS</given-names></name> <name><surname>Schwartz</surname> <given-names>HA</given-names></name> <name><surname>Hovy</surname> <given-names>D</given-names></name></person-group>. <article-title>Predictive Biases in Natural Language Processing Models: A Conceptual Framework and Overview.</article-title> <source><italic>Proceedings of the 58th annual meeting of the association for computational linguistics.</italic></source> <publisher-loc>New York, NY</publisher-loc>: <publisher-name>Association for Computing Machinery</publisher-name> (<year>2020</year>). <fpage>5248</fpage>&#x2013;<lpage>64</lpage>.</citation></ref>
</ref-list>
</back>
</article>
