Skip to main content

MINI REVIEW article

Front. Comput. Sci., 02 August 2023
Sec. Computer Vision
Volume 5 - 2023 | https://doi.org/10.3389/fcomp.2023.1168712

The future of automated capture of social kinesic signals for psychiatric purposes

Judee K. Burgoon1* Aaron C. Elkins2 Douglas Derrick3 Bradley Walls4 Dimitris Metaxas5
  • 1Center for the Management of Information, University of Arizona, Tucson, AZ, United States
  • 2Department of Management Information, San Diego State University, San Diego, CA, United States
  • 3College of Information Science and Technology, University of Nebraska at Omaha, Omaha, NE, United States
  • 4Discern Science International, Tucson, AZ, United States
  • 5Center for Computational Biomedicine Imaging and Modeling (CBIM), Rutgers University, New Brunswick, NJ, United States

This article considers how computer vision can be enlisted for biomedical applications, specifically the measurement, data analytics and treatment of psychiatric disorders. Often, youngsters are too afraid or embarrassed to disclose their emotional and mental problems to human therapists. An AI system can be utilized not only to collect data in a non-threatening ongoing manner and record patient's temporal psychophysiological state but also to analyze and output the periodic results, it may be an efficient and effective means for therapists to plan treatments. We report on various tools for analyzing social kinesic signals for emotional and physiological states. Only one, AVATAR (and its predecessor SPECIES), both records a patient's state and also outputs an analysis that flags problem areas for therapists. In this way, automated tools can augment human observation and judgment.

Introduction

Tools for detecting human emotional and cognitive states have undergone an exponential advancement in recent years. Tools developed for one purpose have shown utility in additional arenas, thus serving a multiplicity of purposes. That is the case with tools that have originated in the field of fraud and deception detection. Noncontact tools meant to passively and surreptitiously detect states of cognitive and emotional arousal may also register disruptions in one's mental, emotional and physiological state. Here, we demonstrate the application in the case of psychiatric disorders, such as bipolar disorder, anxiety, depression and suicidal tendencies. All of these disorders have linkages to arousal, anxiety and/or hidden emotional states. Drawing upon our research on deception and fraud detection, we demonstrate how cumulative signals from various sensors can be aggregated to correlate with, and predict, psychiatric states, that may aid in delivering useful treatment recommendations.

Background and foundations

The scope of human-computer interaction has been an ever-widening one, encompassing such domains as information technology design, entertainment technologies, cooperative work, medical care delivery, personality assessment and more (Salah et al., 2011) that relate to all manner of human intelligences, such as emotional intelligence, linguistic intelligence, logic and interpersonal intelligence (Salovey and Mayer, 1990; Gardner, 2011). This vast panorama exceeds our purview. Our goal in the current article is a more modest one, to take one slice out of the pie to propose augmenting human judgment with computer technology to detect and treat psychiatric disorders.

Many tools have been developed to assess humans' mental states. For example, VlogSense automatically measures and analyzes nonverbal conversational behavior shown while viewers watch YouTube videos (Biel et al., 2011). Computer vision measures such nonverbal behaviors as voice, gaze, facial expressions and head pose to assess team collaboration and personality traits (Jayagopi and Gatica-Perez, 2010; Jacques Junior et al., 2022). With sensors located in an interviewing kiosk, SPECIES [Special-Purpose, Embodied Conversational Intelligence with Environmental Sensors], Derrick (2011) combines sensors to conduct interviews that detect respondents' veracity. Wearables have been designed to give public speakers feedback about the effectiveness of the non-verbal facets of their presentations (Mihoub and Lefebvre, 2019). Using a tripartite system of computer vision for gaze estimation, a taxonomy to tag the implicit semantics of gaze patterns and machine learning to correlate the semantics with the gaze behavior, Okada et al. (2019) found that social gaze distinctly recognized group leaders. Computer scientists are also applying computer vision approaches to extract personality impressions from faces, postures and other kinesic behaviors (Jacques Junior et al., 2022). CogStack uses Electronic Health Records to alert when patients are at risk for a psychotic episode (Wang et al., 2020). Mental illness can be diagnosed from social media posts (Zhang et al., 2023).

In the foregoing examples, most systems deal with one-way transmission of signals by the patient or cooperative discourse between two parties. When it comes to dealing with therapeutic and non-cooperative discourse, however, it is more difficult to model communication because a patient or interviewee may be managing their behavior so as to mask undesirable past behavior or current troubling mental states. The interaction can better be likened to a legal context in which an interrogating attorney questioning a suspect (Keatley, 2020), must discern which behaviors can be believed and which constitute deceiving. Such discourse is regarded as adversarial or non-cooperative.

For over a century, the preferred technology used to assess noncooperative discourse has been the polygraph. It has been regarded as the gold standard for gauging when deception is or is not indicated (Vrij, 2008). Even though the polygraph is not a lie detector per se. Rather, deceit is inferred from respiratory, cardiac and skin conductance responses that measure arousal and thus predict an individual's truthfulness (Grubin and Madsen, 2005). Were psychiatric issues only related to arousal, use of a device like the polygraph would still be an infeasible psychiatric aid for several reasons. First and most problematic, the polygraph is a contact tool, in other words, the patient must be connected to the device. For most patients, being hooked up to the polygraph is intimidating; the patients have various fears of it, such as delivering an electric shock or learning something about the patient's physiological state that they do not want to divulge. Second, the set-up are time consuming as each of the behavioral sensors must be properly secured to the interviewee and calibrated to measure the optimal amplitude. A pneumograph around the chest measures respiration, a cardiosphygmograph around the arm measures blood pressure and pulse, and various leads to the fingers measure palmar sweat (skin conductance). All of this calibration takes a significant amount of time per patient. Third, the standard interview protocol itself for a properly done polygraph is time-consuming. It begins with a pre-test that includes detailed definitions of the meanings of the question terminology and explanation of the process to be followed. This is followed by the main interview set of questions then a post-test during which the questions may be repeated. Fourth, the instruction-giving is done by the examiner, who may unconsciously introduce bias by vocal tone, tempo and word choice (Mitchell et al., 2005). Fifth, the presence of a human conducting the interview introduces the interviewee's fear of evaluation by the examiner. Patients become embarrassed when having to address sensitive topics. Finally, an expert must be trained to review and interpret the results. Subjective interpretation always introduces the potential for variability in judgment across patients and across time.

In sum, completion of a polygraph for each individual patient absorbs extensive time and labor. And the end result is only an assessment of the physiological aspects of arousal, excluding cognitive arousal, emotional distress, depression or veracity. Its accuracy is quite variable, being the highest when judging single-incident, past-tense crimes and lowest when judging future intentions and repetitive proclivities (National Research Council, 2003), such as recurrent bouts of depression or habitual lying.

The shortcomings of the polygraph highlight some of the criteria of a system for gauging psychiatric disorders. Ideally, it should be noncontact; the patient should be free of any cuffs, wires or other connectors, which in addition to removing “scary” wires and connectors also gives the individual freedom of movement and freedom to gesture. An ideal system should entail brief, straightforward instructions to the patient, brevity being one of its hallmarks. It should be valid on its face (measuring what it is meant to measure) and reliable (producing the same results on subsequent administrations), while minimizing fatigue and boredom. Finally, computerized analysis of results would obviate the need for human, and possibly biased and unreliable, interpretation.

One category of computer-based tool used by mental health clinicians is the neuropsychological test conducted with a computer or tablet. One popular tool is the Cambridge Neuropsychological Test Automated Batteries (CANTAB) that measures the correctness and reaction to a series of computerized tests meant to measure visual memory, attention, and working memory and planning (Fray et al., 1996; Smith et al., 2013). The patient sits in front of a computer with a touchscreen and is instructed to respond to the tasks presented on the screen. For example, one task called the Affective Go/No-go presents the patient with words differentiated by their valence (i.e., positive or negative) and they must identify the valence of the word. The patient's omission and commission errors as well as response delays are recorded and used to evaluate, diagnose, and support research in neuropsychological phenomena such as correlating performance on CANTAB with FRMI data. Similar to polygraph, this tool requires physical contact, and human administration and interpretation of the results. It has an advantage over traditional interviews because it has higher face validity during the tests and questions directly measure performance rather than asking for a subjective evaluation.

The AVATAR, or Automated Virtual Agent for Truth-Assessment in Real-time, was developed with such criteria in mind (Patton, 2008; Derrick, 2011; Nunamaker et al., 2011; Burgoon and Nunamaker, 2013; Elkins et al., 2013, 2014; Twitchell et al., 2013). It originated in the field of credibility assessment, marrying sensors that measure signals of credibility with interviews conducted by a virtual agent. Studies employing automated interview systems such as the AVATAR have found that individuals being interviewed by a fully automated virtual agent feel less concerned about being evaluated and freely disclose more sadness, such as is associated with depression and suicidal tendencies, compared to interviews where they believed a virtual avatar was being operated by a human (Lucas et al., 2014; Rizzo et al., 2016). These results are part of a growing body of research suggesting that virtual human interactions reduce stigma by providing a safe context in which users may reveal sensitive information compared to situations where users anticipate negative judgments from a human interviewer. Additionally, automatic behavior detection seems to provide a more accurate window into the emotional state of the user than does self-report.

The AVATAR is designed to mimic human communication. A virtual interviewer that has the head and torso of a human conducts the interview while its various sensors register the interviewee's head pose, eye and facial movement, posture and gestures (It also registers such features of the voice as pitch, loudness, tempo and fluency, but our interest here is in the kinesic, or nonverbal visual, movement features.) This allows it to sense visual signals from the interviewee, interpret those signals, and in turn, translate those signals to produce messages. Among non-verbal signals, kinesic visual cues account for the most variance, followed by vocalics (Burgoon et al., 2022b) in creating first impressions, conveying emotions, managing social interactions and persuading others. Ideally, most or all non-verbal signals can be captured unobtrusively so that measuring instruments are not distracting.

Materials and methods

Starting from the top of the interviewee, the AVATAR analyzes the head and face, the former for purposes of detecting orientation toward the interlocutor and the latter for purposes of detecting emotional states and relational messages. Tools that measure the face such as OpenFace (Baltrušaitis et al., 2016) also often measure the pitch, roll and yaw of the head. Pitch is the forward and backward movement of head pose, such as when nodding “yes” or when hanging head forward and downward to convey emotional sadness. Roll is the left and right turning, such as when shaking the head “no.” Yaw is tilting the head sideways, as when listening. The head tilt is a common gesture to signal subordination; in the animal kingdom, it mimics exposure of the jugular vein of a vanquished foe as a substitute for an actual kill. The canting of the head sideways and downward can signal depression and emotional distress, despite the patient's words saying otherwise. Likewise, orienting the head and body indirectly toward an interlocutor can convey weakness and anxiety or lack of openness and rapport with an interlocutor. It can communicate “shutting down.” Contrariwise, sitting upright and facing an interlocutor straight on communicates directness and composure.

Additionally, many combinations of facial features convey specific emotions (Walls, 2020). Several software tools measure facial feature actions, the most frequently used being OpenFace (Baltrušaitis et al., 2016). Several landmarks are located on the face and computer vision links them, like a dot-to-dot puzzle, to measure different expressions (e.g., eyebrow raise, mouth tightener) and combinations that together express emotions (e.g., anger, fear). These expressions are represented by AUs, for automatic facial action units. AUs related to emotional distress would include sadness depicted around the eyes, laxity in the cheek region, and downward turn of the lips. Anxiety would be shown through tightened forehead muscles, with eyebrows tightened above the bridge of the nose, crows-feet in the outward corners of the eyes, pursed lips, and downturned lips (Porter et al., 2012; Ten Brinke and Porter, 2012).

Also, part of the face and head region are the eyes. The analysis is what is known as oculometrics. Eye trackers such as Tobii and EyeDetect (Cantoni et al., 2018) are used to track blinking, gaze direction, eye saccades and pupil dilation (Proudfoot et al., 2016). Depression and emotional distress are often signaled by suppression of blinking, gaze averted away from the interlocutor, and constricted (rather than dilated) pupils (Burgoon et al., 2017; Ceh et al., 2021). Masked (concealed) emotions are associated with more inconsistent expressions and a faster blink rate; neutralized (weakened) emotions instead show a decreased blink rate (Porter and Ten Brinke, 2008). Blinking and eye movements can predict vigilance during an interaction or task (Langhals et al., 2013).

Moving to the torso, there are motion capture systems such as OpenPose and Kinect for measuring posture and gestures. A slumped posture, often with an averted gaze, is commonly associated with depression or anxiety. Kinect and similar commercial tools can be used to capture the limb and gestural patterns. Alternatively, in contrast to the traditional method of manual gestural analysis, gesture analysis now can be captured with computer measurement. An approach called Blob Analysis, for example, forms bounding boxes around hands, arms and shoulders. Ellipses are formed within the boxes and the x and y coordinates of the ellipses are then calculated. From these, concurrent and sequential nonverbal communication patterns can be calculated. For example, Meservy et al. (2005a,b) created measures of gestural location, expansiveness and velocity from the pixels on the screen. Gestural animation, shown by more expansiveness and faster velocity, is associated with emotional stability and positivity, whereas more gestural restrictedness and rigidity would likely be associated with emotional distress (Twyman et al., 2014; Pentland et al., 2017). Analysis of torso and gestures can be extended to dyads by examining the synchrony of behavior between sender and receiver over time (Dunbar, 2022).

The analysis tools: putting it all together

The emergence of automated AI tools has naturally led not only to collections of multiple signals from multiple modalities, but also development of methods to analyze such signals in simultaneous and serial combinations. One such system, HireVue, is AI-driven software that combines facial affect, eye contact, vocal patterns and word choice to screen video and audio for potential employees. Other companies are Yobs Technologies, Talview Behavioral Insights and VCV.AI (Hinkle, 2020). For all of these, nonverbal behaviors and personality inventories play a big role in combining all these metrics to predict which applicants will make good employees. For criminal investigations, multiple kinesic and vocal signals together produce a robust system for discriminating the “bad guys” from the “good guys.”

An advance in analysis is time series methods like Recurrence Quantification Analysis and Multiscale Entropy (Duran et al., 2013) to measure dynamical movements. Multivariate analyses, machine learning and deep learning have all been used (Ding et al., 2019; Stathopoulos et al., 2021; Burgoon et al., 2022b). In Stathopoulos et al. (2021), the authors created a machine learning-based system that detects deceptive behavior in videos using facial Action Unit (AU) intensities as input. With the help of this system, the authors discovered specific micro-expression patterns that are known to be correlated with deceptive behavior. These include AU45 (eye blinks), AU20 (lip stretcher), AU13 (cheek puffer), AU9 (nose wrinkler), AU10 (upper lip raiser), and AU12 (lip corner puller). They occurred in deceptive videos across genders and ethnicity. As signs of discomfort and negative affect, such behaviors might prove to be good indicators of psychiatric distress.

An alternative approach for identifying hidden recurrent patterns among combined signals is software called THEME, developed by Magnusson (1996) (see also Magnusson, 2016; Burgoon et al., 2022a). An example would be discovering the dynamic head movement, eye gaze and gesture patterns correlated with anxiety. Burgoon et al. (2015) illustrated using this software to discover patterns of deception in group interaction. Several other methods for analyzing non-verbal dynamics can be found in Novotny and Bente (2022).

Finally, technologies such as AVATAR can make use of the cloud for data storage and security. Data no longer need to risk theft or damage when not stored locally.

Discussion

The integration of psychiatry, computer vision, and non-verbal communication is a significant achievement in interdisciplinary research. By combining these fields, researchers are able to create a system that accurately captures and analyzes non-verbal behaviors to aid in psychiatric treatment. The use of computer vision provides an objective and automated method of detecting non-verbal behaviors, while psychiatry offers a framework for interpreting the models' predictions.

One of the most significant benefits of this approach is that it minimizes the risk of human bias. Human observers may have their own personal biases, and their subjective judgments may be influenced by factors such as gender, race, and culture. By using automated systems to capture and analyze non-verbal behaviors, researchers can obtain more reliable and objective data. This information can be used to guide the selection of appropriate treatment options for patients. This has significant implications for the field of psychiatry, as it allows for the development of more accurate and effective diagnostic tools and treatment methods.

One of the key benefits of using AVATARs in this context is their ability to serve as both sender and receiver. AVATARs can deliver verbal and non-verbal messages while simultaneously providing sympathetic listening. This is particularly useful in psychiatric treatment, where empathy and understanding are critical components of successful therapy. AVATARs can simulate human-human interaction without the distractions that often come with face-to-face interactions, creating a more controlled and focused environment for patients to receive treatment.

While there are certainly benefits to using automated systems in psychiatric treatment, there are also limitations to be considered. One potential limitation is the need for sensors to be calibrated and synchronized. Another limitation is that some patients may be distrustful or fearful of technology and may be unwilling to use such devices. However, for those who are comfortable using technology, AVATARs may offer a promising alternative to traditional face-to-face therapy.

Author contributions

JB wrote the first drafts of the paper. AE added significant new content. All authors contributed to the article and approved the submitted version.

Funding

Development of the AVATAR was partially supported by the National Science Foundation Human and Social Dynamics Program (Grant #0725895) on Interactive Deception and its Detection through Multimodal Analysis of Interviewer-Interviewee Dynamics and several grants to the NSF Center for Identification Technology (Grant #1068026) testing aspects of deception and non-contact detection tools.

Conflict of interest

JB, AE, and DD are founders of Discern Science International. BW is a consultant to DSI.

The remaining author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Baltrušaitis, T., Robinson, P., and Morency, L. P. (2016). “Openface: An Opensource facial behavior analysis toolkit,” in 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), Piscataway, NJ: IEEE.

Google Scholar

Biel, J.-I., Aran, O., and Gatica-Perez, D. (2011). You are known by how you vlog: personality impressions and nonverbal behavior in YouTube. Proceedings of the International AAAI Conference on Web and Social Media. 5, 446–449. doi: 10.1609/icwsm.v5i1.14160

CrossRef Full Text | Google Scholar

Burgoon, J. K., Dunbar, N. E., Metzger, M., Staphopoulis, A., Metaxas, D., Nunamaker, J. F., et al. (2015). “Interactive deception in group decision-making: New insights from communication pattern analysis,” in Discovering Hidden Temporal Patterns in Behavior and Interaction: T-Pattern Detection and Analysis with THEME. New York, NY: Springer.

Google Scholar

Burgoon, J. K., Dunbar, N. E., Metzger, M., Staphopoulis, A., Metaxas, D., Nunamaker, J. F., et al. (2022a). The Psychology of trust from Relational Messages. New York, NY: IntechOpen.

Google Scholar

Burgoon, J. K., Magnenat-Thalmann, P. M., and Vinciarelli, A. (2017). Social Signal Processing. Cambridge: Cambridge University Press. doi: 10.1017/9781316676202

CrossRef Full Text | Google Scholar

Burgoon, J. K., Manusov, V., and Guerrero, L. (2022b). Nonverbal Communication. London: Routledge.

Google Scholar

Burgoon, J. K., and Nunamaker, J. F. (2013). Detecting deception in collaboration and negotiation. Group Decision and Negotiation. 22, 85–88.

Google Scholar

Cantoni, V., Musci, M., Nugrahaningsih, N., and Porta, M. (2018). Gaze-based biometrics: an introduction to forensic applications. Pattern Recognit. Letters 113, 54–57. doi: 10.1016/j.patrec.2016.12.006

CrossRef Full Text | Google Scholar

Ceh, S. M., Annerer-Walcher, S., Koschutnig, K., Körner, C., Fink, A., Benedek, M., et al. (2021). Neurophysiological indicators of internal attention: an fMRI–eye-tracking coregistration study. Cortex 143, 29–46. doi: 10.1016/j.cortex.2021.07.005

PubMed Abstract | CrossRef Full Text | Google Scholar

Derrick, D. (2011). Special-Purpose, Embodied Conversational Intelligence with Environmental Sensors (SPECIES) Agents: Implemented in an Automated Interviewing Kiosk. Doctoral Dissertation, University of Arizona.

Google Scholar

Ding, M., Zhao, A., Lu, Z., Xiang, T., and Wen, J. R. (2019). “Face-focused cross-stream network for deception detection in videos,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Piscataway, NJ: IEEE, 7802–7811.

Google Scholar

Dunbar, N. E. (2022). New Methods to Examine Nonverbal Synchrony in dyads. Understanding Social Behavior in Dyadic and Small Group Interactions. Bellingham, MA: Proceedings of Machine Learning Research.

Google Scholar

Duran, N. D., Dale, R., Kello, C. T., Street, C. N., and Richardson, D. C. (2013). Exploring the movement dynamics of deception. Front. Psychol. 4, 1–16. doi: 10.3389/fpsyg.2013.00140

PubMed Abstract | CrossRef Full Text | Google Scholar

Elkins, A. C., Golob, E., Nunamaker, J. F., Burgoon, J. K., and Derrick, D. C. (2014). Apprising the AVATAR for Automated Border Control: Results of a European Union Field Test of AVATAR Systems for Interviewing and Passport Control [Technical]. National Center for Border Security and Immigration. p. 1–54.

Google Scholar

Elkins, A. C., Dunbar, N. E., Adame, B., and Nunamaker, J. F. (2013). Are users threatened by credibility assessment systems?. J. Manage. Info. Syst. 29, 249–262. doi: 10.2753/MIS0742-1222290409

PubMed Abstract | CrossRef Full Text | Google Scholar

Fray, P. J., Robbins, T. W., and Sahakian, B. J. (1996). Neuropsychiatric applications of CANTAB. Int. J. Geriatr. Psychiatr. 4, 3 doi: 10.1002/(SICI)1099-1166(199604)11:4andlt

CrossRef Full Text | Google Scholar

Gardner, H. E. (2011). Frames of Mind: The Theory of Multiple Intelligences. London: Basic Books.

Google Scholar

Grubin, D., and Madsen, L. (2005). Lie detection and the Polygraph: a historical review. J. Foren. Psychiatr. Psychol. 16, 357–369. doi: 10.1080/14789940412331337353

CrossRef Full Text | Google Scholar

Hinkle, C. (2020). The modern lie detector: AI-powered affect screening and the Employee Polygraph Protection Act (EPPA). Georgetown Law J. 109, 1201. doi: 10.3233/978-1-61499-625-5-316

CrossRef Full Text | Google Scholar

Jacques Junior, J. C. S., Gucluturk, Y., Perez, M., Van Lier, R., and Escalera, S. (2022). First impressions: a survey on vision-based apparent personality trait analysis. IEEE Trans. Aff. Comput. 13, 75–95. doi: 10.1109/TAFFC.2019.2930058

CrossRef Full Text | Google Scholar

Jayagopi, D. B., and Gatica-Perez, D. (2010). Mining group nonverbal conversational patterns using probabilistic topic models. IEEE Trans. Multimedia 12, 790–802. doi: 10.1109/TMM.2010.2065218

CrossRef Full Text | Google Scholar

Keatley, D. A. (2020). The Timeline Toolkit: Temporal Methods for Crime Research. Ottawa: ReBSA 9 Publications.

Google Scholar

Langhals, B. T., Burgoon, J. K., and Nunamaker Jr, J. F. (2013). Using eye-based psychophysiological cues to enhance screener vigilance. J. Cognit. Eng. Decision Making 7, 83–95. doi: 10.1177/1555343412446308

CrossRef Full Text | Google Scholar

Lucas, G. M., Gratch, J., King, A., and Morency, L. P. (2014). It's only a computer: virtual humans increase willingness to disclose. Computers in Human Behav. 37, 94–100. doi: 10.1016/j.chb.2014.04.043

CrossRef Full Text | Google Scholar

Magnusson, M. S. (1996). Hidden real-time patterns in intra- and inter-individual behavior: description and detection. Eur. J. Psychol. Assess. 12, 112–123. doi: 10.1027/1015-5759.12.2.112

PubMed Abstract | CrossRef Full Text | Google Scholar

Magnusson, M. S. (2016). Discovering Hidden Temporal Patterns in Behavior and Interaction: T-pattern Detection and Analysis With THEME. Cham: Springer.

Google Scholar

Meservy, T. O., Jensen, M. L., Kruse, W. J., Burgoon, J. K., and Nunamaker, J. F. (2005a). “Automatic extraction of deceptive behavioral cues from video,” in Intelligence and Security Informatics: Proceedings of the Third Symposium on Intelligence and Security, eds P. Kantor, G. Muresan, F. Roberts, D. Zeng, F.-Y. Wang Informatics ISI 2005, Atlanta, GA, USA. Berlin: Springer-Verlag, 198-208.

PubMed Abstract | Google Scholar

Meservy, T. O., Kruse, W. J., Burgoon, J. K., and Nunamaker, J. (2005b). Deception detection through automatic, unobtrusive analysis of nonverbal behavior. IEEE Int. Syst. 20, 36–43. doi: 10.1109/MIS.2005.85

CrossRef Full Text | Google Scholar

Mihoub, A., and Lefebvre, G. (2019). Wearables and social signal processing for smarter public presentations. ACM Trans. Interact. Intell. Syst. 9. doi: 10.1145/3234507

CrossRef Full Text | Google Scholar

Mitchell, T. L., Haw, R. M., Pfeifer, J. E., and Meissner, C. A. (2005). Racial bias in mock juror decision-making: a meta-analytic review of defendant treatment. Law Hum. Behav. 29, 621. doi: 10.1007/s10979-005-8122-9

PubMed Abstract | CrossRef Full Text | Google Scholar

National Research Council (2003). Polygraph and Lie Detection. London: National Academies Press.

Google Scholar

Novotny, E., and Bente, G. (2022). Naming signatures of perceived interpersonal synchrony. J. Nonverb. Behav. 46, 485–517. doi: 10.1007/s10919-022-00410-9

PubMed Abstract | CrossRef Full Text | Google Scholar

Nunamaker, J. F., Derrick, D. C., Elkins, A. C., Burgoon, J. K., and Patton, M. W. (2011). Embodied conversational agent-based kiosk for automated interviewing. J. Manage. Inf. Syst. 28, 17–48. doi: 10.2753/MIS0742-1222280102

CrossRef Full Text | Google Scholar

Okada, S., Nguyen, L. S., Aran, O., and Gatica-Perez, D. (2019). Modeling dyadic and group impressions with intermodal and interperson features. ACM Transactions on Multimedia Computing, Communications and Applications 15 11. doi: 10.1145/3265754

CrossRef Full Text | Google Scholar

Patton, M. (2008). Decision support for rapid assessment of truth and deception using automated assessment technologies and kiosk-based embodied conversational agents. Unpublished -dissertation, University of Arizona.

Google Scholar

Pentland, S. J., Twyman, N. W., Burgoon, J. K., Nunamaker Jr, J. F., and Diller, C. B. (2017). A video-based screening system for automated risk assessment using nuanced facial features. J. Manage. Inf. Syst. 34, 970–993. doi: 10.1080/07421222.2017.1393304

CrossRef Full Text | Google Scholar

Porter, S., and Ten Brinke, L. (2008). Reading between the lies: Identifying concealed and falsified emotions in universal facial expressions. Psychol. Sci. 19, 508–514. doi: 10.1111/j.1467-9280.2008.02116.x

PubMed Abstract | CrossRef Full Text | Google Scholar

Porter, S., Ten Brinke, L., and Wallace, B. (2012). Secrets and lies: Involuntary leakage in deceptive facial expressions as a function of emotional intensity. J. Nonverb. Behav. 36, 23–37. doi: 10.1007/s10919-011-0120-7

CrossRef Full Text | Google Scholar

Proudfoot, J. G., Jenkins, J. L., Burgoon, J. K., and Nunamaker Jr, J. F. (2016). More than meets the eye: How oculometric behaviors evolve over the course of automated deception detection interactions. J. Manag. Inf. Syst. 33, 332–360. doi: 10.1080/07421222.2016.1205929

CrossRef Full Text | Google Scholar

Rizzo, A. A., Lucas, G. M., Gratch, J., Stratou, G., Morency, L. P., Chavez, K., et al. (2016). Automatic behavior analysis during a clinical interview with a virtual human. Studies Health Technol. Inf. 220, 316–322. doi: 10.3233/978-1-61499-625-5-316

PubMed Abstract | CrossRef Full Text | Google Scholar

Salah, A. A., Pantic, M., and Vinciarelli, A. (2011). Recent developments in social signal processing. Conference Proceedings, IEEE International Conference on Systems, Man ad Cybernetics, Piscataway, NJ: IEEE, 380–385.

Google Scholar

Salovey, P., and Mayer, J. D. (1990). Emotional Intelligence. Imag. Cognit. Pers. 9, 185–211. doi: 10.2190/DUGG-P24E-52WK-6CDG

CrossRef Full Text | Google Scholar

Smith, P. J., Need, A. C., Cirulli, E. T., Chiba-Falek, O., and Attix, D. K. (2013). A comparison of the Cambridge Automated Neuropsychological Test Battery (CANTAB) with “traditional” neuropsychological testing instruments. J. Clin. Exp. Neuropsychol. 35, 319–328. doi: 10.1080/13803395.2013.771618

PubMed Abstract | CrossRef Full Text | Google Scholar

Stathopoulos, A., Han, L., Dunbar, N., Burgoon, J. K., and Metaxas, D. (2021). “Deception detection in videos using robust facial features,” in Proceedings of the Future Technologies Conference (FTC) 2020, Vol. 3. FTC 2020. Advances in Intelligent Systems and Computing, Vol. 1290, eds K. Arai, S. Kapoor, and R. Bhatia (Cham: Springer). doi: 10.1007/978-3-030-63092-8_45

PubMed Abstract | CrossRef Full Text | Google Scholar

Ten Brinke, L., and Porter, S. (2012). Cry me a river: identifying the behavioral consequences of extremely high-stakes interpersonal deception. Law Hum. Behav. 36, 469. doi: 10.1037/h0093929

PubMed Abstract | CrossRef Full Text | Google Scholar

Twitchell, D. P., Jensen, M. L., Derrick, D. C., Burgoon, J. K., and Nunamaker, J. F. (2013). Negotiation outcome classification using language features. Group Decision Negot. 22, 135–151. doi: 10.1007/s10726-012-9301-y

PubMed Abstract | CrossRef Full Text | Google Scholar

Twyman, N. W., Elkins, A., Burgoon, J. K., and Nunamaker, J. F. (2014). A rigidity detection system for automated credibility assessment. J. Manage. Inf. Syst. 31, 173–201. doi: 10.2753/MIS0742-1222310108

CrossRef Full Text | Google Scholar

Vrij, A. (2008). Detecting Lies and Deceit Pitfalls and Opportunities, 2e. London: Wiley.

Google Scholar

Walls, B. L. (2020). Using AI to Transform Behavioral Data Into Actionable Insights. Unpublished dissertation, University of Arizona.

PubMed Abstract | Google Scholar

Wang, T., Oliver, D., Msosa, Y., Colling, C., Spada, G., Roguski, Ł., et al. (2020). Implementation of a real-time psychosis risk detection and alerting system based on electronic health records using CogStack. JoVE 159, e60794. doi: 10.3791/60794

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, T., Yang, K. J., and Ananiadou, S. (2023). Emotion fusion for mental illness detection from social media: a survey. Inf. Fusion 92, 231–246. doi: 10.1016/j.inffus.2022.11.031

CrossRef Full Text | Google Scholar

Keywords: kinesics, computer vision, automated visual capture, psychiatric disorders, social signals

Citation: Burgoon JK, Elkins AC, Derrick D, Walls B and Metaxas D (2023) The future of automated capture of social kinesic signals for psychiatric purposes. Front. Comput. Sci. 5:1168712. doi: 10.3389/fcomp.2023.1168712

Received: 18 February 2023; Accepted: 30 May 2023;
Published: 02 August 2023.

Edited by:

Alessandro Vinciarelli, University of Glasgow, United Kingdom

Reviewed by:

Laetitia Aurelie Renier, Université de Lausanne, Switzerland

Copyright © 2023 Burgoon, Elkins, Derrick, Walls and Metaxas. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Judee K. Burgoon, judee@arizona

Download