Specialty Grand Challenge ARTICLE
Breaking fresh ground in human–media interaction research
- Computer Science, University of Twente, Enschede, Netherlands
Background and Current Trends
Human–media interaction research is devoted to methods and situations where humans individually or collectively interact with digital media, systems, devices, and environments. Novel forms of interaction paradigms have been enabled by new sensor and actuator technology in the last decades, combining with advances in our knowledge of human–human interaction and human behavior in general when designing user interfaces.
Collaboration, New Spaces, Physical Engagement
Today, it is possible to design applications where physically engaged as well as mobile users, co-located or distributed, can compete and collaborate or inform others about their whereabouts and activities. Individual, competitive, or collaborative computer-supported activities may take place at home or at the office, or in various public spaces. Sensors and actuators in wearable and mobile computing devices will contribute to the expansion of possibilities for creating interfaces between the physical world and its inhabitants. These will then further extend application areas as well, such as collaborative work, passive and active recreation, education, behavior change, training, and sports (Cheok et al., 2014; Nijholt, 2014a).
Understanding the User
Research in this area concerns the perception–action cycle of understanding human behaviors and generating interactive system responses. It stems from the premise that understanding the user should inform the generation of intuitive and satisfying responses by the user’s environment and its devices. This can be achieved by automated evaluation of speech, pose, gestures, touch, facial expressions, social behavior, interaction with other humans, and bio-physical signals, as well as by answering the pivotal question of how and why people use interactive media.
System evaluation focuses on the perceptions and experiences that are engendered in the user. Design, implementation, and analysis of the systems are then investigated across different application areas and a variety of contexts.
Research in this area certainly concerns the use of sensors that allow behavior sensing: from position and proximity sensing to vision, speech, touch, gestural, and situation recognition.
Participants can intentionally use these modalities in order to control their environment and issue commands. The smart environment and its sensors can also adapt according to social cues, which are potentially complemented with neurophysiological sensors from the body and brain to determine affective and cognitive states, as well as intentions and commands interaction participants may intend to issue.
Embedded Artificial Intelligence and Sensors
Research in affective computing aims at providing tools to recognize and react to affective user states through embedded computational and artificial intelligence. Interaction modeling is required for computer-mediated human–human interaction, for human–agent interaction, and for human–media interaction in general. Sensors and actuators underlie interaction modeling together with emotion modeling and social psychology.
Actuator interaction behavior can be provided by both traditional means and by artificial agents resembling humans, such as virtual agents (embodied conversational agents), social robots, or (semi) intelligent game avatars. In these cases, applications can require human–human-like multimodal behavior and interaction realization.
Sensors that know about us and sensors that we know about and that we can address in order to issue commands or ask for support are and will be embedded in our physical and virtually augmented physical environments. They allow us to communicate with and get support from these environments. The environments may behave and be presented in human-like and socially appropriate ways.
Empathy and humor (Nijholt, 2014b) requires knowledge about a particular application.
Do we need to help the elderly in their daily activities in their home environment? Do we need to model interactions with collaborators in a serious game environment? Do we need to support communication in a home, office, meeting, or rehabilitation environment?
The novel approach to human–media interaction research requires the fusion of human–media interaction with several other disciplines. This fusion is explicit in the following topics.
The research area of Social Signal Processing aims at bringing social intelligence to computers.
For social interaction with computers (or social robots or virtual humans), social cues need to be decoded that are physically detectable from our non-verbal communication and that are beyond our conscious control (Vinciarelli et al., 2008, 2012). We can derive higher-level concepts from these physically detectable social signals, such as empathy. The field should therefore go beyond “signal processing” and turn its attention to the cognition-level processing of our data.
While we need interfaces to detect and interpret social signals, where human-like interaction behavior is requested from the interface, it needs also be able to generate relevant combinations of social signals when interacting with its human partners. Obviously, these signals play an important role in turn-taking and real-time action coordination as well. In multi-party situations, where a smart environment observes multi-party social signals it needs to understand them in order to distinguish roles and predict activities that it can support or help enact. Ultimately, being able to model, analyze, and synthesize social behavior is what we need in order to understand and maximize the potential of smart environments.
In Affective Computing, our understanding of human affective processes is used in designing and evaluation of interfaces that require affective natural interaction with the user. Bodily manifestations of affect, multimodal recognition of affective states, ecological and continuous emotion assessment, and computational models of human emotion processing are under investigation. Algorithms for sensing and analysis, predictive models for recognition, and affective response generation, including behavior recognition when social agents are involved, are the core research issues in this field of interaction research. Research requires corpora of spontaneous interactions and methods of emotion elicitation (Scherer et al., 2010).
In Augmented Social Interaction, we digitally enhance our face-to-face interaction with other partners, including our interaction with smart environments. Digitally enhanced glasses or other wearables may provide us with information about our conversational partner. This can be factual information, collected before the interaction we have, but it can also be real-time updated information, for example, about our partner’s mental state, assuming that it is available from sensors. Socially correct behavior could be suggested or even imposed.
Embodied Agents (virtual agents, virtual humans) are human-like interactive characters that communicate with humans or with each other using natural human modalities such as facial expressions, speech, and gesture. They need to be capable of real-time perception, cognition, and action. Making such characters autonomously perform a particular task in interaction with a human conversational partner is one of the aims of this research. All the research issues mentioned under the topic social signal processing are important for embodied interface agents too. That is, they need integrative social skills (understanding and responding) and social cue analyses need to be augmented with semantic and pragmatic information processing. Realistic conversational behavior also requires building-up of long-term social relationships with human partners. Social robotics research parallels embodied agent research, but, of course there are some exceptions that are related to the physicality of a social robot. For example, the role of its human partner’s bodily engagement and experience need to be taken into account when designing social robots.
Holograms are projected three-dimensional images made up of beams of light. Today, motion sensors and touch capabilities have made interaction with such images possible. Holographic objects augmented with interaction modalities (as we know from interactive computer graphics) can thus become part of smart environments, not really different from tangible objects (Bimber et al., 2005). Real-time altering of images is possible, and, clearly, holographic images can take the form of virtual humans with which we can interact. Hence, just as we want to investigate interaction with tangibles, wearables, virtual humans, and humanoid robots, we should do so for holographic displays.
Interaction modalities that use sight, sound, and touch are well-researched. This is less true for sensory modalities such as smell and taste. Scientific breakthroughs in sensor-based Smell and Taste detection and smell and taste actuators can be expected (Gutierrez-Osuna, 2004; Matsukura et al., 2013; Ranasinghe et al., 2013). “Electronic noses” (arrays of chemical sensors) using pattern recognition algorithms can distinguish different odorants, and digital descriptions can be used to synthesize odorants. Applications can be found in affective and entertainment computing and in increasing the feeling of presence in synthesized environments. Taste sensors, also known as “Electronic tongues,” have been designed to distinguish between different taste experiences. The digital simulation of taste has also been achieved by digital taste interfaces that use electrical, chemical, and thermal stimulation of the tongue. Although many technical problems still have to be resolved, we now see experiments and user evaluation of applications using smell and taste. Required are investigations where smell and taste are integrated in multimodal user interfaces.
There is a range of interfaces that are known as Organic User Interfaces. These include smart material interfaces, reality-based interfaces, programmable matter, flexible interfaces, and smart textiles that use materials or miniature sensors embedded in materials that respond to environmental information by changing their physical properties, such as shape, size, and color (Holman and Vertegaal, 2008; Minuto and Nijholt, 2013). Smart material interfaces attempt to overcome the limitations of traditional and tangible interfaces. They focus on changing the physical reality around the user as the output of interaction and/or computation as well as being used as input device. They promote a tighter coupling between the information displayed and the display itself by using the tangible interface as the control and display at the same time – embedding the information directly inside the physical object. We need to investigate the potential of smart materials for designing and building interfaces that communicate information to the user – or allow the user to manipulate information – using different modalities.
Brain–Computer Interaction will become integrated with multimodal interaction and use unobtrusive sensor technology, naturally embedded in wearables or in socially accepted implants. It will therefore find its way into domestic and health and well-being applications, including game, entertainment, and social media applications (Marshall et al., 2013). Brain activity measurements provide information about the cognitive and affective state of an inhabitant of a sensor-equipped environment (Mühl et al., 2014). This allows adaptation of the environment to this state and voluntarily control by the user of the environment and its devices by manipulating this state.
However, rather than considering one individual we can consider interacting, collaborating, or competing users in smart environments and provide the environment (and its users or players) with this information in order to improve individual or team performance or experience (Nijholt, 2014c). Brain-to-brain communication using EEG to measure and transcranial (magnetic or direct current) stimulation to transfer brain activity from one person to the other has shown to be possible and will be further investigated.
For Mobile Devices and Services, where we interact with a small device, there will be other requirements for interface design, audio, speech and gesture interaction, and the employment of gaze, head, and movements tracking. Research issues (Dunlop and Brewster, 2002) that have been identified are: designing for mobility, designing for a widespread population, designing for limited input/output facilities, designing for incomplete and varying context information, and designing for users multitasking at levels unfamiliar to most desktop users. These issues set mobile HCI apart from traditional HCI and from interaction in sensor-equipped environments that track and support a user. Obviously, hand-held devices such as smartphones have access to all the intelligence available on the web and applications can be designed according to particular users and contexts. One research issue that emerges is interoperability. How can we maintain consistency in information and its presentation when the mobile user enters a new environment that requires or allows different presentation and interaction modalities?
These trends – considered here from a technological viewpoint only – certainly require adaptation. In particular, they await developments in corpus collection and analysis, knowledge representation and reasoning, machine learning techniques, and also in user modeling, usability and user-centered design, engagement, persuasion, experience research, and evaluation. In principle, with smart environments we can create things that can move, change appearance, sense (pro-actively) react, interact, and communicate. One all-important question that arises is who will design such environments, who will be able to configure such environments, and who will provide the tools to adapt environments to user preferences.
Conflict of Interest Statement
The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Bimber, O., Zeidler, T., Grundhoefer, A., Wetzstein, G., Moehring, M., Knoedel, S., et al. (2005). “Interacting with augmented holograms,” in SPIE Proceedings of International Conference on Practical Holography XIX (Bellingham, WA: SPIE), 41–54.
Matsukura, H., Yoneda, T., and Ishida, H. (2013). Smelling screen: development and evaluation of an olfactory display system for presenting a virtual odor source. IEEE Trans. Vis. Comput. Graph 19, 606–615. doi:10.1109/TVCG.2013.40
Minuto, A., and Nijholt, A. (2013). “Smart material interfaces as a methodology for interaction. A survey of SMIs’ state of the art and development,” in 2nd Workshop on Smart Material Interfaces (SMI 2013). Workshop in conjunction with 15th ACM International Conference on Multimodal Interaction (ICMI’13), Sydney, NSW.
Mühl, C., Allison, B., Nijholt, A., and Chanel, G. (2014). A survey of affective brain computer interfaces: principles, state-of-the-art, and challenges. Brain Comput. Interfaces 1, 66–84. doi:10.1080/2326263X.2014.912881
Nijholt, A. (2014a). “Towards humor modelling and facilitation in smart environments,” in Advances in Affective and Pleasurable Design, eds Y. Gu Ji and S. Choi (Krakow, Poland: AHFE Conference ©), 260–269.
Nijholt, A. (2014c). “Competing and collaborating brains: multi-brain computer interfacing,” in Brain-Computer Interfaces: Current trends and Applications [Intelligent Systems Reference Library Series], eds A. E. Hassanieu and A. T. Azar (Cham, Switzerland: Springer), 313–35.
Ranasinghe, N., Cheok, A., Nakatsu, R., and Yi-Luen Do, E. (2013). “Simulating the sensation of taste for immersive experiences,” in Proceedings of the 2013 ACM International Workshop on Immersive Media Experiences (ImmersiveMe ‘13) (New York, NY: ACM), 29–34.
Keywords: human-media interaction, user interface design, organic user interfaces, smart environments, multimodal interaction, nonverbal communication, social signal processing, affective computing
Citation: Nijholt A (2014) Breaking fresh ground in human–media interaction research. Front. ICT 1:4. doi: 10.3389/fict.2014.00004
Received: 02 October 2014; Accepted: 15 October 2014;
Published online: 04 November 2014.
Edited and reviewed by: Alessandro Vinciarelli, University of Glasgow, UK
Copyright: © 2014 Nijholt. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.