How to Induce and Recognize Facial Expression of Emotions by Using Past Emotional Memories: A Multimodal Neuroscientific Algorithm

Emotion expression production and recognition play a decisive and central role in individuals’ life. The consideration and the investigation of emotions result to be especially important allowing to comprehend individuals’ emotional experiences and empathic mechanisms, representing driving knowledge for brain-computer interfaces (BCI), through the implementation of emotional patterns into artificial intelligence tools and computers, and for in-deep comprehension of psychopathology (Balconi et al., 2015a). This article aims to allow the investigation of the neurophysiological correlates and characteristics associated with individuals’ facial expressions production and recognition, considering emotional responses provoked by internal cues based on autobiographic memories, called “self-induced by memories.” Indeed, as reported by Adolphs (2002), the human brain represents most effectively emotional data through the connection of information between different cerebral areas that allow to state and recognize emotional expressions from different stimuli, as visual or auditory ones. The human brain represents emotional data connecting facial, voice, and movement expressions with individuals’ past experiences. Moreover, the use of different neuroscientific techniques, as positron emission tomography (PET), functional magnetic resonance imaging (fMRI), and magnetoencephalography (MEG), allow observing the involvement of specific cerebral regions in different emotional expressions, providing a map of the emotional brain activation (Balconi and Lucchiari, 2007; Balconi and Pozzoli, 2007; Deak, 2011; Kassam et al., 2013). Specifically, neuroimaging measures are used as input to Affective Computing technologies (Frantzidis et al., 2010). Different studies postulate the existence of discrete emotions, such as happiness, fear, anger, sadness, from which the other emotional states would derive (Ekman, 1999). The theory of discrete emotions has been criticized by the Circumplex Model of Affect (Russell, 1980), that describe and label emotions on the base of two dimensions: valence and arousal. Multimodal information are integrated by the human brain generating an integrated representation of different auditory and visual stimuli (Balconi and Carrera, 2011; Barros and Wermter, 2016).


INTRODUCTION: FACIAL EXPRESSION PRODUCTION AND RECOGNITION AS A ICT CHALLENGE
Emotion expression production and recognition play a decisive and central role in individuals' life. The consideration and the investigation of emotions result to be especially important allowing to comprehend individuals' emotional experiences and empathic mechanisms, representing driving knowledge for brain-computer interfaces (BCI), through the implementation of emotional patterns into artificial intelligence tools and computers, and for in-deep comprehension of psychopathology (Balconi et al., 2015a).
This article aims to allow the investigation of the neurophysiological correlates and characteristics associated with individuals' facial expressions production and recognition, considering emotional responses provoked by internal cues based on autobiographic memories, called "self-induced by memories." Indeed, as reported by Adolphs (2002), the human brain represents most effectively emotional data through the connection of information between different cerebral areas that allow to state and recognize emotional expressions from different stimuli, as visual or auditory ones. The human brain represents emotional data connecting facial, voice, and movement expressions with individuals' past experiences. Moreover, the use of different neuroscientific techniques, as positron emission tomography (PET), functional magnetic resonance imaging (fMRI), and magnetoencephalography (MEG), allow observing the involvement of specific cerebral regions in different emotional expressions, providing a map of the emotional brain activation (Balconi and Lucchiari, 2007;Balconi and Pozzoli, 2007;Deak, 2011;Kassam et al., 2013). Specifically, neuroimaging measures are used as input to Affective Computing technologies (Frantzidis et al., 2010).
Different studies postulate the existence of discrete emotions, such as happiness, fear, anger, sadness, from which the other emotional states would derive (Ekman, 1999). The theory of discrete emotions has been criticized by the Circumplex Model of Affect (Russell, 1980), that describe and label emotions on the base of two dimensions: valence and arousal. Multimodal information are integrated by the human brain generating an integrated representation of different auditory and visual stimuli (Balconi and Carrera, 2011;Barros and Wermter, 2016).
An important role is also played by facial motion in emotion perception and recognition. By providing unique information about the direction, quality, and speed of motion, dynamic stimuli enhance coherence in the identification of affect, lead to stronger emotion judgments, and facilitate the differentiation between posed and spontaneous expressions (Krumhuber et al., 2017;Oh et al., 2018;Goh et al., 2020). In this regard, also new technologies have introduced important innovations in face detection and recognition (Canedo and Neves, 2019). For example, sensors that may provide extra information and help the facial recognition systems to detect emotion in both static images and video sequences (Samadiani et al., 2019).
However, despite the vast theoretical differences emerged in previous studies, it is commonly shared that emotional states and consequent responses to external stimuli are influences by arousal and valence (Balconi and Carrera, 2011;Balconi and Molteni, 2016), conceptualized in different ways as tension and energy, positive or negative affect, approach and withdrawal, and valence and arousal (Russell, 1980;Eysenck, 1990;Lang et al., 1997;Watson et al., 1999). In particular, valence refers to the pleasantness or unpleasantness of individuals' emotional states; while, arousal refers to individuals' perception of activation or not. Considering these two dimensions, therefore, each emotional state experienced by individuals can be defined according to a two-dimensional model, including respectively, the valence and arousal axis.
The emotions of individuals, therefore, represent overlapping experiences that are cognitively interpreted in order to identify the responses and neurophysiological changes in the valence and arousal dimensions organized based on different eliciting factors, as different contexts and stimuli, autobiographical memories, and semantic representation or behavioral responses (Russell, 2003;Balconi and Vanutelli, 2016;Balconi et al., 2017).
In this perspective, we may represent emotions as communication signals, as they allow individuals to implement sensorimotor responses congruent with external stimuli by attributing meaning to internal and external information. This process addresses both the individual's body and the external environment, allowing the attribution of emotional meaning to the states experienced. Therefore, from a functional perspective, emotions are used to recognize and categorize some individual states in different social contexts.
Actually, facial expression recognition is well considered in the fields of computer vision, pattern recognition, artificial intelligence, and has drawn extensive attentions owing to its potential applications to natural human-computer interaction (HCI), human emotion analysis, interactive video, image indexing and retrieval.
Indeed, the emotional states' recognition from face patterns and expression allows us to comprehend and satisfy the user's needs facilitating human-machine interaction, especially when only emotional states are used to communicate with others (Kanchanadevi et al., 2019;Volynets et al., 2020). This shows how the fundamental role of emotions in individuals' cognition (LeDoux, 1998) symbolizes a defying topic in Information Communication Technology (ICT) useful to respond to the high request, implementing machines able to assist individuals with several psychological and physical disorders or difficulties at a cognitive, social, or communicative level (Esposito and Jain, 2016).
Also, integration between emotional memories (EM) and emotional expression by faces is an interesting topic. In particular, the role of EM in emotional facial production and recognition has been observed by some studies that have focused mainly on patterns of emotional recognition in specific contexts. Indeed, EM allow the use of internal emotional models developed through the individuals' past life experiences to decode others' emotions expressed through mimic facial patterns. This mechanism is permitted simulating the emotional states expressed in oneself (Dimberg et al., 2000;Heberlein and Atkinson, 2009;Niedenthal et al., 2010;Wearne et al., 2019).

THE TECHNOLOGY-BASED RECOGNITION OF FACIAL PATTERNS
In the last years, the interest regarding the investigation of emotion through electroencephalography (EEG) is increased, thanks to the possibility provided by this tool to label and recognize facial expressions. Furthermore, compared to neuroimaging techniques, as fMRI, MEG, and PET, the EEG is configured as a low cost and easy to use technique thanks to the development of current wireless EEG systems. Recent studies have observed the advantages of using EEG to investigate emotions, providing the measurement of cerebral changes in high-and low-frequency band activity and early and late latencies with an excellent temporal resolution and offering a full overview of the emotional processes. Indeed, brain activity changes depict in a sequential way the dynamicity of individuals' emotional responses variations, that are not fully accessible using neuroimaging techniques Balconi et al., , 2015b.
However, considering the quick temporal evolution of emotional responses and the interconnection of different cerebral areas and neural networks involved in emotional processing, neuroimaging techniques, that provide a good temporal and spatial resolution, could be useful for the investigation of emotional facial expression and recognition. In particular, the fNIRS, consisting of a non-invasive and easy-to-use technique, provides a sufficient temporal resolution to investigate eventrelated hemodynamic changes (Elwell et al., 1993). Indeed, in the last years, fNIRS has been used to investigate emotional responses in various contexts (Koseki et al., 2013;Balconi and Molteni, 2016). Furthermore, the portability, the lack of restrictions, and fNIRS replicability allow, compared to other neuroimaging techniques, to impose lower physical and psychological burdens on participants.
Moreover, the combined use of fNIRS and EEG allows obtaining information about the neural and hemodynamic correlates of brain activity. In addition to electrophysiological and neuroimaging techniques, autonomic ones provide an integration of the central measures, contributing to the integration of the previous order of measures. Finally, EMG allows measuring the zygomaticus major and the corrugator supercilii muscle activity, which characterize the facial autonomic response to emotional stimuli, representing predictive markers of emotional behavior (Fridlund and Cacioppo, 1986).

WAYS TO ELICIT AND RECOGNIZE FACIAL EXPRESSION
As reported by different studies, several techniques are used for the elicitation of emotional responses and expressions. Among these, primary methods of emotional elicitation consist of movies and pictures with highly emotional content. In particular, as demonstrated by Westermann et al. (1996), watching movies result to be the best procedure to elicit positive or negative emotions. Therefore, researchers have proposed different databases containing affective video-clips (Balconi et al., 2009;Chambel et al., 2011) or pictures and sounds with high emotional content to cause emotional responses and expression (Balconi and Pozzoli, 2005). Among these, two of the most used databases of audio and visual emotional elicitation stimuli are the International Affective Digitized Sounds (IADS) (Bradley and Lang, 2007) and the International Affective Picture System (IAPS) (Lang et al., 2008).
However, emotions' elicitation and recognition can also be produced by recalling in mind past experiences. Indeed, also integration between EM and emotion expression by faces is an interesting topic. In particular, for the elicitation of selfgenerated emotions, individuals were asked to re-experience personal life episodes, positive or negative connoted, and marked by different emotions (Damasio et al., 2000;Kassam et al., 2013). The elicitation of self-generated emotions through memories could be a way to drive BCI independently.
Recently, databases included stimuli media belonging to different modalities of emotional elicitation and expression have been suggested (Gunes and Piccardi, 2006;Grimm et al., 2008;Fanelli et al., 2010;Koelstra et al., 2012;Soleymani et al., 2012;Abadi et al., 2015;Katsigiannis and Ramzan, 2018) and implemented through the use of the recognition of patterns of signals derived from different modalities. The databases previously described have the limit of not being generalizable regarding the modalities of measurement and elicitation of emotions, since they have considered certain strategies of signal classification and their results. Due to this structure, these databases result to be very useful to conduct comparisons between different elaboration strategies or classification on the same data, but they make impossible the conduction of transversal studies and the comparison between data of the same subjects collected from different imaging or activation modalities. This allows us to observe how the existing and used databases have limitations to a complete and generalizable investigation of emotional elicitation responses and mechanisms. In fact, in the first place, these databases use partial methods for emotional investigation. Furthermore, different reference models and methods used in previous research are not directly comparable. Besides, these methods do not allow to distinguish the different cognitive components that are involved and that are fundamental in emotional processing, such as memory and its contribution.
Despite the existence of different emotional elicitation techniques, the creation of algorithmic related to emotions' induction and recognition appears to be difficult because individuals' emotional elicitation includes different components, as behavioral, psychological, and cognitive ones. This leads to the consideration of data regarding individuals with different features collected by different methods and techniques, thus including a large number of evidence belonging to different subjects in structured databases.

NEW PERSPECTIVE TO INDUCE AND RECOGNIZE FACIAL EXPRESSION OF EMOTIONS
In light of what is reported in the previous paragraph, the use of existing methods and databases for the induction and the recognition of emotion should be integrated with new databases that consider the collection of different parameters using self-induced stimuli. For example, the steps for creating a database for the induction and recognition of emotion based on self-induced stimuli will be presented below, consisting of recalling past autobiographical events of EM. Specifically, the first step requires collecting autobiographical experiences of individuals through mnemonic recall using semi-structured interviews of autobiographical events with a positive, negative and neutral valence. We clearly explained the subjects the scope, the experimental phases and content and the detailed procedure of the present experiment. An explicit consent to participate (and to withdraw from the experiment in any time) was required for each participant.
The second step requires creating specific algorithms for the formulation of linguistic codes (short utterances memory induction) for the encoding of the participants' autobiographical memories. The utterances were vocally reproduced and then submitted to a specific vocal analysis. Indeed specific parameters (such as F0, speech profile, intensity, temporal parameteri.e., locutory duration and pause etc.) were checked before the experimental phase. It was made to avoid any vocal effect and to make more neutral the linguistic stimuli.
Finally, the third step requires the evocation of emotional experiences through previously created emotional cues, coded in a personalized linguistic way, after a specific time interval. The time interval we adopted was considered based on previous studies on the memory effect related to long-lasting effect. Indeed we intend to work with long-term memories and avoid potential transient effect due only to working memory. For this reason, this time interval was adopted. The advantage of using a database of this type allows the use of an automated and recognized procedure for the induction and the recognition of emotion and it allows explicit reference to fundamental processes, such as memory, in emotional recognition. Furthermore, this database could also be used in the clinical setting, especially in the case of deficits or syndromes related to emotional memory, in order to investigate behavioral and neurophysiological Step

Procedure Characteristics Example
First step Free recall based on positive, negative, and neutral past autobiographical events This step consists of the free recall of autobiographical events based on positive, negative, and neutral individuals' past events. These autobiographical events of individuals are collected through semi-structured interviews conducted by experienced researchers. Specifically, this step required participants to freely recall specific past life events by recalling certain information such as the duration of the event throughout the day, the location of the event and the specific time of day when it took place

Second step Codification of participants' autobiographical memories
This step consists of the collection of autobiographical memories previously produced by each individual. These memories are subsequently coded by expert judges who transpose them into linguistic codes through specific algorithms (short statements induction of the memory recorded by the experimenter). The grammatical and structural homogeneity of these linguistic codes has been verified (linguistic code consisting of subject + verb + direct object) to create 25 positive, 25 negative, and 25 neutral sentences ad hoc for each individual Third step Guided recall by listening to emotional cues This step consists of the guided recall based on listening to statements, reproduced in auditory format, after 5 days compared to the first step. In particular, the statements have been divided into three categories concerning valence and arousal. These dimensions were assessed using a 9-point Likert scale.

DISCUSSION: A NEW PROCEDURE TO INDUCE AND RECOGNIZE FACIAL EXPRESSION: THE ROLE OF AUTOBIOGRAPHICAL MEMORIES
Therefore, given emotions' importance and their various pathways, it is necessary to design a structured multimodal database, based on memories of past experiences able to induce facial expression modulated by specific valence/arousal. As anticipated in the previous paragraph, we suggest some pathways to create specific procedure and dataset to induce emotional response and to active a better recognition of facial expression.
In the first place, the collection of EM must be produced and chosen from a positive, negative, and neutral valence typology. For the creation of the EM database, three specific steps are requested. In particular, the first step includes the free recall based on positive, negative, and neutral past autobiographical events collected through semistructured interview administered to participants by an expert researcher. Specifically, the recalling of autobiographical memories occurred freely, but participants were asked to provide certain information regarding each event, describing the specific moment, the duration and the place of this event in their life.
Moreover, the second step includes the codification of participants' autobiographical memories by expert judges, that reports them in sentences using a specific algorithm to encode participants' memories into linguistic code (brief utterances --bu -memory inducing) able to elicit past autobiographical events positive, negative and neutral connoted. The homogeneity of these linguistic codes has been verified at a grammatical and structural level. In this way, 25 sentences were created for valence (positive, negative, and neutral). Finally, the third step requires the guided recall by listening to emotional cues, previously created, after a specific time interval, consisting of 5 days, from the first step. These guided recalls are based on the initial memory and are coded in a personalized way and transposed in a communicable and objective way using a linguistic code (see Table 1).
The significant effect induced by subjective memories is the crucial point of this dataset: it was created to be able to induce specific emotional responses directly evoked by the subjective recall. Since we verified each cue's exact significance (personal memory and then utterance), we are sure that this dataset can induce specific psychological answers in a subject. This procedure may bypass the limitation of the previous dataset, in which "impersonal" cues are used to elicit an emotional experience or an emotional recognition process. In this last case, we are not sure that the "impersonal" dataset is really able to provoke exactly that emotion. Only by using personal cues, related to the subjective experience (their memories), we can be sure that the real emotional meaning is intrinsic to the cues. This emotional meaning has a high psychological power in terms of emotion induction, since EM were previously demonstrated to be the most significant event able to induce emotions in a subject.
The neurophysiological correlates of facial expression associated with individuals' emotional response elicited by EM could be collected with the simultaneous use of EEG, fNIRS, and autonomic measures. When we associate the EM to facial expressions this link is potentially of high impact to facilitate the facial recognition.
These measures as suggested to better support a multimodal acquisition in order to have a picture of central and peripheral components.
In addition this procedure allows possible future clinical applications in case of subjects with specific consciousness impairment (such as DOC or locked-in syndrome patients) where the use of memories could be a valid alternative to the impossibility to communicate their emotional states by language: this is a possible future development for BCI (brain-computer interface) based on EM and thoughts which are supposed to be able to activate specific physiological activation in the absence of an explicit communication (when impaired), and, for this reason, this tool could become a valid way to improve the quality of life of specific categories of patients.