ORIGINAL RESEARCH article
Psychometric Properties and Validation of the EMOTICOM Test Battery in a Healthy Danish Population
- 1Neurobiology Research Unit, The Neuroscience Centre, Copenhagen University Hospital Rigshospitalet, Copenhagen, Denmark
- 2Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
- 3Neuroscience and Psychiatric Unit, The University of Manchester, Manchester, United Kingdom
- 4Department of Public Health and Center for Healthy Aging, University of Copenhagen, Copenhagen, Denmark
- 5Department of Psychiatry, University of Cambridge, Cambridge, United Kingdom
- 6Behavioural and Clinical Neuroscience Institute, University of Cambridge, Cambridge, United Kingdom
Disruptions in hot cognition, i.e., the processing of emotionally salient information, are prevalent in most neuropsychiatric disorders and constitute a potential treatment target. EMOTICOM is the first comprehensive neuropsychological test battery developed specifically to assess hot cognition. The aim of the study was to validate and establish a Danish language version and reference data for the EMOTICOM test battery. To evaluate the psychometric properties of 11 EMOTICOM tasks, we collected data from 100 healthy Danish participants (50 males, 50 females) including retest data from 49 participants. We assessed test–retest reliability, floor and ceiling effects, task-intercorrelations, and correlations between task performance and relevant demographic and descriptive factors. We found that test–retest reliability varied from poor to excellent while some tasks exhibited floor or ceiling effects. Intercorrelations among EMOTICOM task outcomes were low, indicating that the tasks capture different cognitive constructs. EMOTICOM task performance was largely independent of age, sex, education, and IQ as well as current mood, personality, and self-reported motivation and diligence during task completion. Overall, many of the EMOTICOM tasks were found to be useful and objective measures of hot cognition although select tasks may benefit from modifications to avoid floor and ceiling effects in healthy individuals.
Hot cognition describes cognitive processing of emotionally salient information (Roiser and Sahakian, 2013). Examples of hot cognitive domains include basic emotion processing, motivation and reward driven behaviors as well as social cognition, i.e., the ability to understand and participate in social transactions. Importantly, disruptions in hot cognitive processes have been identified as core features in a wide range of neuropsychiatric disorders such as mood disorders (Elliott et al., 2011), anxiety disorders (Plana et al., 2014), schizophrenia (Ventura et al., 2013), Attention Deficit and Hyperactivity Disorder (ADHD) (Umemoto et al., 2014), and autism (Harms et al., 2010). In particular, negative affective biases, i.e., the preferential processing of negative information over positive information, have consistently been shown in patients with mood disorders (Elliott et al., 2011; Hjordt et al., 2017), anxiety disorders (Mogg et al., 1995), substance abuse disorders (Ersche and Sahakian, 2007), and eating disorders (Lovell et al., 1997). Notably, one study found mood-congruent attentional biases in bipolar disorder where patients in the depressed state showed enhanced processing of negative information while patients in the manic state showed enhanced processing of positive information (García-Blanco et al., 2013). In contrast, healthy individuals typically show no or a slight positive affective bias (Pool et al., 2016). Meanwhile, impairments in motivation and reward-driven behaviors have been observed in psychopathological conditions including aggression (Kuin et al., 2015), traumatic brain injury (Newcombe et al., 2011), and ADHD (Umemoto et al., 2014) while differences in neural response to rewards and loss and disruptions in reinforcement learning have been linked to schizophrenia and major depressive disorder (MDD) (Chen et al., 2015; Hagele et al., 2015). Disturbances in social cognition including mentalization, i.e., the ability to infer the mental states of others, are central features of disorders such as autism and schizophrenia (Chung et al., 2014) and impairment in moral judgment has been reported for psychopathic individuals (Cardinale and Marsh, 2015), autism (Brewer et al., 2015), and patients suffering from ventromedial prefrontal cortex lesions (Cameron et al., 2018). In addition, self-blaming moral emotions such as guilt and shame have been shown to be exacerbated in MDD (Green et al., 2013) and anxiety disorders (Hedman et al., 2013). In healthy individuals, differences in hot cognitive processes have been linked to pharmacological interventions such as oxytocin (Leppanen et al., 2017) and serotonergic manipulations (Merens et al., 2007). Sub-clinical symptoms of depression and anxiety (Routledge et al., 2018), as well as natural sex hormone fluctuations in women (Osorio et al., 2018), also produce changes in hot cognition.
In summary, hot cognitive processes are relevant in a wide range of contexts across both normal and disturbed mental functioning. Notably, hot cognition has been proposed as an early predictor for treatment response in MDD (Harmer and Cowen, 2013; Park et al., 2018) as well as a promising target for therapeutic intervention (Roiser et al., 2012). Yet, despite growing recognition of their importance, scientists have so far lacked a validated and comprehensive set of tools capable of assessing hot cognitive processes in a standardized manner. Therefore, a group of researchers from Britain recently developed a novel 3-h computerized neuropsychological test battery called EMOTICOM (Bland et al., 2016). The EMOTICOM battery comprises 16 novel, adapted, and existing tasks designed to capture cognitive functions from four hot cognitive domains; (1) Emotion Processing, (2) Motivation and Reward, (3) Impulsivity, and (4) Social Cognition. The British developers validated the EMOTICOM battery in a cohort of 200 healthy participants (Bland et al., 2016). We here assess the psychometric properties of EMOTICOM in a shortened version using a Danish cohort of 100 healthy participants and provide reference data for research and clinical use of the test battery in Danish. In the British validation, test–retest reliability of the EMOTICOM battery was assessed after a relatively short time interval (5–10 days). In the present study we chose to collect retest data after 3–5 weeks in order to provide a reference for longitudinal studies investigating the effects of treatment or interventions over weeks or months. We also supplement the original British study findings by comparing performance on the EMOTICOM tasks in the shortened Danish battery with relevant factors such as personality, mood, and self-reported levels of motivation and diligence during task completion.
Materials and Methods
One hundred healthy Danish participants between 18 and 48 years of age (males, n = 50; females, n = 50) were recruited from a previously established database of healthy volunteers (Knudsen et al., 2016) or through internet advertisements and flyers posted around the greater Copenhagen area. Exclusion criteria for the study included history of psychiatric disorders, significant somatic illness, brain trauma, use of psychotropic medication, significant lifetime history of drug abuse, pregnancy or breastfeeding, and non-fluency in Danish. The study was approved by the Danish Data Protection Agency (protocol RH-2015-255) and written informed consent was obtained from all participants.
Upon inclusion, participants were randomized into single test or retest groups. Three participants originally randomized into the retest group dropped out after completing the first test session; one due to a family emergency and two failed to disclose the reason. To accommodate these dropouts, two unused single-test slots in the randomization system were converted into retest slots while the last dropout happened too late in the data collection process to be recovered. Thus, 51 participants completed a single test session while 49 participants completed retest sessions after 3–5 weeks (time between test–retest: 27.4 ± 4.8 days, mean ± SD)1. Intelligence quotient (IQ) was assessed with the Reynolds Intellectual Screening Test (RIST) using the verbal subtest ‘Guess What?’ and the non-verbal subtest ‘Odd-Item Out’ (Reynolds, 2011). Level of education was indexed with the Online Stimulant and Family History Assessment Module (OS-FHAM) questionnaire using a five-point Likert scale from 1 (no vocational degree) to 5 (>4 years of higher learning at university level). Personality was assessed with the NEO Personality Inventory Revised (NEO PI-R, n = 93) and the NEO Personality Inventory-3 (NEO PI-3, n = 6) (Costa and McCrae, 2005). Mood was assessed with the Profile of Mood State (POMS) (McNair and Heuchert, 2007) immediately before each test session. All test sessions took place in standardized testing rooms and were conducted by a team of five trained neuropsychological testers at the Neurobiology Research Unit, Copenhagen University Hospital Rigshospitalet.
In addition to a flat fee of 200 Danish kroners, participants had the opportunity to win money based on their performance in six EMOTICOM tasks that included monetary reward. For these six tasks, participants were instructed to rate their performance during the task in terms of motivation and diligence, i.e., the degree to which they had ‘done their best.’ Participants were also encouraged to write down any thoughts or suggestions regarding the overall test experience or any specific task, followed by a brief unstructured interview at the end of each session. The order of tasks within the EMOTICOM battery was randomized to control for any potential effects of test order.
Translation and Implementation of EMOTICOM in Danish
Three native Danish speakers independently translated the full EMOTICOM test battery into Danish. Following a consensus meeting supervised by trained test psychologists, a single version was agreed upon. The consensus version was then back-translated into English by a natural English-Danish bilingual individual and sent for the approval of the original test developers. Implementation of the final Danish translation was done using the open source software PsychoPy. All monetary rewards were converted from British pounds to equivalent sums in Danish kroners.
The EMOTICOM Test Battery
Out of the original 16 tasks in the full EMOTICOM test battery, 11 were selected for translation and implementation in the Danish version. Two tasks, The Four-choice Serial Reaction Time Task and The Discounting Task, were not translated into Danish because the original test code was unavailable while two others, The Emotional Memory Recognition Task and The Inference Task, were left out based on the recommendation from the original British test developers who felt these tasks warranted further improvements. Lastly, due to translation concerns (e.g., issues relating to word length, frequency, and translation ambiguity), the Word Affective Go No/Go was also not implemented in the Danish validation. Therefore, only three of the original four hot cognitive domains, i.e., Emotion Processing, Reward and Motivation, and Social Cognition, were represented in the present study, while the last domain, Impulsivity, was left out. For a brief overview of the selected EMOTICOM tasks and their primary outcomes see Table 1. For a description of the full EMOTICOM battery see Bland et al. (2016).
Statistical analyses were performed using SPSS statistical software (version 25.0) and R Studio (version 3.5). Missing data included NEO personality for one participant and self-reported ratings of motivation and diligence for five participants on the Prisoner’s Dilemma and for one participant on the Ultimatum Game. Alpha levels were set at 0.01 for statistical significance in order to account for multiple comparisons.
Task Outcomes and Descriptive Statistics
Primary task outcomes for each EMOTICOM task were selected based on recommendations from the original British test developers and the existing literature. Descriptive and psychometric information on secondary outcomes can be found in the Supplementary Information. Mean, SD, median, interquartile range, range, and skewness are reported for all primary task outcomes. Floor and ceiling effects were determined as the percentage of participants who achieved minimum scores (floor effect) or maximum scores (ceiling effects) for a given task outcome. Floor or ceiling effects above 10% were considered moderate while effects above 30% were considered severe/problematic.
To assess test–retest reliability, intraclass correlation coefficients (ICCs) and their 95% confidence intervals (95% CI) were calculated based on retest data from 49 participants using an absolute-agreement two-way mixed effect model. ICC values of less than 0.40 were considered poor, values between 0.40 and 0.59 as fair, values between 0.60 and 0.74 as good, and values greater than 0.75 as excellent (Cicchetti, 1994). In addition, test–retest bias, i.e., percent change in scores between first and second test, was calculated as: Test–retest bias = ((scoreretest – scoretest)/scoretest) ∗ 100.
Task-Intercorrelations and Factor Analysis
To determine EMOTICOM’s ability to capture the three proposed underlying cognitive domains, correlation matrices conducted with Spearman’s rank correlations were used to index the shared marginal variance between tasks within the same cognitive domain, i.e., Emotion Processing, Motivation and Reward, and Social Cognition. In addition, we used an exploratory factor analysis to investigate the underlying factorial structure of the EMOTICOM test battery. The analysis was conducted using principal axis factoring with Varimax rotation. We used an eigen-value greater than 1 as criterion for extraction of factors.
Correlations With Demographic and Descriptive Factors
Spearman’s rank correlation was used to assess the association between performance on EMOTICOM tasks and relevant demographic and descriptive factors including age, sex, education, IQ, NEO personality trait Neuroticism, and scores for self-reported mood on test days. In addition, correlations between test performance and self-reported motivation and diligence were assessed for the six EMOTICOM tasks containing a monetary reward paradigm, i.e., Reinforcement Learning Task, Monetary Incentive Reward Task, Progressive Ratio Task, Adapted Cambridge Gambling Task, Prisoner’s Dilemma, and Ultimatum Game.
Task Outcomes and Descriptive Statistics
Table 2 shows descriptive data for the 100 healthy Danish participants. Level of education was high with a majority (n = 74) of participants currently attending or having completed > 4 years of higher learning at university level. The study sample IQ of 110.36 was significantly higher than the population IQ of 100, t(99) = 14.8, p < 0.001 (Reynolds, 2011). There was no difference in Neuroticism scores between the study sample average of 76.04 and the Danish population average of 77.20, t(98) = −0.41, p = 0.68 (Skovdahl et al., 2011). Lastly, the study sample exhibited significantly lower levels of self-reported total mood disturbance (TMD) indexed with the POMS (TMD score = 1.56) compared to normative data (TMD score = 18.00), t(99) = −10.28, p < 0.001 (Nyenhuis et al., 1999).
Task Outcomes and Descriptive Statistics
The majority of EMOTICOM task outcomes were skewed and 32 out of 42 outcomes had non-normal distributions. For these task outcomes, median and IQR should be used as reference instead of mean and SD. We observed small floor effects (<10%) for 4 outcomes; moderate floor effects (≥ 10%) for 1 outcome; and severe floor effects (≥30%) for 5 outcomes. In addition, we observed small ceiling effects for 15 EMOTICOM outcomes; moderate ceiling effects for 7 outcomes; and severe ceiling effects for 3 outcomes.
Table 4 shows test–retest reliability and test–retest bias for primary EMOTICOM outcomes.
Intraclass correlation coefficients scores varied across primary EMOTICOM outcomes: 7 task outcomes exhibited excellent test–retest reliability (ICC ≥ 0.75); 21 task outcomes exhibited good test–retest reliability (0.60 ≤ ICC < 0.75); 9 task outcomes exhibited fair test–retest reliability (0.40 ≤ ICC < 0.60); and 10 outcomes exhibited poor test–retest reliability (ICC < 0.40). Test–retest bias ranged from −15.32 to 32.58% across all primary EMOTICOM outcomes.
Task-Intercorrelations and Factor Analysis
Figure 1 shows the results of the correlation matrices conducted for each of the three cognitive domains: Emotion Processing, Motivation and Reward, and Social Cognition.
Figure 1. Spearman’s Rank Correlations for EMOTICOM outcomes within the three proposed cognitive domains. (I) Emotion Processing: fERT, face Emotion Recognition Task; fERT1, hit rate for happy; fERT2, hit rate for sad; fERT3, hit rate for angry; fERT4, hit rate for fearful. eERT, eyes Emotion Recognition Task; eERT1, hit rate for happy; eERT2, hit rate for sad; eERT3, hit rate for angry; eERT4, hit rate for fearful. iIM, increase Emotional Intensity Morphing Task; iIM1, detection threshold for happy; iIM2, detection threshold for sad; iIM3, detection threshold for angry; iIM4, detection threshold for fearful; iIM5, detection threshold for disgusted. dIM, decrease Intensity Morphing Task; dIM1, detection threshold for happy; dIM2, detection threshold for sad; dIM3, detection threshold for angry; dIM4, detection threshold for fearful; dIM5, detection threshold for disgusted. fAGN, Face Affective Go/NoGo Task; fAGN1, d-prime for ‘happy/neutral’; fAGN2, d-prime for ‘happy/sad’; fAGN3, d-prime for ‘neutral/happy’; fAGN4, d-prime for ‘neutral/sad’; fAGN5, d-prime for ‘sad/happy’; fAGN6, d-prime for ‘sad/neutral.’ (II) Motivation and Reward: RL, Reinforcement Learning Task; RL1, learning rate alpha for win condition; RL2, learning rate alpha for loss condition. MIR, Monetary Incentive Reward Task; MIR1, reaction time for win condition; MIR2, reaction time for loss condition. PR, Progressive Ratio Task. aCGT, adapted Cambridge Gambling Task; aCGT1, risk adjustment for win condition; aCGT2, risk adjustment for loss condition. (III) Social Cognition Domain: ME, Moral Emotions Task; ME1, guilt for agent; ME2, guilt for victim; ME3, shame for agent; ME4, shame for victim. SIP, Social Information Preference Task; SIP1, proportion thoughts; SIP2, proportion faces; SIP3, proportion facts. UG, Ultimatum Game.
Within the Emotion Processing domain correlations between tasks were predominantly weak (−0.2 < ρ < 0.2) and statistically non-significant at the 0.01 alpha level. Only three pairs of task outcomes showed statistically significant correlations: accuracy for Anger in the face Emotional Recognition Task and d-prime for Happy/Neutral in the Face Affective Go/NoGo task (ρ = 0.30, p = 0.003); accuracy for Happy in the eyes Emotional Recognition Task and detection threshold for Happy in the decrease condition of the Emotional Intensity Morphing task (ρ = −0.36, p < 0.001); and detection threshold for Anger in the decrease condition of the Emotional Intensity Morphing task and d-prime for Happy/Neutral in the Face Affective Go/NoGo task (ρ = −0.31, p = 0.002). Meanwhile correlations between outcomes within the same task ranged from week to moderate for the Emotional Recognition Task (ρ = [−0.12;0.45]); from weak to strong for the Emotional Intensity Morphing task (ρ = [−0.35;0.70]); and from weak to moderate for the Face Affective Go/NoGo task (ρ = [0.13;0.35]). Within the Motivation and Reward domain correlations between tasks were predominantly weak (−0.2 < ρ < 0.2) and statistically non-significant at the 0.01 alpha level. Only one pair of outcomes showed a statistically significant correlation: reaction time for the win condition in the Monetary Incentive Reward task and risk adjustment for the win condition in the Adapted Cambridge Gambling Task (ρ = −0.28, p = 0.005). Correlations between outcomes within the same task was moderate for the Reinforcement Learning Task (ρ = −0.22); weak for the Monetary Incentive Reward task (ρ = 0.05); and weak for the Adapted Cambridge Gambling Task (ρ = 0.04). Within the Social Cognition Domain correlations between tasks were predominantly weak (−0.2 < ρ < 0.2) and statistically non-significant. Only one pair of outcomes showed a statistically significant correlation: Agent Guilt rating from the Moral Emotions task and average acceptance rate from the Ultimatum Game (ρ = −0.28, p = 0.006). Correlations between outcomes within the same task ranged from weak to strong for the Moral Emotions task (ρ = [0.13;0.76]); from weak to strong for the Social Information Preference task (ρ = [−0.61; −0.17]); and were strong for the Prisoner’s Dilemma task (ρ = [0.67;0.71]).
The exploratory factor analysis indicated a 13-factor solution with a majority of factors loading onto a single task (see Supplementary Information for summary of factor loadings). The 13 factors cumulatively accounted for 70.4% of the total variance. The Kaiser-Meyer-Olkin measure of sampling adequacy was low but acceptable (KMO = 0.53) and Bartlett’s test of sphericity was significant [χ2(820) = 1807.0, p < 0.001], indicating that the data was suitable for structure detection.
Correlations With Demographic and Descriptive Factors
Table 5 shows correlations between primary EMOTICOM outcomes and various demographic and descriptive factors. A full overview of correlation between demographic and descriptive factors and all EMOTICOM outcomes can be found in Supplementary Information.
Age was negatively correlated with accuracy in recognizing angry and fearful emotions in the eyes version of the Emotional Face Recognition Task while differences in sex were correlated with risk adjustment in the win condition in the Adapted Cambridge Gambling Task (men performed better); ratings of shame in the Moral Emotions task (women rated higher); and proportion of steals against and aggressive opponent in the Prisoner’s Dilemma (men stole more). Education level showed a negative correlation with detection threshold of fearful emotions in the decrease condition of Intensity Morphing task while IQ and Neuroticism scores were not statistically correlated with performance on any primary outcome. Negative mood was positively correlated with accuracy in recognizing sad emotions in the face version of the Emotional Face Recognition Task and self-rated motivation and diligence during task completion was positively correlated with breakpoint in the Progressive Ratio Task.
We here present data collected from 100 healthy participants in order to validate the EMOTICOM test battery and provide reference material for future clinical and research use in Danish populations. Overall the shortened EMOTICOM test battery exhibited mostly acceptable test–retest reliability, low task-intercorrelations indicating limited redundancy between the tasks, and independence between task performance and demographic factors. Therefore, many of the EMOTICOM tasks provide a useful objective method for measuring hot cognition. Below we discuss some task-specific considerations regarding the use of the EMOTICOM test battery in research or clinical practice.
Skewness of Data
A majority of primary EMOTICOM outcomes (76%) exhibited non-normal distributions. One explanation for this could be that our study sample is biased or that the tasks contain threshold constraints such as floor or ceiling effects which skew the distribution. The observed non-normal distributions may also reflect that the construct being assessed is not normally distributed within the general population. For example, norm data reported for emotion recognition paradigms similar to those included in the EMOTICOM test battery indicate that the performance of healthy individuals is not normally distributed within this cognitive domain (Kessels et al., 2014). Due to the skewness observed in some of the EMOTICOM tasks, we recommend using the median and interquartile ranges to gauge task performance instead of mean and SD.
Floor and Ceiling Effects
Floor and ceiling effects occur when a task is either too difficult (floor effect) or too easy (ceiling effect). It represents a serious psychometric issue because it limits the variability of the collected data and therefore the amount of useful information obtained. Several EMOTICOM tasks exhibited floor or ceiling effects: out of the 42 primary task outcomes, 16 outcomes exhibited either floor or ceiling effects above 10% (i.e., at least 10% of all participants achieved either minimum or maximum scores), including eight outcomes that exhibited severe floor or ceiling effects of 30–55%. In particular, the Face Affective Go/NoGo Task had severe ceiling effects while the Reinforcement Learning Task had severe floor effects. For the Face Affective Go/NoGo Task, this issue could potentially be helped by using reaction time instead of d-prime as the primary outcome as reaction time is less vulnerable to floor and ceiling effects. Meanwhile, the presence of floor effects was particularly problematic for the Reinforcement Learning Task as a basic assumption in the algorithm used to determine the main outcome (learning rate, alpha) is that the participant performs better than chance level, i.e., that they learn the rules for choosing the best option and stop guessing randomly. In the present sample this meant that the learning rate could not be computed for 32 of the 100 participants. The difficulty of the task was corroborated by the unstructured interviews in which many participants reported they were unable to detect any patterns and kept randomly guessing throughout the task. We therefore suggest that the Reinforcement Learning Task may benefit from modifications or at least careful consideration before being applied in clinical practice or research. Other tasks including the Prisoner’s Dilemma Task and the Progressive Ratio Task also had a large proportion of participants who met our criteria for ceiling effects. However, as the purpose of these tasks is to assess different behavioral strategies (e.g., aggressive vs. cooperative) we argue that it is not meaningful to use the terms floor and ceiling effects in the conventional sense for these types of tasks even though they contain optimal strategies for maximizing monetary reward (e.g., not quitting in the Progressive Ratio Task).
In the original British validation study, test–retest reliability was assessed over a time-period of 5–10 days while we chose a retest span of 3–5 weeks. This longer timeframe is suited to inform studies that include long-term interventions or follow clinical progress over time. However, life events and mood may change considerably more over periods of weeks, as compared with days, which may influence test–retest reliability. The majority of EMOTICOM task outcomes exhibited fair to excellent test–retest reliability although notably only two tasks, the Moral Emotions task and the Ultimatum Game, had excellent test–retest coefficients of ≥ 0.75 for all primary outcomes. In addition, several tasks showed very poor reliability including the Face Affective Go/NoGo Task, Monetary Incentive Reward Task, and the Adapted Cambridge Gambling Task. It should be noted that low ICC scores can be caused by limited variance in the data which in turn may occur as a result of ceiling or floor effects (Koo and Li, 2016). For example, the low ICC scores reported for the Face Affective Go/NoGo Task may in part be explained by the severe ceiling effects exhibited by this task. Overall, tasks from the Social Cognition domain appeared to have the highest degree of reliability followed by tasks from the Emotional Processing domain, while tasks from the Motivation and Reward domain had poorer reliability. These observations were largely in accordance with the reports from the original British validation study for related outcomes from the same tasks (Bland et al., 2016). However, what may appear as poor reliability for Motivation and Reward tasks could instead reflect learning effects or adaptation in playing strategy. For instance, several participants reported deliberately prioritizing optimizing their winnings during their second session rather than ‘playing fair’ against the computer opponent. Furthermore, the reported test–retest biases were predominantly positive across most tasks, supporting the presence of a slight behavioral learning effect. It should be noted that for tasks without right/wrong answers (e.g., Moral Emotions Task and Prisoner’s Dilemma), the test–retest bias cannot be interpreted as a learning effect but could instead reflect a shift in response style or choice of strategy.
The tasks in the EMOTICOM test battery were originally chosen to capture distinct hot cognitive domains including Emotion Processing, Motivation and Reward, and Social Cognition. In order to test the extent to which each individual task loaded onto their respective domains, we mapped the shared variance for the task outcomes within the same domain in three correlation matrices. We found that there were little to no correlation between tasks from the same hot cognitive domain indicating that the original hypothesis of task specific domains could not be supported. This was further corroborated by the results of the exploratory factor analysis which indicated a 13-factor solution and thus did not support the proposed three-domain factorial structure. These results align with the findings from the original British validation which also failed to detect the proposed domain-specific pattern across EMOTICOM tasks (Bland et al., 2016). A possible explanation is that the proposed hot cognitive domains do not represent a single unitary cognitive construct; instead they should be seen as umbrella-terms for multiple inter-related cognitive processes. In addition, while previous studies have indicated the existence of an underlying facial expression decoding construct in the Emotion Processing domain (Hildebrandt et al., 2015), we speculate that the EMOTICOM tasks within this domain are too heterogeneous both in terms of task design and outcome scales to capture this single construct. Overall, these findings emphasize that hot cognition is a complex phenomenon made up of multifaceted cognitive constructs. As a consequence, we recommend that researchers aiming to investigate hot cognition using EMOTICOM should view the battery as a tool box and carefully consider the exact target of their investigation before choosing the appropriate task.
Lastly, some EMOTICOM tasks exhibited very low within-task correlation, suggesting that (a) the task itself does not measure a single construct or (b) the outcomes are unreliable. This was particularly pronounced for tasks from the Motivation and Reward domain and indicates that these tasks may benefit from modifications.
With few exceptions, performance on EMOTICOM tasks was not strongly influenced by demographic factors. Age was negatively correlated to recognition of anger and fear in the face version of the Emotional Face Recognition Task but not in the eye version. Age effects on emotion recognition have previously been reported in the literature and in particular for recognition of negative emotions (Ruffman et al., 2008). Therefore, it may be advantageous to use the eye version of the Emotional Face Recognition Task in study cohorts containing middle-aged and older adults as this version appears to be less sensitive to age effects. Corroborating the original British validation study, we did not observe sex effect on tasks from the Emotion Processing domain (Bland et al., 2016), but women exhibited higher ratings of shame in the Moral Emotions Task. This fits with previous reports of sex differences in proneness to experience shame and guilt (O’Connor et al., 1994; Else-Quest et al., 2012). Women were also less likely to steal from their opponent in the Prisoner’s Dilemma task while men exhibited better risk adjustment in the Adapted Cambridge Gambling Task. Performance on EMOTICOM appeared to be largely independent of IQ and education with the single exception of a negative correlation between education level and detection of fear in the Intensity Morphing task’s decrease condition. However, it should be emphasized that the included participants were not stratified for education. This resulted in a cohort with very high education levels as well as high IQ which limits our ability to accurately assess the potential effect of these factors on task performance. Overall, it is a strength of the EMOTICOM test battery that demographic factors do not seem to influence task performance. However, given the stratification issues described above, other studies are needed to investigate the impact of demographic factors on test performance in older as well as less well-educated cohorts.
Mood, Personality, Motivation Factors
In addition to demographic characteristics, we also looked at how other relevant factors such as trait Neuroticism and self-reported mood might influence responses on EMOTICOM tasks. Trait Neuroticism is used to index the tendency to experience negative emotions and is strongly linked to risk of developing psychopathology (Malouff et al., 2005; Ormel et al., 2013). Trait Neuroticism did not correlate significantly with any EMOTICOM outcomes while mood was positively correlated with recognition of sad faces in the face version of the Emotional Face Recognition Task only. The latter finding is in line with previous reports showing that mood can influence recognition of emotional faces. However, the effect appears to be relatively small and in most studies requires the active evocation of emotion in the participant prior to the presentation of the stimuli (Schmid and Mast, 2010). Lastly, the correlation between self-reported motivation and diligence during the six tasks containing the possibility of winning an extra sum of money was also assessed. We found that self-reported motivation and diligence had little effect on performance except for motivation on the Progressive Ratio Task. This provides further validation for the Progressive Ratio Task as an objective measure of motivation. Overall, the general lack of correlations between performance on EMOTICOM tasks and trait Neuroticism, mood disturbance, and self-reported motivation and diligence indicates that EMOTICOM is not sensitive to differences in emotion fluctuations or personality characteristics in healthy participants.
Comparison With British Validation Study
There are several differences between the original British validation study and the present work. For example, we chose a longer test–retest interval and included measures of mood, Neuroticism and motivation and diligence to characterize potential influences on task performance. In addition, many of the reported task outcomes differ. We based our choice of primary outcomes for each task on consultation with the original test developers as well as standard practice in the literature. However, as most cognitive tasks do not have a single, clearly defined outcome, the ‘optimal’ choice of primary outcome may vary from study to study depending on the research question. For example, recognition of angry faces may be especially relevant in studies investigating aggression whereas recognition of fearful faces may be especially relevant for studying anxiety. We therefore endeavored to pick outcomes that we believe best capture the core cognitive function of each task and, when possible, limit the use of composite outcomes (i.e., complex outcomes created from two or more outcomes). While these choices make a direct one-to-one comparison between the two studies difficult, overall our findings align with those from the British validation study. We observed similar patterns of test–retest reliability at both task and domain level and were able to replicate the report that EMOTICOM is largely independent of demographic factors. In addition, we corroborate the original study’s rejection of a three-domain structure. As information on floor and ceiling effects were not reported in the British validation study, we cannot compare our results to the British study.
EMOTICOM was initially validated in 200 volunteers by the British test developers. The purpose of this study was to replicate the original study with a smaller sample of 100 Danish participants. This is a used practice for psychometric studies comparing populations with large biological, environmental, and cultural overlaps; e.g., the Danish version of the Delis-Kaplan Executive Function System (D-KEFS) test battery was validated against American norms based on data collected from 111 Danish individuals. However, the relatively small sample size of the present study does present some limitations. In particular the reported correlations between task performance and demographic and descriptive factors should be interpreted with caution as the study may not have had sufficient power to detect weaker correlations. In addition, as the present study likely does not have a sufficiently large sample size to accurately estimate the true factorial structure of the EMOTICOM task outcomes (Beavers et al., 2013), we refrain from interpreting the meaning of individual factors derived from the analysis. Importantly, our study sample does not represent a normative sample but rather a reference sample based on well-educated individuals with high IQ. In addition, due to the high level of ethnic and cultural homogeneity in the Danish population, the present study sample could not provide any insight into potential effects of ethnicity or cultural differences on task performance. Therefore, caution should be taken when comparing the findings to other types of study groups or the general population. Also, based on the current study it cannot be ascertained whether the observed ceiling effects in healthy participants would also be present in clinical samples nor how sensitive the tasks may be to psychological or pharmacological interventions. So far, one study has used the EMOTICOM battery to investigate the association between paranoid thinking in healthy participants and social cognition, reporting a link between increased paranoia and likelihood of stealing from the cooperative opponent in the Prisoner’s Dilemma task (Savulich et al., 2018).
As a final note, we caution against using the rating of ‘annoyance’ from the Moral Emotions task. Based on the qualitative interviews, we discovered that some participants reported high levels of annoyance in moral scenarios where they were the agent (i.e., when they caused harm to others) because they ‘felt annoyed with themselves’ while some participants reported low levels of annoyance because they ‘did not feel annoyed with the victim or the situation.’ Since this ambiguity of interpretation was not seen in the original publication of a healthy United Kingdom sample, it may reflect cultural differences. We therefore recommend that the task instructions be modified to eliminate this ambiguity.
We here present reference material for performance on the hot cognitive test battery EMOTICOM from a Danish cohort of healthy participants. While most tasks exhibited acceptable psychometric properties, select tasks may not be appropriate for use in healthy individuals due to issues relating to floor and ceiling effects, low test–retest reliability and lack of within-task correlations. While these issues may be ameliorated by choosing alternate task outcomes in some cases (e.g., for the Face Affective Go/NoGo task) other tasks, in particular those from the Motivation and Reward domain, may benefit from modifications. We observed overall weak correlations between tasks within the same domain, indicating that the proposed structure of an Emotion Processing domain, Reward and Motivation domain and Social Cognition domain cannot be substantiated. EMOTICOM tasks were largely independent of demographic factors such as age, sex, education as well as IQ, personality, mood, and self-reported motivation and diligence during task completion. The present study may help guide future study designs by indicating which EMOTICOM tasks may be most appropriate for the study population planned. In conclusion, many EMOTICOM tasks provide useful, objective methods for measuring social and emotional cognition; however, future studies are needed to investigate the performance of EMOTICOM tasks in patient groups as well as their performance in intervention trials.
Data Availability Statement
For legal reasons we are not allowed to upload and share our data. The data from the study is available upon request from the CIMBI database (http://www.cimbi.dk/db).
Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. The patients/participants provided their written informed consent to participate in this study.
DS, VF, and GK conceived and designed the study. VD and CT collected the data. PJ organized the database. VD, PJ, and AB defined and implemented the outcomes used. VD wrote the first draft of the manuscript. EM consulted on the statistical analysis which was performed by VD. CT wrote the sections of the manuscript. RE and BS consulted on the analysis and interpretation of the findings. All authors contributed to the manuscript revision, and read and approved the submitted version.
This study was supported by the Augustinus Foundation (Grant 16-0058), Rigshospitalet’s Research Council (Grant R149-A6325), and the Innovation Fund Denmark (Grant 4108-00004B). The financial supporters were not involved in the study design, collection, analysis, interpretation, or publication of data.
Conflict of Interest
AB, RE, and BS are co-inventors of the EMOTICOM test battery and BS consults for Cambridge Cognition.
The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
We would like to thank Nanna Hansen and Vincent Beliveau for their invaluable assistance with the implementation of the EMOTICOM test battery in Danish as well as Anne-Sofie Schneider, Sophia Armand, and Simone Pleinert for their assistance with recruitment and data collection. We are grateful to all participants who donated their time to this study.
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg.2019.02660/full#supplementary-material
- ^ Due to scheduling conflicts, one participant completed the retest session after 6 weeks (43 days).
Beavers, A. S., Lounsbury, J. W., Richards, J. K., Huck, S. W., Skolits, G., and Esquivel, S. L. (2013). Practical considerations for using exploratory factor analysis in educational research. Pract. Assess. Res. Eval. 18, 1–13. doi: 10.1080/15389588.2018.1476689
Bland, A. R., Roiser, J. P., Mehta, M. A., Schei, T., Boland, H., Campbell-Meiklejohn, D. K., et al. (2016). EMOTICOM: a neuropsychological test battery to evaluate emotion, motivation, impulsivity, and social cognition. Front. Behav. Neurosci. 10:25. doi: 10.3389/fnbeh.2016.00025
Brewer, R., Marsh, A. A., Catmur, C., Cardinale, E. M., Stoycos, S., Cook, R., et al. (2015). The impact of autism spectrum disorder and alexithymia on judgments of moral acceptability. J. Abnorm. Psychol. 124, 589–595. doi: 10.1037/abn0000076
Cameron, C. D., Reber, J., Spring, V. L., and Tranel, D. (2018). Damage to the ventromedial prefrontal cortex is associated with impairments in both spontaneous and deliberative moral judgments. Neuropsychologia 111, 261–268. doi: 10.1016/j.neuropsychologia.2018.01.038
Chen, C., Takahashi, T., Nakagawa, S., Inoue, T., and Kusumi, I. (2015). Reinforcement learning in depression: a review of computational research. Neurosci. Biobehav. Rev. 55, 247–267. doi: 10.1016/j.neubiorev.2015.05.005
Chung, Y. S., Barch, D., and Strube, M. (2014). A meta-analysis of mentalizing impairments in adults with schizophrenia and autism spectrum disorder. Schizophrenia Bull. 40, 602–616. doi: 10.1093/schbul/sbt048
Cicchetti, D. V. (1994). Guidelines, criteria, and rules of thumb for evaluating normed and standardized assessment instruments in psychology. Psychol. Assess. 6, 284–290. doi: 10.1037//1040-35220.127.116.114
García-Blanco, A. C., Perea, M., and Livianos, L. (2013). Mood-congruent bias and attention shifts in the different episodes of bipolar disorder. Cogn. Emot. 27, 1114–1121. doi: 10.1080/02699931.2013.764281
Green, S., Moll, J., Deakin, J. F., Hulleman, J., and Zahn, R. (2013). Proneness to decreased negative emotions in major depressive disorder when blaming others rather than oneself. Psychopathology 46, 34–44. doi: 10.1159/000338632
Hagele, C., Schlagenhauf, F., Rapp, M., Sterzer, P., Beck, A., Bermpohl, F., et al. (2015). Dimensional psychiatry: reward dysfunction and depressive mood across psychiatric disorders. Psychopharmacology 232, 331–341. doi: 10.1007/s00213-014-3662-7
Harmer, C. J., and Cowen, P. J. (2013). ‘It’s the way that you look at it’–a cognitive neuropsychological account of SSRI action in depression. Philos. Trans. R. Soc. Lond. Ser. B Biol. Sci. 368:20120407. doi: 10.1098/rstb.2012.0407
Harms, M. B., Martin, A., and Wallace, G. L. (2010). Facial emotion recognition in autism spectrum disorders: a review of behavioral and neuroimaging studies. Neuropsychol. Rev. 20, 290–322. doi: 10.1007/s11065-010-9138-6
Hedman, E., Ström, P., Stünkel, A., and Mörtberg, E. (2013). Shame and guilt in social anxiety disorder: effects of cognitive behavior therapy and association with social anxiety and depressive symptoms. PLoS One 8:e61713. doi: 10.1371/journal.pone.0061713
Hildebrandt, A., Sommer, W., Schacht, A., and Wilhelm, O. (2015). Perceiving and remembering emotional facial expressions — A basic facet of emotional intelligence. Intelligence 50, 52–67. doi: 10.1016/j.intell.2015.02.003
Hjordt, L. V., Stenbaek, D. S., Ozenne, B., Mc Mahon, B., Hageman, I., Hasselbalch, S. G., et al. (2017). Season-independent cognitive deficits in seasonal affective disorder and their relation to depressive symptoms. (Report). Psychiatry Res. 257:219. doi: 10.1016/j.psychres.2017.07.056
Kessels, R. P., Montagne, B., Hendriks, A. W., Perrett, D. I., and de Haan, E. H. (2014). Assessment of perception of morphed facial expressions using the emotion recognition task: normative data from healthy participants aged 8-75. J. Neuropsychol. 8, 75–93. doi: 10.1111/jnp.12009
Knudsen, G. M., Jensen, P. S., Erritzoe, D., Baare, W. F., Ettrup, A., Fisher, P. M., et al. (2016). The center for integrated molecular brain imaging (Cimbi) database. Neuroimage 124(Pt B), 1213–1219. doi: 10.1016/j.neuroimage.2015.04.025
Leppanen, J., Ng, K. W., Tchanturia, K., and Treasure, J. (2017). Meta-analysis of the effects of intranasal oxytocin on interpretation and expression of emotions. Neurosci. Biobehav. Rev. 78, 125–144. doi: 10.1016/j.neubiorev.2017.04.010
Lovell, D. M., Williams, J. M., and Hill, A. B. (1997). Selective processing of shape-related words in women with eating disorders, and those who have recovered. Br. J. Clin. Psychol. 36(Pt 3), 421–432. doi: 10.1111/j.2044-8260.1997.tb01249.x
Malouff, J., Thorsteinsson, E., and Schutte, N. (2005). The relationship between the five-factor model of personality and symptoms of clinical disorders: a meta-analysis. J. Psychopathol. Behav. Assess. 27, 101–114. doi: 10.1007/s10862-005-5384-y
Merens, W., Willem Van der Does, A. J., and Spinhoven, P. (2007). The effects of serotonin manipulations on emotional information processing and mood. J. Affect. Disord. 103, 43–62. doi: 10.1016/j.jad.2007.01.032
Newcombe, V. F., Outtrim, J. G., Chatfield, D. A., Manktelow, A., Hutchinson, P. J., Coles, J. P., et al. (2011). Parcellating the neuroanatomical basis of impaired decision-making in traumatic brain injury. Brain 134(Pt 3), 759–768. doi: 10.1093/brain/awq388
Nyenhuis, D. L., Yamamoto, C., Luchetta, T., Terrien, A., and Parmentier, A. (1999). Adult and geriatric normative data and validation of the profile of mood states. J. Clin. Psychol. 55, 79–86. doi: 10.1002/(sici)1097-4679(199901)55:1<79::aid-jclp8>3.0.co;2-7
O’Connor, L. E., Berry, J. W., Inaba, D., Weiss, J., and Morrison, A. (1994). Shame, guilt, and depression in men and women in recovery from addiction. J. Subst. Abuse Treat. 11, 503–510. doi: 10.1016/0740-5472(94)90001-9
Ormel, J., Jeronimus, B. F., Kotov, R., Riese, H., Bos, E. H., Hankin, B., et al. (2013). Neuroticism and common mental disorders: meaning and utility of a complex relationship. Clin. Psychol. Rev. 33, 686–697. doi: 10.1016/j.cpr.2013.04.003
Osorio, F. L., de Paula Cassis, J. M., Machado de Sousa, J. P., Poli-Neto, O., and Martin-Santos, R. (2018). Sex hormones and processing of facial expressions of emotion: a systematic literature review. Front. Psychol. 9:529. doi: 10.3389/fpsyg.2018.00529
Park, C., Pan, Z., Brietzke, E., Subramaniapillai, M., Rosenblat, J. D., Zuckerman, H., et al. (2018). Predicting antidepressant response using early changes in cognition: a systematic review. Behav. Brain Res. 353, 154–160. doi: 10.1016/j.bbr.2018.07.011
Plana, I., Lavoie, M. A., Battaglia, M., and Achim, A. M. (2014). A meta-analysis and scoping review of social cognition performance in social phobia, posttraumatic stress disorder and other anxiety disorders. J. Anxiety Disord. 28, 169–177. doi: 10.1016/j.janxdis.2013.09.005
Routledge, K. M., Williams, L. M., Harris, A. W. F., Schofield, P. R., Clark, C. R., and Gatt, J. M. (2018). Genetic correlations between wellbeing, depression and anxiety symptoms and behavioral responses to the emotional faces task in healthy twins. Psychiatry Res. 264, 385–393. doi: 10.1016/j.psychres.2018.03.042
Ruffman, T., Henry, J. D., Livingstone, V., and Phillips, L. H. (2008). A meta-analytic review of emotion recognition and aging: implications for neuropsychological models of aging. Neurosci. Biobehav. Rev. 32, 863–881. doi: 10.1016/j.neubiorev.2008.01.001
Savulich, G., Jeanes, H., Rossides, N., Kaur, S., Zacharia, A., Robbins, T. W., et al. (2018). Moral emotions and social economic games in paranoia. Front. Psychiatry 9:615. doi: 10.3389/fpsyt.2018.00615
Umemoto, A., Lukie, C., Kerns, K., Müller, U., and Holroyd, C. (2014). Impaired reward processing by anterior cingulate cortex in children with attention deficit hyperactivity disorder. Cogn. Affect. Behav. Neurosc. 14, 698–714. doi: 10.3758/s13415-014-0298-3
Ventura, J., Wood, R. C., Jimenez, A. M., and Hellemann, G. S. (2013). Neurocognition and symptoms identify links between facial recognition and emotion processing in schizophrenia: meta-analytic findings. Schizophr Res. 151, 78–84. doi: 10.1016/j.schres.2013.10.015
Keywords: EMOTICOM, affective cognition, social cognition, hot cognition, psychometrics, neuropsychological test battery
Citation: Dam VH, Thystrup CK, Jensen PS, Bland AR, Mortensen EL, Elliott R, Sahakian BJ, Knudsen GM, Frokjaer VG and Stenbæk DS (2019) Psychometric Properties and Validation of the EMOTICOM Test Battery in a Healthy Danish Population. Front. Psychol. 10:2660. doi: 10.3389/fpsyg.2019.02660
Received: 12 September 2019; Accepted: 11 November 2019;
Published: 03 December 2019.
Edited by:Thomas Kleinsorge, Leibniz Research Centre for Working Environment and Human Factors (IfADo), Germany
Reviewed by:Xinyang Liu, University of Oldenburg, Germany
Philip D. Harvey, University of Miami, United States
Copyright © 2019 Dam, Thystrup, Jensen, Bland, Mortensen, Elliott, Sahakian, Knudsen, Frokjaer and Stenbæk. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Dea S. Stenbæk, email@example.com