The Measurement of Emotional Intelligence: A Critical Review of the Literature and Recommendations for Researchers and Practitioners

Emotional Intelligence (EI) emerged in the 1990s as an ability based construct analogous to general Intelligence. However, over the past 3 decades two further, conceptually distinct forms of EI have emerged (often termed “trait EI” and “mixed model EI”) along with a large number of psychometric tools designed to measure these forms. Currently more than 30 different widely-used measures of EI have been developed. Although there is some clarity within the EI field regarding the types of EI and their respective measures, those external to the field are faced with a seemingly complex EI literature, overlapping terminology, and multiple published measures. In this paper we seek to provide guidance to researchers and practitioners seeking to utilize EI in their work. We first provide an overview of the different conceptualizations of EI. We then provide a set of recommendations for practitioners and researchers regarding the most appropriate measures of EI for a range of different purposes. We provide guidance both on how to select and use different measures of EI. We conclude with a comprehensive review of the major measures of EI in terms of factor structure, reliability, and validity.


OVERVIEW AND PURPOSE
The purpose of this article is to review major, widely-used measures of Emotional Intelligence (EI) and make recommendations regarding their appropriate use. This article is written primarily for academics and practitioners who are not currently experts on EI but who are considering utilizing EI in their research and/or practice. For ease of reading therefore, we begin this article with an introduction to the different types of EI, followed by a brief summary of different measures of EI and their respective facets. We then provide a detailed set of recommendations for researchers and practitioners. Recommendations focus primarily on choosing between EI constructs (ability EI, trait EI, mixed models) as well as choosing between specific tests. We take into account such factors as test length, number of facets measured and whether tests are freely available. Consequently we also provide recommendations both for users willing to purchase tests and those preferring to utilize freely available measures.
In our detailed literature review, we focus on a set of widely used measures and summarize evidence for their validity, reliability, and conceptual basis. Our review includes studies that focus purely on psychometric properties of EI measures as well as studies conducted within applied settings, particularly health care settings. We include comprehensive tables summarizing key empirical studies on each measure, in terms of their research design and main findings. Our review includes measures that are academic and/or commercial as well as those that are freely available or require payment. To assist users with accessing measures, we include web links to complete EI questionaries for freely available measures and to websites and/or example items for copyrighted measures. For readers interested in reviews relating primarily to EI constructs, theory and outcomes rather than specifically measures of EI, we recommend a number of recent high quality publications (e.g., Kun and Demetrovics, 2010;Gutiérrez-Cobo et al., 2016). Additionally, for readers interested in a review of measures without the extensive recommendations we provide here, we recommend the chapter by Siegling et al. (2015).

EARLY RESEARCH ON EMOTIONAL INTELLIGENCE
EI emerged as a major psychological construct in the early 1990s, where it was conceptualized as a set of abilities largely analogous to general intelligence. Early influential work on EI was conducted by Salovey and Mayer (1990), who defined EI as the "the ability to monitor one's own and others' feelings and emotions, to discriminate among them and to use this information to guide one's thinking and actions" (p. 189). They argued that individuals high in EI had certain emotional abilities and skills related to appraising and regulating emotions in the self and others. Accordingly, it was argued that individuals high in EI could accurately perceive certain emotions in themselves and others (e.g., anger, sadness) and also regulate emotions in themselves and others in order to achieve a range of adaptive outcomes or emotional states (e.g., motivation, creative thinking).
However, despite having a clear definition and conceptual basis, early research on EI was characterized by the development of multiple measures (e.g., Bar-On, 1997a,b;Schutte et al., 1998;Mayer et al., 1999) with varying degrees of similarity (see Van Rooy et al., 2005). One cause of this proliferation was the commercial opportunities such tests offered to developers and the difficulties faced by researchers seeking to obtain copyrighted measures (see section Mixed EI for a summary of commercial measures). A further cause of this proliferation was the difficulty researchers faced in developing measures with good psychometric properties. A comprehensive discussion of this issue is beyond the scope of this article (see Petrides, 2011 for more details) however one clear challenge faced by early EI test developers was constructing emotion-focused questions that could be scored with objective criteria. In comparison to measures of cognitive ability that have objectively right/wrong answers (e.g., mathematical problems), items designed to measure emotional abilities often rely on expert judgment to define correct answers which is problematic for multiple reasons (Roberts et al., 2001;Maul, 2012).
A further characteristic of many early measures was their failure to discriminate between measures of typical and maximal performance. In particular, some test developers moved away from pure ability based questions and utilized self-report questions (i.e., questions asking participants to rate behavioral tendencies and/or abilities rather than objectively assessing their abilities; e.g., Schutte et al., 1998). Other measures utilized broader definitions of EI that included social effectiveness in addition to typical EI facets (see Ashkanasy and Daus, 2005) (e.g., Boyatzis et al., 2000;Boyatzis and Goleman, 2007). Over time it became clear that these different measures were tapping into related, yet distinct underlying constructs. Currently, there are two popular methods of classifying EI measures. First is the distinction between trait and ability EI proposed initially by Petrides and Furnham (2000) and further clarified by Pérez et al. (2005). Second is in terms of the three EI "streams" as proposed by Ashkanasy and Daus (2005). Fortunately there is overlap between these two methods of classification as we discuss below.

METHODS OF CLASSIFYING EI
The distinction between ability EI and trait EI first proposed by Petrides and Furnham (2000) was based purely on whether the measure was a test of maximal performance (ability EI) or a self-report questionnaire (trait EI) (Petrides and Furnham, 2000;Pérez et al., 2005). According to this method of classification, Ability EI tests measure constructs related to an individual's theoretical understanding of emotions and emotional functioning, whereas trait EI questionnaires measure typical behaviors in emotion-relevant situations (e.g., when an individual is confronted with stress or an upset friend) as well as self-rated abilities. Importantly, the key aspect of this method of classification is that EI type is best defined by method of measurement: all EI measures that are based on self-report items are termed "trait EI" whereas all measures that are based on maximal performance items are termed "ability EI".
The second popular method of classifying EI measures refers the three EI "streams" (Ashkanasy and Daus, 2005). According to this method of classification, stream 1 includes ability measures based on Mayer and Salovey's model; stream 2 includes selfreport measures based on Mayer and Salovey's model and stream 3 includes "expanded models of emotional intelligence that encompass components not included in Salovey and Mayer's definition" (p. 443). Ashkanasy and Daus (2005) noted that stream 3 had also been referred to as "mixed" models in that they comprise a mixture of personality and behavioral items. The term "mixed EI" is now frequently used in the literature to refer to EI measures that measure a combination of traits, social skills and competencies and overlaps with other personality measures (O'Boyle et al., 2011).
Prior to moving on, we note that Petrides and Furnham's (2000) trait vs. ability distinction is sufficient to categorize the vast majority of EI tests. Utilizing this system, both stream 2 (self-report) and stream 3 (self-report mixed) are simply classified as "trait" measures. Indeed as argued by Pérez et al. (2005), this method of classification is probably sufficient given that self-report measures of EI tend to correlate strongly regardless of whether they are stream 2 or stream 3 measures. However, given that the terms "stream 3" and "mixed" are so extensively used in the EI literature, we will also use them here. We are not proposing that these terms are ideal or even useful when classifying EI, but rather we wish to adopt language that is most representative of the existing literature on EI. In the following section therefore, we refer to ability EI (stream 1), trait EI (steam 2), and mixed EI (stream 3). As outlined later, decisions regarding which measure of EI to use should be based on what form of EI is relevant to a particular research project or professional application.

ABILITY EI
For the purposes of this review, we refer to "ability" based measures as tests that utilize questions/items comparable to those found in IQ tests (see Austin, 2010). These include all tests containing ability-type items and not only those based directly on Mayer and Salovey's model. In contrast to trait based measures, ability measures do not require that participants self-report on various statements, but rather require that participants solve emotion-related problems that have answers that are deemed to be correct or incorrect (e.g., what emotion might someone feel prior to a job interview? (a) sadness, (b) excitement, (c) nervousness, (d) all of the above). Ability based measures give a good indication of individuals' ability to understand emotions and how they work. However since they are tests of maximal ability, they do not tend to predict typical behavior as well as trait based measures (see O'Connor et al., 2017). Nevertheless, ability-based measures are valid, albeit weak, predictors of a range of outcomes including work related attitudes such as job satisfaction (Miao et al., 2017) and job performance (O'Boyle et al., 2011).

TRAIT EI
In this review, we define trait based measures as those that utilize self-report items to measure overall EI and its sub dimensions. We utilize this term for measures that are self-report, and have not explicitly been termed as "mixed" or "stream 3" by others. Individuals high in various measures of trait EI have been found to have high levels of self-efficacy regarding emotionrelated behaviors and tend to be competent at managing and regulating emotions in themselves and others. Also, since trait EI measures tend to measure typical behavior rather than maximal performance, they tend to provide a good prediction of actual behaviors in a range of situations (Petrides and Furnham, 2000). Recent meta-analyses have linked trait EI to a range of work attitudes such as job satisfaction and organization commitment (Miao et al., 2017), and Job Performance (O'Boyle et al., 2011).

MIXED EI
As noted earlier, although the majority of EI measures can be categorized using the terms "ability EI" and "trait EI", we adopt the term "mixed EI" in this review when this term has been explicitly used in our source articles. The term mixed EI is predominately used to refer to questionnaires that measure a combination of traits, social skills and competencies that overlap with other personality measures. Generally these measures are self-report, however a number also utilize 360 degree forms of assessment (self-report combined with multiple peer reports from supervisors, colleagues and subordinates) (e.g., Bar-On, 1997a,b) This is particularly true for commercial measures designed to predict and improve performance in the workplace. A common aspect in many of these measures is the focus on emotional "competencies" which can theoretically be developed in individuals to enhance their professional success (See Goleman, 1995). Research on mixed measures have found them to be valid predictors of multiple emotion-related outcomes including job satisfaction, organizational commitment (Miao et al., 2017), and job performance (O'Boyle et al., 2011). Effect sizes of these relationships tend to be moderate and on par with trait-based measures.
We note that although different forms of EI have emerged (trait, ability, mixed) there are nevertheless a number of conceptual similarities in the majority of measures. In particular, the majority of EI measures are regarded as hierarchical meaning that they produce a total "EI score" for test takers along with scores on multiple facets/subscales. Additionally, the facets in ability, trait and mixed measures of EI have numerous conceptual overlaps. This is largely due to the early influential work of Mayer and Salovey. In particular, the majority of measures include facets relating to (1) perceiving emotions (in self and others), (2) regulating emotions in self, (3) regulating emotions in others, and (4) strategically utilizing emotions. Where relevant therefore, this article will compare how well different measures of EI assess the various facets common to multiple EI measures.

EMOTIONAL INTELLIGENCE SCALES
The following emotional intelligence scales were selected to be reviewed in this article because they are all widely researched general measures of EI that also measure several of the major facets common to EI measures (perceiving emotions, regulating emotions, utilizing emotions).

Mayer-Salovey-Caruso
Emotional Intelligence Tests (MSCEIT) (Mayer et al., 2002a,b (Boyatzis and Goleman, 2007) The complete literature review of these measures is included in the Literature Review section of this article. The following section provides a set of recommendations regarding which of these measures is appropriate to use across various research and applied scenarios.

RECOMMENDATIONS REGARDING THE APPROPRIATE USE OF MEASURES
Deciding Between Measuring Trait EI, Ability EI and Mixed EI A key decision researchers/practitioners need to make prior to incorporating EI measures into their work is whether they should utilize a trait, ability or mixed measure of EI. In general, we suggest that when researchers/practitioners are interested in emotional abilities and competencies then they should utilize measures of ability EI. In particular ability EI is important in situations where a good theoretical understanding of emotions is required. For example a manager high in ability EI is more likely to make good decisions regarding team composition. Indeed numerous studies on ability EI and decision making in professionals indicates that those high in EI tend to be competent decision makers, problem solvers and negotiators due primarily to their enhanced abilities at perceiving and understanding emotions (see Mayer et al., 2008). More generally, ability EI research also has demonstrated associations between ability EI and social competence in children (Schultz et al., 2004) and adults (Brackett et al., 2006). We suggest that researchers/practitioners should select trait measures of EI when they are interested in measuring behavioral tendencies and/or emotional self-efficacy. This should be when ongoing, typical behavior is likely to lead to positive outcomes, rather than intermittent, maximal performance. For example, research on task-induced stress (i.e., temporary states of negative affect evoked by short term, challenging tasks) has shown trait EI to have incremental validity over other predictors (O'Connor et al., 2017). More generally, research tends to show that trait EI is a good predictor of effective coping styles in response to life stressors (e.g., Austin et al., 2010). Overall, trait EI is associated with a broad set of emotion and social related outcomes adults and children (Mavroveli and Sánchez-Ruiz, 2011;Petrides et al., 2016) Therefore in situations characterized by ongoing stressors such as educational contexts and employment, we suggest that trait measures be used.
When both abilities and traits are important, researchers/practitioners might choose to use both ability and trait measures. Indeed some research demonstrates that both forms of EI are important stress buffers and that they exert their protective effects at different stages of the coping process: ability EI aids in the selection of coping strategies whereas trait EI predicts the implementation of such strategies once selected (Davis and Humphrey, 2014).
Finally, when researchers/practitioners are interested in a broader set of emotion-related and social-related dispositions and competencies we recommend a mixed measure. Mixed measures are particularly appropriate in the context of the workplace. This seems to be the case for two reasons: first, the tendency to frame EI as a set of competencies that can be trained (e.g., Goleman, 1995;Boyatzis and Goleman, 2007) is likely to equip workers with a positive growth mindset regarding their EI. Second, the emphasis on 360 degree forms of assessment in mixed measures provides individuals with information not only on their self-perceptions, but on how others perceive them which is also particularly useful in training situations.

Advantages and Disadvantages of Trait and Ability EI
There are numerous advantages and disadvantages of the different forms of EI that test users should factor into their decision. One disadvantage of self-report measures is that people are not always good judges of their emotion-related abilities and tendencies (Brackett et al., 2006;Sheldon et al., 2014;Boyatzis, 2018). A further disadvantage of self-report, trait based measures is their susceptibility to faking. Participants can easily come across as high in EI by answering questions in a strategic, socially desirable way. However, this is usually only an issue when testtakers believe that someone of importance (e.g., a supervisor or potential employer) will have access to their results. When it is for self-development or research, individuals are less likely to fake their answers to trait EI measures (see Tett et al., 2012). We also note that the theoretical bases of trait and mixed measures have also been questioned. Some have argued for example that selfreport measures of EI measure nothing fundamentally different from the Big Five (e.g., Davies et al., 1998). We will not address this issue here as it has been extensively discussed elsewhere (e.g., Bucich and MacCann, 2019) however we emphasize that regardless of the statistical distinctiveness of self-report measures of EI, there is little question regarding their utility and predictive validity (O'Boyle et al., 2011;Miao et al., 2017).
One advantage of ability based measures is that they cannot be faked. Test-takers are told to give the answer they believe is correct, and consequently should try to obtain a score as high as possible. A further advantage is that they are often more engaging tests. Rather than simply rating agreement with statements as in trait based measures, test-takers attempt to solve emotion-related problems, solve puzzles, and rate emotions in pictures.
Overall however, there are a number of fundamental problems with ability based measures. First, many personality and intelligence theorists question the very existence of ability EI, and suggest it is nothing more than intelligence. This claim is supported by high correlations between ability EI and IQ, although some have provided evidence to the contrary (e.g., MacCann et al., 2014). Additionally, the common measures of ability EI tend to have relatively poor psychometric properties in terms of reliability and validity. Ability EI measures do not tend to strongly predict outcomes that they theoretically should predict (e.g., O'Boyle et al., 2011;Miao et al., 2017). Maul (2012) also outlines a comprehensive set of problems with the most widely used ability measure, the MSCEIT, related to consensusbased scoring, reliability, and underrepresentation of the EI construct. Also see Petrides (2011) for a comprehensive critique of ability measures.

General Recommendation for Non-experts Choosing Between Ability and Trait EI
While the distinction between trait, ability and mixed EI is important, we acknowledge that many readers will simply be looking for an overall measure of emotional functioning that can predict personal and professional effectiveness. Therefore, when potential users have no overt preference for trait or ability measures but need to decide, we strongly recommend researchers/ practitioners begin with a trait-based measure of EI. Compared to ability based measures, trait based measures tend to have very good psychometric properties, do not have questionable theoretical bases and correlate moderately and meaningfully with a broad set of outcome variables. In general, we believe that trait based measures are more appropriate for most purposes than ability based measures. That being said, several adequate measures of ability EI exist and these have been reviewed in the Literature Review section. If there is a strong preference to use ability measures of EI then several good options exist as outlined later.

Choosing a Specific Measure of Trait EI
Based on our literature review we suggest that a very good, comprehensive measure of trait EI is the Trait Emotional Intelligence Questionnaire, or TEIQue (Petrides and Furnham, 2001). If users are not restricted by time or costs (commercial users need to pay, researchers do not) then the TEIQue is a very good option. The TEIQue is a widely used questionnaire that measures 4 factors and 15 facets of trait EI. It has been cited in more than 2,000 academic studies. It is regarded as a "trait" measure of EI because it is based entirely on selfreport responses, and facet scores represent typical behavior rather than maximal performance. There is extensive evidence in support of its reliability and validity (Andrei et al., 2016). The four factors of the TEIQue map on to the broad EI facets present in multiple measures of EI as follows: emotionality = perceiving emotions, self-control = regulating emotions in self, sociability = regulating emotions in others, well-being = strategically utilizing emotions.
One disadvantage of the TEIQue however is that it is not freely available for commercial use. The website states that commercial or quasi-commercial use without permission is prohibited. The test can nevertheless be commercially used for a relatively small fee. The relevant webpage can be found here (http://psychometriclab.com/). A second disadvantage is that the test can be fairly easily faked due to its use of a self-report response scale. However, this is generally only an issue when individuals have a reason for faking (e.g., their score will be seen by someone else and might impact their prospects of being selected for a job) (see Tett et al., 2012). Consequently, we do not recommend the TEIQue to be used for personnel selection, but it is relevant for other professional purposes such as in EI training and executive coaching.
There are very few free measures of trait EI that have been adequately investigated. One exception is the widely used, freely available measure termed the Self-Report Emotional Intelligence Test (SREIT, Schutte et al., 1998). The SREIT has been cited more than 3,000 times. The full paper which includes all test items can be accessed here (https://www.researchgate. net/publication/247166550_Development_and_Validation_ of_a_Measure_of_Emotional_Intelligence). Although it was designed to measure overall EI, subsequent research indicates that it performs better as a multidimensional scale measuring 4 distinct factors including: optimism/mood regulation, appraisal of emotions, social skills and utilization of emotions. These four scales again map closely to the broad facets present in many EI instruments as follows: optimism/mood regulation = regulating emotions in self, appraisal of emotions = perceiving emotions in self, social skills = regulating emotions in others, and utilization of emotions = strategically utilizing emotions. Please note that although one study has comprehensively critiqued the SREIT (Petrides and Furnham, 2000), it actually works well as a multidimensional measure. This was acknowledged by the authors of the critique and has been subsequently confirmed (e.g., by O'Connor and Athota, 2013).

Long vs. Short Measures of Trait EI
The TEIQue is available in long form (153 items, 15 facets, 4 factors) and short form (30 items, 4 factors/subscales). A complete description of all factors and facets can be found here (http://www.psychometriclab.com/adminsdata/files/ TEIQue%20interpretations.pdf). We recommend using the short form when users are interested in measuring only the 4 broad EI factors measured by this questionnaire (self-control, well-being, sociability, emotionality). Additionally, there is much more research on the short form of the questionnaire (e.g., Cooper and Petrides, 2010) (see Table 5), and the scoring instructions for the short form are freely available for researchers. If the short form is used, it is recommended that all factors/subscales are utilized because they predict outcomes in different ways (e.g., O'Connor and Brown, 2016). The SREIT is available only as a short, 33 item measure. All subscales are regarded as equally important and should be included if possible. Again it is noted that this test is freely available and the article publishing the items specifically states "Note: the authors permit free use of the scale for research and clinical purposes." When users require a comprehensive measure of trait EI, the long form of the TEIQue is also a good option (see Table 5).
Although not as widely researched as the short version, the long version nevertheless has strong empirical support for reliability and validity. The long form is likely to be particularly useful for coaching and training purposes, because the use of 15 narrow facets allows for more focused training and intervention than measures with fewer broad facets/factors.

Choosing Between Measures of Ability EI
The most researched and supported measure of ability EI is the Mayer, Salovey, Caruso Emotional Intelligence Test (MSCEIT) (see Tables 2, 3). It has been cited in more than 1,500 academic studies. It uses a 4 branch approach to ability EI and measures ability dimensions of perceiving emotions, facilitating thought, understanding emotions and managing emotions. These scales broadly map onto the broad constructs present in many measures of EI as follows: facilitating thought = strategically utilizing emotions, perceiving emotions = perceiving emotions in self and others, understanding emotions = understanding emotions, and managing emotions = regulating emotions in self and others. However, this is a highly commercialized test and relatively expensive to use. The test is also relatively long (141 items) and time consuming to complete (30-45 min). A second, potentially more practical option includes two related tests of ability EI designed by MacCann and Roberts (2008) (see Tables 2, 7). These tests are called the Situational Test of Emotion Management (STEM) and the Situational Test of Emotional Understanding (the STEU). These tests are becoming increasingly used in academic articles; the original paper has now been cited more than 250 times. The two aspects of ability EI measured in these tests map neatly onto two of the broad EI constructs present in multiple measures of EI. Specifically, the STEM can be regarded as a measure of emotional regulation in oneself and the STEU can be regarded as a measure of emotional understanding. As indicated in Table 7, there is strong psychometric support for these tests (although the alpha for STEU is sometimes borderline/low). A further advantage of STEU is that it contains several items regarding workplace behavior, making it highly applicable for use in professional contexts.
If researchers/practitioners decide to use the STEM and STEU, additional measures might be required to measure the remaining broad EI constructs present in other tests. Although these measures could all come from relevant scales of tests reviewed in this article (see Table 1), there is a further option. Users should consider the Diagnostic Analysis of Non-verbal Accuracy scale (DANVA) which is a widely used, validated measure of perceiving emotion in others (see Nowicki and Duke, 1994 for an introduction to the DANVA). Alternatively, for those open to using a combination of ability and trait measures, users might wish to use Schutte et al.'s (1998) SREIT to assess remaining facets of EI (see Table 4). This is because it is free and captures aspects of EI not measured by STEM/STEU. These include appraisal of emotions (for perceiving emotions) and utilization of emotions (for strategically utilizing emotions), respectively. Therefore, if there is a strong preference to utilize ability based measures, the STEM, STEU, and DANVA represent some very good options worth considering. The advantage of using these over the MSCEIT is the lower cost of these measures and the reduced test time. Although the STEM, STEU, and DANVA do not seem to be freely available for commercial use, they are nevertheless appropriate for commercial use and likely to be cheaper than alternative options at this point in time.

Deciding Between Using a Single Measure or Multiple Measures
When seeking to measure EI, researchers/practitioners could choose to use (1) a single EI tool that measures overall EI along with common EI facets (i.e., perceiving emotions in self and others, regulating emotions in self and others and strategically utilizing emotions) or (2) some combination of existing scales from EI tool/s to cumulatively measure the four constructs.
The first option represents the most pragmatic and generally optimal solution because all information about the relevant facets and related measures would usually be located in a single document (e.g., test manual, journal article) or website. Additionally, if a paid test is used it would only require a single payment to a single author/institution. Furthermore, single EI tools are generally based on theoretical models of EI that have implications for training and development. For example EI facets in Goleman's (1995) model (as measured using the ESCI, Boyatzis and Goleman, 2007) are regarded as characteristics that can be trained. Therefore, if a single EI tool is selected, the theory underlying the tool could be used to model the interventions.
However, a disadvantage of the first option is that some EI measures will not contain the specific set of EI constructs researchers/practitioners are interested in assessing. This will often be the case when practitioners are seeking a comprehensive measure of EI but prefer a freely available measure. The second option specified above would solve this problem. However, the trade-off would be increased complexity and the absence of a single underlying theory that relates to the selected measures. Tables 2-8 describe facets within each measure as well as reliability and validity evidence for each facet and can be used to assist the selection of multiple measures if users choose to do this.

The Best Measure of Each Broad EI Construct (Evaluated Across all Reviewed Tests)
In some cases, researchers/practitioners will not need to measure overall EI, but instead seek to measure a single dimension of EI (e.g., emotion perception, emotion management etc.). In general, we caution the selective use of individual EI scales and recommend that users habitually measure and control for EI facets they are not directly interested in. Nevertheless, we acknowledge that in some cases users will have to select a single measure and consequently, this section specifies a selection of what we consider the "best" measures for each construct. We do this for both free measures and those requiring payment. In order to determine which measure constitutes the "best" measure for each construct, the following criteria were applied: 1. The measure should have been used in multiple research studies published in high quality journals.  Mayer et al. (2002aMayer et al. ( ,b, 2003 Cited in more than 1,500 articles In 1997 Salovey and Mayer developed a 4 branch approach to ability EI called MEIS and since then this has been developed into the MSCEIT (Mayer et al., 2002a,b) and revised with additional versions. The revised model is a process-orientated model that emphasizes stages of development in EI, potential for growth and the contributions emotions make to intellectual growth. The scale was developed based on a review of ability EI literature around focusing on individuals' processing of emotion related information.
Each of the four branches is measured with two objective, ability-based tasks. There are different response formats. Some tasks such as the "picture task," use 5-point rating scales, whereas other tasks, such as the "blends task," use a multiple-choice response. For all questions however, answers can be considered correct or incorrect in a similar way to IQ tests. The facets can be defined as follows: Perceiving Emotion represents the ability to correctly identify how oneself and others are feeling. Facilitating Thought represents the ability to create emotions that impact thought processes. Understanding Emotion represents the ability to understand the causes of emotions. Managing Emotion represents the ability to create effective strategies that utilize emotions for a specific purpose.  Schutte et al. (1998) Cited in more than 3,000 articles Schutte et al. (1998) developed a self-report EI questionnaire based on Salovey and Mayer's (1990) model. A factor analysis was conducted on 62 items using data from 346 participants from which a 33-item scale was created. The measure showed good internal consistency (Cronbach's alpha of 0.90) and test-retest reliability (r =0.78). The scale was also tested against theoretically related constructs including alexithymia, non-verbal communication of affect, optimism, pessimism, attention to feelings, clarity of feelings, mood repair, depressed mood and impulsivity and found to have construct validity. The model however has been criticized for confusing ability and trait forms of EI (however this criticism can be applied to the development of most trait based models). Participants respond to items on a 5-point Likert-type scale ranging from 1 (strongly-disagree) to 5 (strongly-agree).
Consists of 33 self-report statements. Long Form and Short Forms. Petrides and Furnham (2001) Cited in more than 2,000 articles

TEIQue-Long Form
The TEIQue is based on trait EI theory, which conceptualizes emotional intelligence as a personality trait. It has also been described as "emotional self-efficacy." Unlike Schutte et al.'s (1998) measure, it did not originally aim to measure ability based EI with self-report questions.
Item and facets were developed by conducting a content analysis of the EI literature and available constructs (Salovey and Mayer, 1990;Goleman, 1995;Bar-On, 1997a,b).

TEIQue-Short Form
Petrides and Furnham also created a short-form questionnaire ( Bar-On Emotional Quotient Inventory (EQ-i) Bar-On (1996, 1997a Cited in more than 1,000 articles Mixed position, considers EI as a mixed construct consisting of both cognitive ability and personality aspects. The scale emphasizes how the personality traits influence a person's general well-being. Bar-On's model was based on empirical research into personal factors related to EI and particularly into emotional and social elements of behavior.
The concept was theoretically developed from logically clustering variables and identifying underlying key factors claimed to determine effective and successful functioning. The EQ-i measures abilities and the potential for performance rather than performance itself; it is process-oriented, rather than outcome-oriented.
Bar-On's original report of EQ-i from 1996 is in a book form.  (Mayer et al., 2001).

STEM
The STEM was developed to be administered in both multiple-choice and rate-the-extent formats (i.e., test takers rate the appropriateness, strength, or extent of each alternative, rather than selecting the correct alternative). Items for STEM were developed by conducting semi-structured interviews with 50 individuals who described emotional situations they experienced in the past 2 weeks (with a total of 290 situations). These items were categorized and tested.

STEU
Roseman's (2001) emotion appraisal theory was used as the basis for item construction and scoring of the STEU such that answers could be regarded as correct or incorrect. According to this model, the 17 most common emotions can be explained by a combination of seven appraisal dimensions. The STEU comprised 42 items with each item presenting emotional situations, and participants had to choose which emotion the situation will most likely elicit. Fourteen emotions were assessed in 3 separate contexts-de-contextualized, work and private life.

STEM−44 items
Anger (18 items Boyatzis et al. (2000) Cited in more than 1,500 articles The ESCI is based on a mixed model of EI and regards EI as consisting of both cognitive ability and personality aspects. The model focuses heavily on predicting workplace success. The ESCI utilizes 360 degree assessment that can include self-ratings, peer ratings and supervisor ratings. Boyatzis and Goleman include a set of emotional competencies within each construct of EI. Emotional competencies are not regarded as innate talents, but rather learned capabilities that must be worked on and can be developed to achieve outstanding performance. Boyatzis and Goleman argue that individuals are born with a general emotional intelligence potential that determines their potential for learning emotional competencies. Internal consistency of the scales ranges from 0.61 to 0.85 (Conte, 2005).

Consists of 110 items
Assesses 12 competencies organized into four factors:

Cost
Note the measures reviewed above were selected based on widespread use and validation. Although other measures exist, they were not reviewed based on either less research in general or poor psychometric support. However, if none of those reviewed above are considered appropriate, three further available measures could be considered. One relatively new measure with good preliminary support is the Genos Emotional Intelligence Inventory (Palmer et al., 2009). This is a commercial, mixed measure of EI and requires payment. A further, freely available measure is Wong's Emotional Intelligence Scale (WEIS) (trait-based; see Wong et al., 2004Wong et al., , 2007. A third very new measure is the Geneva Emotional Competence Test (GECo) (see Schlegel and Mortillaro, 2019). It is an ability based measure designed for the workplace that looks very promising based on early work. The authors also suggest the study did not control for personality which may have an impact on the results.
Note two of the studies reviewed in this table utilize student samples. As specified in the inclusion criteria section we targeted non-student samples and only utilized student samples where others were not available or not appropriate.
Frontiers in Psychology | www.frontiersin.org The study aimed to explore the emotional intelligence of nursing students and its relationship to perceived stress, coping strategies, subjective well-being, perceived nursing competency and academic performance. 1 was administered with a pen and paper questionnaire. Confirmatory factor analysis was conducted and four of the items were re-written.
Study 2: The students completed version 1.5 of the TEIQue developed in study 1. The same procedure was carried out in study 2. Internal consistency: In study 1 (TEIQue -SF), Cronbach's alpha for men was 0.89 and 0.88 for women.
In study 2 (TEIQue-SF 1.50), Cronbach's alpha for men was 0.88 and 0.87 for women. Construct validity: Each measure was tested using item response theory (IRT) which provides information about measurement precision for each item. Gender: 95% of participants were female. Age: 34% of nurses were aged 41-50 and 31% were aged 52-60. Education: 42% bachelor level and 28% masters level. The study assessed self-compassion and emotional intelligence using the TEIQue -SF in nurses. Nurses completed the self-report assessment online. Internal consistency: Cronbach's alpha of 0.88 was reported for the study.
Construct validity: The study found EI was significantly related to self-compassion (r = 0.55, p < 0.0001).
Note some of the studies reviewed in this table utilize student samples. As specified in the inclusion criteria section we targeted non-student samples and only utilized student samples where others were not available or not appropriate. Asian American, 7% African American, 3% Hispanic, and 1% Native American.
The EQ-I has been developed over 17 years by Bar-On. Numerous studies have been conducted by Bar-On testing the self-report measure to establish a valid and reliable tool. Many of his earlier works were not able to be located however information was drawn from a number or sources listed to the left.
Internal consistency: The overall internal consistency was reported at 0.97. Test-retest reliability: the average stability coefficient is 0.85 after 1 month and 0.75 after 4 months.
Predictive validity: Bar-On (2006) noted 20 predictive validity studies that have been conducted on a total of 22,971 individuals across 7 counties. The EQ-i measure was found to predict performance in social interactions, at school and work as well as impacts on physical health, psychological health, self-actualization and subjective well-being.
The average predictive validity coefficient is 0.59.

Bar-On et al. (2000) Germany
Non clinical N = 167 Sample: Helping professionals including police officers (n = 85) and child care and mental health care workers (n = 81).
Gender: 72% male and 28% female. Age: mean age was 33.2 years. Education: the average duration of education was 11.9 years.
Used the earlier version of Bar-On's EQ-i comprising of 133 items translated to German.
The study assessed occupational stress and emotional expression within different high stress helping professions, namely the police force and child care and mental health care professions. The authors examined gender, age and occupational differences.
Internal consistency: Alpha coefficients ranged from 0.66 to 0.87 for the scales.
The authors noted that there may be social desirability bias present. Specific organizational stressors were not assessed in the study therefore there organizational or occupational differences may be present. Results may not be generalizable to the wider population due to the limited sample size. Cross-sectional study -a longitudinal study is required to assess causality. Self-report measure-this study relies on subjective self-report data.

Dawda and Hart (2000) Canada
Non clinical N = 243 Sample: University students Gender: 118 men and 125 women Age: Age ranged from 17 to 47 with a mean age of 21.27 years.
Students were recruited via posters advertising an "emotions study." The aim of the research was to assess the validity and reliability of the EQ-i measure, and was undertaken as part of a larger study examining the association between psychopathy and alexithymia. Participants completed the EQ-i measure, as well as two interview-based rating scales for alexithymia, and a range of self-report measures including alexithymia, personality, affect intensity, depression and psychosomatic complaints.
Internal consistency: Cronbach's alpha for the full scale was 0.96 with coefficients ranging from 0.81 to 0.94 for the factors. Construct validity: The correlations between EI and the additional scales generally were moderate, ranging from 0.32 to 0.83. In general, people with high EQ Total scores had low levels of negative affectivity and high levels of positive affectivity; were conscientious and agreeable; had fewer difficulties identifying and describing feelings; and were not prone to somatic symptomatology or increased somatic symptoms under stress.
One concern was that the Interpersonal scale had relatively small correlations with the other EQ composite scales, as well as a different pattern of convergent and discriminant validities.
The authors were unable to explain below-normal EQ-i scores in the study however the low scores should not have much impact on the observed convergent/discriminant validity.
For specific aspects of EI, the authors suggest to use the 15 subscale scores instead of the 5 factors, which are generally more internally consistent.
Note some of the studies reviewed in this table utilize student samples. As specified in the inclusion criteria section we targeted non-student samples and only utilized student samples where others were not available or not appropriate.
Frontiers in Psychology | www.frontiersin.org Predictive validity: Both the STEU and STEM incrementally predicted students' psychology grades, and the STEU also incrementally predicted students' overall grades.
The validation had some issues. Further validation of the measures is need such as against the full MSCEIT scale. The author suggests that a video or audio based version (rather than text) would also be useful to determine whether relationships of EI to intelligence are due to cognitive processing of emotional information rather than to the verbal ability required to comprehend the text-based items.
Emails were sent to all 209 full-time employees which provided a link to an initial survey containing self-report measures of emotional labor strategies and personality traits. Once completed a second survey was sent assessing emotion regulation (EI) knowledge (on average completed 3 weeks later). Employees were assessed on their emotional regulation knowledge (measured by STEM), as well as measures such as emotional labor strategies, voice and performance evaluation, helping and extraversion.
Internal consistency: Cronbach's alpha for the STEM was reported at 0.73.
Due to the correlational nature of the study, it makes it difficult to rule out alternative explanations for the relationships or to predict causality. Additionally, because the employees were tested for their emotional regulation knowledge (STEM) after the other constructs, this may influence the causality direction or relationship. Contextual factors were also not measured in the study that may impact the emotional regulation knowledge and strategies. Self-report measure.
Note some of the studies reviewed in this table utilize student samples. As specified in the inclusion criteria section we targeted non-student samples and only utilized student samples where others were not available or not appropriate.
Frontiers in Psychology | www.frontiersin.org Small sample size which could limit the generalizability.
The authors also noted that an organizational climate survey could be administered to assess whether the organizational climate affects how a registered nurse responds when faced with conflict.
The aim of the study was to determine: (1) performance of first-year pediatric residents in the delivery of bad news in a standardized patient (SP) setting; and (2)  This study presents a number of limitations.
There was a small sample size which may limit the generalizability of the findings. The low response rate (5% valid responses) may have resulted in more of a volunteer bias than is often encountered in survey research in organizations.
Further, due to the limited sample, it may be possible that the findings may be a function of organizational culture. Statistically speaking, the ESCI was completed by subordinates, so there could be an inflated effect due to common source. Where multiple measures met the above criteria, they were compared on their performance on each criterion (i.e., a measure with a lot of research scored higher on the first criteria than a measure with a medium level of research). Table 1 summarizes these results. Please note that the Emotional and Social Intelligence Inventory (ESCI) by Boyatzis and Goleman (2007) has subscales that are also closely related to the ones listed in Table 1 (see full technical manual here (http://www.eiconsortium.org/pdf/ESCI_ user_guide.pdf). The measure was developed primarily to predict and enhance performance at work and items are generally written to reflect workplace scenarios. Subscales from this test were not consistently chosen as the "best" measures because it has not had as extensive published research as the other tests. Most research using this measure has also used peer-ratings rather than selfratings which makes it difficult to compare with the majority of measures (this is not a weakness though). Nevertheless, it should be considered if cost is not an issue and there is a strong desire to utilize a test specifically developed for the workplace.

Qualifications and Training
Although our purpose in this paper is not to outline the necessary training or qualifications required to administer the set of tests/questionnaires reviewed, we feel it is important to make some comments on this. First, we recommend that all researchers and practitioners considering using one more of these tests have a good understanding of the principles of psychological assessment. Users should understand the concepts of reliability, validity and the role of norms in psychological testing. There are many good introductory texts in this area (e.g., Kaplan and Saccuzzo, 2017). Furthermore, we recommend users have a good understanding of the limitations of psychological testing and assessment. When using EI measures to evaluate suitability of job applicants, these measures should form only part of the assessment process and should not be regarded as comprehensive information about applicants. Finally, some of the tests outlined in this review require specific certification and/or qualifications. Certification and/or qualification is required for administrators of the ESCI, MSCEIT, and EQi 2.0).

LITERATURE REVIEW
The final section of this article is a literature review of the 6 popular measures we have covered. We have included our review at the end of this article because we regard it as optional reading. We suggest that this section will be useful primarily for those seeking a more in depth understanding of the key studies underlying the various measures we have presented in earlier sections.
This literature review had two related aims; first to identify prominent EI measures used in the literature, as well as specifically in applied (e.g., health care) contexts. The emotional intelligence measures we included were those that measured both overall EI as well as more specific EI constructs common to multiple measures (e.g., those related to perceiving emotions in self and others, regulating emotions in self and others and strategically utilizing emotions). The second aim was to identify individual studies that have explored the validity and reliability of the specific emotional intelligence measures identified.

Inclusion Criteria
Four main inclusion criteria were applied to select literature: (a) focus on adult samples, (b) use of reputable, peer-reviewed journal articles, (c) use of an EI scale, and (d) where possible, use of a professional sample (e.g., health care professionals) rather than primarily student samples. The literature search therefore focused on empirical, quantitative investigations published in peer-reviewed journals. The articles reviewed therefore were generally methodologically sound and enabled a thorough analysis of some aspect of reliability or validity. We only reviewed articles published after 1990. Additionally, only papers in English were reviewed.

Sources
Papers were identified by conducting searches in the following electronic databases: PsycINFO, Medline, PubMED, CINAHL (Cumulative Index for Nursing and Allied Health Literature), EBSCO host and Google Scholar. Individual journals were also scanned such as The Journal of Nursing Measurement and Psychological Assessment.

Search Terms
When searching for emotional intelligence scales and related literature, search terms included: trait emotional intelligence, ability emotional intelligence, emotional intelligence scales, mixed emotional intelligence and emotional intelligence measures. Some common EI facet titles (e.g., self-awareness, selfregulation/self-management, social awareness, and relationship management) were also entered as search terms however this revealed far less relevant literature than searches based on EI terms. To access studies using professionals we also used terms such as workplace, healthcare, and nursing, along with emotional intelligence.
When searching for literature on the identified scales, the name of the respective scale was included in the search term (such as TEIQue scale) and the authors' names, along with terms such as workplace, organization, health care, nurses, health care professionals, to identify specific studies with a professional employee sample that utilized the specific scale. The terms validity and reliability were also used. Additionally, a similar search was conducted on articles that had cited the original papers. This search was done conducted utilizing Google Scholar. Table 2 summarizes the result of the first part of the literature review. It provides an overview of major Emotional Intelligence assessment measures, in terms of when they were developed, who developed them, what form of EI they measure, theoretical basis, test length and details regarding cost.
Tables 3-8 summarize research on the validity and reliability of the 6 tests included in Table 2. In these tables we summarize the methodology used in major studies assessing reliability and validity as well as the results from these studies.
Collectively, these tables indicate that all 6 of the measures we reviewed have received some support for their reliability and validity. Measures with extensive research include the MSCEIT, SREIT, and TEIQue, and EQ-I and those with less total research are the STEU/STEM and ESCI. Existing research does not indicate that these latter measures are any less valid or reliable that the others; on the contrary they are promising measures but require further tests of reliability and validity. As noted previously, this table confirms that the tests with the strongest current evidence for construct and predictive validity are the selfreport/trait EI measures (TEIQue, EQ-I, and SREIT). We note that although there is evidence for construct validity of the SREIT based on associations with theoretically related constructs (e.g., alexithymia, optimism; see Table 4), some have suggested the measure is problematic due to its use of self-report questions that primarily measure ability based constructs (see Petrides and Furnham, 2000).

CONCLUSION
In this article we have reviewed six widely used measures of EI and made recommendations regarding their appropriate use. This article was written primarily for researchers and practitioners who are not currently experts on EI and therefore we also clarified the difference between ability EI, trait EI and mixed EI. Overall, we recommend that users should use single, complete tests where possible and choose measures of EI most suitable for their purpose (i.e., choose ability EI when maximal performance is important and trait EI when typical performance is important). We also point out that, across the majority of emotion-related outcomes, trait EI tends to be a stronger predictor and consequently we suggest that new users of EI consider using a trait-based measure before assessing alternatives. The exception is in employment contexts where tests utilizing 360 degree assessment (primarily mixed measures) can also be very useful.