Impact Factor 2.089

The world's most-cited Multidisciplinary Psychology journal

Original Research ARTICLE

Front. Psychol., 02 October 2018 | https://doi.org/10.3389/fpsyg.2018.01834

Using Item Response Theory for the Development of a New Short Form of the Eysenck Personality Questionnaire-Revised

Daiana Colledani*, Pasquale Anselmi and Egidio Robusto
  • Department of Philosophy, Sociology, Education and Applied Psychology, School of Psychology, University of Padova, Padova, Italy

The present work aims at developing a new version of the short form of the Eysenck Personality Questionnaire-Revised, which includes Psychoticism, Extraversion, Neuroticism, and Lie scales (48 items, 12 per scale). The work consists of two studies. In the first one, an item response theory model was estimated on the responses of 590 individuals to the full-length version of the questionnaire (100 items). The analyses allowed the selection of 48 items well discriminating and distributed along the latent continuum of each trait, and without misfit and differential item functioning. In the second study, the functioning of the new form of the questionnaire was evaluated in a different sample of 300 individuals. Results of the two studies show that reliability of the four scales is better than, or equal to that of the original forms. The new version outperforms the original one in approximating scores of the full-length questionnaire. Moreover, convergent validity coefficients and relations with clinical constructs were consistent with literature.

Introduction

In the view of Eysenck (see Eysenck and Eysenck, 1975, 1991), the structure of personality may be effectively described by three main traits: psychoticism (P), extraversion (E), and neuroticism (N). These dimensions are also known as the “Giants Three” and represent basic, independent, and biologically founded traits. They characterize all subjects, with varying degrees, and allow for effectively describing behavioral, emotional, and individual differences among adults and young people. According to the authors, PEN traits do not represent pathological dimensions in themselves, but could lead to the development of abnormal conditions only in particular situations (Eysenck and Eysenck, 1991). In this perspective, neurosis and psychosis should be conceived as pathological exaggerations of the underlying traits of neuroticism and psychoticism (Eysenck and Eysenck, 1991; Mor, 2010).

Extraversion and neuroticism have been the first two dimensions included in the Eysenck's model and were conceptualized as orthogonal continua (Eysenck and Eysenck, 1964, 1991). The neuroticism dimension describes a trait opposed to emotional stability, and defines the degree to which a person is predisposed to experience negative affect (Eysenck and Eysenck, 1964, 1991; Mor, 2010). Individuals with high levels of this trait tend to be worried, apprehensive, moody, fed-up, and irritable (Eysenck and Eysenck, 1991; Eysenck and Barrett, 2013). Extraversion is the second dimension included in the model and depicts sociable, carefree, friendly, convivial, easygoing, and impulsive individuals. This trait is opposed to introversion which, in contrast, defines individuals introspective, quiet, serious, and reserved (Eysenck and Eysenck, 1975, 1991; Eysenck and Barrett, 2013). The third dimension included in the Eysenck's model has been psychoticism, or toughmindedness. The typical toughminded is an individual hostile, aggressive, untrusting, cold, unemotional, rude, lacking in human feelings, and unfriendly. On the opposite pole of the continuum, there are individuals with well-adjusted personality, agreeable, empathic, tolerant, conscientious, open-minded, friendly, and warm (Eysenck and Eysenck, 1975, 1991; Eysenck and Barrett, 2013).

Over the years, a series of instruments has been developed for the assessment of PEN traits on both young and adult people (e.g., Eysenck and Eysenck, 1964, 1975; Eysenck et al., 1985). These instruments also included a Lie (L) scale, which measures dissimulation and the tendency to deceive (Eysenck and Eysenck, 1964). Several contributions have been offered for the refinement of the psychometric properties of Eysenck's questionnaires, as well as for the development of brief versions (Eysenck et al., 1985; Francis and Pearson, 1988; Corulla, 1990; Francis et al., 1992; Francis, 1996). The psychometric properties and factor structure of all these instruments have been investigated in cross cultural research (e.g., Hosokawa and Ohyama, 1993; Maltby and Talley, 1998; Forrest et al., 2000; Qian et al., 2000; Scholte and De Bruyn, 2001; Aluja et al., 2003; Alexopoulos and Kalaitzidis, 2004; Dazzi et al., 2004; Francis et al., 2006; Karanci et al., 2006; Tiwari et al., 2009; Picconi et al., 2018). Unidimensionality of N and L scales has been widely supported in literature (e.g., Lajunen and Scherler, 1999; Ferrando, 2001; Ferrando and Chico, 2001; Ferrando and Anguiano-Carrasco, 2009; Dazzi, 2011). Contrasting results have been found concerning E scale: There are several studies supporting the unidimensionality of this scale (e.g., Rocklin and Revelle, 1981; Ferrando and Chico, 2001; Dazzi, 2011), but there is also some evidence suggesting the presence of two dimensions (Eysenck and Eysenck, 1963; Vidotto et al., 2008). Finally, there is large agreement in the literature that P scale comprises different facets (e.g., Howarth, 1986; Roger and Morris, 1991), which nevertheless contribute to a unique dimension (Chico and Ferrando, 1995; Dazzi, 2011).

Eysenck's instruments have been extensively employed for clinical, forensic, educational, and organizational purposes (e.g., Nyborg, 1997; Judge et al., 2000; Wood and Newton, 2003; Laidra et al., 2007; Smillie et al., 2009; Almiro et al., 2016), and all scales showed significant relations with a variety of psychologically and clinically relevant constructs and behaviors. Research, for instance, suggests that individuals with high levels of neuroticism may experience symptoms of anxiety and depression (e.g., Eysenck, 1991; Saklofske et al., 1995; del Barrio et al., 1997; Dazzi et al., 2004; Jylhä and Isometsä, 2006), and may also be more likely exposed to stress and health problems (e.g., Denney and Frisch, 1981; Huang et al., 2015; Bergomi et al., 2017). In contrast, extraversion appears to be mainly linked to adaptive social behavior, mental well-being, happiness, and life satisfaction (e.g., Lu, 1995; Mor, 2010; Gale et al., 2013). Moreover, this trait has been found to be negatively related to symptoms of anxiety and depression, to self-reported mental disorder and to health care use for psychiatric reasons (e.g., del Barrio et al., 1997; Jylhä and Isometsä, 2006). Finally, psychoticism has been often cited in relation to inappropriate social behaviors, such as unsafe sexual habits, heavy drinking, criminal behavior, dysfunctional impulsivity, gambling, and drug abuse (e.g., Barnes et al., 1984; Blaszczynski et al., 1985; Bogaert, 1993; Lodhi and Thakur, 1993; Francis, 1996; Conrad et al., 1997; Grau and Ortet, 1999; Hoyle et al., 2000; Chico et al., 2003; Heaven et al., 2004; Gudgeon et al., 2005; Colledani, 2018).

The short form of the Eysenck Personality Questionnaire-Revised (EPQ-R; Eysenck et al., 1985; Eysenck and Eysenck, 1991) includes 48 items (out of 100 of the EPQ-R), 12 per each of the four dimensions. This version of the instrument has been translated in several languages and is widely used, across different countries, for scientific and clinical purposes (Hosokawa and Ohyama, 1993; Aluja et al., 2003; Alexopoulos and Kalaitzidis, 2004; Dazzi et al., 2004; Francis et al., 2006; Tiwari et al., 2009; Sanavio et al., 2013). However, it suffers from the same drawbacks of the full-length version. In particular, P scale exhibited poor reliability with a restricted range of scores and a strong positive skewness (Bishop, 1977; Block, 1977; Claridge, 1981; Hosokawa and Ohyama, 1993; Katz and Francis, 2000; Alexopoulos and Kalaitzidis, 2004). In addition, several items showed differential item functioning (DIF) across gender (Eysenck et al., 1985; Eysenck and Eysenck, 1991; Lynn and Martin, 1997; Forrest et al., 2000; Karanci et al., 2006; Escorial and Navas, 2007), which makes the comparison between groups questionable.

A better selection of the items from the full-length version of the instrument could allow for reducing some of the aforementioned drawbacks. The present work aims at developing a new version of the short form of the EPQ-R with improved psychometric properties.

Item response theory (IRT; Bock, 1997; Thissen and Steinberg, 2009) is one of the most promising approaches to this aim. There are several successful applications of IRT for the development and validation of measurement scales (see, Da Dalt et al., 2013, 2015; Balsamo et al., 2014; Anselmi et al., 2015; Zanon et al., 2016; Sotgiu et al., 2018). Moreover, compared with classical test theory, IRT was found to provide more diagnostic information useful for the development of brief scales (Spence et al., 2012; Bortolotti et al., 2013; Petrillo et al., 2015). IRT allows for identifying the items that are best at discriminating different levels of the latent trait of interest, while ensuring that the entire trait continuum is covered. Selecting these items can result in a brief version of the scale that produces scores very similar to those obtained with the full-length scale and has the same external validity (i.e., the same correlations with other constructs; Reise and Henson, 2000; Spence et al., 2012). Moreover, IRT allows for detecting items that are unclear, ambiguous, or which exhibit DIF. These items should be not included in the brief scale. Despite advantages offered by IRT, only a few studies employed this approach for the refinement of Eysenck's instruments (e.g., Ferrando, 2001; Ferrando and Chico, 2001; Escorial and Navas, 2007; Maij-de Meij et al., 2008). Recently, Colledani et al. (2018) used IRT for developing a new version of the abbreviated form of the Junior EPQ-R (6 items per scale). The new version outperformed the original one on several aspects.

This work includes two main studies. In Study 1, a series of analyses were performed on the responses to the full-length version of the EPQ-R in order to select the 48 items (12 per each scale) with the best psychometric properties. In Study 2, the functioning of the new short form was tested in a new data sample. Reliability, validity and factor structure were examined. Relationships of the new scales with social desirability, the dimensions of the Five Factor Model (FFM), and clinically relevant constructs were verified.

Study 1

Participants

A total of 590 participants took part in the study (mean age = 36.69 years, SD = 14.16; from 18 to 75 years; 55.8% females). They were recruited from different Italian regions through convenience sampling. All participants were native Italian speakers and completed the questionnaire anonymously and voluntarily. All standards for research with human subjects were respected. Written informed consent was obtained from the participants. The project has been approved, now as later, by the Ethical Committee for the Psychological Research of the University of Padova since a prospective ethics approval was not required at the time when the research was conducted (Protocol n. 2622).

Instruments

The participants were presented with the Italian version of the EPQ-R (Dazzi et al., 2004; Dazzi, 2011). The instrument consists of 100 dichotomous items (yes/no), 32 for P scale (e.g., “Should people always respect the law?,” “Do you enjoy hurting people you love?”), 23 for E scale (e.g., “Do you enjoy meeting new people?,” “Can you get a party going?”), 24 for N scale (e.g., “Would you call yourself a nervous person,” “Are you often troubled about feelings of guilt?”), and 21 for L scale (e.g., “Are all your habits good and desirable ones?,” “Have you ever cheated at a game?”). Administration of the questionnaire was individual and paper-and-pencil.

The Italian version of the questionnaire has good reliability and the four-factor structure was confirmed (α = 0.67, 0.78, 0.85, and 0.75 for P, E, N, and L scales, respectively; Dazzi et al., 2004; Dazzi, 2011). The reliability found in the current sample (α = 0.60, 0.79, 0.85, and 0.77 for P, E, N, and L scales) is in line with literature.

Studies in the Italian context aimed also to test the factor structure and the psychometric characteristics of the short version of the instrument (Dazzi et al., 2004). Consistently with cross-cultural findings, results supported the four-factor structure of the instrument and showed reliability coefficients satisfactory for E, N, and L scales, while lower for P (α = 0.37, 0.77, 0.83, and 0.70 for P, E, N, and L, respectively; Dazzi et al., 2004). The reliability found in the current sample (α = 0.40, 0.73, 0.83, and 0.73 for P, E, N, and L scales) is in line with literature.

Analysis Strategy

The two-parameter logistic (2PL) model (see Thissen and Steinberg, 2009) was separately estimated on the responses to each of the four scales of the questionnaire. This model describes the probability that a subject endorses a certain item as a function of the latent trait level of the subject (parameter θ), the “endorsability” level of the item (i.e., the ease of providing a “yes” response to that item; parameter ε), and the capability of the item in differentiating subjects with different trait levels (parameter δ). In the case of the P scale, for instance, the greater the value of parameter θ, the greater the level of psychoticism of the subject; the greater the value of parameter ε, the greater the ease of responding “yes” to the item (i.e., of providing a response that is indicative of the presence of psychoticism); the greater the value of parameter δ, the greater the capability of the item in differentiating between subjects with different levels of psychoticism. All the analyses were run using the packages “difR” (Magis et al., 2016) and “ltm” (Rizopoulos, 2012) for the statistical environment R (R Core Team, 2016).

The 2PL assumes unidimensionality of the scales. Confirmatory factor analyses were run on the data of each of the four scales (for a reasonable fit, CFI ≥0.90, RMSEA < 0.08; see Hu and Bentler, 1999; Marsh et al., 2004; Brown, 2006). These analyses confirmed the unidimensionality of N [χ(252)2 = 1046.791, p ≤ 0.001; RMSEA = 0.073; CFI = 0.919] and L [χ(189)2 = 532.901, p ≤ 0.001; RMSEA = 0.056; CFI = 0.900]. Fit indices of E scale were close to acceptance [χ(230)2 = 808.417, p ≤ 0.001; RMSEA = 0.065; CFI = 0.890]. The unidimensional model did not fit the data of P scale [χ(464)2 = 1841.233, p ≤ 0.001; RMSEA = 0.071; CFI = 0.467]. An exploratory factor analysis on this scale suggests a four-factor solution with 7 items out of 32 exhibiting cross-loadings. In line with literature (e.g., Howarth, 1986; Roger and Morris, 1991; Chico and Ferrando, 1995; Dazzi, 2011), this result confirms that P scale defines a complex and multifaceted construct.

Item Selection for the New Short Scales

DIF and item fit statistics were used to identify the items with the poorest psychometric properties that were not included in the new short scales.

Three item fit statistics were used: infit, outfit (Wright and Masters, 1982), and the index suggested by Bock (1972). Infit and outfit are two χ2-based statistics, the former being effective in detecting unexpected responses to items close to a subject's trait level, the latter being effective in detecting unexpected responses to items far from the subject's trait level. In this work, items with infit and/or outfit higher than 1.4 (Wright and Linacre, 1994) were considered misfitting and not included in the new short scales. The index suggested by Bock involves grouping subjects into n categories on the basis of their latent trait level, and observed and expected proportions of subjects endorsing the item for each group are compared (Bock, 1972; Reise, 1990). In this work, subjects were grouped into four categories and the items which displayed a medium (0.3 ≤ Φ < 0.5) to large (Φ ≥ 0.5) effect size (Cohen, 1988) were not selected for inclusion in the new questionnaire.

Items exhibiting gender DIF were also excluded from the new questionnaire. Both uniform and non-uniform DIF were considered. The former is a systematic bias expressing a different probability of endorsing an item for the members of a specific group. The latter is a non-systematic bias which varies with the latent trait level. Females were used as reference group. Effect sizes of uniform and non-uniform DIF were evaluated by the R2 difference test (Nagelkerke, 1991; Gómez-Benito et al., 2009), with values higher than 0.035 denoting moderate DIF and values higher than 0.07 denoting strong DIF (Jodoin and Gierl, 2001; Magis et al., 2016).

Parameters ε and δ were examined to select, among the remaining items, those that allow for covering the entire trait continuum and with the greatest discrimination level.

Assessment of the Psychometric Characteristics of the New Short Scales

Reliability and validity of the newly developed PEN-L scales were evaluated and compared with those of the original short scales. Reliability was evaluated through Cronbach's α and test information function (TIF). TIF tells us how well the test measures the latent trait levels over the entire range of interest (Baker, 2001; Petrillo et al., 2015). The larger the value of TIF, the greater the accuracy with which the latent trait levels are measured. TIF depends on the latent trait range under consideration and on the number of items in the test (Baker, 2001). In this work, the old and new short scales had the same length (12 items), and TIF was defined on the same range of latent trait levels (−5 to 5). Validity was evaluated using a bias index and the correlation between scores obtained with full-length and short scales. The bias index was computed as the average difference (in absolute terms) between the parameters θ estimated on the full-length scales and those estimated on the short scales. Low biases suggest that the latent trait estimates obtained with the short scales approximate those of the full-length versions. In addition, the correlations between scores obtained with the full-length and short scales were computed and corrected for common items using the Levy's (1967) method.

Results

Three of the 32 items of P scale exhibited uniform and non-uniform gender DIF of moderate (Items 68 and 91) or strong (Item 12) size. Fit statistics were adequate for all the items. From the remaining 29 items, 12 were selected taking into account their parameters ε and δ. This resulted in a new short scale, that differed from the original one for eight items (see Table 1). Specifically, Item 91 was changed because it showed uniform and non-uniform gender DIF of moderate size. These modifications allowed for obtaining a new scale with increased reliability (α increased from 0.40 to 0.62; TIF increased from 8.13 to 12.86) and with scores that better approximate those obtained with the full-length scale (bias decreased from 0.37 to 0.18, corrected correlation increased from 0.47 to 0.52). It is worth noting that Cronbach's α of the new 12-item scale (0.62) largely resembles that of the full 32-item scale (0.60).

TABLE 1
www.frontiersin.org

Table 1. Easiness (ε) and discrimination (δ) parameters for the 32 items of the Psychoticism scale.

Regarding the 23 items of E scale, only Item 55 exhibited uniform gender DIF of moderate size and no item showed misfit. Selecting 12 items upon the basis of their parameters ε and δ, we obtained a new E scale that differed from the original one for three items (see Table 2). The differences in reliability and validity of the new and original scales were small in size, nevertheless in favor of the new version (α increased from 0.73 to 0.75; TIF increased from 16.62 to 16.83; bias decreased from 0.21 to 0.19; corrected correlation increased from 0.74 to 0.77).

TABLE 2
www.frontiersin.org

Table 2. Easiness (ε) and discrimination (δ) parameters for the 23 items of the Extraversion scale.

Concerning N and L scales, no one item exhibited gender DIF or misfit. Therefore, items were selected considering their ε and δ parameters. For both scales, the new versions differed from the original ones for two items (see Tables 3, 4). Item 35 was present in the previous version of the N scale but it has not been included in the new one because of its redundant content. Reliability of the new scales largely resembles that of the original versions (α = 0.83, 0.82; TIF = 20.86, 20.80 for original and new N scale, respectively; α = 0.73, 0.74; TIF = 13.86, 14.15 for original and new L scale, respectively). Concerning N scale, a slight decrease of bias was observed (from 0.22 to 0.16). The other indexes remained substantially unchanged (bias = 0.20, 0.18 for original and new L scale, respectively; corrected correlation = 0.74, 0.75 for original and new L scale, respectively; 0.83, 0.84, for original and new N scale, respectively).

TABLE 3
www.frontiersin.org

Table 3. Easiness (ε) and discrimination (δ) parameters for the 24 items of the Neuroticism scale.

TABLE 4
www.frontiersin.org

Table 4. Easiness (ε) and discrimination (δ) parameters for the 21 items of the Lie scale.

Discussion

This study aimed at developing a new short version of the EPQ-R with improved psychometric characteristics. IRT based statistics allowed the identification of 48 items without gender DIF or misfit, well discriminating, and well distributed along the four latent traits continua. The new version of the P scale differs from the original one for eight items (out of 12), E scale for three, and N and L only for two. The largest improvement was reached for P scale, which in literature was found to perform less well than the other three scales (e.g., Bishop, 1977; Block, 1977; Claridge, 1981). In particular, the new version is not affected by gender DIF and outperforms the original one for reliability and approximation of the scores obtained with the full-length form. The new versions of the other three scales performed as well as, or slightly better than the original ones. Although small in size, these improvements are valuable taking into account that were obtained by substituting a small number of items and reducing content redundancy.

Study 2

This study aimed at investigating the functioning of the new version of the short EPQ-R on a new data set. Other to reliability and factor structure, construct validity was evaluated by taking into account relationships with social desirability, the dimensions of the FFM, and measures of anxiety and depression.

Participants

Participants were 300 native Italian speakers aged between 18 and 65 (mean age = 29.28, SD = 10.38; 60.2% females). They were recruited from different Italian regions using convenience sampling. All participants were presented with the new version of the short EPQ-R, whereas a subsample of 158 participants (mean age = 34.73, SD = 9.88; 68.7% females) also received the other measures. The participation to the study was anonymous and voluntary, and all standards for research with human subjects were respected. Written informed consent was obtained from the participants. The project has been approved, now as later, by the Ethical Committee for the Psychological Research of the University of Padova since a prospective ethics approval was not required at the time when the research was conducted (Protocol n. 2622).

Instruments

The new form of the short EPQ-R devised in Study 1 was administered to all participants.

The five traits of the FFM of personality (i.e., extraversion, agreeableness, conscientiousness, emotional stability, and openness) were measured through the Italian version (Ubbiali et al., 2013; Chiorri et al., 2016) of the Big Five Inventory (BFI; John et al., 2008). The questionnaire consists of 44 items answered on a five-point Likert scale (from 1 “Strongly disagree” to 5 “Strongly agree”; e.g., “I see myself as someone who is full of energy” for extraversion; “I see myself as someone who is helpful and unselfish with others” for agreeableness; “I see myself as someone who perseveres until the task is finished” for conscientiousness; “I see myself as someone who worries a lot” for emotional stability; “I see myself as someone who is ingenious, a deep thinker” for openness). Convincing evidence was found concerning construct validity, factor structure, gender invariance, and reliability (α from 0.75 to 0.86; Ubbiali et al., 2013; Chiorri et al., 2016; α from 0.73 to 0.83 in the current sample).

The Impression Management (IM) scale of the Italian brief version (Bobbio and Manganelli, 2011) of the Balanced Inventory of Desirable Responding (BIDR; Paulhus, 1991) was also administered. The scale comprises 8 items answered on a six-point Likert scale (from 1 “Strongly disagree” to 6 “Strongly agree”) and assesses the conscious tendency of individuals to provide positively inflated self-descriptions (e.g., “I have never dropped litter on the street”). Internal consistency of the scale ranges from 0.73 to 0.81 (Bobbio and Manganelli, 2011; in the current sample, α = 0.75).

The trait scale of the State-Trait Anxiety Inventory (STAI-Y; Spielberger et al., 1983; Pedrabissi and Santinello, 1989) was used to evaluate anxiety. The scale comprises 20 items answered on a four-point Likert scale (from 1 “Not at all” to 4 “Very much”). The instrument evaluates the tendency of people to experience general anxiety and the relatively stable predisposition to view stressful situations as threatening (e.g., “I am regretful”). The Italian version of the questionnaire showed adequate validity and reliability (α from 0.85 and 0.90; Pedrabissi and Santinello, 1989; in the current sample, α = 0.92).

Finally, the Italian version of the Patient Health Questionnaire-9 (PHQ-9; Spitzer et al., 1999; Kroenke et al., 2001) was used to evaluate depressive symptoms. The questionnaire is a self-administered instrument and assesses the nine DSM-IV (American Psychiatric Association, 2000) criteria for depression. Respondents are asked to evaluate the presence of depressive symptoms over the last 2 weeks through nine items scored on a four-point Likert scale (from 0 “Not at all” to 3 “Nearly every day”; e.g., “Feeling tired or having little energy”). This instrument showed adequate reliability (α from 0.86 to 0.89), and good sensitivity and specificity (see Kroenke et al., 2001). In the current sample, α equals 0.81.

Analysis Strategy

Reliability of the new version of the short EPQ-R was tested through Cronbach's α. Construct validity was evaluated by computing convergent validity coefficients and by analyzing the factor structure of the instrument.

Convergent validity was evaluated considering correlations between the four PEN-L traits, the five dimensions of FFM, social desirability, and indexes of depression and trait anxiety. According with literature, L scores are expected to positively correlate with the IM scale of the BIDR (e.g., Gillings and Joseph, 1996), while PEN traits are expected to correlate with BFI scales, depression and trait anxiety. In particular, positive correlations are expected between E scores of the EPQ-R and the extraversion measure of the BFI, while negative correlations are expected between P scale and agreeableness and conscientiousness. Positive correlations are also expected between N scale of the EPQ-R and the neuroticism measure of the BFI (e.g., McCrae and Costa, 1985; Draycott and Kline, 1995; Saggino, 2000; Barbaranelli et al., 2003; Scholte and De Bruyn, 2004; Heaven et al., 2013). Neuroticism, in addition, is expected to positively correlate with indexes of anxiety and depression (STAI-Y; Spielberger et al., 1983; PHQ-9; Spitzer et al., 1999; Kroenke et al., 2001). In contrast, extraversion is expected to negatively correlate with these two clinical indexes.

An Exploratory Structural Equation Model (ESEM; Asparouhov and Muthén, 2009) was run to evaluate the factor structure. The ESEM framework represents an integration of confirmatory factor analysis (CFA), structural equation modeling (SEM), and exploratory factor analysis (EFA). ESEMs give access to all the common statistics of SEM/CFA but, at the same time, overcome the restrictions associated with the confirmatory approach. CFA fixes non-target loadings to zero and, therefore, it may be inadequate to handle complex and multifaceted constructs where many cross-loadings may be expected (Marsh et al., 2009, 2010, 2011, 2014). When this is the case, fit problems and upward-biased estimates of correlations between factors can be observed (Cole et al., 2007; Marsh and Hau, 2007; Marsh et al., 2010). As in EFA, ESEMs allow for the free estimation of cross-loadings between items and non-target factors. In this work, ESEM was run using Mplus7 (Muthén and Muthén, 2012), and the WLSMV as estimator (weighted least squares mean and variance-adjusted). This method is recommended for binary or ordinal observed data (e.g., Flora and Curran, 2004; Brown, 2006) such as the dichotomous items of the EPQ-R. In the model, the 48 items were the indicators and four factors were modeled. The GEOMIN oblique rotation was used. To evaluate the goodness of fit of the model, several fit indexes were considered: χ2, Comparative Fit Index (CFI; Bentler, 1990), Weighted Root Mean Square Residual (WRMR; Yu, 2002), and Root Mean Square Error of Approximation (RMSEA; Browne and Cudeck, 1993) with its 90% confidence interval (90% CI) and the test of close fit (CFit; Browne and Cudeck, 1993). A solution fits the data well when χ2 is non-significant (p ≥ 0.05). Since this statistic is sensitive to sample size, the other fit measures were also considered. In particular, a solution fits the data well when CFI is close to 0.95 (0.90 to 0.95 for reasonable fit), WRMR is close to 1.0, and RMSEA is smaller than 0.06 (0.06 to 0.08 for reasonable fit) with CFit non-significant (see Hu and Bentler, 1999; Marsh et al., 2004; Brown, 2006).

Results

Cronbach's α coefficients were 0.55, 0.80, 0.81, and 0.70 for P, E, N, and L scales, respectively. These values were consistent with those of Study 1. Compared with the original version, the largest improvement was reached for P scale, as observed in Study 1.

Convergent validity coefficients are reported in Table 5. All the four PEN-L traits correlated in the expected direction with the considered constructs. E scale showed a strong positive relation with the extraversion measure of the BFI (0.727). P scale was negatively related to agreeableness (−0.323) and conscientiousness (−0.321). N scale was strongly correlated with neuroticism (0.709). Relations with anxiety and depression were also in the expected directions. N scale showed positive relations with scores of PHQ-9 (0.619) and STAI-Y (0.697), while moderate negative relations were found between these two indexes and E scale (r = −0.409, −0.405 for PHQ-9 and STAI-Y, respectively). Finally, L scale showed a strong positive correlation with the IM scale of the BIDR.

TABLE 5
www.frontiersin.org

Table 5. Cronbach's αs and correlations between the four PEN-L traits, STAI-Y, PHQ-9, BIDR-IM, and the five BFI dimensions.

Results of the ESEM supported the four-factor structure of the instrument {χ(942)2 = 1122.686, p < 0.001; RMSEA = 0.025 [0.019, 0.031]; CFit ≅ 1.000; CFI = 0.930; WRMR = 0.864}. The model is represented in Table 6. All items loaded on the intended factor and cross-loadings were, in general, lower than those observed on the target-factor.

TABLE 6
www.frontiersin.org

Table 6. Exploratory structural equation modeling.

Discussion

The analyses performed in this study provide further evidence concerning the adequate psychometric properties of the new short form of the EPQ-R. Concerning reliability, results are in line with those of Study 1 and confirm that, compared with the original version, the largest improvement was observed for P scale. Concerning validity, both the factor structure of the instrument and its convergent validity are supported.

Final Remarks

This work aimed at developing a new and improved version of the short form of the EPQ-R. This instrument is well-known and widely used in different settings. However, some weaknesses have been pointed out, especially for P scale (e.g., Bishop, 1977; Block, 1977; Claridge, 1981). IRT approach was used to develop the new instrument. This approach allowed for removing items with misfit or gender DIF, and for identifying items that were best at discriminating different levels of traits, while ensuring that the respective continua were covered. As suggested in literature, following these criteria for item selection should lead to a short scale with the same psychometric properties of the full-length instrument (Reise and Henson, 2000; Spence et al., 2012). In fact, results of this work show that the new short form of the EPQ-R approximated the scores obtained with the full-length form better than the original short version. In addition, convergent validity of the new scale was consistent with literature (e.g., Saklofske et al., 1995; Gillings and Joseph, 1996; del Barrio et al., 1997; Dazzi et al., 2004; Jylhä and Isometsä, 2006; Mor, 2010). The moderate to strong relationships between Eysenck's traits and clinical constructs provide further evidence toward the usefulness of assessing these traits in clinical settings.

A strength of the present work is that it provides a solution to some well-known drawbacks of the full-length EPQ-R and of its short form existing in the literature (Eysenck et al., 1985; Eysenck and Eysenck, 1991). The largest improvement was obtained for P scale. The new version is not affected by gender DIF and outperforms the original one for reliability and approximation of the full-length form. The new versions of the other three scales performed as well as the original ones, or slightly better. These improvements are small in size, yet notable considering that were obtained by substituting a small number of items and reducing content redundancy.

In the present work, separate analyses have been performed on each of the four scales by using a unidimensional IRT model. An alternative could have been examining the four scales at once through a multidimensional IRT (MIRT) model (see Haberman et al., 2008; Reckase, 2009). MIRT models offer some advantages over unidimensional IRT models. They could allow for better understanding the traits measured by an instrument and how well individual items measure each of them (Ackerman, 1994). Moreover, MIRT models could provide a more precise estimation of scale reliability (Cheng et al., 2009) and item parameters (Finch, 2010). In the present work, some of these advantages are not very relevant. On the one hand, the factor structure of the EPQ-R has been widely tested and validated in the literature (e.g., Hosokawa and Ohyama, 1993; Maltby and Talley, 1998; Forrest et al., 2000; Qian et al., 2000; Scholte and De Bruyn, 2001; Aluja et al., 2003; Alexopoulos and Kalaitzidis, 2004; Dazzi et al., 2004; Francis et al., 2006; Karanci et al., 2006; Tiwari et al., 2009; Picconi et al., 2018). On the other hand, for scales whose length is analogous to that of the four EPQ-R scales (i.e., from 21 to 32 items), the unidimensional IRT models have been found to provide item parameter estimates whose precision exceeds or equals that of the estimates produced by the MIRT models (Finch, 2010). Finch (2010) investigated the precision of MIRT estimates on tests measuring a number of traits as small as two. For larger numbers of traits (e.g., the four traits of the EPQ-R), the number of parameters of a MIRT model increases considerably. Thus, the sample size of Study 1 (590 individuals) could have not been appropriate for performing a multidimensional analysis.

Concerning P scale, despite notable improvements, reliability remains rather low. This result, however, was expected. P scale, in fact, maybe because of its complex and clinical nature, is the most problematic and controversial of the instrument (e.g., Eysenck et al., 1985). Future research, therefore, should try to develop a new pool of items effective in capturing the multifaced aspects of this trait.

In the present work, a new short version of the EPQ-R has been devised, which consists of 12 items per each of the four scales. An abbreviated form exists also in literature (Francis et al., 1992) that consists of only 6 items per scale. This abbreviated form suffers of the same weaknesses that have been pointed out for the other Eysenck's questionnaires. Future research should try to devise a new version of the abbreviated form by using the IRT approach.

Data Availability Statement

The raw data supporting the conclusions of this manuscript will be made available by the authors, without undue reservation, to any qualified researcher.

Author Contributions

DC contributed to the conception and design of the study, conducted the research, performed the statistical analyses, and wrote the first draft of the manuscript. DC and PA wrote sections of the manuscript. All authors contributed to manuscript revision, read and approved the submitted version.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Ackerman, T. A. (1994). Using multidimensional item response theory to understand what items and tests are measuring. Appl. Meas. Educ. 7, 255–278. doi: 10.1207/s15324818ame0704_1

CrossRef Full Text | Google Scholar

Alexopoulos, D. S., and Kalaitzidis, I. (2004). Psychometric properties of Eysenck Personality Questionnaire-Revised (EPQ-R) short scale in Greece. Pers. Indiv. Differ. 37, 1205–1220. doi: 10.1016/j.paid.2003.12.005

CrossRef Full Text | Google Scholar

Almiro, P. A., Moura, O., and Simões, M. R. (2016). Psychometric properties of the European Portuguese version of the Eysenck Personality Questionnaire—Revised (EPQ-R). Pers. Indiv. Differ. 88, 88–93. doi: 10.1016/j.paid.2015.08.050

CrossRef Full Text | Google Scholar

Aluja, A., Garcia, Ó., and Garcia, L. F. (2003). A psychometric analysis of the revised Eysenck Personality Questionnaire short scale. Pers. Indiv. Differ. 35, 449–460. doi: 10.1016/S0191-8869(02)00206-4

CrossRef Full Text | Google Scholar

American Psychiatric Association (2000). Diagnostic and Statistical Manual of Mental Disorders (4th Ed.). Washington, DC: American Psychiatric Association.

Anselmi, P., Vidotto, G., Bettinardi, O., and Bertolotti, G. (2015). Measurement of change in health status with Rasch models. Health Qual. Life Out. 13:16. doi: 10.1186/s12955-014-0197-x

PubMed Abstract | CrossRef Full Text | Google Scholar

Asparouhov, T., and Muthén, B. (2009). Exploratory structural equation modeling. Struct. Equ. Modeling 16, 397–438. doi: 10.1080/10705510903008204

CrossRef Full Text | Google Scholar

Baker, F. B. (2001). The Basics of Item Response Theory (2nd Ed.). Maryland: ERIC Clearinghouse on Assessment and Evaluation.

PubMed Abstract | Google Scholar

Balsamo, M., Giampaglia, G., and Saggino, A. (2014). Building a new Rasch-based self-report inventory of depression. Neuropsych. Dis. Treat. 10, 153–165. doi: 10.2147/NDT.S53425

PubMed Abstract | CrossRef Full Text | Google Scholar

Barbaranelli, C., Caprara, G. V., Rabasca, A., and Pastorelli, C. (2003). A questionnaire for measuring the Big Five in late childhood. Pers. Indiv. Differ. 34, 645–664. doi: 10.1016/S0191-8869(02)00051-X

CrossRef Full Text | Google Scholar

Barnes, G. E., Malamuth, N. M., and Check, J. V. (1984). Personality and sexuality. Pers. Indiv. Differ. 5, 159–172.

Google Scholar

Bentler, P. M. (1990). Comparative fit indexes in structural models. Psychol. Bull. 107, 238–246.

Google Scholar

Bergomi, M., Modenese, A., Ferretti, E., Ferrari, A., Licitra, G., Vivoli, R., et al. (2017). Work-related stress and role of personality in a sample of Italian bus drivers. Work 57, 433–440. doi: 10.3233/WOR-172581

PubMed Abstract | CrossRef Full Text | Google Scholar

Bishop, D. V. (1977). The P scale and psychosis. J. Abnorm. Psychol. 86, 127–134.

PubMed Abstract | Google Scholar

Blaszczynski, A. P., Buhrich, N., and McConaghy, N. (1985). Pathological gamblers, heroin addicts and controls compared on the EPQ ‘Addiction Scale’. Br. J. Addict. 80, 315–319.

Google Scholar

Block, J. (1977). P scale and psychosis: continued concerns. J. Abnorm. Psychol. 86, 431–434.

PubMed Abstract | Google Scholar

Bobbio, A., and Manganelli, A. M. (2011). Measuring social desirability responding. A short version of Paulhus' BIDR 6. TPM 18, 117–135. doi: 10.4473/TPM.18.2.4

CrossRef Full Text | Google Scholar

Bock, R. D. (1972). Estimating item parameters and latent ability when responses are scored in two or more nominal categories. Psychometrika 37, 29–51. doi: 10.1007/BF02291411

CrossRef Full Text | Google Scholar

Bock, R. D. (1997). A brief history of item response theory. Educ. Meas. Issues Pract. 16, 21–33. doi: 10.1111/j.1745-3992.1997.tb00605.x

CrossRef Full Text | Google Scholar

Bogaert, A. F. (1993). Personality, delinquency, and sexuality: data from three Canadian samples. Pers. Indiv. Differ. 15, 353–356.

Google Scholar

Bortolotti, S. L. V., Tezza, R., de Andrade, D. F., Bornia, A. C., and de Sousa Júnior, A. F. (2013). Relevance and advantages of using the item response theory. Qual. Quant. 47, 2341–2360. doi: 10.1007/s11135-012-9684-5

CrossRef Full Text | Google Scholar

Brown, T. A. (2006). Confirmatory Factor Analysis for Applied Research. New York, NY: Guilford Press.

Google Scholar

Browne, M. W., and Cudeck, R. (1993). “Alternative ways of assessing model fit,” in Testing Structural Equation Models, eds. K. A. Bollen and J. S. Long (Newbury Park, CA: SAGE), 136–162.

Cheng, Y. Y., Wang, W. C., and Ho, Y. H. (2009). Multidimensional Rasch analysis of a psychological test with multiple subtests: a statistical solution for the bandwidth–fidelity dilemma. Educ. Psychol. Meas. 69, 369–388. doi: 10.1177/0013164408323241

CrossRef Full Text | Google Scholar

Chico, E., and Ferrando, P. J. (1995). A psychometric evaluation of the revised P scale in delinquent and non-delinquent Spanish samples. Pers. Indiv. Differ. 18, 331–337.

Google Scholar

Chico, E., Tous, J. M., Lorenzo-Seva, U., and Vigil-Colet, A. (2003). Spanish adaptation of Dickman's impulsivity inventory: its relationship to Eysenck's personality questionnaire. Pers. Indiv. Differ. 35, 1883–1892. doi: 10.1016/S0191-8869(03)00037-0

CrossRef Full Text | Google Scholar

Chiorri, C., Marsh, H. W., Ubbiali, A., and Donati, D. (2016). Testing the factor structure and measurement invariance across gender of the Big Five inventory through exploratory structural equation modeling. J. Pers. Assess. 98, 88–99. doi: 10.1080/00223891.2015.1035381

CrossRef Full Text | Google Scholar

Claridge, G. S. (1981). “Psychoticism,” in Dimensions of Personality. Papers in Honour of H.J. Eysenck, eds. H. J. Eysenck, and R. Lynn (Oxford, UK: Pergamon Press), 79–110.

Google Scholar

Cohen, J. (1988). Statistical Power and Analysis for the Behavioral Sciences (2nd Ed.). Hillsdale, N.J: Lawrence Erlbaum Associates, Inc.

Google Scholar

Cole, D. A., Ciesla, J. A., and Steiger, J. H. (2007). The insidious effects of failing to include design-driven correlated residuals in latent-variable covariance structure analysis. Psychol. Methods 12, 381–398. doi: 10.1037/1082-989X.12.4.381

PubMed Abstract | CrossRef Full Text | Google Scholar

Colledani, D. (2018). Psychometric properties and gender invariance for the Dickman Impulsivity Inventory. Test. Psychometr. Methodol. Appl. Psychol. 25, 49–61. doi: 10.4473/TPM25.1.3

CrossRef Full Text | Google Scholar

Colledani, D., Robusto, E., and Anselmi, P. (2018). Development of a new abbreviated form of the Junior Eysenck Personality Questionnaire-Revised. Pers. Indiv. Differ. 120, 159–165. doi: 10.1016/j.paid.2017.08.037

CrossRef Full Text | Google Scholar

Conrad, P. J., Petersen, J. B., and Pihl, R. O. (1997). Disinhibited personality and sensitivity to alcohol reinforcement: independent correlates of drinking behavior in sons of alcoholics. Alcohol. Clin. Exp. Res. 21, 1320–1332.

Google Scholar

Corulla, W. J. (1990). A revised version of the psychoticism scale for children. Pers. Indiv. Differ. 11, 65–76. doi: 10.1016/0191-8869(90)90169-R

CrossRef Full Text | Google Scholar

Da Dalt, L., Anselmi, P., Bressan, S., Carraro, S., Baraldi, E., Robusto, E., et al. (2013). A short questionnaire to assess pediatric resident's competencies: the validation process. Ital. J. Pediatr. 39:41. doi: 10.1186/1824-7288-39-41

PubMed Abstract | CrossRef Full Text | Google Scholar

Da Dalt, L., Anselmi, P., Furlan, S., Carraro, S., Baraldi, E., Robusto, E., et al. (2015). Validating a set of tools designed to assess the perceived quality of training of pediatric residency programs. Ital. J. Pediatr. 41:2. doi: 10.1186/s13052-014-0106-2

PubMed Abstract | CrossRef Full Text | Google Scholar

Dazzi, C. (2011). The Eysenck personality questionnaire–Revised (EPQ-R): A confirmation of the factorial structure in the Italian context. Pers. Indiv. Differ. 50, 790–794. doi: 10.1016/j.paid.2010.12.032

CrossRef Full Text | Google Scholar

Dazzi, C., Pedrabissi, L., and Santinello, M. (2004). Adattamento Italiano delle Scale di Personalità Eysenck per Adulti [Italian Adaptation of the Eysenck Personality Scales for Adults]. Firenze, IT: Organizzazioni Speciali.

del Barrio, V., Moreno-Rosset, C., López-Martínez, R., and Olmedo, M. (1997). Anxiety, depression and personality structure. Pers. Indiv. Differ. 23, 327–335.

Denney, D. R., and Frisch, M. B. (1981). The role of neuroticism in relation to life stress and illness. J. Psychosom. Res. 25, 303–307.

PubMed Abstract | Google Scholar

Draycott, S. G., and Kline, P. (1995). The big three or the big five—The EPQ-R vs the NEOPI: a research note, replication and elaboration. Pers. Indiv. Differ. 18, 801–804. doi: 10.1016/0191-8869(95)00010-4

CrossRef Full Text | Google Scholar

Escorial, S., and Navas, M. J. (2007). Analysis of the gender variable in the Eysenck Personality Questionnaire–revised scales using differential item functioning techniques. Educ. Psychol. Meas. 67, 990–1001. doi: 10.1177/0013164406299108

CrossRef Full Text

Eysenck, H. J. (1991). Neuroticism, anxiety, and depression. Psychol. Inq. 2, 75–76.

Eysenck, H. J., and Eysenck, S. B. G. (1964). Manual of the Eysenck Personality Inventory. London, UK: University of London Press.

Eysenck, H. J., and Eysenck, S. B. G. (1975). Manual of the Eysenck Personality Questionnaire (Adult and Junior). London, UK: Hodder and Stoughton.

Eysenck, H. J., and Eysenck, S. B. G. (1991). Manual of the Eysenck Personality Scale (Adults). London, UK: Hodder and Stoughton.

Eysenck, S., and Barrett, P. (2013). Re-introduction to cross-cultural studies of the EPQ. Pers. Indiv. Differ. 54, 485–489. doi: 10.1016/j.paid.2012.09.022

CrossRef Full Text | Google Scholar

Eysenck, S. B., Eysenck, H. J., and Barrett, P. (1985). A revised version of the psychoticism scale. Pers. Indiv. Differ. 6, 21–29. doi: 10.1016/0191-8869(85)90026-1

CrossRef Full Text | Google Scholar

Eysenck, S. B. G., and Eysenck, H. J. (1963). On the dual nature of extraversion. Brit J. Clin. Psychol. 2, 46–55.

Google Scholar

Ferrando, P. J. (2001). The measurement of neuroticism using MMQ, MPI, EPI and EPQ items: a psychometric analysis based on item response theory. Pers. Indiv. Differ. 30, 641–656. doi: 10.1016/S0191-8869(00)00062-3

CrossRef Full Text | Google Scholar

Ferrando, P. J., and Anguiano-Carrasco, C. (2009). The interpretation of the EPQ Lie scale scores under honest and faking instructions: a multiple-group IRT-based analysis. Pers. Indiv. Differ. 46, 552–556. doi: 10.1016/j.paid.2008.12.013

CrossRef Full Text | Google Scholar

Ferrando, P. J., and Chico, E. (2001). Detecting dissimulation in personality test scores: a comparison between person-fit indices and detection scales. Educ. Psychol. Meas. 61, 997–1012. doi: 10.1177/00131640121971617

CrossRef Full Text | Google Scholar

Finch, H. (2010). Item parameter estimation for the MIRT model: Bias and precision of confirmatory factor analysis–based models. Appl. Psych. Meas. 34, 10–26. doi: 10.1177/0146621609336112

CrossRef Full Text | Google Scholar

Flora, D. B., and Curran, P. J. (2004). An empirical evaluation of alternative methods of estimation for confirmatory factor analysis with ordinal data. Psychol. Methods 9, 466–491. doi: 10.1037/1082-989X.9.4.466

PubMed Abstract | CrossRef Full Text | Google Scholar

Forrest, S., Lewis, C. A., and Shevlin, M. (2000). Examining the factor structure and differential functioning of the Eysenck Personality Questionnaire Revised-abbreviated. Pers. Indiv. Differ. 29, 579–588. doi: 10.1016/S0191-8869(99)00220-2

CrossRef Full Text | Google Scholar

Francis, L. J. (1996). The relationship between Eysenck's personality factors and attitude towards substance use among 13–15-year-olds. Pers. Indiv. Differ. 21, 633–640. doi: 10.1016/0191-8869(96)00125-0

CrossRef Full Text | Google Scholar

Francis, L. J., Brown, L. B., and Philipchalk, R. (1992). The development of an abbreviated form of the revised Eysenck Personality Questionnaire (EPQR-A): its use among students in England, Canada, the USA and Australia. Pers. Indiv. Differ. 13, 443–449. doi: 10.1016/0191-8869(92)90073-X

CrossRef Full Text | Google Scholar

Francis, L. J., Lewis, C. A., and Ziebertz, H. G. (2006). The short-form revised Eysenck personality questionnaire (EPQR-S): a German edition. Soc. Behav. Personal. 34, 197–204. doi: 10.2224/sbp.2006.34.2.197

CrossRef Full Text | Google Scholar

Francis, L. J., and Pearson, P. R. (1988). Religiosity and the short-scale EPQ-R indices of E, N and L, compared with the JEPI, JEPQ and EPQ. Pers. Indiv. Differ. 9, 653–651. doi: 10.1016/0191-8869(88)90162-6

CrossRef Full Text

Gale, C. R., Booth, T., Mõttus, R., Kuh, D., and Deary, I. J. (2013). Neuroticism and Extraversion in youth predict mental wellbeing and life satisfaction 40 years later. J. Res. Pers. 47, 687–697. doi: 10.1016/j.jrp.2013.06.005

CrossRef Full Text

Gillings, V., and Joseph, S. (1996). Religiosity and social desirability: Impression management and self-deceptive positivity. Pers. Indiv. Differ. 21, 1047–1050.

Google Scholar

Gómez-Benito, J., Hidalgo, M. D., and Padilla, J. L. (2009). Efficacy of effect size measures in logistic regression: an application for detecting DIF. Methodology 5, 18–25. doi: 10.1027/1614-2241.5.1.18

CrossRef Full Text | Google Scholar

Grau, E., and Ortet, G. (1999). Personality traits and alcohol consumption in a sample of non-alcoholic women. Pers. Indiv. Differ. 27, 1057–1066.

Google Scholar

Gudgeon, E. T., Connor, J. P., Young, R. M., and Saunders, J. B. (2005). The relationship between personality and drinking restraint in an alcohol dependent sample. Pers. Indiv. Differ. 39, 885–893. doi: 10.1016/j.paid.2005.03.009

CrossRef Full Text | Google Scholar

Haberman, S. J., von Davier, M., and Lee, Y.-H. (2008). Comparison of Multidimensional Item Response Models: Multivariate Normal Ability Distributions Versus Multivariate Polytomous Ability Distributions (RR-08-45). Educational Testing Service.

Google Scholar

Heaven, P. C., Ciarrochi, J., Leeson, P., and Barkus, E. (2013). Agreeableness, conscientiousness, and psychoticism: distinctive influences of three personality dimensions in adolescence. Brit. J. Psychol. 104, 481–494. doi: 10.1111/bjop.12002

PubMed Abstract | CrossRef Full Text | Google Scholar

Heaven, P. C., Newbury, K., and Wilson, V. (2004). The Eysenck psychoticism dimension and delinquent behaviours among non-criminals: changes across the lifespan? Pers. Indiv. Differ. 36, 1817–1825. doi: 10.1016/j.paid.2003.07.003

CrossRef Full Text | Google Scholar

Hosokawa, T., and Ohyama, M. (1993). Reliability and validity of a Japanese version of the short-form Eysenck Personality Questionnaire—Revised. Psychol. Rep. 72, 823–832.

Google Scholar

Howarth, E. (1986). What does Eysenck's psychoticism scale really measure? Brit. J. Psychol. 77, 223–227.

PubMed Abstract | Google Scholar

Hoyle, R. H., Fejfar, M. C., and Miller, J. D. (2000). Personality and sexual risk taking: a quantitative review. J. Pers. 68, 1203–1231. doi: 10.1111/1467-6494.00132

CrossRef Full Text | Google Scholar

Hu, L. T., and Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: conventional criteria versus new alternatives. Struct. Equ. Modeling 6, 1–55.

Google Scholar

Huang, L., Zhou, D., Yao, Y., and Lan, Y. (2015). Relationship of personality with job burnout and psychological stress risk in clinicians. Chin. J. Industr. Hyg. Occup. Dis. 33, 84–87.

PubMed Abstract | Google Scholar

Jodoin, M. G., and Gierl, M. J. (2001). Evaluating type I error and power rates using an effect size measure with logistic regression procedure for DIF detection. Appl. Meas. Educ. 14, 329–349. doi: 10.1207/S15324818AME1404_2

CrossRef Full Text | Google Scholar

John, O. P., Naumann, L. P., and Soto, C. J. (2008). “Paradigm shift to the integrative Big Five trait taxonomy: History, measurement, and conceptual issues,” in Handbook of Personality: Theory and Research, eds. O.P. John, R.W. Robins, and L.A. Pervin (New York, NY: Guilford), 114–158.

Judge, T. A., Bono, J. E., and Locke, E. A. (2000). Personality and job satisfaction: the mediating role of job characteristics. J. Appl. Psychol. 85, 237–249. doi: 10.1037/0021-9010.85.2.237

PubMed Abstract | CrossRef Full Text | Google Scholar

Jylhä, P., and Isometsä, E. (2006). The relationship of neuroticism and extraversion to symptoms of anxiety and depression in the general population. Depress. Anxiety 23, 281–289. doi: 10.1002/da.20167

PubMed Abstract | CrossRef Full Text | Google Scholar

Karanci, A. N., Dirik, G., and Yorulmaz, O. (2006). [Reliability and validity studies of Turkish translation of Eysenck personality questionnaire revised-Abbreviated]. Turk. J. Psychiatry 18, 254–261.

PubMed Abstract | Google Scholar

Katz, Y. J., and Francis, L. J. (2000). Hebrew revised Eysenck Personality Questionnaire: Short form (EPQR-S) and abbreviated form (EPQR-A). Soc. Behav. Personal. 28, 555–560. doi: 10.2224/sbp.2000.28.6.555

CrossRef Full Text | Google Scholar

Kroenke, K., Spitzer, R. L., and Williams, J. B. (2001). The PHQ-9: validity of a brief depression severity measure. J. Gen. Intern. Med. 16, 606–613. doi: 10.1046/j.1525-1497.2001.016009606.x

PubMed Abstract | CrossRef Full Text | Google Scholar

Laidra, K., Pullmann, H., and Allik, J. (2007). Personality and intelligence as predictors of academic achievement: a cross-sectional study from elementary to secondary school. Pers. Indiv. Differ. 42, 441–451. doi: 10.1016/j.paid.2006.08.001

CrossRef Full Text | Google Scholar

Lajunen, T., and Scherler, H. R. (1999). Is the EPQ Lie Scale bidimensional? Validation study of the structure of the EPQ Lie Scale among Finnish and Turkish university students. Pers. Indiv. Differ. 26, 657–664.

Google Scholar

Levy, P. (1967). The correction for spurious correlation in the evaluation of short-form tests. J. Clin. Psychol. 23, 84–86. doi: 10.1002/1097-4679(196701)23:1<84::AID-JCLP2270230123>3.0.CO;2-2

PubMed Abstract | CrossRef Full Text | Google Scholar

Lodhi, P. H., and Thakur, S. (1993). Personality of drug addicts: Eysenckian analysis. Pers. Indiv. Differ. 15, 121–128.

Google Scholar

Lu, L. (1995). The relationship between subjective well-being and psychosocial variables in Taiwan. J. Soc. Psychol. 135, 351–357.

PubMed Abstract | Google Scholar

Lynn, R., and Martin, T. (1997). Gender differences in extraversion, neuroticism, and psychoticism in 37 nations. J. Soc. Psychol. 137, 369–373.

PubMed Abstract | Google Scholar

Magis, D., Beland, S., and Raiche, G. (2016). Package ‘difR’. R package version 4.7.

Maij-de Meij, A. M., Kelderman, H., and Flier, V. D. H. (2008). Fitting a mixture item response theory model to personality questionnaire data: characterizing latent classes and investigating possibilities for improving prediction. Appl. Psych. Meas. 32, 611–631. doi: 10.1177/0146621607312613

CrossRef Full Text | Google Scholar

Maltby, J., and Talley, M. (1998). The psychometric properties of an abbreviated form of the revised Junior Eysenck Personality Questionnaire (JEPQR-A) among 12–15-year old US young persons. Pers. Indiv. Differ. 24, 891–893. doi: 10.1016/S0191-8869(97)00234-1

CrossRef Full Text | Google Scholar

Marsh, H. W., and Hau, K. T. (2007). Applications of latent-variable models in educational psychology: the need for methodological-substantive synergies. Contemp. Educ. Psychol. 32, 151–170. doi: 10.1016/j.cedpsych.2006.10.008

CrossRef Full Text | Google Scholar

Marsh, H. W., Hau, K. T., and Wen, Z. (2004). In search of golden rules: comment on hypothesis-testing approaches to setting cutoff values for fit indexes and dangers in overgeneralizing Hu and Bentler's (1999) findings. Struct. Equ. Modeling 11, 320–341. doi: 10.1207/s15328007sem1103_2

CrossRef Full Text | Google Scholar

Marsh, H. W., Lüdtke, O., Muthén, B., Asparouhov, T., Morin, A. J., Trautwein, U., et al. (2010). A new look at the big five factor structure through exploratory structural equation modeling. Psychol. Assess. 22, 471–491. doi: 10.1037/a0019227

PubMed Abstract | CrossRef Full Text | Google Scholar

Marsh, H. W., Morin, A. J., Parker, P. D., and Kaur, G. (2014). Exploratory structural equation modeling: an integration of the best features of exploratory and confirmatory factor analysis. Annu. Rev. Clin. Psycho. 10, 85–110. doi: 10.1146/annurev-clinpsy-032813-153700

PubMed Abstract | CrossRef Full Text | Google Scholar

Marsh, H. W., Muthén, B., Asparouhov, T., Lüdtke, O., Robitzsch, A., Morin, A. J., et al. (2009). Exploratory structural equation modeling, integrating CFA and EFA: application to students' evaluations of university teaching. Struct. Equ. Modeling 16, 439–476. doi: 10.1080/10705510903008220

CrossRef Full Text

Marsh, H. W., Nagengast, B., Morin, A. J., Parada, R. H., Craven, R. G., and Hamilton, L. R. (2011). Construct validity of the multidimensional structure of bullying and victimization: an application of exploratory structural equation modeling. J. Educ. Psychol. 103, 701–732. doi: 10.1037/a0024122

CrossRef Full Text | Google Scholar

McCrae, R. R., and Costa, P. T. (1985). Comparison of EPI and psychoticism scales with measures of the five-factor model of personality. Pers. Indiv. Differ. 6, 587–597. doi: 10.1016/0191-8869(85)90008-X

CrossRef Full Text | Google Scholar

Mor, N. (2010). “Eysenck personality questionnaire,” in The Corsini Encyclopedia of Psychology, eds. I. B. Weiner and W. E. Craighead (Hoboken, NJ: John Wiley & Sons, Inc.), 1–2.

Google Scholar

Muthén, L. K., and Muthén, B. O. (2012). Mplus Version 7 User's Guide. Los Angeles, CA: Muthén & Muthén.

Nagelkerke, N. J. D. (1991). A note on a general definition of the coefficient of determination. Biometrika 78, 691–692. doi: 10.1093/biomet/78.3.691

CrossRef Full Text | Google Scholar

Nyborg, H. E. (1997). The Scientific Study of Human Nature: tribute to Hans J. Eysenck at Eighty. New York, NY: Elsevier.

Google Scholar

Paulhus, D. L. (1991). “Measurement and control of response bias,” in Measures of Personality and Social Psychological Attitudes, Vol. 1, eds. J. P. Robinson, P. R. Shaver, and L. S. Wrightsman (San Diego, CA: Academic Press), 17–59.

Google Scholar

Pedrabissi, L., and Santinello, M. (1989). Inventario per L'ansia di ≪Stato≫ e di ≪Tratto≫: Nuova Versione Italiana Dello Stai Forma Y: Manuale. Firenze, IT: Organizzazioni Speciali.

Petrillo, J., Cano, S. J., McLeod, L. D., and Coon, C. D. (2015). Using classical test theory, item response theory, and Rasch measurement theory to evaluate patient-reported outcome measures: a copmparison of worked examples. Value Health 18, 25–34. doi: 10.1016/j.jval.2014.10.005

PubMed Abstract | CrossRef Full Text | Google Scholar

Picconi, L., Jackson, C. J., Balsamo, M., Tommasi, M., and Saggino, A. (2018). Factor structure and measurement invariance across groups of the Italian Eysenck Personality Questionnaire-Short form (EPP-S). Pers. Indiv. Differ. 123, 76–80. doi: 10.1016/j.paid.2017.11.013

CrossRef Full Text | Google Scholar

Qian, M., Wu, G., Zhu, R., and Zhang, S. (2000). Development of the revised Eysenck personality questionnaire short scale for Chinese (EPQ-RSC). Acta Psychol. Sinica 32, 317–323.

Google Scholar

R Core Team (2016). A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Available online at: http://www.R-project.org/

Reckase, M. D. (2009). Multidimensional Item Response Theory: Statistics for Social and Behavioral Sciences. New York, NY: Springer. doi: 10.1007/978-0-387-89976-3

CrossRef Full Text

Reise, S. P. (1990). A comparison of item-and person-fit methods of assessing model-data fit in IRT. Appl. Psych. Meas. 14, 127–137. doi: 10.1177/014662169001400202

CrossRef Full Text | Google Scholar

Reise, S. P., and Henson, J. M. (2000). Computerization and adaptive administration of the NEO PI-R. Assessment 7, 347–364. doi: 10.1177/107319110000700404

PubMed Abstract | CrossRef Full Text | Google Scholar

Rizopoulos, D. (2012). Package ‘ltm’. R package version 1.1–0.

Rocklin, T., and Revelle, W. (1981). The measurement of extroversion: a comparison of the Eysenck Personality Inventory and the Eysenck Personality Questionnaire. Brit. J. Soc. Psychol. 20, 279–284.

Google Scholar

Roger, D., and Morris, J. (1991). The internal structure of the EPQ scales. Pers. Indiv. Differ. 12, 759–764.

Google Scholar

Saggino, A. (2000). The big three or the big five? A replication study. Pers. Indiv. Differ. 28, 879–886. doi: 10.1016/S0191-8869(99)00146-4

PubMed Abstract | CrossRef Full Text | Google Scholar

Saklofske, D. H., Kelly, I. W., and Janzen, B. L. (1995). Neuroticism, depression, and depression proneness. Pers. Indiv. Differ. 18, 27–31.

Google Scholar

Sanavio, E., Bertolotti, G., Bettinardi, O., Michielin, P., Vidotto, G., and Zotti, A. M. (2013). The cognitive behavioral assessment (CBA) project: presentation and proposal for international collaboration. Psychol. Community Health 2, 362–380. doi: 10.5964/pch.v2i3.61

CrossRef Full Text | Google Scholar

Scholte, R. H., and De Bruyn, E. E. (2001). The Revised Junior Eysenck Personality Questionnaire (JEPQ-R): Dutch replications of the full-length, short, and abbreviated forms. Pers. Indiv. Differ. 31, 615–625. doi: 10.1016/S0191-8869(00)00166-5

CrossRef Full Text | Google Scholar

Scholte, R. H., and De Bruyn, E. E. (2004). Comparison of the Giant Three and the Big Five in early adolescents. Pers. Indiv. Differ. 36, 1353–1371. doi: 10.1016/S0191-8869(03)00234-4

CrossRef Full Text | Google Scholar

Smillie, L. D., Bhairo, Y., Gray, J., Gunasinghe, C., Elkin, A., McGuffin, P., et al. (2009). Personality and the bipolar spectrum: normative and classification data for the Eysenck Personality Questionnaire–Revised. Compr. Psychiat. 50, 48–53. doi: 10.1016/j.comppsych.2008.05.010

CrossRef Full Text | Google Scholar

Sotgiu, I., Anselmi, P., and Meneghini, A. M. (2018). Investigating the psychometric properties of the Questionnaire for Eudaimonic Well-Being: A Rasch analysis. Test. Psychometr. Methodol. Appl. Psychol. (in press).

Spence, R., Owens, M., and Goodyer, I. (2012). Item response theory and validity of the NEO-FII in adolescents. Pers. Indiv. Differ. 53, 801–807. doi: 10.1016/j.paid.2012.06.002

CrossRef Full Text | Google Scholar

Spielberger, C. D., Gorsuch, R. L., Lushene, R., Vagg, P. R., and Jacobs, G. A. (1983). Manual for the State-Trait Anxiety Inventory STAI. Palo Alto, CA: Consulting Psychologists Press.

Google Scholar

Spitzer, R. L., Kroenke, K., and Williams, J. B. (1999). Validation and utility of a self-report version of PRIME-MD: the PHQ primary care study. JAMA 282, 1737–1744.

PubMed Abstract | Google Scholar

Thissen, D., and Steinberg, L. (2009). “Item response theory,” in The Sage Handbook of Quantitative Methods in Psychology, eds. R. Millsap and A. Maydeu-Olivares (London: Sage), 148–177.

Google Scholar

Tiwari, T., Singh, A. L., and Singh, I. L. (2009). The short-form revised Eysenck personality questionnaire: a Hindi edition (EPQRS-H). Ind. Psychiatry J. 18, 27–31. doi: 10.4103/0972-6748.57854

PubMed Abstract | CrossRef Full Text | Google Scholar

Ubbiali, A., Chiorri, C., and Hampton, P. (2013). Italian Big Five Inventory. Psychometric properties of the Italian adaptation of the Big Five Inventory (BFI). Appl. Psychol. Bull. 266, 37–48.

Google Scholar

Vidotto, G., Cioffi, R., Saggino, A., and Wilson, G. (2008). The Italian version of the Junior Eysenck Personality Questionnaire: a confirmatory factor analysis. Psychol. Rep. 103, 715–726. doi: 10.2466/pr0.103.3.715-726

PubMed Abstract | CrossRef Full Text | Google Scholar

Wood, J., and Newton, A. K. (2003). The role of personality and blame attribution in prisoners' experiences of anger. Pers. Indiv. Differ. 34, 1453–1465. doi: 10.1016/S0191-8869(02)00127-7

CrossRef Full Text | Google Scholar

Wright, B. D., and Linacre, J. M. (1994). Reasonable mean-square fit values. Rasch Meas. Trans. 8:370.

Google Scholar

Wright, B. D., and Masters, G. N. (1982). Rating Scale Analysis: Rasch Measurement. Chicago: Mesa Press.

Yu, C. Y. (2002). Evaluating Cutoff Criteria of Model Fit Indices for Latent Variable Models with Binary and Continuous Outcomes (Doctoral dissertation, University of California Los Angeles). Available online at: http://www.researchgate.net/profile/Brian_Goldman3/publication/222578792_A_Multicomponent_Conceptualization_of_Authenticity_Theory_and_Research/links/00b49520552576cb3b000000.pdf

Zanon, C., Hutz, C. S., Yoo, H., and Hambleton, R. K. (2016). An application of item response theory to psychological test development. Psicologia: Reflexão e Crítica 29:18. doi: 10.1186/s41155-016-0040-x

CrossRef Full Text | Google Scholar

Keywords: short Eysenck personality questionnaire-revised, item response theory, 2PL, ESEM, DIF

Citation: Colledani D, Anselmi P and Robusto E (2018) Using Item Response Theory for the Development of a New Short Form of the Eysenck Personality Questionnaire-Revised. Front. Psychol. 9:1834. doi: 10.3389/fpsyg.2018.01834

Received: 15 March 2018; Accepted: 07 September 2018;
Published: 02 October 2018.

Edited by:

Marco Innamorati, Università Europea di Roma, Italy

Reviewed by:

Davide Marengo, Università degli Studi di Torino, Italy
Pietro Cipresso, Istituto Auxologico Italiano (IRCCS), Italy

Copyright © 2018 Colledani, Anselmi and Robusto. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Daiana Colledani, daianacolledani@gmail.com