Short Forms of Wechsler Scales Assessing the Intellectually Gifted Children Using Simulation Data
- CRP-CPO EA 7273, Université de Picardie Jules Verne, Amiens, France
Intellectual giftedness is usually defined in terms of having a very high Intellectual Quotient (IQ). The intellectual capacity is assessed by a standardized test such as the Wechsler Intelligence Scale for Children (WISC). However, the identification of intellectually gifted children (IGC) often remains time-consuming. A short-form WISC can be used as a screening instrument. The practitioners and researchers in this field can then make a more in-depth evaluation of the IGC's cognitive and socioemotional characteristics if needed. The aim of our study is thus to determine the best short tests, in terms of their psychometric qualities, for the identification of IGC. The current study is composed of three-step analyses. Firstly, we created nine IQs short forms (IQSF) with 2-subtests, and nine IQSF with 4-subtests from the WISC-IV (Wechsler, 2005). Secondly, we estimated psychometric parameters (i.e., reliability and validity) from empirical and simulated dataset with WISC-IV. The difference in the estimation of psychometric qualities of each IQSF from the simulated data is very close to those derived from empirical data. We thus selected the three best IQSF based on these psychometrics parameters estimated from simulated datasets. For each selected short form of the WISC-IV, we estimated the screening quality in our sample of IGC. Thirdly, we created IQSF with 2- and 4-subtests from the WISC-V (Wechsler, 2016) with simulated dataset. We then highlighted the three best short forms of WISC-V based on the estimated psychometric parameters. The results are interpreted in terms of validity, reliability and screening quality of IGC. In spite of the important changes in the WISC-V, our findings show that the 2-subtest form, Similitaries + Matrix Reasoning, and 4-subtest form, Similitaries + Vocabulary + Matrix Reasoning + Block Design, are the most efficient to identify the IGC at the two recent versions of Wechsler scales. Finally, we discuss the advantages and drawbacks of a brief assessment of intellectual aptitudes for the identification of the IGC.
In the sphere of education, the rapid and reliable evaluation of a child's global intellectual capacity is important for an efficient identification of intellectually gifted children (IGC). Indeed, such evaluation contributes in proposing specific educational programs (e.g., accelerated or enrichment programs). In this context, a short evaluation can initially be used to assess a child's intellectual giftedness. It can then serve to determine whether a more in-depth evaluation of the child's cognitive and socioemotional characteristics is needed.
The importance and usefulness of cognitive ability assessments for the identification of IGC have long been recognized by the scientific community as a means of facilitating the integration of IGC in specialized school programs (Simpson et al., 2002; Pierson et al., 2012). Furthermore, special education such as enrichment education has proved to impact positively on their cognitive skills (Shi et al., 2013). Consequently, an early identification of intellectual giftedness appears to be a predictor of not only their psychological well-being (Neihart, 1999; Litster and Roberts, 2011; Kroesbergen et al., 2016), but also their professional success as adults (Rinn and Bishop, 2015).
Intellectual giftedness is usually defined in terms of a high Intellectual Quotient (IQ) resulting from a standardized and validated intelligence test (Winner, 2000; see Lovett and Lewandowski, 2006). The Wechsler Intelligence Scale for Children (WISC-IV; Wechsler, 2005; WISC-V; Wechsler, 2016) is one of the most common tools used by both researchers and clinicians for the evaluation of intellectual capacity (Evers et al., 2012). It is currently used to identify intellectually gifted children in school (McClain and Pfeiffer, 2012). It allows a person's global intellectual potential to be estimated on the basis of a Full Scale Intellectual Quotient (FSIQ). The most commonly considered threshold value of the FSIQ is a score greater than or equal to 130, i.e., 2 standard deviations above the average (Carman, 2013). However, this cutoff is an ideal target value. Very high cutoffs also require a very high level of reliability in the measurement, in order to ensure satisfactory identification performance. Indeed, the measurement error has an influence on the identification of IGC (McIntosh et al., 2005). It perturbs the accuracy of decisions related to their identification (McBee et al., 2013). According to McIntosh et al. (2005), a threshold FSIQ value of 125, i.e., the 95th percentile, appears to be a reasonable choice for the identification of IGC. In the current study, we retained this cutoff of 125 in Wechsler's scale to identify IGC.
The recent versions of Wechsler scales are explicitly developed on the basis of the theoretical model of Carroll-Horn-Cattel (CHC; Keith et al., 2006; Lecerf et al., 2010a; Golay et al., 2013; Weiss et al., 2013). The literature is consensual with regard to the number of global cognitive abilities evaluated by the WISC-IV (Reverte et al., 2015) or WISC-V (Reynolds and Keith, 2017). Subtests are posited to estimate global cognitive aptitudes: Similarities, Vocabulary, and Comprehension are considered to evaluate the Comprehension-Knowledge (Gc) ability; Pictures Concept, Figure Weights (only WISC-V) and Reasoning Matrix are considered to estimate Fluid reasoning (Gf) ability; Block Design and Visual Puzzle (only WISC-V) are posited to evaluate mainly Visual processing (Gv) ability; Digit Span, Picture Span (only WISC-V) and Letter-Number Sequencing are considered to evaluate Short-memory ability (Gsm), and Coding and Symbol target processing Speed (Gs) ability.
In practice, IGC have higher performance for Gc, Gf and Gv than for Gsm and Gs (Volker et al., 2006; Rowe et al., 2010). It thus appears that IGC performed more efficiently in terms of high-level abilities than in terms of low-level abilities. The consequence of this is the presence of strong scatter scores in Wechsler scale with IGC. According to Flanagan and Kaufman (2009), an important and uncommon variability can affect the interpretation of the FSIQ. Indeed, the aim of the FSIQ is to summarize all cognitive aptitudes assessed by the scale. The FSIQ score is designed to estimate Spearman's g factor 1904. If the tests are too heterogeneous, the common variance between tests decreases and the mean value of the performance does not provide a satisfactory representation of the overall cognitive aptitude. This abnormal variability between scores in Wechsler scale could have an impact on the estimation of a child's overall intellectual functioning (Lecerf et al., 2015). To mitigate this problem, Weiss et al. (2008) proposed a new indicator: the Global Aptitude Index (GAI). This is estimated from subtests used mainly to evaluate Gc, Gf, and Gv abilities. These are the most highly g-saturated core subtests (Keith et al., 2006; Lecerf et al., 2010b). The GAI would thus provide a better estimation of overall cognitive functioning when low g-loaded cognitive aptitudes are lower than the high g-loaded cognitive aptitudes (Watkins et al., 2002; Sattler and Ryan, 2009; Lecerf et al., 2016). Considering the discrepancies in performance for the low- and high-level cognitive abilities in IGC, the use of GAI appears to be more judicious than the FSIQ in the context of the identification of IGC (Newman et al., 2008). The GAI also has a satisfactory reliability with regard to both short- and long-term stability (Kieng et al., 2013; Watkins and Smith, 2013). It is often used to include the gifted children in special education such as enrichment or accelerated programs (Saklofske et al., 2005; Pierson et al., 2012). The GAI is often considered as an abbreviated form of the Wechsler scale for the identification of IGC. However, the short form of a test should reduce the examination time by at least 50% (Levy, 1968). The GAI reduces the administrative time by approximately 23% (Ryan et al., 2007). As a consequence, the identification of IGC often remains time-consuming, even with the GAI.
In order to reduce the examination time of a cognitive ability test, the most commonly used solution described in the literature is to reduce the number of subtests retained to estimate the global cognitive functioning (Silverstein, 1990). This time should not be gained at the cost of predictive accuracy (Doppelt, 1956). The use of short forms comprising more than 4 subtests only weakly increases the reliability of the measurement with respect to the cost in terms of examination time (Karnes and Brown, 1981).
The usefulness of brief intellectual assessment is controversial, because it prevents to analyse the cognitive profile which can be essential in the learning disabilities context (Fiorello et al., 2001; Hale et al., 2007). Nevertheless, the brief intellectual assessment may be useful to estimate the intellectual potential. Indeed, several short forms of the WISC-IV have been validated in various languages (e.g., Crawford et al., 2010; in English; Dasi et al., 2014, in Spanish). These short forms have been used to obtain a rapid and reliable evaluation of intellectual ability in children with an intellectual disability (McKenzie et al., 2013; Murray et al., 2016), children having a high-functioning autism spectrum (Thomeer et al., 2012), children affected by epilepsy (Hrabok et al., 2012) or children affected by traumatic brain injury (Donders et al., 2013). Recent studies have used shortened versions of the Wechsler scale with 2 or 4 subtests, for the detection of IGC (Shaw et al., 2006; Alloway and Elsworth, 2012; van Viersen et al., 2014, 2015). Most of these short forms have been constructed on the basis of subtests evaluating high-level cognitive abilities such as verbal comprehension and perceptive reasoning. Although there are short forms of the previous versions of the Wechsler Intelligence Scale for Children (WISC-R and WISC-III) as well as other sets of cognitive ability evaluations for the identification of IGC (Killan and Hughes, 1978; Dirks et al., 1980; Karnes and Brown, 1981; Ortiz and Gonzalez, 1989; Mark et al., 1998; for a review Simpson et al., 2002; Reiter, 2004; Pierson et al., 2012), to our knowledge, no short form of the WISC-IV or WISC-V has been tested with respect to its psychometric qualities in this atypically developing population. With respect to the use of shortened tests of cognitive ability as a decisional aid for the identification of IGC, Prewett (1995) suggests that shortened tests should generate scores that are comparable with those obtained with a battery of global assessment tests. It is thus necessary to compare the mean values of the short scale with that of the full scale. If a discrepancy between the IQ short form (IQSF) and the FSIQ is within 2 standard errors of measurement (SEM), the IQSF is considered as stable (Kieng et al., 2013; Meyers et al., 2013).
In the literature, some authors use 2 or 4 subtests to estimate IQSF score from Wechsler scale to identify IGC (Shaw et al., 2006; Alloway and Elsworth, 2012; van Viersen et al., 2014, 2015). To our knowledge, the psychometric qualities of these short forms have never been tested for the identification of IGC. In the present study, we make nine IQSF from all possible combinations of 2 or 4 subtests in Verbal Comprehension Index (VCI) and Performance Reasoning Index (PRI). We compare reliability, validity, and screening qualities with each other. We excluded the subtests involving the working memory, because these subtests are also known to be affected negatively by specific learning disabilities (Maehler and Schuchardt, 2009; Cornoldi et al., 2014; Toffalini et al., 2017a). While IGC are known to elicit high performances in working memory tasks (Calero et al., 2007; Hoard et al., 2008; Ruthsatz and Urbach, 2012), these subtests can underestimate the overall cognitive performance in the context of specific learning disability. All short forms based on subtests from VCI and PRI make it possible to obtain an estimation of the FSIQ and GAI scores in less than 30 min. The aim of our study is thus to determine the best short scales, in terms of their psychometric qualities, for the identification of IGC.
The data was collected from the WISC-IV produced by 117 French IGC (mean age: 10.39, SD: 1.03, 74% boys) and 52 French intellectually typical children (mean age: 10.57, SD: 2.66, 79% boys). Clinician psychologists gave us the data completely anonymous and in the respect of French deontological code. To preserve the anonymity of the children, we chose not to contact the parents and their children. The groups did not significantly differ in terms of age, t(57.902) = −0.485, p = 0.629, d = −0.109, or gender, = 0.550, p = 0.458. The children were contacted through teachers, school psychologists or licensed psychologists. In order to allow for measurement errors, the inclusion criterion for intellectually gifted group was set to FSIQ or GAI score greater than or equal to 125 in accordance with the current recommendation (McIntosh et al., 2005; Assouline et al., 2010; Brasseur and Grégoire, 2010).
All of the statistical analyses were made using version 3.4.2 of R software (R Core Team, 2017). We estimated the interclass correlation of each IQSF and the full-scale using the irr library (Gamer et al., 2012) and the hierarchical omega (ωH) of each IQSF using the MBESS library (Kelley, 2018). We also computed the 95% confidence intervals for the correlation comparisons (Zou, 2007) using the cocor library (Diedenhofen and Musch, 2015).
The data relevant to the preparation of short forms was extracted from fully administered WISC-IV protocols. The FSIQ scores were computed from 10 core subtests from the WISC-IV. The GAI scores were obtained from the VCI and PRI (Lecerf et al., 2010a).
The current study is composed of three steps in our statistical analyses. Firstly, the scores from different IQSF were computed using the linear equation method (Tellegen and Briggs, 1967). They were divided by the sum of the standard scores given by the VCI (Vocabulary [Vo], Similarities [Si] and Comprehension [Co]) and PRI (Block Design [Bd], Picture Concepts [Pc], and Matrix Reasoning [Mr]). Secondly, we realized a simulation of 1,000,000 data with the Choleski decomposition from correlation matrix of WISC-IV (Table 4.1; Wechsler, 2005) using the mvtnorm library (Genz et al., 2018) (see Giofrè et al., 2017 for a similar approach). We realized a comparison between psychometrics parameters (i.e., reliability and validity) estimated from empirical and simulated datasets. Then, we selected the three best IQSF based on these psychometrics parameters estimated from simulated dataset. For each selected short form WISC-IV, we computed the indicators of a receiver operating characteristic (ROC) such as the sensitivity, the sensibility, the false positive rate (FPR), the false negative rate (FNR), and the Area Under Curve (AUC) from our empirical sample using the pROC library (Robin et al., 2011). The AUC is used as a general measure to estimate the performance of the IGC classification (Fawcett, 2006). Thirdly, we realized another simulated dataset from correlation matrix of WISC-V (Table 4.1; Wechsler, 2016). From this simulated dataset, we created the different short forms of WISC-V. We then highlighted the three best short forms of WISC-V based on the estimated psychometrics parameters from simulated dataset.
The data and R script used for the simulations and the statistical analyses are available on the Open Science Framework (OSF) at http://osf.io/dax8p.
Nine IQSF derived from 2-subtest forms, and 9 IQSF derived from 4-subtest forms were computed from 3 PRI and 3 VCI subtests. For each IQSF, the threshold of intellectual giftedness identification was superior to 125, i.e., the 95th percentile.
It is essential to evaluate reliability and validity when selecting the best short form (Cyr and Brooker, 1984). For each form that we prepared, reliability and validity indices were thus computed.
Index of Reliability
The reliability of each form was determined by a composite reliability coefficient (rcc) according to Equation (1) from Tellegen and Briggs (1967), based on a table of internal consistency and inter-correlations of the applied subtests derived from the WISC-IV manual (Table 4.1 and Table 5.1 from Wechsler, 2005) and the WISC-V manual (Table 4.1 and Table 5.1 from Wechsler, 2016). This index allows the standard error of each short form measurement to be determined. The composite reliability coefficient is frequently used to estimate the reliability of the abriged scale (Ryan and Ward, 1999; Girard et al., 2010, 2015; Donders et al., 2013; Denney et al., 2015).
where rjj is the reliability coefficient of the jth subtest in IQSF, n is the number of subtests in IQSF, rjk is the correlation coefficient between the jth subtest and the kth subtest used for IQSF.
In contrast to the coefficient alpha reliability, the coefficient omega (ω) takes into account the unequal factor loadings (Watkins, 2017). In particular, the hierarchical omega (ωH) has also the advantage to be unaffected by the fitting factorial analysis model (Kelley and Pornprasertmanit, 2016). So, ωH can be a better index of the reliability of a composite score from Wechsler scale than alpha coefficient (Gignac and Watkins, 2013). This index estimates the variation portion which is involved by the general factor. High value of ωH indicates that a general factor explains a large part of variation in the composite score. ωH coefficient is considered as reliable if it exceeds 0.50 at minimum, but ωH superior or equal to 0.75 is considered better (Reise et al., 2013).
Interclass correlation coefficient
In addition, we used the Interclass Correlation Coefficient (ICC; model A.1 in McGraw and Wong, 1996) to examine the reliability of both the IQSF and the full-scale scores.
Index of Validity
Three validity indices were estimated: the convergent validity determined by the corrected correlation between the short form and the full form of the scale (r′), the degree of discrepancy computed as the difference between the IQSF score and the full form score (FSIQ or GAI), and the accuracy of the estimation of the full form scores (Cacc).
The convergent validity of each form was determined by computing its correlation (Pearson's r) with the FSIQ and GAI scores. These correlations were then corrected (r′; Equation 2) by taking the redundancy of the variance error into account, using the modified version (Girard and Christensen, 2008) of the Levy formula (Levy, 1967). The forms were indeed prepared from 2 or 4 tests taken from the entirely administered scale. The correlation between the forms and the full forms (FSIQ or GAI) is artificially increased by the measurement error shared between the two forms:
where r′ is the corrected correlation coefficient; rsf is the uncorrected correlation coefficient between the score of the short form (QISF) and the full scale (FSIQ or GAI), rcc is the composite reliability coefficient of QISF, rjk is the correlation coefficient between the jth subtest and the kth subtest used for IQSF, rlm is the correlation coefficient between subtest l and subtest m, used for the FSIQ or the GAI, p is the number of subtests used for the IQSF, SDSF is the standard deviation of the QISF.
Degree of discrepancy
The paired Student t-tests were computed in order to determine whether the average of the QISF scores was significantly different at the average of the FSIQ or GAI scores. This index allows us to define whether the measurement provided by the scale is significantly different at that of the full scale. The extent of this effect (Cohen's d for correlated samples comparison; Lakens, 2013) was estimated, in order to determine the magnitude of the difference in mean computed value.
An indicator of the accuracy (Cacc) of the estimation of FSIQ or GAI from the QISF score was prepared, in order to identify the percentage of individuals in our sample having a QISF score greater than or equal to the threshold value of 125. The indicator Cacc also takes into account the measurement stability between the QISF score and the FSIQ and GAI scores. Usually, this stability is determined by the difference between two scores in the range between −2 and +2 standard errors of measurement (Meyers et al., 2013). Cacc is a coefficient lying in the range between 0 and 1. It can be interpreted as the accuracy of IGC identification. The closer Cacc is to 1, the more accurate the QISF score-based identification.
In order to simplify the interpretation of multiple psychometric reliability and validity scores, we construct a composite indicator (Rc) based on the approach proposed by Cyr and Brooker (1984), and adapted by Girard and Christensen (2008). Rc corresponds to the unweighted average of 4 computed reliability and validity indicators, i.e., the composite reliability coefficient (rcc), the hierarchical omega (ωH), the interclass correlation coefficient (ICC), the corrected correlation (r′) between the IQSF score and the FSIQ and GAI scores, the identification accuracy (Cacc). We thus reproduced and adapted the psychometric agreement indicator of Girard and Christensen (2008), with our forms and the full form of the Wechsler scale. This makes it possible to interpret the strength of agreement, which ranges from 0 (absence) to 1 (perfect), between the IQSF score and the FSIQ and GAI scores (Girard et al., 2015).
Description of the Sample
All of the participants obtained a GAI score equal to, or greater than 125. The descriptive data from our sample, as well as the mean and standard deviation of the FSIQ, the GAI, and the substests from WISC-IV, are shown in Table 1.
The deviations and equations for each IQSF are presented in Table 2. All of the scores from each 2-subtest and 4-subtest IQSF have skewness and kurtosis coefficient lying between −1 and 1. All of the 2- and 4-subtests forms have an average value greater than 125 in intellectually gifted children and an average value of 107 in typical children. The raw data is available in the Supplementary Material (Data Sheet 1).
Short Form WISC-IV Estimation With Simulated Dataset
We realized a comparison between the reliability and validity indicators from real and simulated data in WISC-IV. The composite indicator Rc calculated with our sample and the simulated dataset are highly correlated [r(36) = 0.988]. In addition, there are few differences between each indicator of reliability and validity estimated with our sample and these with the simulated dataset have few differences (Table 3). The near perfect duplication of psychometric quality indicators shows that their estimation with simulated data is very close to those derived from real data. We thus selected the short forms from the simulated dataset.
Table 3. Comparison the principal indexes of reliability and validity estimated with our sample and the simulated dataset.
The 2-subtest and 4-subtest forms are ranked by decreasing value of the composite score Rc, determined to use the reliability index and each validity index (see Table 4). All of the IQSF scores produced by the 2 subtests are significantly correlated with the FSIQ, [0.680; 0.786]; p < 0.01. All of the IQSF scores computed from 4 subtests are also correlated with the FSIQ, r′(1e6) ∈ [0.823; 0.851]; p < 0.01. However, the forms with 4 subtests are significantly more strongly correlated with the FSIQ, respectively r(9e6) = 0.791; r(9e6) = 0.885; 95% CI [−0.094, −0.094], and the GAI, respectively r(9e6) = 0.856; r(9e6) = 0.957; 95% CI [−0.102, −0.102], than with the forms with 2 subtests
The short form with the Similarities + Matrix Reasoning (SiMr) subtests has the highest agreement score of all 2 subtest forms, Rc = 0.807. It is most strongly correlated with the FSIQ, r′(1e6) = 0.786; p < 0.01, and GAI score, r′(1e6) = 0.833; p < 0.01. In 80% of cases, the IQSF score for the SiMr form correctly identifies the IGC in our sample, and their IQSF lie within 2 standard errors of measurement of the FSIQ. In 91% of cases, the IQSF score for the SiMr form correctly identifies the IGC in our sample, and their IQSF lie within 2 standard errors of measurement of the GAI.
Among the 4-subtest forms, the IQSF scores of Similarities + Vocabulary + Matrix Reasoning + Block Design [SiVoMrBd] form has a higher accession score, Rc = 0.885. It is strongly correlated with the FSIQ, r′(1e6) = 0.851; p < 0.01, and GAI score, r′(1e6) = 0.892; p < 0.01. In 86% of cases, the IQSF score for the SiVoMrBd form correctly identifies the IGC in our sample, and their IQSF lie within 2 standard errors of measurement of the FSIQ. In 99% of cases, the IQSF score for the SiVoMrBd form correctly identifies the IGC in our sample, and their IQSF lie within 2 standard errors of measurement of the GAI.
Identification Efficiency of Short-Form WISC-IV From Empirical Data
After selecting the three best short forms with 2- and 4-subtests from simulated dataset, we realized the AUC, sensitivity and sensibility of the three best IQSF from empirical dataset. In Table 5, the short forms with 2 subtests and 4 subtests are ranked in descending order of the AUC indicating the predictive performance in identifying the IGC. All short form scales have a high AUCs suggesting a highly predictive performance with 2- and 4-subtests models. The difference among all models is low (ΔAUCs < 0.05).
Among the forms with 2 subtests, the form with the best performance is still Similitaries + Matrix Reasoning [SiMr]. They allow more than 74% of IGC in our sample to be correctly identified. The SiMr form has the probability of around 1.1% typical children being incorrectly identified as gifted. It has only significant difference with GAI, t(168) = −3.903, p < 0.01, drm = −0.156, but it has the smallest effect size among the three best 2-subtests short forms.
Among the forms comprising 4 subtests, the IQSF with the best performance consisted of the Similarities + Vocabulary + Picture Concept + Block Design [SiVoPcBd] subtests. It correctly identifies more than 96% of IGC in our sample. No typical children were identified as being gifted, but 4.3% IGC from our sample have not been correctly identified. It has only a significant difference with FSIQ, t(168) = −3.053, p < 0.05, drm = −0.109.
Short-Form WISC-V Estimation From Simulated Dataset
The 2-subtest and 4-subtest forms are ranked by decreasing value of the composite score Rc (see Table 6). All of the IQSF scores produced by the 2 subtests are significantly correlated with the FSIQ, r′(1e6) ∈ [0.741; 0.833]; p < 0.01. All of the IQSF scores computed from 4 subtests are also correlated with the FSIQ, r′(1e6) ∈ [0.921; 0.940]; p < 0.01.
Like at the WISC-IV, the short form with the Similarities + Matrix Reasoning [SiMr] subtests has the highest agreement score of all 2 subtest forms, Rc = 0.849. It is strongly correlated with the FSIQ, r′(1e6) = 0.833; p < 0.01, and GAI score, r′(1e6) = 0.855; p < 0.01. In 87% of cases, the IQSF score for the SiMr form correctly identifies the IGC in our sample, and their IQSF lie within 2 standard errors of measurement of the FSIQ. In 92% of cases, the IQSF score for the SiMr form correctly identifies the IGC in our sample, and their IQSF lie within 2 standard errors of measurement of the GAI.
Like at the WISC-IV, the Similarities + Vocabulary + Matrix Reasoning + Block Design [SiVoMrBd] form has a high agreement score, Rc = 0.923. It is strongly correlated with the FSIQ, r′(1e6) = 0.898; p < 0.01, and GAI score, r′(1e6) = 0.917; p < 0.01. In 95% of cases, the IQSF score for the SiVoMrBd form correctly identifies the IGC in our sample, and their IQSF lie within 2 standard errors of measurement of the FSIQ. In 99.8% of cases, the IQSF score for the SiVoMrBd form correctly identifies the IGC in our sample, and their IQSF lie within 2 standard errors of measurement of the GAI.
Our aim was to make a short form of the recent versions of Wechsler scales, allowing an intellectual capacity assessment in less than 30 min. In the literature, short forms with 2 or 4 subtests are often used to estimate intellectual capacity. We thus tested all possible short form combinations, comprising either 2 or 4 subtests used for the evaluation of high-level cognitive abilities.
The results of the present study indicate that the estimation of reliability and validity indicators with simulated data are very close to them estimated with real data. Ours findings also show the short forms of the WISC-IV can have high performance to identify children on having intellectual giftedness. The 4-subtest forms appear to produce better psychometric results than the 2-subtest forms. In addition, the 4-subtest short forms appear to provide a good compromise between test duration and psychometric qualities, which are more satisfactory than those obtained with 2-subtest forms. The ωH coefficient also showed that the general factor explained more variance in the 4-subtest than 2-subtest forms. The 4-subtest forms seemed to be a satisfactory trade-off between an accurate estimation of overall cognitive aptitude and administration time (Gignac, 2015).
In spite of a number changes, our results show that the 4-subtest form Similarities + Vocabulary + Matrix Reasoning + Block Design [SiVoMrBd], in the WISC-IV and WISC-V, is globally efficient in the identification of IGC. It appears to evaluate the three most discriminating cognitive abilities in the identification of IGC, i.e., Knowledge-Comprehension, Fluid reasoning, and Visual processing (Volker et al., 2006). Moreover, the composite score using the Similarities and Matrix Reasoning (SiMr) subtests, in the two recent Wechsler scale, appears to provide one of the best 2-subtest forms for the identification of IGC. These two types of short form thus appear to provide acceptable means of identification, for the selection of candidates for complementary evaluations (Prewett, 1995).
In the context of learning disabilities, an abbreviated intellectual assessment using Similarities, Vocabulary, Matrix Reasoning, and Block Design subtests can allow clinician psychologists to be less biased an overall cognitive aptitude estimation. Indeed, these subtests seem to be less affected by the specific learning disabilities (e.g., Toffalini et al., 2017a,b). The time gain allows us to add another cognitive assessments such as the complex span tasks to assess working memory capacity (e.g., Gonthier et al., 2017).
The Gc, Gf, and Gv abilities evaluated by the subtests of the GAI score appear to provide the best discrimination in the identification of IGC. This is based on the idea that these cognitive abilities appear to be the characteristics of high intellectual potential (Margulies and Floyd, 2009). Thus, psychologists should choose measurements that are well adapted to characteristics that are related to intellectual giftedness (Pierson et al., 2012).
Our results reveal the importance of relying on a theoretical model of cognitive ability, with the aim of identifying IGC as a CHC model. In addition, this theoretical model can be very helpful for the schooling of children in general (Aubry and Bourdin, 2016) and IGC in particular (Warne, 2016).
Any decision in this respect should not be made solely on the basis of the composite score from the short form. It is also important to recognize the reality of measurement errors, which can prevent the correct identification of IGC (Pierson et al., 2012). It is thus recommended to implement a multidimensional evaluation of IGC's characteristics (McClain and Pfeiffer, 2012).
We have shown that some short forms have satisfactory psychometric qualities. However, they have to be accompanied by other assessments such as a teacher rating scale, such as the Gifted Rating Scales–School Form (GRS; Pfeiffer and Jarosewich, 2003) in order to improve the quality of IGC identification (McBee et al., 2013, 2016). Short measurements appear to be reasonable tools for the prediction of global scores of complete batteries of tests, for the identification of children with intellectual giftedness (Newton et al., 2008).
Limitations of Our Study
Our simulation of data is very close to our empirical data. However, their relation is not perfect. So, it may have some difference with empirical data for WISC-V short forms. In terms of future perspectives, it would be interesting to implement a detailed analysis of the specificity and sensitivity of each short form of WISC-V identified as being reliable and valid for the identification of IGC.
Among the limitations of our study, the relatively small number of typical children in our sample means that our results must be considered with caution. We tried to estimate the AUC, sensitivity and specificity of all short forms scale.
The identification of IGC based on the use of standardized tests such as the Wechsler scale is often time-consuming. This drawback can prevent other evaluations from being made, which could be essential for the education of these children with specific needs. The development of a short form of the recent version of Wechsler scale is thus useful for the fast and efficient identification of IGC. This short form would then allow the need for a more in-depth evaluation of various cognitive and socioemotional characteristics to be determined.
Our study evaluated nine 2-subtest forms and nine 4-subtest forms, based on the linear method of Tellegen and Briggs (1967). In order to evaluate these different short forms, we computed with simulated datasets several psychometric indicators that were reorganized into groups with an agreement indicator for FSIQ and GAI scores. We also computed the AUC, sensibility and specificity, on the basis of the 117 IGC and 52 typical children in our sample.
Our results show that the 4-subtest short form at the WISC-IV and WISC-V, Similarities + Vocabulary + Matrix Reasoning + Block Design [SiVoMrBd], appears to be one of the most reliable forms for the identification of IGC. Among the 2-subtest forms at the WISC-IV and WISC-V, the Similarities + Matrix Reasoning [SiMr] version appears to ensure an optimal compromise between reliability and accuracy, for the estimation of FSIQ and GAI scores. In the case of our sample, this outcome led us to question the usefulness of relying on a theory of cognitive aptitudes such as that of the CHC model, in order to determine the specific cognitive characteristics of IGC. We are of the opinion that the elaboration of a short form should take these specific cognitive characteristics into account, in order to obtain sufficiently accurate identification of IGC.
The participant's data were retrieved from clinical psychologists as part of their day-to-day clinical practice and a further examination ran with aim of developing a short-form of the Wechsler' scale.
AA and BB designed the study. AA performed data collection and analyzed the data. AA and BB wrote the manuscript.
AA's PhD is part of the HPISCOL project, both supported by the Hauts-de-France region council and the FEDER (European Fund for Regional Development).
Conflict of Interest Statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
First and foremost, we extend our thanks to the children and families who participated in our study. The authors also extend their warm thanks to Claire Touchet, Corentin Gonthier, Émilie Lacot, Geoffrey Blondelle, Yannick Gounden, and Marc Chatterji for their advice and their proofreading of this paper. The authors thank also Académie de Versailles, Florence Pâris, Philippe Coche, Isabelle Sage, and Éric Turon-Lagot for their help. The authors also want to thank the two reviewers for their suggestions to improve this article. This research was supported by the European Regional Development Fund (ERDF) and the Regional Council of Haut de France for the HPISCOL project: Enfants et adolescents à haut potentiel: Identification et scolarisation.
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg.2018.00830/full#supplementary-material
Data Sheet 1. Raw data.
Assouline, S. G., Foley-Nicpon, M., and Whiteman, C. (2010). Cognitive and psychosocial characteristics of gifted students with written language disability. Gifted Child Q. 54, 102–115. doi: 10.1177/0016986209355974
Aubry, A., and Bourdin, B. (2016). Les Tests BV9 et B53 Peuvent-ils Prédire la Réussite Scolaire? [Can the B53 and BV9 Tests Predict the Academic Achievement?]. L'orientation Scolaire et Professionnelle 45/3 | 2016.
Brasseur, S., and Grégoire, J. (2010). L'intelligence émotionnelle – trait chez les adolescents à haut potentiel : spécificités et liens avec la réussite scolaire et les compétences sociales [Trait emotional intelligence in adolescents with high potential: specificities and links with academic achievement and social competences]. Enfance 2010, 59–76. doi: 10.4074/S0013754510001060
Calero, M. D., García-Martín, M. B., Jiménez, M. I., Kazén, M., and Araque, A. (2007). Self-regulation advantage for high-IQ children: Findings from a research study. Learn. Individ. Differ. 17, 328–343. doi: 10.1016/j.lindif.2007.03.012
Cornoldi, C., Giofrè, D., Orsini, A., and Pezzuti, L. (2014). Differences in the intellectual profile of children with intellectual vs. learning disability. Res. Dev. Disabil. 35, 2224–2230. doi: 10.1016/j.ridd.2014.05.013
Crawford, J. R., Anderson, V., Rankin, P. M., and MacDonald, J. (2010). An index-based short form of the WISC-IV with accompanying analysis of the reliability and abnormality of differences. Br. J. Clin. Psychol. 49, 235–258. doi: 10.1348/014466509X455470
Dasi, C., Soler, M. J., Bellver, V., and Ruiz, J. C. (2014). Short form of Spanish version of the WISC-IV for intelligence assessment in elementary school children. Psychol. Rep. 115, 784–793. doi: 10.2466/03.PR0.115c32z7
Donders, J., Elzinga, B., Kuipers, D., Helder, E., and Crawford, J. R. (2013). Development of an eight-subtest short form of the WISC-IV and evaluation of its clinical utility in children with traumatic brain injury. Child Neuropsychol. 19, 662–670. doi: 10.1080/09297049.2012.723681
Fiorello, C. A., Hale, J. B., McGrath, M., Ryan, K., and Quinn, S. (2001). IQ interpretation for children with flat and variable test profiles. Learn. Individ. Differ. 13, 115–125. doi: 10.1016/S1041-6080(02)00075-4
Gamer, M., Lemon, J., and Singh, I. F. P. (2012). irr: Various Coefficients of Interrater Reliability and Agreement (Version 0.84) [Computer Software]. Available online at: https://cran.r-project.org/web/packages/irr/
Genz, A., Bretz, F., Miwa, T., Mi, X., Leisch, F., Scheipl, F., et al. (2018). Multivariate Normal and t Distributions. Available online at: http://mvtnorm.r-forge.r-project.org
Giofrè, D., Toffalini, E., Altoè, G., and Cornoldi, C. (2017). Intelligence measures as diagnostic tools for children with specific learning disabilities. Intelligence 61, 140–145. doi: 10.1016/j.intell.2017.01.014
Girard, T. A., Axelrod, B. N., Patel, R., and Crawford, J. R. (2015). Wechsler adult intelligence scale-IV dyads for estimating global intelligence. Assessment 22, 441–448. doi: 10.1177/1073191114551551
Girard, T. A., and Christensen, B. K. (2008). Clarifying problems and offering solutions for correlated error when assessing the validity of selected-subtest short forms. Psychol. Assess. 20, 76–80. doi: 10.1037/1040-3518.104.22.168
Golay, P., Reverte, I., Rossier, J., Favez, N., and Lecerf, T. (2013). Further insights on the French WISC–IV factor structure through Bayesian structural equation modeling. Psychol. Assess. 25, 496–508. doi: 10.1037/a0030676
Gonthier, C., Aubry, A., and Bourdin, B. (2017). Measuring working memory capacity in children using adaptive tasks: example validation of an adaptive complex span. Behav. Res. Methods 27, 1–12. doi: 10.3758/s13428-017-0916-4
Hale, J. B., Fiorello, C. A., Kavanagh, J. A., Holdnack, J. A., and Aloe, A. M. (2007). Is the demise of IQ interpretation justified? A response to special issue authors. Appl. Neuropsychol. 14, 37–51. doi: 10.1080/09084280701280445
Hrabok, M., Brooks, B. L., Fay-McClymont, T. B., and Sherman, E. M. S. (2012). Wechsler Intelligence Scale for Children-Fourth Edition (WISC-IV) short-form validity: a comparison study in pediatric epilepsy. Child Neuropsychol. 20, 49–59. doi: 10.1080/09297049.2012.741225
Keith, T. Z., Fine, J. G., Taub, G. E., Reynolds, M. R., and Kranzler, J. H. (2006). “Higher order, multisample, confirmatory factor analysis of the wechsler intelligence scale for children - fourth edition: what does it measure? in School Psychology Review 35, 108–127. Available online at: http://eric.ed.gov/?id=EJ788234
Kelley, K. (2018). MBESS (Version 4.4.3) [Computer Software]. Available online at: https://cran.r-project.org/package=MBESS
Kelley, K., and Pornprasertmanit, S. (2016). Confidence intervals for population reliability coefficients: evaluation of methods, recommendations, and software for composite measures. Psychol. Methods 21, 69–92. doi: 10.1037/a0040086
Kieng, S., Rossier, J., Favez, N., and Lecerf, T. (2013). Étude exploratoire de la stabilité à long terme des indices standard du WISC-IV [Exploratory study about long-term stability of French WISC-IV index scores]. Pratiques Psychologiques 19, 163–178. doi: 10.1016/j.prps.2013.07.003
Killan, J. B., and Hughes, L. C. (1978). A comparison of short forms of the intelligence scale for children - revised in the screening of gifted referrals. Gifted Child Q. 22, 111–115. doi: 10.1177/001698627802200123
Kroesbergen, E. H., van Hooijdonk, M., van Viersen, S., Middel-Lalleman, M. M. N., and Reijnders, J. J. W. (2016). The psychological well-being of early identified gifted children. Gifted Child Q. 60, 16–30. doi: 10.1177/0016986215609113
Lecerf, T., Bovet-Boone, F., Peiffer, E., Kieng, S., and Geistlich, S. (2016). WISC-IV GAI and CPI profiles in healthy children and children with learning disabilities. Rev. Eur. Psychol. Appl. 66, 101–107. doi: 10.1016/j.erap.2016.04.001
Lecerf, T., Kieng, S., and Geistlich, S. (2015). Cohésion–non-cohésion des scores composites : valeurs seuils et interprétabilité. L'exemple du WISC-IV [Cohesive vs. non cohesive composite scores: Cut-off values and interpretability. The example of the WISC-IV]. Pratiques Psychologiques 21, 155–171. doi: 10.1016/j.prps.2015.02.001
Lecerf, T., Reverte, I., Coleaux, L., Favez, N., and Rossier, J. (2010a). Indice d'Aptitude Général pour le WISC-IV: Normes francophones [General ability index for the WISC-IV: French norms]. Pratiques Psychologiques 16, 109–121. doi: 10.1016/j.prps.2009.04.001
Lecerf, T., Rossier, J., Favez, N., Reverte, I., and Coleaux, L. (2010b). The Four- vs. Alternative Six-Factor Structure of the French WISC-IV. Swiss J. Psychol. 69, 221–232. doi: 10.1024/1421-0185/a000026
Litster, K., and Roberts, J. (2011). The self-concepts and perceived competencies of gifted and non-gifted students: a meta-analysis. J. Res. Special Educ. Needs 11, 130–140. doi: 10.1111/j.1471-3802.2010.01166.x
Maehler, C., and Schuchardt, K. (2009). Working memory functioning in children with learning disabilities: does intelligence make a difference? J. Intellect. Disabil. Res. 53, 3–10. doi: 10.1111/j.1365-2788.2008.01105.x
Margulies, A. S., and Floyd, R. G. (2009). A Preliminary Examination of the CHC Cognitive Ability Profiles of Children with High IQ and High Academic Achievement Enrolled in Services for Intellectual Giftedness. Nashville, TN: Woodcock Munoz Foundation Press.
McBee, M. T., Peters, S. J., and Miller, E. M. (2016). The impact of the nomination stage on gifted program identification a comprehensive psychometric analysis. Gifted Child Q. 60, 258–278. doi: 10.1177/0016986216656256
McBee, M. T., Peters, S. J., and Waterman, C. (2013). Combining scores in multiple-criteria assessment systems: the impact of combination rule. Gifted Child Q. 58, 69–89. doi: 10.1177/0016986213513794
McClain, M.-C., and Pfeiffer, S. (2012). Identification of gifted students in the united states today: a look at state definitions, policies, and practices. J. Appl. School Psychol. 28, 59–88. doi: 10.1080/15377903.2012.643757
McIntosh, D. E., Dixon, F. A., and Pierson, É. E. (2005). “Use of intelligence tests in the identification of giftedness,” in Contemporary Intellectual Assessment, eds P. L. Harrison and D. P. Flanagan, 2nd Edn (New York, NY: Guilford Press), 504–520.
McKenzie, K., Murray, A. L., Murray, K. R., and Murray, G. C. (2013). Assessing the accuracy of the WISC-IV seven-subtest short form and the child and adolescent intellectual disability screening questionnaire in identifying intellectual disability in children. Child Neuropsychol. 20, 372–377. doi: 10.1080/09297049.2013.799642
Meyers, J. E., Zellinger, M. M., Kockler, T., Wagner, M., and Miller, R. M. (2013). A validated seven-subtest short form for the WAIS-IV. Appl. Neuropsychol Adult 20, 249–256. doi: 10.1080/09084282.2012.710180
Murray, A. L., McKenzie, K., and Murray, G. C. (2016). An evaluation of the performance of the WISC-IV eight-subtest short form with children who may have an intellectual disability. J. Intellect. Dev. Disabil. 41, 50–53. doi: 10.3109/13668250.2015.1084611
Newman, T. M., Sparrow, S. S., and Pfeiffer, S. I. (2008). “The use of the WISC-IV in assessment and intervention planning for children who are gifted,” in WISC-IV. Clinical Assessment and Intervention, eds A. Prifitera, D. H. Saklofske and L. G. Weiss (San Diego, CA: Academic Press), 217–242.
Newton, J. H., McIntosh, D. E., Dixon, F., Williams, T., and Youman, E. (2008). Assessing giftedness in children: comparing the accuracy of three shortened measures of Intelligence to the Stanford–Binet Intelligence Scales, Fifth Edition. Psychol. Sch. 45, 523–536. doi: 10.1002/pits.20321
Pierson, É. E., Kilmer, L. M., Rothlisberg, B. A., and McIntosh, D. E. (2012). Use of brief intelligence tests in the identification of giftedness. J. Psychoeduc. Assess. 30, 10–24. doi: 10.1177/0734282911428193
Prewett, P. N. (1995). A comparison of two screening tests (the Matrix Analogies Test—Short Form and the Kaufman Brief Intelligence Test) with the WISC-III. Psychol. Assess. I, 69–72. doi: 10.1037/1040-3522.214.171.124
R Core Team (2017). R: A Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing. Available online at: https://www.R-project.org/
Reise, S. P., Bonifay, W. E., and Haviland, M. G. (2013). Scoring and modeling psychological measures in the presence of multidimensionality. J. Pers. Assess. 95, 129–140. doi: 10.1080/00223891.2012.725437
Reverte, I., Golay, P., Favez, N., Rossier, J., and Lecerf, T. (2015). Testing for multigroup invariance of the WISC-IV structure across France and Switzerland: Standard and CHC models. Learn. Individ. Differ. 40, 127–133. doi: 10.1016/j.lindif.2015.03.015
Reynolds, M. R., and Keith, T. Z. (2017). Multi-group and hierarchical confirmatory factor analysis of the Wechsler Intelligence Scale for Children—Fifth Edition: What does it measure? Intelligence 62, 31–47. doi: 10.1016/j.intell.2017.02.005
Robin, X., Turck, N., Hainard, A., Tiberti, N., Lisacek, F., Sanchez, J.-C., et al. (2011). pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics 12:77. doi: 10.1186/1471-2105-12-77
Rowe, E. W., Kingsley, J. M., and Thompson, D. F. (2010). Predictive ability of the General Ability Index (GAI) versus the Full Scale IQ among gifted referrals. School Psychol. Q. 25, 119–128. doi: 10.1037/a0020148
Ruthsatz, J., and Urbach, J. B. (2012). Child prodigy: a novel cognitive profile places elevated general intelligence, exceptional working memory and attention to detail at the root of prodigiousness. Intelligence 40, 419–426. doi: 10.1016/j.intell.2012.06.002
Ryan, J. J., Glass, L. A., and Brown, C. N. (2007). Administration time estimates for Wechsler Intelligence Scale for Children-IV subtests, composites, and short forms. J. Clin. Psychol. 63, 309–318. doi: 10.1002/jclp.20343
Ryan, J. J., and Ward, L. C. (1999). Validity, reliability, and standard errors of measurement for two seven-subtest short forms of the Wechsler Adult Intelligence Scale—III. Psychol. Assess. 11, 207–211. doi: 10.1037/1040-35126.96.36.199
Saklofske, D. H., Weiss, L. G., Raiford, S. E., and Prifitera, A. (2005). “Advanced interpretive issues with the WISC-IV Full-Scale IQ and General Ability Index Scores,” in WISC-IV Advanced Clinical Interpretation, eds L. G. Weiss, D. H. Saklofske, A. Prifitera, and J. Holdnack (San Diego: Elsevier Academic Press), 99–138.
Shaw, P., Greenstein, D., Lerch, J., Clasen, L., Lenroot, R., Gogtay, N., et al. (2006). Intellectual ability and cortical development in children and adolescents. Nature 440, 676–679. doi: 10.1038/nature04513
Shi, J., Tao, T., Chen, W., Cheng, L., Wang, L., and Zhang, X. (2013). Sustained attention in intellectually gifted children assessed using a continuous performance test. PLoS ONE 8:e0057417. doi: 10.1371/journal.pone.0057417
Thomeer, M. L., Lopata, C., Volker, M. A., Toomey, J. A., Lee, G. K., Smerbeck, A. M., et al. (2012). Randomized clinical trial replication of a psychosocial treatment for children with high-functioning autism spectrum disorders. Psychol. Sch. 49, 942–954. doi: 10.1002/pits.21647
Toffalini, E., Giofrè, D., and Cornoldi, C. (2017a). Strengths and weaknesses in the intellectual profile of different subtypes of specific learning disorder: a study on 1,049 diagnosed children. Clin. Psychol. Sci. 5, 402–409. doi: 10.1177/2167702616672038
Toffalini, E., Pezzuti, L., and Cornoldi, C. (2017b). Einstein and dyslexia: Is giftedness more frequent in children with a specific learning disorder than in typically developing children? Intelligence 62, 175–179. doi: 10.1016/j.intell.2017.04.006
van Viersen, S., de Bree, E. H., Kroesbergen, E. H., Slot, E. M., and de Jong, P. F. (2015). Risk and protective factors in gifted children with dyslexia. Ann. Dyslexia 65, 178–198. doi: 10.1007/s11881-015-0106-y
Warne, R. T. (2016). Five reasons to put the g back into giftedness: an argument for applying the cattell-horn-carroll theory of intelligence to gifted education research and practice. Gifted Child Q. 60, 3–15. doi: 10.1177/0016986215605360
Watkins, M. W., Greenawalt, C. G., and Marcell, C. M. (2002). Factor Structure of the Wechsler Intelligence Scale for Children-Third Edition among Gifted Students. Educ. Psychol. Meas. 62, 164–172. doi: 10.1177/0013164402062001011
Wechsler, D. (2005). Manuel de l'Echelle d'Intelligence de Wechsler pour Enfants – 4e édition [Manual for the Wechsler Intelligence Scale for Children – fourth edition]. Paris: Editions du Centre de Psychologie Appliquée.
Wechsler, D. (2016). Manuel de l'Echelle d'Intelligence de Wechsler pour Enfants 5e édition [Manual for the Wechsler Intelligence Scale for Children – fifth edition]. Paris: Editions du Centre de Psychologie Appliquée.
Weiss, L. G., Beal, A. L., Saklofske, D. H., Alloway, T. P., and Prifitera, A. (2008). “Interpretation and intervention with WISC-IV in the clinical assessment context,” in WISC-IV. Clinical Assessment and Intervention, eds A. Prifitera, D. H. Saklofske, & L. G. Weiss (San Diego: Elsevier Inc), 3–66.
Weiss, L. G., Keith, T. Z., Zhu, J., and Chen, H. (2013). WISC-IV and Clinical validation of the four- and five-factor interpretative approaches. J. Psychoeduc. Assess. 31, 114–131. doi: 10.1177/0734282913478032
Keywords: intelligence, brief assessment, gifted children, short-form, screening tools
Citation: Aubry A and Bourdin B (2018) Short Forms of Wechsler Scales Assessing the Intellectually Gifted Children Using Simulation Data. Front. Psychol. 9:830. doi: 10.3389/fpsyg.2018.00830
Received: 15 February 2018; Accepted: 08 May 2018;
Published: 28 May 2018.
Edited by:Ilaria Grazzani, Università degli Studi di Milano Bicocca, Italy
Reviewed by:Carmen Belacchi, University of Urbino, Italy
David Giofrè, Liverpool John Moores University, United Kingdom
Copyright © 2018 Aubry and Bourdin. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Béatrice Bourdin, firstname.lastname@example.org