Short Forms of Wechsler Scales Assessing the Intellectually Gifted Children Using Simulation Data

Aubry, Alexandre; Bourdin, Béatrice

doi:10.3389/fpsyg.2018.00830

METHODS article

Front. Psychol., 28 May 2018

Sec. Human Developmental Psychology

Volume 9 - 2018 | https://doi.org/10.3389/fpsyg.2018.00830

Short Forms of Wechsler Scales Assessing the Intellectually Gifted Children Using Simulation Data

Alexandre Aubry

Béatrice Bourdin^*

CRP-CPO EA 7273, Université de Picardie Jules Verne, Amiens, France

Intellectual giftedness is usually defined in terms of having a very high Intellectual Quotient (IQ). The intellectual capacity is assessed by a standardized test such as the Wechsler Intelligence Scale for Children (WISC). However, the identification of intellectually gifted children (IGC) often remains time-consuming. A short-form WISC can be used as a screening instrument. The practitioners and researchers in this field can then make a more in-depth evaluation of the IGC's cognitive and socioemotional characteristics if needed. The aim of our study is thus to determine the best short tests, in terms of their psychometric qualities, for the identification of IGC. The current study is composed of three-step analyses. Firstly, we created nine IQs short forms (IQ_SF) with 2-subtests, and nine IQ_SF with 4-subtests from the WISC-IV (Wechsler, 2005). Secondly, we estimated psychometric parameters (i.e., reliability and validity) from empirical and simulated dataset with WISC-IV. The difference in the estimation of psychometric qualities of each IQ_SF from the simulated data is very close to those derived from empirical data. We thus selected the three best IQ_SF based on these psychometrics parameters estimated from simulated datasets. For each selected short form of the WISC-IV, we estimated the screening quality in our sample of IGC. Thirdly, we created IQ_SF with 2- and 4-subtests from the WISC-V (Wechsler, 2016) with simulated dataset. We then highlighted the three best short forms of WISC-V based on the estimated psychometric parameters. The results are interpreted in terms of validity, reliability and screening quality of IGC. In spite of the important changes in the WISC-V, our findings show that the 2-subtest form, Similitaries + Matrix Reasoning, and 4-subtest form, Similitaries + Vocabulary + Matrix Reasoning + Block Design, are the most efficient to identify the IGC at the two recent versions of Wechsler scales. Finally, we discuss the advantages and drawbacks of a brief assessment of intellectual aptitudes for the identification of the IGC.

Introduction

In the sphere of education, the rapid and reliable evaluation of a child's global intellectual capacity is important for an efficient identification of intellectually gifted children (IGC). Indeed, such evaluation contributes in proposing specific educational programs (e.g., accelerated or enrichment programs). In this context, a short evaluation can initially be used to assess a child's intellectual giftedness. It can then serve to determine whether a more in-depth evaluation of the child's cognitive and socioemotional characteristics is needed.

The importance and usefulness of cognitive ability assessments for the identification of IGC have long been recognized by the scientific community as a means of facilitating the integration of IGC in specialized school programs (Simpson et al., 2002; Pierson et al., 2012). Furthermore, special education such as enrichment education has proved to impact positively on their cognitive skills (Shi et al., 2013). Consequently, an early identification of intellectual giftedness appears to be a predictor of not only their psychological well-being (Neihart, 1999; Litster and Roberts, 2011; Kroesbergen et al., 2016), but also their professional success as adults (Rinn and Bishop, 2015).

Intellectual giftedness is usually defined in terms of a high Intellectual Quotient (IQ) resulting from a standardized and validated intelligence test (Winner, 2000; see Lovett and Lewandowski, 2006). The Wechsler Intelligence Scale for Children (WISC-IV; Wechsler, 2005; WISC-V; Wechsler, 2016) is one of the most common tools used by both researchers and clinicians for the evaluation of intellectual capacity (Evers et al., 2012). It is currently used to identify intellectually gifted children in school (McClain and Pfeiffer, 2012). It allows a person's global intellectual potential to be estimated on the basis of a Full Scale Intellectual Quotient (FSIQ). The most commonly considered threshold value of the FSIQ is a score greater than or equal to 130, i.e., 2 standard deviations above the average (Carman, 2013). However, this cutoff is an ideal target value. Very high cutoffs also require a very high level of reliability in the measurement, in order to ensure satisfactory identification performance. Indeed, the measurement error has an influence on the identification of IGC (McIntosh et al., 2005). It perturbs the accuracy of decisions related to their identification (McBee et al., 2013). According to McIntosh et al. (2005), a threshold FSIQ value of 125, i.e., the 95th percentile, appears to be a reasonable choice for the identification of IGC. In the current study, we retained this cutoff of 125 in Wechsler's scale to identify IGC.

The recent versions of Wechsler scales are explicitly developed on the basis of the theoretical model of Carroll-Horn-Cattel (CHC; Keith et al., 2006; Lecerf et al., 2010a; Golay et al., 2013; Weiss et al., 2013). The literature is consensual with regard to the number of global cognitive abilities evaluated by the WISC-IV (Reverte et al., 2015) or WISC-V (Reynolds and Keith, 2017). Subtests are posited to estimate global cognitive aptitudes: Similarities, Vocabulary, and Comprehension are considered to evaluate the Comprehension-Knowledge (Gc) ability; Pictures Concept, Figure Weights (only WISC-V) and Reasoning Matrix are considered to estimate Fluid reasoning (Gf) ability; Block Design and Visual Puzzle (only WISC-V) are posited to evaluate mainly Visual processing (Gv) ability; Digit Span, Picture Span (only WISC-V) and Letter-Number Sequencing are considered to evaluate Short-memory ability (Gsm), and Coding and Symbol target processing Speed (Gs) ability.

In practice, IGC have higher performance for Gc, Gf and Gv than for Gsm and Gs (Volker et al., 2006; Rowe et al., 2010). It thus appears that IGC performed more efficiently in terms of high-level abilities than in terms of low-level abilities. The consequence of this is the presence of strong scatter scores in Wechsler scale with IGC. According to Flanagan and Kaufman (2009), an important and uncommon variability can affect the interpretation of the FSIQ. Indeed, the aim of the FSIQ is to summarize all cognitive aptitudes assessed by the scale. The FSIQ score is designed to estimate Spearman's g factor 1904. If the tests are too heterogeneous, the common variance between tests decreases and the mean value of the performance does not provide a satisfactory representation of the overall cognitive aptitude. This abnormal variability between scores in Wechsler scale could have an impact on the estimation of a child's overall intellectual functioning (Lecerf et al., 2015). To mitigate this problem, Weiss et al. (2008) proposed a new indicator: the Global Aptitude Index (GAI). This is estimated from subtests used mainly to evaluate Gc, Gf, and Gv abilities. These are the most highly g-saturated core subtests (Keith et al., 2006; Lecerf et al., 2010b). The GAI would thus provide a better estimation of overall cognitive functioning when low g-loaded cognitive aptitudes are lower than the high g-loaded cognitive aptitudes (Watkins et al., 2002; Sattler and Ryan, 2009; Lecerf et al., 2016). Considering the discrepancies in performance for the low- and high-level cognitive abilities in IGC, the use of GAI appears to be more judicious than the FSIQ in the context of the identification of IGC (Newman et al., 2008). The GAI also has a satisfactory reliability with regard to both short- and long-term stability (Kieng et al., 2013; Watkins and Smith, 2013). It is often used to include the gifted children in special education such as enrichment or accelerated programs (Saklofske et al., 2005; Pierson et al., 2012). The GAI is often considered as an abbreviated form of the Wechsler scale for the identification of IGC. However, the short form of a test should reduce the examination time by at least 50% (Levy, 1968). The GAI reduces the administrative time by approximately 23% (Ryan et al., 2007). As a consequence, the identification of IGC often remains time-consuming, even with the GAI.

In order to reduce the examination time of a cognitive ability test, the most commonly used solution described in the literature is to reduce the number of subtests retained to estimate the global cognitive functioning (Silverstein, 1990). This time should not be gained at the cost of predictive accuracy (Doppelt, 1956). The use of short forms comprising more than 4 subtests only weakly increases the reliability of the measurement with respect to the cost in terms of examination time (Karnes and Brown, 1981).

The usefulness of brief intellectual assessment is controversial, because it prevents to analyse the cognitive profile which can be essential in the learning disabilities context (Fiorello et al., 2001; Hale et al., 2007). Nevertheless, the brief intellectual assessment may be useful to estimate the intellectual potential. Indeed, several short forms of the WISC-IV have been validated in various languages (e.g., Crawford et al., 2010; in English; Dasi et al., 2014, in Spanish). These short forms have been used to obtain a rapid and reliable evaluation of intellectual ability in children with an intellectual disability (McKenzie et al., 2013; Murray et al., 2016), children having a high-functioning autism spectrum (Thomeer et al., 2012), children affected by epilepsy (Hrabok et al., 2012) or children affected by traumatic brain injury (Donders et al., 2013). Recent studies have used shortened versions of the Wechsler scale with 2 or 4 subtests, for the detection of IGC (Shaw et al., 2006; Alloway and Elsworth, 2012; van Viersen et al., 2014, 2015). Most of these short forms have been constructed on the basis of subtests evaluating high-level cognitive abilities such as verbal comprehension and perceptive reasoning. Although there are short forms of the previous versions of the Wechsler Intelligence Scale for Children (WISC-R and WISC-III) as well as other sets of cognitive ability evaluations for the identification of IGC (Killan and Hughes, 1978; Dirks et al., 1980; Karnes and Brown, 1981; Ortiz and Gonzalez, 1989; Mark et al., 1998; for a review Simpson et al., 2002; Reiter, 2004; Pierson et al., 2012), to our knowledge, no short form of the WISC-IV or WISC-V has been tested with respect to its psychometric qualities in this atypically developing population. With respect to the use of shortened tests of cognitive ability as a decisional aid for the identification of IGC, Prewett (1995) suggests that shortened tests should generate scores that are comparable with those obtained with a battery of global assessment tests. It is thus necessary to compare the mean values of the short scale with that of the full scale. If a discrepancy between the IQ short form (IQ_SF) and the FSIQ is within 2 standard errors of measurement (SEM), the IQ_SF is considered as stable (Kieng et al., 2013; Meyers et al., 2013).

In the literature, some authors use 2 or 4 subtests to estimate IQ_SF score from Wechsler scale to identify IGC (Shaw et al., 2006; Alloway and Elsworth, 2012; van Viersen et al., 2014, 2015). To our knowledge, the psychometric qualities of these short forms have never been tested for the identification of IGC. In the present study, we make nine IQ_SF from all possible combinations of 2 or 4 subtests in Verbal Comprehension Index (VCI) and Performance Reasoning Index (PRI). We compare reliability, validity, and screening qualities with each other. We excluded the subtests involving the working memory, because these subtests are also known to be affected negatively by specific learning disabilities (Maehler and Schuchardt, 2009; Cornoldi et al., 2014; Toffalini et al., 2017a). While IGC are known to elicit high performances in working memory tasks (Calero et al., 2007; Hoard et al., 2008; Ruthsatz and Urbach, 2012), these subtests can underestimate the overall cognitive performance in the context of specific learning disability. All short forms based on subtests from VCI and PRI make it possible to obtain an estimation of the FSIQ and GAI scores in less than 30 min. The aim of our study is thus to determine the best short scales, in terms of their psychometric qualities, for the identification of IGC.

Methods

Participants

The data was collected from the WISC-IV produced by 117 French IGC (mean age: 10.39, SD: 1.03, 74% boys) and 52 French intellectually typical children (mean age: 10.57, SD: 2.66, 79% boys). Clinician psychologists gave us the data completely anonymous and in the respect of French deontological code. To preserve the anonymity of the children, we chose not to contact the parents and their children. The groups did not significantly differ in terms of age, t_(57.902) = −0.485, p = 0.629, d = −0.109, or gender, $χ_{(1)}^{2}$ = 0.550, p = 0.458. The children were contacted through teachers, school psychologists or licensed psychologists. In order to allow for measurement errors, the inclusion criterion for intellectually gifted group was set to FSIQ or GAI score greater than or equal to 125 in accordance with the current recommendation (McIntosh et al., 2005; Assouline et al., 2010; Brasseur and Grégoire, 2010).

Procedure

All of the statistical analyses were made using version 3.4.2 of R software (R Core Team, 2017). We estimated the interclass correlation of each IQ_SF and the full-scale using the irr library (Gamer et al., 2012) and the hierarchical omega (ω_H) of each IQ_SF using the MBESS library (Kelley, 2018). We also computed the 95% confidence intervals for the correlation comparisons (Zou, 2007) using the cocor library (Diedenhofen and Musch, 2015).

The data relevant to the preparation of short forms was extracted from fully administered WISC-IV protocols. The FSIQ scores were computed from 10 core subtests from the WISC-IV. The GAI scores were obtained from the VCI and PRI (Lecerf et al., 2010a).

The current study is composed of three steps in our statistical analyses. Firstly, the scores from different IQ_SF were computed using the linear equation method (Tellegen and Briggs, 1967). They were divided by the sum of the standard scores given by the VCI (Vocabulary [Vo], Similarities [Si] and Comprehension [Co]) and PRI (Block Design [Bd], Picture Concepts [Pc], and Matrix Reasoning [Mr]). Secondly, we realized a simulation of 1,000,000 data with the Choleski decomposition from correlation matrix of WISC-IV (Table 4.1; Wechsler, 2005) using the mvtnorm library (Genz et al., 2018) (see Giofrè et al., 2017 for a similar approach). We realized a comparison between psychometrics parameters (i.e., reliability and validity) estimated from empirical and simulated datasets. Then, we selected the three best IQ_SF based on these psychometrics parameters estimated from simulated dataset. For each selected short form WISC-IV, we computed the indicators of a receiver operating characteristic (ROC) such as the sensitivity, the sensibility, the false positive rate (FPR), the false negative rate (FNR), and the Area Under Curve (AUC) from our empirical sample using the pROC library (Robin et al., 2011). The AUC is used as a general measure to estimate the performance of the IGC classification (Fawcett, 2006). Thirdly, we realized another simulated dataset from correlation matrix of WISC-V (Table 4.1; Wechsler, 2016). From this simulated dataset, we created the different short forms of WISC-V. We then highlighted the three best short forms of WISC-V based on the estimated psychometrics parameters from simulated dataset.

The data and R script used for the simulations and the statistical analyses are available on the Open Science Framework (OSF) at http://osf.io/dax8p.

Statistical Analyses

Nine IQ_SF derived from 2-subtest forms, and 9 IQ_SF derived from 4-subtest forms were computed from 3 PRI and 3 VCI subtests. For each IQ_SF, the threshold of intellectual giftedness identification was superior to 125, i.e., the 95th percentile.

It is essential to evaluate reliability and validity when selecting the best short form (Cyr and Brooker, 1984). For each form that we prepared, reliability and validity indices were thus computed.

Index of Reliability

Alpha

The reliability of each form was determined by a composite reliability coefficient (r_cc) according to Equation (1) from Tellegen and Briggs (1967), based on a table of internal consistency and inter-correlations of the applied subtests derived from the WISC-IV manual (Table 4.1 and Table 5.1 from Wechsler, 2005) and the WISC-V manual (Table 4.1 and Table 5.1 from Wechsler, 2016). This index allows the standard error of each short form measurement to be determined. The composite reliability coefficient is frequently used to estimate the reliability of the abriged scale (Ryan and Ward, 1999; Girard et al., 2010, 2015; Donders et al., 2013; Denney et al., 2015).

\begin{array}{l} r_{c c} = \frac{\sum r_{j j} + 2 \times \sum r_{j k}}{n + 2 \times \sum r_{j k}} \end{array}

where r_jj is the reliability coefficient of the jth subtest in IQ_SF, n is the number of subtests in IQ_SF, r_jk is the correlation coefficient between the jth subtest and the kth subtest used for IQ_SF.

Hierarchical omega

In contrast to the coefficient alpha reliability, the coefficient omega (ω) takes into account the unequal factor loadings (Watkins, 2017). In particular, the hierarchical omega (ω_H) has also the advantage to be unaffected by the fitting factorial analysis model (Kelley and Pornprasertmanit, 2016). So, ω_H can be a better index of the reliability of a composite score from Wechsler scale than alpha coefficient (Gignac and Watkins, 2013). This index estimates the variation portion which is involved by the general factor. High value of ω_H indicates that a general factor explains a large part of variation in the composite score. ω_H coefficient is considered as reliable if it exceeds 0.50 at minimum, but ω_H superior or equal to 0.75 is considered better (Reise et al., 2013).

Interclass correlation coefficient

In addition, we used the Interclass Correlation Coefficient (ICC; model A.1 in McGraw and Wong, 1996) to examine the reliability of both the IQ_SF and the full-scale scores.

Index of Validity

Three validity indices were estimated: the convergent validity determined by the corrected correlation between the short form and the full form of the scale (r′), the degree of discrepancy computed as the difference between the IQ_SF score and the full form score (FSIQ or GAI), and the accuracy of the estimation of the full form scores (C_acc).

Convergent validity

The convergent validity of each form was determined by computing its correlation (Pearson's r) with the FSIQ and GAI scores. These correlations were then corrected (r′; Equation 2) by taking the redundancy of the variance error into account, using the modified version (Girard and Christensen, 2008) of the Levy formula (Levy, 1967). The forms were indeed prepared from 2 or 4 tests taken from the entirely administered scale. The correlation between the forms and the full forms (FSIQ or GAI) is artificially increased by the measurement error shared between the two forms:

\begin{array}{l} r^{'} = r_{s f} - (1 - r_{c c}) \times \frac{\sqrt{p + 2 \times r_{j k}}}{\sqrt{10 + 2 \times r_{l m}}} \times \frac{S D_{s f}}{15} \end{array}

where r′ is the corrected correlation coefficient; r_sf is the uncorrected correlation coefficient between the score of the short form (QI_SF) and the full scale (FSIQ or GAI), r_cc is the composite reliability coefficient of QI_SF, r_jk is the correlation coefficient between the jth subtest and the kth subtest used for IQ_SF, r_lm is the correlation coefficient between subtest l and subtest m, used for the FSIQ or the GAI, p is the number of subtests used for the IQ_SF, SD_SF is the standard deviation of the QI_SF.

Degree of discrepancy

The paired Student t-tests were computed in order to determine whether the average of the QI_SF scores was significantly different at the average of the FSIQ or GAI scores. This index allows us to define whether the measurement provided by the scale is significantly different at that of the full scale. The extent of this effect (Cohen's d for correlated samples comparison; Lakens, 2013) was estimated, in order to determine the magnitude of the difference in mean computed value.

Accuracy

An indicator of the accuracy (C_acc) of the estimation of FSIQ or GAI from the QI_SF score was prepared, in order to identify the percentage of individuals in our sample having a QI_SF score greater than or equal to the threshold value of 125. The indicator C_acc also takes into account the measurement stability between the QI_SF score and the FSIQ and GAI scores. Usually, this stability is determined by the difference between two scores in the range between −2 and +2 standard errors of measurement (Meyers et al., 2013). C_acc is a coefficient lying in the range between 0 and 1. It can be interpreted as the accuracy of IGC identification. The closer C_acc is to 1, the more accurate the QI_SF score-based identification.

In order to simplify the interpretation of multiple psychometric reliability and validity scores, we construct a composite indicator (R_c) based on the approach proposed by Cyr and Brooker (1984), and adapted by Girard and Christensen (2008). R_c corresponds to the unweighted average of 4 computed reliability and validity indicators, i.e., the composite reliability coefficient (r_cc), the hierarchical omega (ω_H), the interclass correlation coefficient (ICC), the corrected correlation (r′) between the IQ_SF score and the FSIQ and GAI scores, the identification accuracy (C_acc). We thus reproduced and adapted the psychometric agreement indicator of Girard and Christensen (2008), with our forms and the full form of the Wechsler scale. This makes it possible to interpret the strength of agreement, which ranges from 0 (absence) to 1 (perfect), between the IQ_SF score and the FSIQ and GAI scores (Girard et al., 2015).

Results

Description of the Sample

All of the participants obtained a GAI score equal to, or greater than 125. The descriptive data from our sample, as well as the mean and standard deviation of the FSIQ, the GAI, and the substests from WISC-IV, are shown in Table 1.

TABLE 1

Table 1. Descriptive statistics of the sample.

The deviations and equations for each IQ_SF are presented in Table 2. All of the scores from each 2-subtest and 4-subtest IQ_SF have skewness and kurtosis coefficient lying between −1 and 1. All of the 2- and 4-subtests forms have an average value greater than 125 in intellectually gifted children and an average value of 107 in typical children. The raw data is available in the Supplementary Material (Data Sheet 1).

TABLE 2

Table 2. Mean (SD) of WISC-IV Short-Form Scores and formulas.

Short Form WISC-IV Estimation With Simulated Dataset

We realized a comparison between the reliability and validity indicators from real and simulated data in WISC-IV. The composite indicator Rc calculated with our sample and the simulated dataset are highly correlated [r₍₃₆₎ = 0.988]. In addition, there are few differences between each indicator of reliability and validity estimated with our sample and these with the simulated dataset have few differences (Table 3). The near perfect duplication of psychometric quality indicators shows that their estimation with simulated data is very close to those derived from real data. We thus selected the short forms from the simulated dataset.

TABLE 3

Table 3. Comparison the principal indexes of reliability and validity estimated with our sample and the simulated dataset.

The 2-subtest and 4-subtest forms are ranked by decreasing value of the composite score R_c, determined to use the reliability index and each validity index (see Table 4). All of the IQ_SF scores produced by the 2 subtests are significantly correlated with the FSIQ, $r_{(1 e 6)}^{'} \in$ [0.680; 0.786]; p < 0.01. All of the IQ_SF scores computed from 4 subtests are also correlated with the FSIQ, r′_(1e6) ∈ [0.823; 0.851]; p < 0.01. However, the forms with 4 subtests are significantly more strongly correlated with the FSIQ, respectively r_(9e6) = 0.791; r_(9e6) = 0.885; 95% CI [−0.094, −0.094], and the GAI, respectively r_(9e6) = 0.856; r_(9e6) = 0.957; 95% CI [−0.102, −0.102], than with the forms with 2 subtests

TABLE 4

Table 4. Reliability and Validity of WISC-IV Short Forms from simulated data.

The short form with the Similarities + Matrix Reasoning (SiMr) subtests has the highest agreement score of all 2 subtest forms, R_c = 0.807. It is most strongly correlated with the FSIQ, r′_(1e6) = 0.786; p < 0.01, and GAI score, r′_(1e6) = 0.833; p < 0.01. In 80% of cases, the IQ_SF score for the SiMr form correctly identifies the IGC in our sample, and their IQ_SF lie within 2 standard errors of measurement of the FSIQ. In 91% of cases, the IQ_SF score for the SiMr form correctly identifies the IGC in our sample, and their IQ_SF lie within 2 standard errors of measurement of the GAI.

Among the 4-subtest forms, the IQ_SF scores of Similarities + Vocabulary + Matrix Reasoning + Block Design [SiVoMrBd] form has a higher accession score, R_c = 0.885. It is strongly correlated with the FSIQ, r′_(1e6) = 0.851; p < 0.01, and GAI score, r′_(1e6) = 0.892; p < 0.01. In 86% of cases, the IQ_SF score for the SiVoMrBd form correctly identifies the IGC in our sample, and their IQ_SF lie within 2 standard errors of measurement of the FSIQ. In 99% of cases, the IQ_SF score for the SiVoMrBd form correctly identifies the IGC in our sample, and their IQ_SF lie within 2 standard errors of measurement of the GAI.

Identification Efficiency of Short-Form WISC-IV From Empirical Data

After selecting the three best short forms with 2- and 4-subtests from simulated dataset, we realized the AUC, sensitivity and sensibility of the three best IQ_SF from empirical dataset. In Table 5, the short forms with 2 subtests and 4 subtests are ranked in descending order of the AUC indicating the predictive performance in identifying the IGC. All short form scales have a high AUCs suggesting a highly predictive performance with 2- and 4-subtests models. The difference among all models is low (ΔAUCs < 0.05).

TABLE 5

Table 5. Performance analysis of three best 2- and 4-subtests WISC-IV short-forms.

Among the forms with 2 subtests, the form with the best performance is still Similitaries + Matrix Reasoning [SiMr]. They allow more than 74% of IGC in our sample to be correctly identified. The SiMr form has the probability of around 1.1% typical children being incorrectly identified as gifted. It has only significant difference with GAI, t₍₁₆₈₎ = −3.903, p < 0.01, d_rm = −0.156, but it has the smallest effect size among the three best 2-subtests short forms.

Among the forms comprising 4 subtests, the IQ_SF with the best performance consisted of the Similarities + Vocabulary + Picture Concept + Block Design [SiVoPcBd] subtests. It correctly identifies more than 96% of IGC in our sample. No typical children were identified as being gifted, but 4.3% IGC from our sample have not been correctly identified. It has only a significant difference with FSIQ, t₍₁₆₈₎ = −3.053, p < 0.05, d_rm = −0.109.

Short-Form WISC-V Estimation From Simulated Dataset

The 2-subtest and 4-subtest forms are ranked by decreasing value of the composite score R_c (see Table 6). All of the IQ_SF scores produced by the 2 subtests are significantly correlated with the FSIQ, r′_(1e6) ∈ [0.741; 0.833]; p < 0.01. All of the IQ_SF scores computed from 4 subtests are also correlated with the FSIQ, r′_(1e6) ∈ [0.921; 0.940]; p < 0.01.

TABLE 6

Table 6. Reliability and Validity of WISC-V Short Forms from simulated data.

Like at the WISC-IV, the short form with the Similarities + Matrix Reasoning [SiMr] subtests has the highest agreement score of all 2 subtest forms, R_c = 0.849. It is strongly correlated with the FSIQ, r′_(1e6) = 0.833; p < 0.01, and GAI score, r′_(1e6) = 0.855; p < 0.01. In 87% of cases, the IQ_SF score for the SiMr form correctly identifies the IGC in our sample, and their IQ_SF lie within 2 standard errors of measurement of the FSIQ. In 92% of cases, the IQ_SF score for the SiMr form correctly identifies the IGC in our sample, and their IQ_SF lie within 2 standard errors of measurement of the GAI.

Like at the WISC-IV, the Similarities + Vocabulary + Matrix Reasoning + Block Design [SiVoMrBd] form has a high agreement score, R_c = 0.923. It is strongly correlated with the FSIQ, r′_(1e6) = 0.898; p < 0.01, and GAI score, r′_(1e6) = 0.917; p < 0.01. In 95% of cases, the IQ_SF score for the SiVoMrBd form correctly identifies the IGC in our sample, and their IQ_SF lie within 2 standard errors of measurement of the FSIQ. In 99.8% of cases, the IQ_SF score for the SiVoMrBd form correctly identifies the IGC in our sample, and their IQ_SF lie within 2 standard errors of measurement of the GAI.

Discussion

Our aim was to make a short form of the recent versions of Wechsler scales, allowing an intellectual capacity assessment in less than 30 min. In the literature, short forms with 2 or 4 subtests are often used to estimate intellectual capacity. We thus tested all possible short form combinations, comprising either 2 or 4 subtests used for the evaluation of high-level cognitive abilities.

The results of the present study indicate that the estimation of reliability and validity indicators with simulated data are very close to them estimated with real data. Ours findings also show the short forms of the WISC-IV can have high performance to identify children on having intellectual giftedness. The 4-subtest forms appear to produce better psychometric results than the 2-subtest forms. In addition, the 4-subtest short forms appear to provide a good compromise between test duration and psychometric qualities, which are more satisfactory than those obtained with 2-subtest forms. The ω_H coefficient also showed that the general factor explained more variance in the 4-subtest than 2-subtest forms. The 4-subtest forms seemed to be a satisfactory trade-off between an accurate estimation of overall cognitive aptitude and administration time (Gignac, 2015).

In spite of a number changes, our results show that the 4-subtest form Similarities + Vocabulary + Matrix Reasoning + Block Design [SiVoMrBd], in the WISC-IV and WISC-V, is globally efficient in the identification of IGC. It appears to evaluate the three most discriminating cognitive abilities in the identification of IGC, i.e., Knowledge-Comprehension, Fluid reasoning, and Visual processing (Volker et al., 2006). Moreover, the composite score using the Similarities and Matrix Reasoning (SiMr) subtests, in the two recent Wechsler scale, appears to provide one of the best 2-subtest forms for the identification of IGC. These two types of short form thus appear to provide acceptable means of identification, for the selection of candidates for complementary evaluations (Prewett, 1995).

In the context of learning disabilities, an abbreviated intellectual assessment using Similarities, Vocabulary, Matrix Reasoning, and Block Design subtests can allow clinician psychologists to be less biased an overall cognitive aptitude estimation. Indeed, these subtests seem to be less affected by the specific learning disabilities (e.g., Toffalini et al., 2017a,b). The time gain allows us to add another cognitive assessments such as the complex span tasks to assess working memory capacity (e.g., Gonthier et al., 2017).

The Gc, Gf, and Gv abilities evaluated by the subtests of the GAI score appear to provide the best discrimination in the identification of IGC. This is based on the idea that these cognitive abilities appear to be the characteristics of high intellectual potential (Margulies and Floyd, 2009). Thus, psychologists should choose measurements that are well adapted to characteristics that are related to intellectual giftedness (Pierson et al., 2012).

Our results reveal the importance of relying on a theoretical model of cognitive ability, with the aim of identifying IGC as a CHC model. In addition, this theoretical model can be very helpful for the schooling of children in general (Aubry and Bourdin, 2016) and IGC in particular (Warne, 2016).

Any decision in this respect should not be made solely on the basis of the composite score from the short form. It is also important to recognize the reality of measurement errors, which can prevent the correct identification of IGC (Pierson et al., 2012). It is thus recommended to implement a multidimensional evaluation of IGC's characteristics (McClain and Pfeiffer, 2012).

We have shown that some short forms have satisfactory psychometric qualities. However, they have to be accompanied by other assessments such as a teacher rating scale, such as the Gifted Rating Scales–School Form (GRS; Pfeiffer and Jarosewich, 2003) in order to improve the quality of IGC identification (McBee et al., 2013, 2016). Short measurements appear to be reasonable tools for the prediction of global scores of complete batteries of tests, for the identification of children with intellectual giftedness (Newton et al., 2008).

Limitations of Our Study

Our simulation of data is very close to our empirical data. However, their relation is not perfect. So, it may have some difference with empirical data for WISC-V short forms. In terms of future perspectives, it would be interesting to implement a detailed analysis of the specificity and sensitivity of each short form of WISC-V identified as being reliable and valid for the identification of IGC.

Among the limitations of our study, the relatively small number of typical children in our sample means that our results must be considered with caution. We tried to estimate the AUC, sensitivity and specificity of all short forms scale.

Conclusion

The identification of IGC based on the use of standardized tests such as the Wechsler scale is often time-consuming. This drawback can prevent other evaluations from being made, which could be essential for the education of these children with specific needs. The development of a short form of the recent version of Wechsler scale is thus useful for the fast and efficient identification of IGC. This short form would then allow the need for a more in-depth evaluation of various cognitive and socioemotional characteristics to be determined.

Our study evaluated nine 2-subtest forms and nine 4-subtest forms, based on the linear method of Tellegen and Briggs (1967). In order to evaluate these different short forms, we computed with simulated datasets several psychometric indicators that were reorganized into groups with an agreement indicator for FSIQ and GAI scores. We also computed the AUC, sensibility and specificity, on the basis of the 117 IGC and 52 typical children in our sample.

Our results show that the 4-subtest short form at the WISC-IV and WISC-V, Similarities + Vocabulary + Matrix Reasoning + Block Design [SiVoMrBd], appears to be one of the most reliable forms for the identification of IGC. Among the 2-subtest forms at the WISC-IV and WISC-V, the Similarities + Matrix Reasoning [SiMr] version appears to ensure an optimal compromise between reliability and accuracy, for the estimation of FSIQ and GAI scores. In the case of our sample, this outcome led us to question the usefulness of relying on a theory of cognitive aptitudes such as that of the CHC model, in order to determine the specific cognitive characteristics of IGC. We are of the opinion that the elaboration of a short form should take these specific cognitive characteristics into account, in order to obtain sufficiently accurate identification of IGC.

Ethics Statement

The participant's data were retrieved from clinical psychologists as part of their day-to-day clinical practice and a further examination ran with aim of developing a short-form of the Wechsler' scale.

Author Contributions

AA and BB designed the study. AA performed data collection and analyzed the data. AA and BB wrote the manuscript.

Funding

AA's PhD is part of the HPISCOL project, both supported by the Hauts-de-France region council and the FEDER (European Fund for Regional Development).

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

First and foremost, we extend our thanks to the children and families who participated in our study. The authors also extend their warm thanks to Claire Touchet, Corentin Gonthier, Émilie Lacot, Geoffrey Blondelle, Yannick Gounden, and Marc Chatterji for their advice and their proofreading of this paper. The authors thank also Académie de Versailles, Florence Pâris, Philippe Coche, Isabelle Sage, and Éric Turon-Lagot for their help. The authors also want to thank the two reviewers for their suggestions to improve this article. This research was supported by the European Regional Development Fund (ERDF) and the Regional Council of Haut de France for the HPISCOL project: Enfants et adolescents à haut potentiel: Identification et scolarisation.

Supplementary Material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyg.2018.00830/full#supplementary-material

Data Sheet 1. Raw data.

References

Alloway, T. P., and Elsworth, M. (2012). An investigation of cognitive skills and behavior in high ability students. Learn. Individ. Differ. 22, 891–895. doi: 10.1016/j.lindif.2012.02.001