Comparison of Measures of Ability in Adolescents with Intellectual Disability.

Finding the most appropriate intelligence test for adolescents with Intellectual Disability (ID) is challenging given their limited language, attention, perceptual, and motor skills and ability to stay on task. The study compared performance of 23 adolescents with ID on the Wechsler Intelligence Scale for Children-Fourth Edition (WISC-IV), one of the most widely used intelligence tests, and three non-verbal IQ tests, the Raven's Colored Progressive Matrices (RCPM), the Test of Non-verbal Intelligence-Fourth Edition and the Wechsler Non-verbal test of Ability. Results showed that the WISC-IV Full Scale IQ raw and scaled scores were highly correlated with total scores from the three non-verbal tests, although the correlations were higher for raw scores, suggesting they may lead to better understanding of within group differences and what individuals with ID can do at the time of assessment. All participants attempted more questions on the non-verbal tests than the verbal. A preliminary analysis showed that adolescents with ID without ASD (n = 15) achieved higher scores overall than those presenting with ID+ASD (n = 8). Our findings support the view that short non-verbal tests are more likely to give a similar IQ result as obtained from the WISC-IV. In terms of the time to administer and the stress for participants, they are more appropriate for assessing adolescents with ID.


INTRODUCTION
Assessments of intellectual ability occur at all stages of life for all levels of ability and society as the basis of clinical and education decisions, and for research purposes (Davis et al., 2000;Bradley-Johnson, 2001;Bittles et al., 2002;Heffer et al., 2009;Larsen et al., 2011;Baron and Leonberger, 2012). Such assessments are critical for identification and diagnosis of individuals with neurodevelopmental disorders, especially those with Intellectual Disability (ID).
ID is defined by the Diagnostic and Statistical Manual of Mental Disorders: DSM-5 (American Psychiatric Association, 2013) as an IQ of 70 or below, lower than expected, adaptive functioning and presence from early childhood. The previous DSM classification (DSM-IV) (American Psychiatric Association, 1994) classified ID in degrees of severity based on adaptive functioning and IQ: Mild (50-55 to approximately 70; ∼85% of the ID population), Moderate (35-40 to 50-55; ∼10% of ID), Severe (20-25 to 35-40; 3-4% of ID), and Profound (below 20 or 25; 1-2% of ID). Approximately 50% of individuals diagnosed with ID also have co-morbid Autism Spectrum Disorder (ASD) (Matson et al., 1996;Matson and Shoemaker, 2009;Wilkins and Matson, 2009), and Baio (2012) reported an estimated 31% of children with ASD have co-occurring ID with an additional 23% having IQs in the borderline range (71-84) (Baio, 2012). This confirms earlier observations by Vig and Jedrysek (1999) who reported that the more severe the individual's ID, the greater likelihood of ASD symptoms. It has also been shown that impaired verbal and nonverbal communication in individuals with ID co-morbid with ASD (ID+ASD) is more common than in those with ASD alone (Deb and Prasad, 1994).
Assessment scores for individuals with ID play a significant role in the allocation of clinical and educational services and monitoring of efficacy of training programs. However, intelligence testing in individuals with ID has significant limitations. Particular tests have emerged as the gold standard (e.g., the Wechsler Scales) often with insufficient up-to-date consideration or justification of the appropriateness of individual items and the scoring scheme for this population (Wechsler, 1949(Wechsler, , 1974(Wechsler, , 1997(Wechsler, , 2003. Traditionally, measures of progress of cognitive development in ID have been made in terms of language-based items by comparing ID to Typically Developing (TD) groups of the same chronological and mental ages (Carroll, 1997). However, individuals with TD have vastly better verbal skills, raising the question of the most optimal and sensitive testing criteria.
Classical assessment tools rarely provide sensitive measurement for the low functioning individuals. For example, most intelligence tests do not measure IQ below 40 (e.g., the Wechsler Scales, Wechsler, 1949Wechsler, , 1974Wechsler, , 1997Wechsler, , 2003, the Kaufman Assessment Battery for children (Roid and Barram, 2004) and the Stanford Binet Scales (Thorndike et al., 1986), and so their use in measuring intellectual functioning below the mild ID range is limited. In addition, the normative samples rarely include an adequate number of participants with ID needed to provide sensitive measurement in the very low ability range, although they have been recruited for separate validation studies (Wechsler, 1949(Wechsler, , 1974(Wechsler, , 1997(Wechsler, , 2003. Thus, floor effects have been discussed, particularly for different versions of the Wechsler Scales, e.g., Whitaker and Wood (2008) who made the recommendation to extrapolate the relationship between subtest raw scores and scaled scores below a scaled score of one.
Other researchers have addressed the issue by using restandardized raw scores based on their sample specific statistic, but this confounds comparison of results across study samples. Hessl et al. (2009), Sansone et al. (2014, Benson et al. (2015) and Orsini et al. (2015) have suggested new scoring approaches to use for ID populations, e.g., Z-scores or the deviation Z-scores from raw scores for the Wechsler Intelligence Scale for Children-Fourth Edition (WISC-IV). However, the WISC-IV scaled score continues to be used to assess the cognitive abilities of different age groups and abilities, e.g., ranging from a 6-year-old child with moderate ID to a 16-year-old with intellectual giftedness or a TD individual, to plan special school programs (Sattler, 2008). Sattler (2008) has also suggested that individuals who are extremely high functioning are not adequately assessed on the WISC-IV and nor are those who are extremely low functioning. Individuals with ID usually have poor verbal ability, although usually more than expected from measures of their non-linguistic cognitive ability (Rondal, 2001;van der Schuit et al., 2011). Thus, reliance on scores from the verbal component may not be valid or a reliable measure of assessment of cognitive ability and may lead to a misunderstanding of their cognitive potential (Courchesne et al., 2015).
Because of the limitations associated with using the WISC-IV, alternative assessments using non-verbal measures have emerged as useful. Such non-verbal measures include the Raven's Colored Progressive Matrices (RCPM) (Raven et al., 1995), the Test of Non-verbal Intelligence-fourth edition (TONI-4) (Brown et al., 2010), the Comprehensive Test of Non-verbal Intelligence-Second Edition (CTONI-2) (Hammill et al., 2009), and the Wechsler Non-verbal Scale of Ability; (WNV) (Wechsler and Naglieri, 2006). The RCPM is well-known and has been much used cross-culturally (McCallum and SpringerLink, 2003) since its appearance in 1938 (Raven, 1938). Similarly to the RCPM, a low degree of cultural loading and linguistic demand is usually associated with the TONI-4 and the WNV (McCallum, 2013). In the current study we examined these three non-verbal tests. They were chosen on the basis of visual problem solving, less verbal expression required in administration and no need for verbal expression for responses.
This study firstly aimed to compare the raw scores (controlling for age) and scaled scores of an adolescent group with ID on the WISC-IV. Secondly we compared both scoring systems of the WISC-IV with the raw scores obtained on the three non-verbal tests as a means of establishing whether the non-verbal tests are adequate measures of cognitive ability in individuals with ID. That is, we investigated: i. whether the WISC-IV scaled scores of our ID sample were similar to those of the ID sample reported in the manual; ii. whether the RCPM, TONI-4, and WNV raw scores correlated significantly with the WISC-IV raw scores; and iii. whether the WISC-IV raw scores or scaled scores are more appropriate for measuring cognitive understanding in individual with ID.
In addition, since it is well accepted that many adolescents with ID also show symptoms of ASD, we did a preliminary comparison of the raw scores on each test of those participants with ID and no diagnosis of ASD and those with co-morbid ID and ASD.

Participants
Participants were 23 adolescents (15 male and 8 female) with ID who attended a Specialist School in Melbourne, Australia. All had been clinically diagnosed prior to starting school with ID, and often a more specific neurodevelopmental disorder, by a trained clinician using medical records and the DSM-IV criteria. All had IQ scores below 70 at school enrollment. The mean age was 14 years and 1 month (range from 11 years and 3 months to 17 years 0 months; SD = 1.86) at the start of the study. Of the entire ID sample of 23, the largest subgroup was 8 (34.8%) who had previously been assessed by a panel of three clinical experts, as required by the State Government of Victoria, to meet the criteria for a clinical diagnosis of ASD. This group is referred to as the ID+ASD group. Another 15 had a primary diagnosis of ID, and included 3 with Attention Deficit and Hyperactive Disorder (13.0%), 1 with Down Syndrome (4.3%), 1 with Williams Syndrome (4.3%), 5 with Idiopathic ID (21.7%) and 5 with other medical diagnoses leading to ID (21.7%). One participant completed only 10 subtests of the 15 subtests of the WISC-IV but completed all three non-verbal tests and was included in the analyses where appropriate. All parents/guardians provided informed consent prior to their child's participation. All individuals were screened for normal hearing and vision. Ethics approval was obtained from La Trobe University Human Ethics Committee and the State Department of Education. Permission to conduct testing in schools was obtained from the school principal.

Materials
The Wechsler Intelligence Scale for Children Fourth Edition (WISC-IV) (Wechsler, 2003) The WISC-IV provides a Full-Scale IQ (FSIQ) to represent a child's overall cognitive ability (Flanagan and Kaufman, 2004). The WISC-IV can be used to assess individuals aged 6 years 0 months to 16 years 11 months. It has 10 core and 5 supplemental subtests. The subtests can be clustered into composite quotients for four indices. The subtests are grouped into four factors as follows: (a) the Verbal Comprehension Index (VCI) which is based on Similarity, Vocabulary, and Comprehension subtests; (b) the Perceptual Reasoning Index (PRI) which is based on the Block Design, Matrix Reasoning, and Picture Concepts subtests; (c) the Working Memory Index (WMI) which is based on the Digit Span and Letter-Number Sequencing subtests; and the Processing Speed Index (PSI) which is based on the Coding and Symbol Search subtests. The FSIQ is computed from all 10 core subtests. Internal consistency reliability coefficients range from 0.96 to 0.97 for the Full Scale IQ. Criterion validity with other intelligence tests range from 0.85 to 0.88 (Wechsler, 2003;Sattler, 2008).
See Table 1 for the WISC-IV subtests, index structure and task demands.
Raven's Colored Progressive Matrices (RCPM) (Raven et al., 1995) The RCPM is commonly used to obtain a non-verbal reasoning component and was designed for individuals aged 5 through 11 years, elderly persons, and mentally and physically impaired persons. The RCPM comprises 36 multiple-choice items of abstract reasoning divided into three subsets of 12 items. Each item consists of a different colored matrix pattern with a "missing" piece. Six possible pieces are displayed as alternatives to best complete the pattern. The puzzle version (Bello et al., 2008) was used in the current study as it has been shown to reliably measure non-verbal mentation of individuals equal to the standard book form in children with typical development (Bello et al., 2008), and to increase response rate of children with ID. The puzzle version of RCPM uses identical matrices to the book form, but the six alternative answers are attached by Velcro requiring the participant to actually replace and reattach the chosen/preferred answer. Only raw scores could be used for the RCPM because the manual does not report standard scores. Split-half reliabilities range from 0.65 to 0.94. Concurrent validity coefficients between Raven's Progressive Matrices and other intelligence tests are in the 0.50-0.80 range (Raven et al., 1995).
The Test of Non-verbal Intelligence, 4th Edition (TONI-4) (Banks and Franzen, 2010) The TONI-4 is designed to assess problem-solving ability and abstract reasoning abilities of individuals aged 6-89 years. This test provides language-free measures of cognitive ability, and does not require reading or writing skill. Instructions are given via pantomime; participants respond by pointing, nodding or blinking. The TONI-4 Form A has five training items and 45 abstract/figural problem-solving items. Items are in a multiplechoice format, with either five or six response alternatives. Based on the TONI-3, it can differentiate individuals with ID from those without (Sattler, 2008). Internal consistency reliability scores are satisfactory, ranging from 0.94 to 0.97. Correlation coefficients Task demands Ability to listen to question, draw upon learned information from both formal and informal education, reason through an answer, and express thoughts.
Ability to examine a problem, draw upon visual-motor and visual-spatial skills, organize thoughts, create solutions, and then test them.
Ability to memorize new information, hold it in short-term memory, concentrate, and manipulate that information to produce some result or reasoning process.
Ability to focus attention and quickly scan, discriminate between and sequentially order visual information.
with other non-verbal intelligence tests range from 0.73 to 0.79 (Brown et al., 2010).
The Wechsler Non-verbal Scale of Ability (WNV) (Wechsler and Naglieri, 2006) The WNV is an individually administered non-verbal test of intelligence designed for ages 4 through 21 years. The purpose of the WNV is to expand the clinical utility of the Wechsler scales to individuals with language constraints. The average internal consistency reliability for the Full Scale IQ on the four-subtests version is 0.91. Correlations between the Full Scale IQ using the four-subtest version and other tests of intelligence range between 0.71 and 0.82 (Wechsler and Naglieri, 2006).

Procedure
The participants were tested individually on school grounds during school hours. All tests were conducted using standardized instructions, scoring and interpretation. The results for each participant were considered a valid assessment, not impeded by behavioral or emotional factors. The RCPM was usually administered in the initial session, the TONI-4 in the second, the WNV in the third and the WISC-IV in the last. The RCPM, TONI-4, and WNV were completed in one session (approximately 10-45 min) while the WISC-IV was completed in 2-4 sessions, with a range of 10-30 min per testing session, depending on the child's concentration. All standard test scores were obtained using the norms from the relevant test manuals.
Although six participants were chronologically over 16 years 11 months, their mental age was under 9 years. Thus, we used standard score equivalents of ceiling chronological age provided in the WISC-IV manual. All 6 subtests from the WNV were included, as were the 10 core subtests from the WISC-IV plus the 5 supplemental subtests.

RESULTS
Both raw scores and scaled scores from each test were first checked for normality and the results were acceptable. Our first analysis investigated whether the WISC-IV scaled scores of our sample (n = 23) were similar to those of the ID norming sample, aged 6-16 years, reported in the manual. Descriptive statistics (mean and standard deviation) for the WISC-IV scaled scores and FSIQ in our sample and ID sample in the WISC-IV manual are reported in Table 2. In general, mean scores on all subtests, composites and FSIQ on the WISC-IV were slightly lower for our sample. However, there were no significant differences between our sample and the norming sample on any of the WISC-IV scaled score results. The subtests for which the two groups had the lowest and highest scores were the same: Arithmetic was the lowest and Cancellation the highest. Descriptive statistics (mean and standard deviation) for the WISC-IV raw scores and scaled scores in our sample are reported in Table 3. There was a floor effect in all subtests of the WISC-IV. For example, the Block Design raw scores that ranged from 0 to 10 were converted to scaled scores as 1 (n = 9), the Vocabulary raw scores that ranged from 0 to 9 were converted to scaled scores as 1 (n = 6), the Arithmetic raw scores that ranged from 1 to 9 were converted to scales scored as 1 (n = 11) etc. Of the sample, only 7 participants did not have at least one scaled score of 1. Picture Concept, Letter-Number sequencing and Arithmetic showed the highest number of floor effects (n = 11). Letter-Number sequencing and Arithmetic are two of the three subtests of the Working Memory Index. Pearson's correlation was used to compare the WISC-IV raw scores (co-varied for age) and scaled scores from the 10 core and 5 supplemental subtests. Associations among the subtests, composite and FSIQ of the WISC-IV are shown in Table 4. Both the raw scores controlling for age and the scaled scores of each subtest of the WISC-IV were highly correlated with the FSIQ, although inter-correlations among these subtests varied. Using raw scores, the Block Design correlated significantly with 5 of the 15 subtests, and using scaled scores with 4 of the 15 subtests. In the majority of cases the raw scores controlling for age showed higher correlations among the different tasks than when using scaled scores. For example, for Block Design raw scores controlling for age the correlation with FSIQ was higher (r = 0.657, p = 0.001) than when using scaled scores (r = 0.588, p = 0.003). Block Design controlling for age significantly correlated with the Perceptual Reasoning Index: raw scores r = 0.947, p < 0.001 and scaled scores, r = 0.801, p < 0.001. Block Design controlling for age significantly correlated with Matrix Frontiers in Psychology | www.frontiersin.org Reasoning: raw scores r = 0.814, p < 0.001 and scaled scores, r = 0.784, p < 0.001. For some subtests that showed significant correlations when using raw scores there were no significant correlation when scaled scores were used, and vice versa. Our third investigation was to determine whether the RCPM, TONI-4, and WNV raw scores (controlling for age) correlated significantly with the WISC-IV raw scores. Descriptive statistics (mean and standard deviation) for the raw scores and scaled scores on the three non-verbal tests are reported in Table 5. Pearson correlation was again used to assess associations among the subtests and composites of the WISC-IV and three nonverbal tests. Table 6 shows the results.
Significant correlations were found between the FSIQ of the WISC-IV and all the non-verbal tests, including each subtest of the WNV as well as non-verbal subtests of the WISC-IV. There were also significant correlations between the non-verbal tests and the verbal subtests of the WISC-IV: Similarities, Vocabulary, Comprehension, and Information. Raw scores controlling for age for the RCPM were significantly correlated with 14 out of the 15 subtests of the WISC-IV; Letter-Number sequencing was the only subtest that did not show a significant correlation. Raw scores for the TONI-4 significantly correlated with all subtests of the Perceptual Reasoning Index and the Processing Speed Index. The TONI-4 scores also showed significant correlations with Similarities and Information of the Verbal Comprehension Index, but there were no significant correlations with any of the subtests of the Working Memory Index. Raw scores for each subtest of the WNV significantly correlated with all subtests of the Perceptual Reasoning Index and Processing Speed Index. All subtests of the WNV except Matrices significantly correlated with the Verbal Comprehension Index and Working Memory Index. Only the Information subtest of the Verbal Comprehension Index of the WISC-IV showed significant correlations with all non-verbal tests, including all subtests in the WNV.
As reported in the introduction, ID is often co-morbid with ASD and certainly there were many and varied clinical diagnoses for the adolescents with ID in our sample. The largest subgroup (co-morbid ID+ASD) included only 8 individuals and so we compared the pattern of performance of those adolescents with that of the participants without ASD (n = 15) using nonparametric statistics. Means and standard deviations for the raw scores and the scaled scores for the WISC-IV, RCPM, TONI-4, and WNV for the two groups are reported in Table 6. On the WISC-IV, Matrix Reasoning had the highest mean scaled scores for the ID+ASD group, while Cancellation was the highest for the non-ASD group. As shown in Table 7, scores on Arithmetic were the lowest for both groups. The Similarities subtest was also low for the ID+ASD group. The ID+ASD scores were slightly higher on some of the non-verbal tasks (Block Design, Coding, Matrix Reasoning, and Picture Completion) than the non-ASD group.
Based on Mann-Whitney U comparisons, there was no significant difference between the ID non-ASD and ID+ASD for the RCPM, the TONI-4 and the WNV. However, there were significant group difference between the ID non-ASD and ID+ASD for some of the subtest scores on the WISC-IV: Similarities (p = 0.015), Digit Span (U = 23, p = 0.034), Vocabulary (U = 11.5, p = 0.002), Letter-Number Sequencing (U = 26.5, p = 0.030), Comprehension (U = 27, p = 0.033), Information (U = 25.5, p = 0.025) and Word Reasoning (U = 24, p = 0.020). Furthermore, there were significant group difference between the ID non-ASD and ID+ASD for the Verbal Comprehension Index (U = 18.5, p = 0.007) and Working Memory Index (U = 26, p = 0.028) on the WISC-IV composite score.
Pearson correlation was used to assess associations among the raw scores of the subtests, the composite and the total raw scores of the WISC-IV and three non-verbal tests for the ID non-ASD group (see Table 8) and the ID+ASD group (see Table 9).
For the ID non-ASD group, all three non-verbal tests significantly correlated with the WISC-IV FSIQ. The RCPM was significantly correlated with all subtests of the Verbal Comprehension Index of the WISC-IV: Similarities, Vocabulary, Information and Word Reasoning, while the TONI-4 was significantly correlated with only one subtest, Information. For the ID+ASD group, none of the Verbal Comprehension Index subtests of the WISC-IV were significantly correlated with any of the non-verbal tests.

DISCUSSION
The purpose of this study was to examine performance of an adolescent group with ID on the WISC-IV and on three   non-verbal tests: the RCPM, the TONI-4 and the WNV. The most important result in this study is that both the raw scores (controlling for age) and the scaled scores showed highly significant correlations across the different measures (the WISC-IV, RCPM, TONI-4, and WNV), in particular between the WISC-IV FSIQ and the three non-verbal tests. Not surprisingly, performance measured using the scaled scores indicated that all individuals with ID performed well below average for their chronological age on all four tests. Our results also showed no statistically significant differences between our sample and the ID norming sample on the scaled scores of each subtest, composite scores and FSIQ. Interestingly, however, for our sample both WISC-IV scaled scores and raw scores controlling for age were highly correlated with FSIQ scores. The correlations were higher for the raw scores, suggesting that the raw scores of the WISC-IV are the optimal for measuring reasoning ability for individuals with ID; this is because the scaled scores reflect floor effects.
Our finding of such high significant correlations across the different measures of IQ (WISC-IV, RCPM, TONI-4, and WNV), especially between the WISC-IV FSIQ and the three non-verbal tests, is very important both operationally, in terms of establishing better assessment procedures for individuals with ID, and theoretically, in terms of how generalized the concept of cognitive functioning is. Given that there are many conceptual and procedural differences between the four tests in term of design, structure, and language requirements both during administration and in terms of responses, and time needed for completion, it is of importance that we found comparable full scaled IQ results on all four tests. The adolescents with ID performed particularly poorly on tasks requiring verbal reasoning but performed relatively better on non-verbal tasks, suggesting, firstly, that one of the non-verbal short tests would be a more appropriate tool to measure potential cognitive performance in individuals with ID given their well-documented short attention span (Chakrabarti and Banerjee, 2013). Secondly our results suggest that, at least for our sample, cognitive performance is limited by verbal skills. Additionally, the maximum non-verbal mental age an individual with ID (IQ less than 70 on the WISC-IV) is likely to attain eventually in each subtest is a raw score representative of approximately 8 chronological years, or lower, of typical development (Wechsler, 2003).
There are pragmatic advantages in using a non-verbal IQ test. The WISC-IV usually requires 60-90 min (according to the WISC-IV manual) or much longer for intellectually disabled individuals (Wechsler, 2003). Our ID group completed the WISC-IV within 2 to 4 sessions. By contrast, each of the three   non-verbal tests only took one session, with the RCPM taking approximately 13 min, the TONI-4 15 min and the WNV 45 min. Thus, we would argue that the language requirements and time to administer the WISC-IV make the non-verbal tests, especially the RCPM or TONI-4, better options than the WISC-IV or the WNV. The inability of many of the participants to perform some subtests of the WISC-IV, including Vocabulary, and hence the associated floor effects also argues against the use of the WISC-IV and scaled scores, as suggested by Whitaker (2010) and Sansone et al. (2014). Intelligence tests with large floor effects typically have reduced range and variability, and create positively skewed data. In the current study, significant correlations were found between all three non-verbal tests and the WISC-IV. The highest correlations were with the Perceptual Reasoning Index (Block Design, Picture Concept, Matrix Reasoning, and Picture Completion) and the Processing Speed Index (Coding, Symbol Search, and Cancellation). These findings are consistent with the WISC-IV manual (Wechsler, 2003), which states that withingroup comparisons for children with ID are expected to reveal slightly higher scores on the Processing Speed Index than on the Verbal Comprehension Index and Perceptual Reasoning Index.
Non-parametric analysis of scores on the subtests of the WISC-IV of the ID non-ASD group and those co-morbid for ID+ASD demonstrated that, although the verbal ability of all participants was poor, it was lower for those with ID+ASD. Adolescents with ID without co-morbid ASD were able to attempt some verbal subtests, e.g., Vocabulary, whereas all of the adolescents with ID+ASD were either limited to repeating the question or making no response. Thus, our findings with this small group support previous reports suggesting that the main impairment in individuals with both ID and ID+ASD is in the verbal domain (Vig and Jedrysek, 1999;Brereton et al., 2006;Wilkins and Matson, 2009;van der Schuit et al., 2011). Furthermore, the Matrix Reasoning subtest of the WISC-IV resembles some aspects of the Raven's Progressive Matrices (Wechsler, 2003) and so it was expected that our study would find that the highest subtest score for children with ID+ASD would be on the Matrix Reasoning subtest. A number of previous studies have noted that the Matrix Reasoning subtest of the WISC-IV is indicative of relative cognitive strength for individuals with ASD Calhoun, 2006, 2008). Other studies (Dawson et al., 2007) have suggested that while the WISC-III and the Raven's Standard Progressive Matrices (RSPM) provided similar estimates for children with TD, children with ASD perform significantly better on the RSPM. This finding is consistent with Barbeau et al. (2013) and Nader et al. (2014), who suggested that the RSPM non-verbal test may be a better choice than the Wechsler scales to assess children with ASD, and in comparing the RSPM and WISC-IV scores, Nader et al. (2014) showed that the WISC-IV FSIQ underestimated the IQ of individuals with ASD in that they achieved a significantly higher mental age on the RSPM.

CONCLUSION
The results of this study have important implication for the assessment of individuals with ID with and without ASD, emphasizing the need for theoretical considerations of what IQ tests really measure. Our results indicate that the three short nonverbal tests (RCPM, TONI-4, WNV) compared here provide an adequate estimate of IQ. The value in using the lengthy verbal language-based WISC-IV is that it can provide separate verbal and non-verbal IQ scores if required, but a main disadvantage is the time taken to administer. We also found that the use of raw scores controlling for age provided a more appropriate measure of WISC-IV performance in our limited ID sample, than did scaled scores. Furthermore, the use of raw scores is more likely to lead to better understanding of within group differences and what individuals with ID are capable of, at the time of assessment. Use of raw scores will facilitate further investigation of the type of errors individuals commonly make, and assessment of the problem solving strategies they habitually use. Such information will also be useful for planning individual programs aimed at accommodating the variability found in a group of individuals with ID in one class, and hopefully better encourage each individual to achieve their potential.

AUTHOR CONTRIBUTIONS
CM-collected the data, conducted the analysis and interpretation of the data, and prepared the early drafts of the paper. SC-Supervised development of the research, data interpretation and manuscript preparation. EB-Supervised development of the research, data interpretation and preparation of the manuscript. NG-Collected the data and helped in preparation of the manuscript. CP-Helped in recruitment and provided testing facilities.

FUNDING
The first author was supported by a Royal Thai Government PhD Scholarship and additional funds were provided by the School of Psychology and Public Health, La Trobe University.