^{*}

Edited by: Jesus De La Fuente, University of Almería, Spain

Reviewed by: Michele Settanni, University of Turin, Italy; Kate E. Snyder, University of Louisville, USA; Caterina Primi, University of Florence, Italy

*Correspondence: Alejandro Veas

This article was submitted to Educational Psychology, a section of the journal Frontiers in Psychology

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

There are very few studies in Spain that treat underachievement rigorously, and those that do are typically related to gifted students. The present study examined the proportion of underachieving students using the Rasch measurement model. A sample of 643 first-year high school students (

The concept of underachievement has been widely studied in the educational field in the last 50 years, showing a clear impact in high education studies and in professional careers (Conklin,

Including or not these kind of diversifications, the consequences of being underachieving could imply insufficient support (Ziegler et al.,

In Spain, the percentage of school failure or dropout (those students who leave the educational system) during the course of 2012–2013 was 23.5% (Eurostat,

The estimation of the percentage of underachieving students can vary, depending on some aspects such as the operational underachievement definition or the socio-cultural context of students involved. For example, Rimm (

There are hardly any studies in Spain that treat underachievement rigorously, and they are usually related to gifted students. One of the most important studies was developed in Madrid by García-Alcañiz (

With respect to the operational definition of underachievement, the discrepancy between the potential ability and the academic achievement is, in some cases, restricted to gifted students, as happens frequently in the United States (Reis and McCoach,

From a methodological perspective, some questions have been raised about the adequacy of different identification methods proposed in the studies. Traditionally, there have been three statistical methods: the absolute split method, the simple difference method and the regression method (Plewis,

The more recent method is based on the application of the Rasch model (Phillipson and Tse,

While many statistical models try to fit the model to the data, the opposite occurs in the Rasch measurement model. That is, the data must fit the model to be accepted (Bond and Fox,

Statistical Infit and Outfit are calculated based on root mean squares, depending on the statistical value of Pearson's chi-squared divided by the degrees of freedom, thus forming a scale with values that can range from 0 to infinity. Values below 1 indicate a higher than expected fit of the model, while values greater than 1 indicate a poor fit of the model. If we have an Infit value of 1.40, then we can assert that there is 40% more data variability compared to the model's prediction. An Outfit of 0.80 indicates that 20% less data variability is observed with respect to the model's prediction.

Phillipson (

In Spain, the educational evaluation processes undertaken by teachers in schools are based on conducting non-standardized written tests and the assessment of attitudinal variables, (e. g., quality of the participation in the proposed activities) observed in the classroom. Thus, the application of the evaluation criteria leads to a total grade for each of the courses which the student is enrolled. Therefore, the use of academic grades are quite important, as schools continue to evaluate skills through other traditional methods and/or measurement instruments, such as written exams, oral exams, group work, etc., that are based on the evaluation criteria of regional regulations.

On the other hand, there are a significant number of studies on academic performance that have used the results of studies at the international level, such as the Trends in International Mathematics and Science Study (TIMSS) and especially the Program for International Student Assessment (PISA) by using standardized tests (Ruiz de Miguel,

The analyses of the conceptual and methodological processes in comparing school grades have been studied extensively in the last quarter of the twentieth century, especially in the United Kingdom (Forrest and Vickerman,

According to this postulate, the difficulty of a course will correspond to a specific level established in the latent variable. A course will be more difficult than another to the extent that a higher level of performance or ability is needed to achieve the same grade. If the latent construct is changed, this relationship may easily be the inverse (Coe,

The measurement of comparability would be based on using the grades from the courses as a measurement to validate the construct, which implies that they must provide good levels of content representativeness, good internal consistency, and appropriate levels of correlation between the variables that comprise the different courses. If, in studies on academic performance or other research topics, the mean grades of the courses are used to obtain the academic performance variable, then it is essential to use statistical tools to confirm their fit from the measurement standpoint.

As noted above, at this level of analysis we start with considering each of the courses as a test with specific items, with the range of grades from 1 to 10, which implies various degrees or categories of success. The partial credit model (Wright and Masters,

At this point, it becomes necessary to test the extent to which students can be identified as underachieving and non-underachieving by using measures in the same metric scale. Therefore, the present study will describe an estimation of the proportion of underachieving Spanish students in the first course of compulsory secondary education. Rasch measurement method will ensure an estimation of the construct validity of both the intelligence test and the academic grades.

Random cluster sampling was used, using the school as the sampling unit, taking into account geographical areas of the province of Alicante. A total of 8 schools in the province of Alicante were included; 2 schools were private, while the rest were public. A total of 643 students in the first year of Compulsory Secondary Education (Educación Secundaria Obligatoria—E.S.O.) participated in the study. Twenty nine students (4.31%) were excluded from the final sample due to having an insufficient command of the language, because they had special educational needs, or because they did not have parental consent. Fifty one percent of the students were male, and 49% were female, with an average age of 12.09 years and a standard deviation of 0.47. Five hundred twenty three participants (81.4%) were enrolled in a public school, while 120 (18.6%) were enrolled in a private school. Overall student in each class in each school took part in the study. Because of the racial and ethnic homogeneity of the country, the majority of children were Caucasian (98%). Childhood socioeconomic status (SES) was indexed according to parental occupation. There was a wide range of socioeconomic status with a predominance of middle class children. This classification was based on the level of incomes and the level of studies of the families. The regional education counselors determined SES through a questionnaire registered with the responses of the students. The variable used were: parents' professions, professional situation and level of studies, number of books at home, cultural and sporting activities, and availability of technological means at home.

Chi-square test was used to determine whether there were differences between the gender of the sample (51.2% boys and 48.8% girls) and the gender of the national student population (51.3% boys and 48.7% girls), supporting the absence of gender differences between sample and population (χ^{2} = 0.29,

In the sample, the percentage of students who assist to public schools (81.4%) was slightly higher than the percentage who assist to private schools (18.6%) in the population, which was 76 and 24% respectively (χ^{2} = 4.1, ^{2} = 2.67,

So, in general terms, the sample studies was representative of the national general population of first grade Compulsory Secondary Education students.

For the analysis of academic performance, numerical GPAs from 9 mandatory courses, which the faculty provided at the end of the school year, were considered. The courses recorded were Spanish Language and Literature, Natural Sciences, Valencian Language, Social Sciences, Mathematics, English, Technology, Art Education, and Physical Education. Student scores showed high reliability, with a Cronbach's alpha of 0.93. Students' scholar ability was estimated using the Battery of Differential and General Skills (Yuste et al.,

Prior to data collection, the necessary permission was requested from the educational administration and school boards of the various schools. After obtaining these permissions, the parents or legal guardians of the students had to provide the corresponding informed consent. Data collection was performed in the schools themselves during the second trimester of the school year and during normal school hours. The data were collected by collaborating researchers previously trained in the standards and guidelines for data collection.

For this study, punctuations from Badyg and school grades were analyzed using Winsteps version 3.81 statistical software (Linacre,

From the maximum likelihood procedure, it is possible to obtain a value for the difficulty of a certain item that best explains the pattern of recorded performance. Similarly, one can obtain a value for the ability of each individual depending on the pattern of the indices of difficulty. This process is repeated continuously using the most recent estimates of skill and difficulty until the estimate converges.

Once fit indices from both measures have been observed, the Rasch model allows for the testing of the hypothesis that two tests measure the same underlying construct (Bond and Fox,

Taking into account that school grades do not constitute a validated test, a deeper analysis of the fit of the courses has been conducted, based on the inter-subject comparability approach (Tasmanian Qualification Authority,

Spanish Language and Literature | 643 | 0.63 | 0.63 | 0.88 |

Natural Sciences | 642 | 0.62 | 0.62 | 0.87 |

Valencian Language | 625 | 0.71 | 0.71 | 0.86 |

Social Sciences | 640 | 0.88 | 0.86 | 0.85 |

Mathematics | 641 | 0.94 | 0.93 | 0.85 |

English | 629 | 1.16 | 1.13 | 0.82 |

Technology | 640 | 1.12 | 1.11 | 0.78 |

Arts Education | 642 | 1.20 | 1.21 | 0.77 |

Physical Education | 641 | 1.53 | 1.87 | 0.64 |

Based on the qualitative scores of Spanish schools, recoding was performed using the following values: 1 for categories 1, 2, 3, and 4 (“poor”); 2 for categories 5 and 6 (“sufficient” and “good”); 3 for categories 7 and 8 (“notable”); and 4 for categories 9 and 10 (“outstanding”).

The new calibration of the courses provided a good fit for the data (Table ^{2} = 23.518;

Spanish Language and Literature | 643 | 0.75 | 0.79 |

Natural Sciences | 642 | 0.75 | 0.75 |

Valencian Language | 625 | 0.78 | 0.76 |

Social Sciences | 640 | 0.83 | 0.83 |

Mathematics | 641 | 0.94 | 0.99 |

English | 629 | 1.03 | 1.04 |

Technology | 640 | 1.13 | 1.12 |

For the analysis of unidimensionality, a principal component analysis of the residual scores was conducted (Linacre,

Although not shown, each of the Badyg blocks was analyzed separately. The item analyses for the Badyg demonstrate that all items except for items 1M, 11M, 7M, 2E, 13E, 29E, 2P, 8P, and 29S have an Infit Mean SQ between 0.80 and 1.20, indicating that the majority of items fitted the model satisfactorily. As regards person fit, the majority of Infit and Outfit Mean SQ values of persons are within values of 1.3. Approximately 95% of students fit the Rasch model (Bond and Fox,

Table

Mean | 0.00 | −4.44 | 0.00 | 1.05 |

SD | 1.22 | 0.79 | 0.27 | 1.33 |

Reliability of estimate | 0.99 | 0.95 | 0.99 | 0.92 |

Infit Mean SQ | ||||

Mean | 1.01 | 1.01 | 0.98 | 0.98 |

SD | 0.13 | 0.31 | 0.28 | 0.66 |

Outfit Mean SQ | ||||

Mean | 1.04 | 1.04 | 1.01 | 1.01 |

SD | 0.18 | 0.41 | 0.37 | 0.72 |

For the School grades, the mean (and SD) logit for items is 0.00 (0.27), showing that grades are not widespread in the interval scale. The reliability of the estimate is very high, with a value of 0.99. Infit and Outfit Mean (and SD) have values close to 1, which implies a good fit of the data. The mean (and SD) of the person estimate from the school grades are (1.05 and 1.33). In this case, students find the majority of the courses easy.

After adjusting the school grade scores and Badyg scores to be aligned with mean 0 and SD 1, the scatterplot of person logit school grades scores and person logit Badyg scores was produced (Figure

The individual underachievement index, based on the significant differences between GPA and Badyg, provides the exact number of underachieving students, 181 or 28.14% of the total sample of 643 students across the ability levels. From the total of underachieving students, 29 were enrolled in private school (16.02%), whereas 152 students were enrolled in public school (83.98%). The analysis of the differences between these percentages of underachieving students identified in public and private schools showed that these differences were statistically significant (χ^{2} = 17.13, ^{2} = 6.24,

The present study describes an estimation of the proportion of underachieving Spanish students in the first course of compulsory secondary education. In light of the results, we may assert that the proportion of underachieving students found in the sample with the Rasch method is relatively high, with a value of 181, or 28.14% of the total sample. Moreover, important gender differences are observed between non-underachieving and underachieving students with the total sample. A higher proportion of boys are identified as underachieving in comparison with girls. These results with the Rasch method is consistent with previous results using other methods of measuring underachievement (Gibbs et al.,

This percentage is similar to those found previously in Spain. Jiménez and Álvarez's (

With respect to the high number of underachieving students, it is important to consider the contextual factors in the present study. Firstly, it seems that underachievement changes from more general to more subject-specific areas at the end of elementary school (McCall et al.,

Some points must be addressed in the present study, as they can affect the levels of detection of underachievement. First, we referred to a global underachievement instead of an underachievement index in a specific area, which implies a major probability of obtaining a higher number of underachieving students. Second, and according to previous studies (Phillipson and Tse,

The analysis of GPA through the partial credit model confirmed the possibility of comparison, based on the construct comparability approach (Newton,

For a more objective measure of the courses, it would be advisable to reduce the number of grades for evaluation, especially in the lowest categories. In the present study, we found that in all high schools analyzed, the grades 1, 2, and 3 are assigned to a very low proportion in all courses. In addition, a wider range of grades leads to a more heterogeneous distribution of evaluation criteria than the standards indicate. In this regard, schools in countries such as the United Kingdom use small grade ranges (Department for Education,

In addition, some limitations may need to be addressed in the future. Firstly, existence of cultural factors must be added in future studies (Reis and McCoach,

Another important point is that this study focuses on students of all intelligence levels, not only on gifted students. Therefore, the heterogeneity level could be higher (Reis and McCoach,

The present study constitutes a pioneer analysis for the estimation of the prevalence of underachievement in Spain. These results could be useful to educational orientation and the instructional interventions performed by teachers, as they have already done in other countries (McCall et al.,

AV Theoretical review of the topic. Rasch Analysis of the measures. Differential item functioning of each test. RG Theoretical review of the topic. Review of the references. PM Theoretical review of the topic. JC Quantitative methods. Analysis of the sample. Reliability of the instruments.

The present work was supported by the Spanish Ministry of Economy and Competitiveness (Award number: EDU2012-32156) and the Vice Chancellor for Research of the University of Alicante (Award number: GRE11-15). The corresponding author is funded by the Spanish Ministry of Economy and Competitiveness (Reference of the grant: BES-2013-064331).

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.