Does the mean adequately represent reading performance? Evidence from a cross-linguistic study

Marinelli, Chiara V.; Horne, Joanna K.; McGeown, Sarah P.; Zoccolotti, Pierluigi; Martelli, Marialuisa

doi:10.3389/fpsyg.2014.00903

ORIGINAL RESEARCH article

Front. Psychol., 19 August 2014

Sec. Psychology of Language

Volume 5 - 2014 | https://doi.org/10.3389/fpsyg.2014.00903

This article is part of the Research TopicThe Variable Mind? How Apparently Inconsistent Effects Might Inform Model BuildingView all 12 articles

Does the mean adequately represent reading performance? Evidence from a cross-linguistic study

Chiara V. Marinelli¹

Joanna K. Horne²

Sarah P. McGeown³

Pierluigi Zoccolotti^4,5

Marialuisa Martelli^1,4^*

¹IRCCS Fondazione Santa Lucia, Rome, Italy
²Department of Psychology, University of Hull, Hull, UK
³School of Education, Edinburgh University, Edinburgh, UK
⁴Department of Psychology, Sapienza University of Rome, Rome, Italy
⁵Institute of Cognitive Sciences and Technologies, ISTC-CNR, Rome, Italy

Reading models are largely based on the interpretation of average data from normal or impaired readers, mainly drawn from English-speaking individuals. In the present study we evaluated the possible contribution of orthographic consistency in generating individual differences in reading behavior. We compared the reading performance of young adults speaking English (one of the most irregular orthographies) and Italian (a very regular orthography). In the 1st experiment we presented 22 English and 30 Italian readers with 5-letter words using the Rapid Serial Visual Presentation (RSVP) paradigm. In a 2nd experiment, we evaluated a new group of 26 English and 32 Italian proficient readers through the RSVP procedure and lists matched in the two languages for both number of phonemes and letters. The results of the two experiments indicate that English participants read at a similar rate but with much greater individual differences than the Italian participants. In a 3rd experiment, we extended these results to a vocal reaction time (vRT) task, examining the effect of word frequency. An ex-Gaussian distribution analysis revealed differences between languages in the size of the exponential parameter (tau) and in the variance (sigma), but not the mean, of the Gaussian component. Notably, English readers were more variable for both tau and sigma than Italian readers. The pattern of performance in English individuals runs counter to models of performance in timed tasks (Faust et al., 1999; Myerson et al., 2003) which envisage a general relationship between mean performance and variability; indeed, this relationship does not hold in the case of the English participants. The present data highlight the importance of developing reading models that not only capture mean level performance, but also variability across individuals, especially in order to account for cross-linguistic differences in reading behavior.

Introduction

Reading is a complex task that involves several cognitive and sensory-motor components from image detection to the comprehension of meaning. It takes years to master this skill and during this progression, each of the components undergoes maturation and specific learning effects. Literate adults read with near perfect accuracy at an impressive speed, optimizing each of the processes involved and performing them in parallel. The speeding up of the function may be seen as moving from serial to parallel analysis up to the point in which individuals learn to master orthographic decoding of a letter string in a glance (e.g., Ziegler and Goswami, 2005).

In 1992, Carver proposed a bold conjecture to account for reading rate. Carver showed that readers adjust their reading rate, speeding up if they are searching for a particular word in a text (scanning) and slowing down if they want to memorize concepts. According to Carver, readers may shift “gear” to achieve the desired goal, but they generally read in the middle (third) gear or “rauding” (i.e., reading and auding) which optimizes comprehension considering the speed limits set by the processing components. In a classic paper, Taylor (1965) surveyed the reading skills of 12,000 US students, from first grade to college, and found the average rate to be 300 words per minute (wpm), which was taken by Carver (1992) as an estimate of rauding rate.

This functional measure of reading speed incorporates several components from decoding to motor execution, and it is relatively stable across individuals. Notably, it has been shown that pronunciation time, the most time consuming process, weights heavily on the average speed but contributes minimally to individual differences (Martelli et al., 2014). This means that pronunciation time adds a substantial constant factor to the (much faster) compartment of decoding. Furthermore, it indicates that the maximal reading rate obtained in standard conditions (i.e., rauding) does not necessarily indicate the maximum processing rate for each of the sub-components in reading. Put in different terms, the articulatory component (as well as the eye movement scanning; see below) may pose an upper bound to the estimate of maximal reading rate that can be obtained in functional reading.

In a different line of research, focussed on assessing the perceptual limitations in reading, several authors measured reading speed by means of the Rapid Serial Visual Presentation (RSVP) paradigm. In this procedure, a sequence of words is rapidly presented in the same retinal location. The observer is required to name the words presented (typically a stream of four words per trial) without a time limit. The duration of the words on the screen to achieve a certain level of task performance (typically 80%) is measured. In this paradigm, the articulatory components do not directly exert a role on the estimation of the reading rate, since no time limit is given to complete the response. Furthermore, unlike ordinary reading, the observer does not have to scan for the words to read by eye movements, as stimuli are all presented in the same retinal position. Thus, this procedure minimizes the role of memory, pronunciation time and eye movements, allowing a more direct examination of the decoding components in reading (see Rubin and Turano, 1992; Chung et al., 1998; Legge et al., 2001; Pelli et al., 2007). In fact, compared to other reading techniques, RSVP gives the opportunity to substantially “speed up” reading rate. For example, Potter (1984) originally showed that reading and recall is still excellent at 12 words per second (i.e., 720 wpm), which is much faster than the level of “rauding.”

In the absence of specific reading or visual deficits, and controlling the stimuli for high level cognitive factors, one may assume that decoding is similar across individuals. Indeed, most low-level visual functions, such as acuity or contrast sensitivity, are similar across subjects (Barlow, 1962; Fisher, 1975; Pelli et al., 2006; Strasburger et al., 2011), revealing that perceptual limitations are invariant across individuals and labs. However, when considering the reading speed measurements obtained with RSVP, variability across subjects and labs is, surprisingly, very large. In some cases, the advantage given by the RSVP technique in speeding up reading rate is relatively low, with reading rates around 300 wpm (Latham and Whitaker, 1996; Fine et al., 1997, 1999; Chung et al., 1998; Pelli et al., 2007) while, in other studies, reading rates exceeding 1500 wpm have been reported (Rubin and Turano, 1992; Latham and Whitaker, 1996).

Part of the large discrepancy in RSVP reading across labs may be related to low-level visual effects, such as presence/absence of masking (Felsten and Wasserman, 1980; Breitmeyer, 1984; Enns and Di Lollo, 2000) or to the number of items used in the stream. In particular, in some studies, four words are presented per trial, while in others, number of words well exceeds the memory span (e.g., Latham and Whitaker, 1996; Chung et al., 1998; Yager et al., 1998; Fine et al., 1999; Kwon et al., 2007; Pelli and Tillman, 2007; Pelli et al., 2007; Yu et al., 2007, 2010; Lee et al., 2010; Kwon and Legge, 2012). Note that these studies are mainly concerned with factors affecting visual limitations to reading, such as font size or letter spacing, and much less to cognitive dimensions (as well as to absolute estimates of reading rate which are rarely commented on). Thus, direct comparisons between the various estimates are hard to make since the stimuli are usually not designed to take into consideration linguistic variables (e.g., word frequency, orthographic complexity, orthographic neighborhood, age of acquisition, etc.) that are known to influence speed of reading (e.g., Coltheart et al., 1977; Ferrand and New, 2003).

Furthermore, there is also a surprisingly large discrepancy in reading rate across languages, such as when comparing the irregular English orthography with the consistent Italian one with similar RSVP reading tasks. The reading rate of English 5th and 7th graders with the RSVP of stimuli averages at around 500 wpm (Kwon et al., 2007), while normal 6th grade Italian readers do not exceed 120 wpm, a rate much slower than any other reported for this age level (Martelli et al., 2009). Italian dyslexics' average reading rate is as slow as 40 wpm (Martelli et al., 2009). Although suggestive, comparisons between these two languages are certainly difficult to interpret across experiments, particularly since Italian words tend to be long and morphologically complex, while English words tend to be shorter and morphologically simple.

As described above, most studies on reading focus on group data that average across participants and trials, and only recently it has been suggested that “it is possible that some of the inconsistencies in the literature may be driven by individual differences among participants” (Yap et al., 2012, p. 2). The source of this variability may possibly concern strategic differences related to the linguistic demands (both within a language and across different languages). Following Yap et al. (2012), we conjecture that, over and above differences in average speed, variability estimates may also provide insights into the computation involved in reading. Here, we were interested in exploring such variability in relation to differences in orthographic consistency, with the ultimate goal of understanding the invariant and variable properties of reading across languages. Indeed, learning to become a proficient reader in different orthographies may pose very different requirements to the reader and the end product of these different task demands may well be expressed by different degrees of inter-individual variability.

In the present study, we address a number of questions, comparing Italian and English readers. Is there a difference in processing speed of regular and irregular orthographies, once most of the cognitive variables are taken into account? Does the general speed factor interact with the efficiency of the orthographic decoding, as reflected by the size of the lexical effects in the two languages? Indeed, Faust et al. (1999) showed that larger effects of the experimental manipulations are expected in the case of differences in overall processing time across individuals (i.e., larger effects for slower individuals). Do the individual differences across languages arise from different strategies adopted in reading? The difference engine model (DEM), proposed by Myerson et al. (2003), explains group RT differences by assuming that, in the absence of a peripheral deficit, most differences between individuals are due to the amount of cognitive processing required predicting the relationship between mean and SD. Is this relationship as well as vRTs distribution similar across languages in the case of reading tasks? In this study, we attempt to answer these questions through three experiments that compared reading speed (assessed with either the RSVP procedure or with vRT measurements) in a very regular (Italian) and in a very irregular (English) orthography with controlled orthographic materials.

Experiment 1: Processing Speed Differences Between English and Italian Readers

In this first preliminary experiment, we aim to explore possible differences in processing speed between Italian and English proficient readers, controlling for as many psycholinguistic variables as possible, based on the structural differences between the two languages. Previous observations report large discrepancies in RSVP reading rate across labs and languages, with English observers obtaining much higher estimates of reading rate (e.g., Rubin and Turano, 1992; Latham and Whitaker, 1996; Chung et al., 1998; Kwon et al., 2007; Martelli et al., 2009). However, due to concurrent procedural differences and uncontrolled variables, it is hard to draw a firm conclusion on these data. Here, we test a group of English young adults and a group of peer Italian readers using the RSVP paradigm to confirm the possible presence of different reading rates.

Materials and Methods

Participants

Thirty Italian (15 males and 15 females) and 22 English (11 males and 11 females) readers participated in this experiment. Participants were university students recruited from the student population of the Sapienza University of Rome in Italy and of the University of Hull in the United Kingdom. Groups were comparable for age and gender. The age of the Italian group ranged between 19 and 28 years (mean age: 22.96, SD = 2.84) with 15.81 (SD = 1.39) years of schooling; the age of the English participants ranged between 18 and 24 years (mean age: 20.86, SD = 3.77) with 14.23 (SD = 1.02) schooling years. All participants were self-reported good readers, without a history of language, reading or spelling disorders. This study, as well as the ones presented in Experiments 2 and 3 (both conducted according to the principles of the Helsinki Declaration) were approved by the Ethical Committee of the Department of Psychology of Rome, and by that of the University of Hull in line with the BPS guidelines. Before taking part in the experiment, the subjects were given a description of the study and approved their participation.

Stimuli, apparatus, and procedure

In both languages, words were all nouns, without morphological complexity and irregularity in grapheme-to-phoneme correspondence. Because stress assignment to Italian polysyllabic words is unpredictable by rule, no words with irregular stress were included in the Italian list (i.e., all words were stressed on the syllable before the last). No irregular stress words were used also in the English list. In both languages, archaic, obsolete, poetic and scientific forms were avoided. For the Italian readers, a list of 80 5-letter words were selected from the LEXVAR database (Barca et al., 2002) with frequency ranging from 0 to 100 (mean frequency = 25.1, SD = 24.4, Colfis database; Bertinetto et al., 2005). For the English readers, a list of 80 5-letter words was selected from the MRC Psycholinguistic Database 2.0 (Wilson, 1988): Frequency ranged from 0 to 100 (mean frequency: 24.8, SD = 36.2 CELEX database, Baayen et al., 1993). Note that, to compare the frequency values of the two databases (the English database has one million of occurrences, while the Italian database counts over three million occurrences), the Italian word frequency values were reported to one million of occurrences. In Appendix A, means (and SDs) of the psycholinguistic variables are reported for the Italian and English lists. The Italian and English lists were matched for frequency, n-size, imageability and age of acquisition (all ps > 0.1). Italian and English words were comparable for bigram frequency based on values reported in the MCWord database (Medler and Binder, 2005) for English and in the LEXVAR database (Barca et al., 2002) for Italian language (referring to one million of occurrences). As it can be seen in Appendix A, lists were not matched between languages for number of phonemes [t₍₁₅₉₎ = 7.92, p < 0.0001], that was higher for the Italian than the English list. Moreover, it was not possible to match the lists for number of syllables [t₍₁₅₉₎ = 14.41, p < 0.0001], as English and Italian differ in the number of syllables and in the complexity of the syllabic structure. The number of syllables is generally higher in Italian (the mode length in the Italian lexicon, according to De Mauro, 1999, is 4 syllables) than in English. Moreover, in English, only 5% of monosyllables are CV (De Cara and Goswami, 2002), while in Italian (as in other romance languages) CV is the most frequent syllable type, covering 56% of syllable tokens in written corpora (for a more detailed description of Italian see Burani et al., 2014; for English, see Wyse and Goswami, 2008).

Words were rendered in Courier New font, a proportionally spaced font, and each letter subtended 0.4° of visual angle. Participants were seated 57 cm away from the computer screen (refresh rate = 60 Hz). A fixation point (a black square subtending 0.2° of visual angle) was presented at the center of the screen for 2000 ms. Immediately after the offset of fixation point, words were presented using the RSVP paradigm, i.e., four words were presented sequentially, one word at a time, at the same location on the display and participants were asked to read them aloud. There was no blank frame (zero inter-stimulus interval) between words. Following Rubin and Turano (1992) no mask was presented prior to the first or after the fourth word in the stream. We measured the duration threshold for each participant by varying exposure duration in a 20-trial run using the improved QUEST staircase procedure with a threshold criterion of 80% correct responses (Watson and Pelli, 1983). The adaptive QUEST procedure increased or decreased the presentation rate (starting from 500 ms) according to the participant's accuracy. Word omissions, mispronunciations and substitutions were considered to be errors. In order for the subjects to familiarize with the RSVP paradigm 10 practice trials (40 4-letter words) were administered prior to the beginning of the experiment. As in the experimental session the word duration in each trial was controlled by the adaptive procedures based on response accuracy.

Results

The reading rate (i.e., wpm) was measured as 60/duration threshold*1000 using the geometric mean as measure of the central tendency of the distribution (represented using a log scale, Figure 1 lower axis) and the 95% confidence intervals (CIs) to express the variability in the distributions. ANOVA comparisons across groups were performed on log-transformed reading rates (linear scale, Figure 1 upper axis). The reading speed for the English list (geomean = 449 wpm; CI: 346–583) was not different from the reading speed of the Italian parallel list [479 wpm; CI: 433–548; F_(1,50) < 1, p = 0.55]. Results were replicated also when socio-demographic variables (i.e., gender, age, and years of schooling) were added to the analysis as covariates: the main effect of the language factor was not significant [F_(1,44) about 1; p = 0.31]; furthermore, none of the covariates were significant.

FIGURE 1

Figure 1. Individual reading rates for Italian readers (Xs) and English participants (open circles) for a letter-matched list of words. The upper scale shows the corresponding values expressed as a log (wpm). Note the greater dispersion of experimental points among English than Italian observers.

Figure 1 presents the reading rate distributions for the Italian and English readers. An inspection of the figure indicates that the English group was less homogeneous than the Italian group, with a larger variability: the group comprised the fastest individual and individuals who were slower (by a factor of ca. two) than the slowest Italian reader. This pattern is confirmed by the Levene's test for equality of variances: the variances of the Italian and English samples were significantly different (F = 4.17, p < 0.05).

As variability appears as the key feature of the group differences between the two languages we replicated this analysis using untransformed threshold values to check whether the difference in the variance of the two distributions could be due to the adoption of a nonlinear transformation. Mean duration thresholds were 120 ms (SD = 44) for the Italian group and 152 ms (SD = 142) for the English group. Again, the variances of the two groups were significantly different (F = 8.86, p < 0.01). Therefore, it appears that the difference in variability between the two languages is not due to the use of a nonlinear transformation of data.

Comments

Contrary to expectations based on our preliminary observations, and the work of Paulesu et al. (2000), Italian readers as a group were neither faster nor slower than English readers once the items were made comparable for some relevant psycholinguistic variables. However, the two groups showed substantial differences in individual variability with the Italian group considerably more homogeneous than the English one. The English group included both the fastest participant, reading over 1100 wpm, and the slowest participant, reading at ca. 90 wpm. Clearly, this phenomenon is captured by the variability in the two distributions and not by the group mean. This large variability is somewhat coherent with the 5 to 1 difference across labs testing English RSVP reading (Rubin and Turano, 1992; Latham and Whitaker, 1996; Fine et al., 1997, 1999; Chung et al., 1998; Pelli et al., 2007).

If high individual variability is the norm, the mean performance of any given sample would depend upon the actual proportion of fast and slow individuals. This is particularly the case for RSVP studies which are typically concerned with perceptual parameters and use a large number of trials but a small (often very small) sample size. In these conditions, variability between samples is expected to be quite high and this may substantially contribute to the very different reading rates reported in the literature.

Variability may be the diagnostic marker of the reading differences across regular and irregular orthographies. However, this first preliminary experiment had several pitfalls preventing any definite conclusion on whether the high inter-individual variability among English observers is a “real” phenomenon. Obviously, one possible source of variability would be the presence of a proportion of individuals with a reading deficit. All participants were self-reported proficient readers, but, given the absence of an independent evaluation using standardized reading measures, it is impossible to exclude such an explanation with certainty. Moreover, we did not have a measure of wpm in the case of words equated on number of phonemes (rather than letters). Based on these considerations it seemed important to confirm and extend the findings of Experiment 1 with a new group of subjects; this was carried out in Experiment 2. Additionally, it is unclear whether the difference in variability between the two groups is specifically related to the cognitive components involved in the performance with the RSVP or may be a more general phenomenon extending across reading tasks. This was the aim of Experiment 3.

Experiment 2: Functional Reading Abilities and RSVP Reading Speed

In this experiment we aimed to replicate Experiment 1 measuring RSVP reading speed in an independent sample. In order to exclude differences between samples related to more general cognitive efficiency and/or the presence of a reading deficit, standardized tests appropriate for the participants' age and language were administered to ensure that all participants were normal fluent readers. Additionally, the performance of English and Italian readers was examined both using lists of words matched for number of letters and lists of words matched for number of phonemes.