The Clock Counts – Length Effects in English Dyslexic Readers

In reading, length effects (LEs) are defined as an increment in the time taken to read as a function of word length and may indicate whether reading is proceeding in an efficient whole word fashion or by serial letter processing. LEs are generally considered to be a pathognomonic symptom of developmental dyslexia (DD) and predominantly have been investigated in transparent orthographies where reading impairment is characterized as slow and effortful. In the present study a sample of 18 adult participants with DD were compared to a matched sample of typical developing readers to investigate whether the LE is a critical aspect of DD in an opaque orthography, English. We expected that the DD group would present with marked LEs, in both words and non-words, compared to typical developing readers. The presence of LEs in the DD group confirmed our prediction. These effects were particularly strong in low frequency words and in non-words, as observed in reading speed. These preliminary findings may have important theoretical implications for current understanding of DD.


INTRODUCTION
Developmental dyslexia (DD) is a specific learning disorder characterized by problems with accurate or fluent word recognition, poor letter decoding, and poor spelling abilities, that affects up to 15% of the population worldwide (American Psychiatric Association [APS], 2013). Although most of the research regarding DD has been conducted with children, reading difficulties persist throughout life (Bruck, 1985;Finucci et al., 1985;Nergård-Nilssen and Hulme, 2014;Shrewsbury, 2016;Eloranta et al., 2018).
The manifestation of DD differs across orthographies. For instance, in transparent orthographies in which the mapping between letters and sounds is more regular and predictable (e.g., Italian), the consistency of the letter-sound correspondence limits the incidence of letter decoding errors [e.g., volpe (fox) and read as folpe]. The main feature of DD in transparent orthographies appears to be slow and effortful word reading, with accuracy being relatively well preserved (Job et al., 1984;Wimmer, 1993;de Jong and van der Leij, 2003). Conversely, in opaque orthographies with more irregular letter-sound correspondence in which the mapping between letters and sounds is not always consistent and predictable (e.g., English), DD tends to be characterized by slow reading and a dramatic impairment in reading accuracy (Wimmer, 1993;Landerl et al., 1997;Spinelli et al., 2005). These patterns led (Wimmer, 1993) to propose a distinction between "speed dyslexia, " affecting individuals reading transparent orthographies, and "decoding dyslexia, " affecting individuals reading opaque orthographies (although see Ziegler et al., 2003 for similarities between accuracy and speed across orthographies).
Differences in the manifestation of DD in opaque and transparent orthographies might reflect variances in how reading is accomplished. Opaque orthographies encourage a wholeword reading procedure, due to orthographic irregularity (Frost et al., 1987;Marinelli et al., 2016; see also Ziegler and Goswami, 2005 for a review on differences between languages). Given the inconsistency of the mapping between letters and sounds, DD in opaque orthographies is characterized by a high incidence of errors (Wimmer, 1993). Conversely, transparent orthographies encourage a serial analysis of the word, particularly in the early stages of reading acquisition, due to the almost perfect concordance between the letters (graphemes) and the sounds (phonemes) of the words (Frost et al., 1987;Ziegler and Goswami, 2005). Given this letter-sound consistency, in transparent orthographies DD is mainly characterized by slow, although accurate reading (Wimmer, 1993;Coltheart and Leahy, 1996;Zoccolotti et al., 1999;Ziegler and Goswami, 2005;Martens and de Jong, 2006). This pattern of difficulties seems to persist in adulthood (Martin et al., 2010;Lindgrén and Laine, 2011;Re et al., 2011;Suárez-Coalla and Cuetos, 2015;Eloranta et al., 2018).
A cross-cultural study conducted with English and Italian children to investigate reading acquisition in these orthographies showed that, even in the early stage of reading acquisition, English children were faster than Italian children, although less accurate (Marinelli et al., 2016). Interestingly, a length effect (LE) was present in younger children in both groups, however, it disappeared in older English children and persisted only in Italian children. These results suggest that children reading a transparent orthography persisted in adopting a serial strategy, whilst children reading the opaque orthography did not. This pattern is consistent with evidence from adult English readers where exposure to words through reading acquisition decreases the likelihood that a serial, phonological decoding strategy will be employed. Given the characterization of reading impairment in transparent orthographies is captured in reading latency, the LE in DD has been more extensively evaluated in these orthographies in both adults and children (see Davies et al., 2007 for Spanish children; Richlan et al., 2010 for German adults; Suárez-Coalla and Cuetos, 2015 for spanish adults; Zoccolotti et al., 2005 for Italian children), but scarcely investigated in English (see e.g., Ziegler et al., 2003;Kemp et al., 2009).
Length effects have been considered as a pathognomonic symptom in acquired disorders of reading such as pure alexia (Behrmann and Shallice, 1995;Behrmann et al., 1998;Montant and Behrmann, 2001;Roberts et al., 2010Roberts et al., , 2013Roberts et al., , 2015, a disorder caused by damage to the left fusiform gyrus in the ventral occipitotemporal cortex (Price and Devlin, 2011;Behrmann and Plaut, 2013;Roberts et al., 2013). Support for the contention that this area may also be important in DD is provided by Richlan et al. (2010). They found that adult participants with DD presented with abnormalities of the left occipitotemporal cortex. In addition, reading performance of these participants was also captured by strong LEs. It should be acknowledged, however, that this evidence is from readers of a transparent orthography (German). Whether LEs are a core deficit in adult DD participants reading an opaque orthography is yet to be determined.
One cognitive model employed to explain the LE in reading is the dual-route cascaded (DRC) model (Coltheart et al., 2001). Although the DRC model was initially implemented to explain deficits in acquired dyslexia, it also accommodates deficits in developmental reading disorders and is widely employed in research on DD (Castles and Coltheart, 1993;Coltheart and Leahy, 1996;Castles et al., 2006;Coltheart, 2015).
In this model, reading can be achieved via two routes: (i) lexically through access to stored representations in the orthographic and phonological lexicons, and (ii) sub-lexically through a phonological conversion procedure. The lexical route permits reading of familiar words in parallel whilst the sub-lexical route processes unfamiliar words and phonologically plausible non-words (e.g., plur) through a serial spelling-to-sound (grapheme-to-phoneme) mechanism. In this conceptualization, the serial processing of graphemes results in a LE whereas words read via the lexical route, with parallel processing of graphemes, and predicts that a LE will not be observed. The larger the LE the greater the reliance on the sub-lexical route (Martens and de Jong, 2006). Hence, within the DRC model, the LE might be considered to reflect an over-reliance on the sub-lexical route (Barca et al., 2006).
An alternative to the DRC account of the underpinnings of reading achievement is the triangle model, which is implemented in a parallel distributed processing (PDP) connectionist network (Plaut et al., 1996). The triangle model has received substantial support in explaining various types of acquired dyslexia (Patterson and Lambon Ralph, 1999;Hoffman et al., 2015). This view differs from the DRC in that reading is underpinned by the phylogenetically more mature primary systems of vision, phonology, and semantics. Central to this approach is the proposal that the same computational elements, in various combinations, support different activities during word reading: (1) vision, which with respect to reading mediates knowledge about orthographic word form; (2) phonology -the internal representation of word sound; and (3) semantics -word meaning. Reading aloud can be accomplished directly between vision and phonology (V > P) or mediated by semantics (V > S or the interplay between S <> P). During reading acquisition, the direct pathway becomes sensitive to the relationship that exists between graphemes and phonemes and achieves efficient computations for regular words and non-words with typical grapheme-phoneme rules (e.g., pat and snat). It is less efficient for infrequent irregular words with atypical graphemephoneme rules (e.g., poignant) and it is these that may require additional semantic support. In the scenario of the triangle model, LEs may be the result of damage to the visual system (e.g., Roberts et al., 2013).
The present study aimed to examine whether LEs are present in DD reading of English orthography. Few studies have investigated LEs in English children with DD (for an exception see Ziegler et al., 2003) and to the best of our knowledge, evidence of LEs in adult English speakers with DD is scarce. It is possible that, even if LEs affect the reading performance of English children with DD, by adulthood they will have acquired adequate strategies to compensate for their deficit. However, it is also possible that the LEs persist in adulthood, suggesting an over-reliance on the sub-lexical route to read, in the scenario of the DRC model, or a deficit in the visual system, in the scenario of the triangle model. To evaluate between these possibilities, we compared a group of English university students with a diagnosis of DD, alongside a group of typically developing readers (TDR) in a word reading task. Such a population represents individuals who might have compensated their reading difficulties in some way and achieve well academically (Lefly and Pennington, 1991;Kemp et al., 2009;Cavalli et al., 2017). To do so they may have received extensive instructional support. Evidence from this population of a resistant LE therefore speaks to a more stringent test of a core deficit in reading processes. Both accuracy and reaction times (RTs) have been analyzed. Following evidence of increased reliance on the sub-lexical route with decreasing word familiarity (Weekes, 1997;Balota et al., 2004) both non-word reading and the effect of word frequency were also explored.

Participants
Eighteen university students with DD (5 males; age range 19-27; M years = 21.8; SD = 2.29) participated. All had normal or corrected-to-normal vision and were in receipt of a formal diagnosis of dyslexia (supplied by a registered assessor of SpLD) as required for access arrangements and additional support in UK higher education institutions. These diagnoses follow DSM-IV recommendations (American Psychiatric Association [APS], 1994) and the guidelines adopted in public services, namely normal level of general intelligence (IQ above 85; although we did not obtain a measure of IQ as part of this study), reading performance at a clinical level, and no neurological, sensory, or educational deficit that could be cause of their reading impairment. They have been contrasted to a TDR group of 18 students (7 males; age range 19-28; M years = 21.8; SD = 2). The two groups did not differ for gender [χ 2 (1) = 0.50, p = 0.480, Cramer's V = 0.118] or age [F(1,34) = 0.02, p = 0.878, η 2 p = 0.001]. The study was reviewed and approved by the Liverpool John Moores University Research Committee and by the RES Committee North West Liverpool Central (15/NW/0461). Written consent was obtained from all participants.

Materials and Procedure
Single Word Reading (Roberts et al., 2010) In this and all subsequent tasks, stimuli were presented using E-Prime 2.0 software on a PC. Participants were seated approximately 50 cm from the screen. A list of 180 words comprising 60 words of three, five and seven letters were administered. These included 30 low frequency words and 30 high frequency words in each length set matched for CELEX written word frequency across the three letter lengths (three letters: low 1.08, high 151.96, average 76.52; five letters: low 1.10, high 130.76, average 65.93; seven letters: low 1.9, high 145.19, average 73.57 -for details see Roberts et al., 2010). Significant frequency effects were observed within each length and collapsed across length (ts > 6.8; ps < 0.001).
Stimuli were randomize and presented in the same order for each participant. Each word was presented after a fixation point with a duration of 500 ms, remaining on screen until the participant responded. Participants were instructed to read the words aloud as fast and accurately as possible. Reading latencies were measured using the E-Prime voice key and calculated from the onset of the stimulus to the onset of the correct naming response and, therefore, encompass the time taken to identify individual letters. Reading accuracy was recorded by the experimenter using a response box. Participant responses were also recorded allowing the accuracy of pronunciation to be agreed by two researchers. A number of responses were excluded from the analyses of RTs: incorrect responses, responses below 200 ms and those considered invalid due to technical problems (e.g., microphone errors).
Single Non-word Reading (Roberts et al., 2013) Monosyllabic non-words of three, four, five, and six letters were used (17 for each length). Non-words were pronounceable letter strings, derived by changing one letter of a standardized English word list (Weekes, 1997, Roberts et al., 2013 and provided the initial phoneme of that word remained intact. Non-words were matched for number of phonemes, summed bigram frequency, and average grapheme frequency. The procedure was identical to that described above. It is important to note that the time between the onset of the word or non-word stimulus to the onset of the correct naming response is an indicator of the LE. Of course, when subjects begin to pronounce the string, they have already decided that reading is lexical or non-lexical.

Data Analytic Strategy
Generalized linear mixed-effects model (GLMM), a robust analysis that allows controlling for the variability of items and subjects (Baayen et al., 2002), was implemented. GLMM limits the loss of information due to the prior averaging of the byitem and by-subject analyses and has been repeatedly used in the case of RTs and errors (Paizi et al., 2013;Marinelli et al., 2016). Analyses were carried out by using R (R Core Team, 2019), with the package lme4 for fitting the models (Bates et al., 2015), and the package ggplot2 for the graphics (Wickham, 2009). The package lmerTest was used to obtain p-values and summary tables for lmer model fits on RTs (Kuznetsova et al., 2017), while a traditional model comparison was used for the accuracy. Participants and items were used as independent random effects. Fixed effects varied in different analyses.
As for words, Group (DD vs. TDR), Frequency (High vs. Low), and Length (3, 5, and 7 letters) were used as fixed factors. Concerning non-words, Group (DD vs. TDR), and Length (3, 4, 5, and 6 letters) were included as fixed factors. Analysis on the RTs were repeated using data transformation in z-scores, to control for over-additive effects (see Paizi et al., 2013 for a similar approach). It is worth noting that this transformation fixes the grand average of each participant (and therefore of each group) to zero. Therefore, in all z-score analyses the fixed effect of group and the random effects of subject tend to be closed to zero. Note that the higher the z-score, the lower the performance.

A priori Power Analysis
Given the relatively small sample size a power analysis, using G-Power (Erdfelder et al., 1996) has been performed prior to data collection to determine the sufficiency of the sample estimating a moderate effect size based on Cohen's (1988) thresholds. Considering an alpha level of 0.05, and a correlation between measurements of 0.05 a sample of 10 participants has a power of 0.80 to detect a significant interaction. Considering within factors effects, a sample size of 8-10 is required to detect significant differences with a power of 0.80. Finally, concerning the between factor effect, a sample of 28 is needed to have a power of 0.80 to detect significant effects. The sample size of 36, which was the sample size that we decided to obtain, has a power of 0.90 to detect a significant effect of the between factor manipulation. The analytic approach that we decided to use (i.e., GLMM), strengthen the experimental power of the by-subject and byitem analyses and limits the loss of information due to the prior averaging of the by-item and by-subject analyses (Baayen et al., 2002;Paizi et al., 2013).

Descriptive Statistics
Means and standard deviations for both RTs and accuracy of the two groups are displayed in Table 1.

Reaction Times
Results for the GLMM on word RTs are displayed in Figure 1. The results of this word reading task demonstrate that only the DD group was affected by length and this effect was larger for longer unfamiliar words, particularly in the low frequency condition between lengths three and seven (t = −8.28, p < 0.001) and lengths five and seven (t = −7.67, p < 0.001). No LEs were present in the high frequency condition for the DD group (ps ≥ 0.908).
The TDR group did not show any LEs (ps ≥ 0.980). Post hoc analyses on the three-way interaction are presented in Table 2.

Errors
Results for the GLMM on word errors are displayed in Table 1 and Figure 3. Significant main effects were observed for Group, z = −2.73, p = 0.006, and Frequency, z = −7.22, p < 0.001. For Length, only the difference between lengths three and seven was significant, z = −2.12, p < 0.05. These results demonstrate that the DD group performed worse than the TDR group.
Additionally, both groups were more accurate in the high frequency condition as shown by the main effect of frequency. Intriguingly, the performance in both groups was very high. Only the longest words (7 letters) were read worse than the other words in the DD group.

Reaction Times
Results for the GLMM on non-word RTs are displayed in Figure 4. Significant main effects were observed for Group, F(1,34) = 12.60, p < 0.001, and Length, F(3,63) = 12.52, p < 0.001. A significant interaction was observed for Group × Length, F(3,2132) = 16.20, p < 0.001. The results of this non-word reading task demonstrate that the DD group was affected by non-word length, with significant differences between lengths three and five (t = −6.80, p < 0.001), lengths three and six (t = −7.48, p < 0.001), lengths four and five (t = −4.70, p < 0.001), and length four and six (t = −5.35, p < 0.001). No differences were present between length three and four (p = 0.413). The TDR group did not show any LEs (p ≥ 0.962). Post hoc analyses on the interaction are presented in Table 4.

Z-Scores
Results for the GLMM on non-word z-scores are displayed in Figure 5. A significant main effect was observed for Length, F(3,63) = 6.21, p < 0.001, with no effect of Group, F(1,2160) = 1.19, p = 0.276. This latter result is not surprising since all individual performances have been centered to the zero through the z-score transformation. A significant interaction was observed for Group × Length, F(3,2160) = 12.32, p < 0.001. These results confirmed those obtained with the raw data. Post hoc analyses on the interaction are presented in Table 5.

Errors
Results for the GLMM on non-word errors are displayed in Table 1. A significant main effect was observed for group only, Group, z = −3.03, p = 0.002, reflecting the fact that the TDR group was more accurate than the DD group.  FIGURE 3 | Error rates in the two groups in each individual condition. TDR, typical developing readers; DD, developmental dyslexics; HF, high frequency; and LF, low frequency.

DISCUSSION
The aim of this study was to investigate whether the effect of word length, usually investigated in adult DD readers of a transparent orthography, may also characterize the reading of English individuals with DD. In this study, we wanted to verify whether participants with DD showed an over reliance on the sub-lexical route, with a consequent increase in the time needed to read words and non-words of increasing length (i.e., LE). For this reason, we compared a group of participants with DD to a group of TDRs in word and non-word reading tasks. The results of this study indicate that participants with DD did indeed present with a strong LE, compared to TDRs, in both word and non-word reading, which was particularly evident in RTs. The DD group showed a marked decrease in speed of reading as a function of the number of letters in a word. These results are similar to those observed with adult participants in transparent orthographies FIGURE 4 | Two-way interaction on non-words. TDR, typical developing readers; DD, developmental dyslexics; and RTs, reaction times. Davies et al., 2007;Richlan et al., 2010;Suárez-Coalla and Cuetos, 2015) and with children reading English (Ziegler et al., 2003). A possible explanation for these results may be that participants in the DD group predominantly rely on a serial analysis of the item, remaining anchored to a sub-lexical reading strategy, which results in slower and more effortful reading. For the word reading task, intriguingly, the marked differences in the DD group were in low frequency words, particularly between length three and length seven and between length five and length seven, whereas no statistically significant differences were found between different lengths in the high frequency condition, as shown by the post hoc comparisons (see Table 2). These results may indicate that the DD group employed larger units to read familiar words whereas, they appear to switch to smaller units when reading longer unfamiliar words.
The use of larger and smaller units in reading is postulated by the grain size theory (Ziegler and Goswami, 2005). The grain size hypothesis assumes that readers of inconsistent orthographies rely to a greater extent on larger units or grain sizes (e.g., syllables FIGURE 5 | Two-way interaction on z-scores on non-words. Higher z-scores reflect lower performance. TDR, typical developing readers and DD, developmental dyslexics. or even whole words), whereas readers of more consistent orthographies such as Italian, tend to rely on smaller grain sizes (e.g., graphemes) with the reading output primarily based on grapheme-phoneme correspondence. That is, the opaquer  the orthography, the larger the units employed in reading. Participants with DD were affected by the frequency of the words with familiar words being read better than unfamiliar words at each length considered. This pattern is consistent with the employment of a lexical route by the DD group to read familiar words. These findings were confirmed by the z-score analyses and mirrored those found with adult DDs reading in a transparent orthography (see e.g., Yael et al., 2015). Aspects of the TDR group performance are also interesting to note. In contrast to earlier studies (e.g., Balota et al., 2004), we did not find any significant LE for words or non-words. Our results fit well with previous research where LE has not been found among adult English readers, except in studies which employ a large number of items and lengths (see Marinelli et al., 2016 on this point). However, the results obtained with the z-scores showed that low frequency seven letter words differed from the other lengths. This result may indicate that the TDR group struggle to read long, unfamiliar words, and hence the TDR performance might be affected by the length of the words.
Intriguingly, the TDR group did not show any advantage in reading high frequency words compared to low frequency words (i.e., frequency effect). We can speculate that the employment of larger units by the TDR group might determine the almost total absence of advantage in reading high frequency words compared to low frequency words. In fact, even if a difference is noticeable in terms of means in RTs between low frequency and high frequency words, such difference is not statistically significant, except in the case of the seven letter low frequency condition and only in the z-scores (see Table 3). Nevertheless, it is worth noting that this result might be due to the effects of the transformation in z-scores.
Overall the results obtained from the z-score transformation are consistent with those obtained using the RTs. However, it is worth stating that in this particular case z-score transformation might be somewhat problematic. It has been argued that to the extent that the product of intrinsic variability and processing rate differs across individuals, the z-score transformation will be differentially biased for individuals (Faust et al., 1999). In this study, we found that the variability in the TDR group was much smaller, compared to the variability in the DD group. Therefore, when the raw scores are transformed to z-scores in the TDR group, even very small differences tend to be magnified. Such an effect seems to reflect more differences in the variance than an intrinsic difference between the two groups.
Typically developing readers seem to read familiar words by directly accessing the orthographic representation of the word (whole word recognition strategy) and unfamiliar words through the employment of large chunks such as the pattern of letters, syllables or rimes (e.g., Brown and Deavers, 1999). As previously illustrated, the inconsistency of English, in which the correspondence between letters and sounds is not always predictable, leads readers of this orthography to rely on a larger grain size to read. Indeed, the employment of smaller grain sizes by English readers is more likely to result in errors. The present results are therefore consistent with previous accounts of the use of larger units and a parallel processing mode in English readers (Ziegler and Goswami, 2005;Marinelli et al., 2016). Furthermore, the use of larger units in this group seems to help them to read fast even unfamiliar words, showing a minimum and not statistically significant frequency effect. DD participants, instead, seem to employ smaller grain sizes to read longer and unfamiliar words, which in turn cause an increase in the response latency and the LE. However, the frequency effect showed by such participants seems to highlight that they are still able to employ a parallel processing of the words when they are familiar.
Some useful insight can also be drawn by considering accuracy rates. Both groups were more accurate in reading high than low frequency words. This frequency effect shown by DDs also in RTs confirms the availability of the lexical route in the DD group (Barca et al., 2006). Furthermore, the largest number of errors for both groups was in the low frequency set of five and seven letter lengths. This reflects the fact that in an opaque orthography, like English, long unfamiliar words might be more difficult to read than familiar words even for proficient readers, increasing the amount of errors.
The non-word reading task, employed to investigate sublexical decoding, showed that LE in RTs were more apparent in the DD group, than in the TDR group. The marked differences in the DD group were detected between shorter non-words and longer non-words. Indeed, no significant LE was found between three letter and four letter non-words, whereas a difference was found between three letter and five letter, three letter and six letter, four letter and five letter and four letter and six letter nonwords. These results confirm that DDs can employ larger grain sizes to read even shorter non-words. However, increasing the number of letters results in smaller grain sizes being employed.
Interestingly, the TDR group did not show any LE in the non-word task, confirming that the employment of larger grain sizes is the prevailing way to read in this group, even when they encounter unfamiliar words. Indeed, the absence of a LE in the TDR group in this task is entirely consistent with the employment of larger grain sizes in typical readers of opaque orthographies compared to transparent orthographies. As for the accuracy data, the DD group made more errors than the TDR group, whose performance was also high in this task. The results obtained with the raw data were replicated with the z-scores, demonstrating that these findings are robust and might indicate that the DD group struggled with the sub-lexical decoding.
Overall, these findings suggest that the DD group presents with a large LE in both word and non-word reading, compared to TDRs, who showed very little difference between conditions in all the measures and tasks considered. Although this result seems to point to a deficit of the lexical route and an over-reliance on the sub-lexical route in DD, the frequency effect shown by DDs allows us to speculate that the lexical route is still available to this group. Furthermore, the difficulties shown by DDs in the non-word reading point out that they also struggle in the sublexical decoding. In terms of the DRC model, it is possible that the difficulties in DD arise at an earlier stage of the model, in particular at the visual feature or at the letter unit system.
An alternative explanation of the findings comes from studies conducted with patients with pure alexia. As previously mentioned, these patients present with damage to the left fusiform gyrus in the ventral occipito-temporal cortex, an area known as the visual word form area (Dehaene and Cohen, 2011). This area seems to be involved in pre-lexical processing of visual word forms (e.g., Dehaene et al., 2005). Behaviorally, pure alexia is characterized by a slowing of letter/word processing with some participants only able to read words by identifying one letter at a time. Using sensitive non-orthographic visual tests (naming line drawings of objects, novel face matching, checkerboard and kanji character discrimination), these patients also show deficits in pattern discrimination, object naming, and face processing, and are slower as a function of the visual complexity of the stimuli (Roberts et al., 2013(Roberts et al., , 2015Woollams et al., 2014). Future research should then investigate whether participants with DD also present with deficits in non-orthographic visual processing using the same tasks (i.e., checkerboard discrimination, novel face matching). If so, the triangle model (Patterson and Lambon Ralph, 1999;Hoffman et al., 2015) might be a more parsimonious account of these results than the DRC model and the application of the domain-general cognitive neuropsychological approach in explaining DD may prove valuable.
Establishing which model best accounts fits our findings is, however, is beyond the scope of this paper. Nevertheless, it would be useful for future studies to test participants with DD on the visual tasks mentioned above, work which we have already begun (Provazza et al., 2019). This would seem to be particularly relevant since patients with pure alexia present with LEs associated with other visual impairments (e.g., Roberts et al., 2013). Furthermore, similar brain abnormalities (e.g., left vOT) have been noted in DD using different methods including total brain volume, voxel-and surfacebased morphometry, white matter, diffusion imaging, brain gyrification, and tissue metabolite (for review see Ramus et al., 2018). Consequently, an association seems to exist between the neural bases of dyslexia (acquired and developmental) and visual and phonological impairments. It would also be interesting to compare participants with DD reading different orthographies such as Italian and English (transparent vs. opaque; see Marinelli et al., 2016 on this point).
To summarize, our results have shown that the LE seems to characterize DD not only in transparent but also in opaque orthographies, like English. This research presents an original contribution to our understanding of DD in English speakers. In fact, in the extant literature, LEs appear to be scarcely evaluated in DD in opaque orthographies and, in particular, in adults with DD. Furthermore, this study clearly showed that participants with DD are severely impaired in RTs, whereas they performed better in terms of accuracy, although this was lower compared to that of the TDR group.
It is worth noting that this study presents with some limitations. For instance, participants have not been matched for IQ. However, we would expect differences in IQ to be insignificant in this sample of academically able adults in higher education and thus would not impact substantially the conclusions drawn. IQ is a very generic and broad concept, and in fact, the use of some intelligence batteries has been recently questioned. For example, some authors (Giofrè and Cornoldi, 2015;Giofrè et al., 2019) have highlighted important biases in the use of intelligence estimates in studies of children with learning disabilities. Principally, differences in IQs might reflect artifacts of the battery in use, rather than real differences in the proposed latent variables. Notwithstanding the conclusions drawn from the present sample, we do acknowledge that perhaps in more differentiated samples the use of intelligence tests, may be worthwhile (see e.g., Kemp et al., 2009;Paizi et al., 2013). A further limitation might be the sample size, which was not very large. Nevertheless, the a priori power analysis showed that a sample size of 36 participants was sufficient to obtain robust results. Moreover, the analytic approach that we employed (i.e., generalized linear mixed models), strengthened the experimental power of the by-subject and by-item analyses and limited the loss of information due to the prior averaging of the by-subject and by-item analyses (Baayen et al., 2002;Paizi et al., 2013). Despite these limitations, the results of this study provide insight into LEs in adult participants with DD reading in an opaque orthography and show that the LE is a critical feature in DD regardless of the orthography. Additionally, since LEs are observed in highly educated participants with DD, it might be an aspect to be clinically assessed in adults with DD in higher education and beyond. Previous research indeed has shown a lack of consensus about how university students should be diagnosed, since their performance in achievement tests is often in the average range (e.g., Sparks and Lovett, 2009). These findings might prove fruitful to clinicians working with DD university students, although further research is needed to confirm the results obtained in this study.

DATA AVAILABILITY STATEMENT
The datasets generated for this study are available on request to the corresponding author.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Liverpool John Moores University Research Ethics Committee. The patients/participants provided their written informed consent to participate in this study.