Factors That Influence Improvement in Numeracy, Reading, and Comprehension in the Context of a Numeracy Intervention

In a randomized controlled trial 104 primary school children, who received an individualized numeracy intervention, Catch Up Numeracy, were compared with 100 children, who received matched-time teaching, and 107, who received business-as-usual teaching. They were assessed before and after intervention, on the Number Screening Test and on both the reading and comprehension components of the Salford Sentence Reading Test. Those who received the intervention improved significantly more than the controls in numeracy but not in reading or comprehension. Numeracy, reading, and comprehension scores were significantly correlated. Both reading and numeracy predicted improvement in comprehension, but only comprehension predicted improvement in reading, and neither literacy measure predicted improvement in numeracy. Children eligible for free school meals scored lower than others on all pre-tests and post-tests, but did not differ in their levels of improvement. Age negatively predicted improvement in reading and comprehension, but not numeracy. Gender affected comprehension but not reading or numeracy.


INTRODUCTION
This study deals with an investigation of certain factors that influence children's levels of improvement in response to a mathematics intervention. We will discuss both the general levels of response to the mathematics intervention, and the question of whether the extent of progress is influenced by children's performance in measures of literacy.
Evidence shows that reading and mathematical abilities are correlated, and in particular that reading and mathematical disabilities often show comorbidity (Miles et al., 2001;Fuchs et al., 2004;Dirks et al., 2008;Rubinsten, 2008;Slot et al., 2016). Moreover, children with comorbid mathematics and reading disabilities tend to do less well on mathematical tasks than children with mathematical disabilities without reading disabilities (Jordan and Montani, 1997;Jordan and Hanich, 2000;Jordan et al., 2003). This association is far from invariable and discrepancies between reading and arithmetic are common (Jordan et al., 2003;Landerl et al., 2009). Some studies suggest that there are common factors underlying mathematical and reading disabilities, e.g., phonological abilities (Slot et al., 2016). Other studies suggest that this may be only true of those who do have comorbid reading and mathematical difficulties. Moll et al. (2015) found that children with mathematical difficulties alone tend to have deficits in processing numerosities, while those with combined reading and mathematical difficulties tend to have deficits in phonological awareness.
It is important to understand more about the relationships between reading and arithmetic, in order to increase our understanding of both arithmetical development and reading development in their own right, and possibly of the factors that may influence the nature, treatment and outcomes of reading difficulties and arithmetical difficulties.
There are several issues that limit the conclusions that can be drawn with regard to existing studies of the influences of reading ability on the nature and outcomes of children's mathematical difficulties. One is that most studies have compared children who have mathematical difficulties with and without comorbid reading difficulties, but have not investigated the effects of continuous variations in reading ability on mathematical difficulties. Another is that neither arithmetic nor reading is a unitary ability.
Arithmetical ability is not a single entity but is made up of many components (Dowker, 2005(Dowker, , 2015 and different components appear to be differentially related to reading ability. It is usually found that reading difficulties are more associated with difficulties in retrieval of arithmetical facts than with other aspects of arithmetic (Miles et al., 2001;Singleton, 2006, 2009;Goebel and Snowling, 2010).
Reading also has different components: most notably decoding ability and comprehension. Most studies of the relationships between reading and mathematics have not separated the effects of decoding (usually treated as synonymous with reading) and comprehension. Those studies, that have separated the two, have tended to suggest that decoding is more associated with arithmetical fluency, possibly because phonological awareness contributes to both (De Smedt et al., 2010;Jordan et al., 2010) while comprehension is more associated with mathematical reasoning and word problem solving (Pimperton and Nation, 2010;Vukovic et al., 2010;Bjork and Bowyer-Crane, 2013;Bjorn et al., 2016).
Most studies of the relationships between reading and arithmetic have been cross-sectional and have not involved longitudinal studies. In particular, few have looked at the influence of either reading or arithmetic on response to intervention in the other subject. An exception is a study by Fuchs et al. (2004). They gave a 16-week mathematical problem-solving intervention to children who were assessed to be at risk of reading disability, mathematics disability, both or neither. All at-risk groups showed less improvement than the no-risk group in computation and labeling; and those at risk of both showed less improvement in conceptual underpinnings. However, mathematics-related abilities were better predictors of improvement than reading-related abilities. Thus, it seems that reading-related limitations are a negative predictor of improvement in mathematics, but not as much as mathematical limitations.
Although mathematics-related limitations have in some studies (Fuchs et al., 2004) proved a negative predictor of improvement as well as current performance, we predicted that initial mathematics score would be a negative predictor of improvement, since parallel forms of the same test were being used, and there is more room for improvement if scores are lower to start with.
The present study was carried out in the context of an evaluation, funded by the Education Endowment Fund of a numeracy intervention. The evaluation included pre-tests and post-tests not only in numeracy but in reading (decoding) and comprehension, making it possible to investigate both the specificity of effects of the intervention on numeracy, and more generally, whether numeracy influenced performance and improvement in reading or comprehension, and vice versa. There was also some information about the children's socioeconomic status, which made it possible to investigate its effects on performance and improvement in all the domains studied.
The intervention studied was Catch Up TM Numeracy, developed by the author in collaboration with Graham Sigley and the Catch Up TM Trust (Dowker and Sigley, 2010;Holmes and Dowker, 2013;Dowker and Morris, 2014). The target pupils for this intervention are primary school pupils, who have numeracy difficulties (not necessarily amounting to dyscalculia), and its key focus is assessing and targeting specific strengths and weaknesses. The intervention begins by assessing the children on 10 components of early numeracy. Each child is assessed individually by a trained teacher/teaching assistant. This assessment is used to construct a "Catch Up Numeracy" learner profile, which determines the entry level for each of the 10 Catch Up Numeracy components and the appropriate focus for numeracy teaching. Children are provided with mathematical games and activities targeted to their specific levels in specific activities.
The children receive two 15-min sessions per week for ∼30 weeks, focusing on the components with which they have difficulty.
For a detailed account of the intervention programme, see Holmes and Dowker (2013). The focus of the present study is more on the characteristics in children that may influence improvement in general, and response to intervention in particular.
The present investigation involved a randomized controlled study, which compared children, who underwent the intervention, with controls, who received business-as-usual teaching. There was an additional control group, who received equivalent time for individualized numeracy intervention not using Catch Up. However, this part of the study proved problematic, as the randomization of the groups was within schools, and there was evidence that there was often communication between the staff involved, so that the staff supposedly administering the equivalent-time measure were often adopting Catch Up techniques from other staff (this issue is being addressed in an ongoing follow-up study). Several predictions were made.
(1) On the basis of earlier findings (Dowker and Sigley, 2010;Holmes and Dowker, 2013), it was predicted that children who underwent Catch Up Numeracy would show more improvement than controls. (2) It was predicted that girls might perform better at reading and comprehension, given that studies often show better literacy performance by girls (e.g., OECD, 2015). (3) No gender difference was expected for improvement in any of the domains. (4) It was predicted that pupils eligible for free school meals would perform less well in all domains, given that most studies show a strong effect of SES on academic performance (e.g., Melhuish et al., 2008;Dickerson and Popli, 2016). (5) It was also expected at pupils eligible for free school meals might also show less improvement and, in the case of mathematics, less response to intervention, on the basis of somewhat parallel findings with regard to literacy (Torgesen et al., 1999). (6) It was predicted that chronological age might negatively predict improvement in all domains, as any weaknesses might become harder to correct, whether by external intervention or by standard teaching, as children become older. (7) As most studies show that academic skills correlate with one another and with IQ (e.g., see Mellanby and Theobald, 2014), it was expected that scores in reading, comprehension and numeracy would all correlate significantly with one another; and that all would correlate with an IQ measure. (8) As regards influences on improvement, it was tentatively predicted that reading would predict levels of improvement in comprehension and vice versa, but that numeracy would not influence improvement in either. (9) It was, however, expected that reading would influence improvement in, and possibly response to, intervention in numeracy, but that comprehension would not. This was because the numeracy task predominantly involved computation and number understanding, and contained only a small element of word problem solving; and previous findings had suggested the former are more strongly related to decoding and the latter to comprehension (e.g., Fuchs et al., 2004).

Ethics
The NFER has a well-developed Code of Practice that contains detailed ethical protocols. These protocols govern all research undertaken by NFER and the trial lies within them. Parents gave active written consent for all eligible pupils put forward for the intervention and testing, and the Catch Up team confirmed that consent had been received before continuation of the trial.
Parental consent was obtained (see above). Interventions were carried out by teachers or teaching assistants already employed by the schools. All researchers involved in testing had undergone enhanced Criminal Records Bureau/ Disclosure and Barring Services checks.

Design and Participants
The larger-scale study originally included 336 participants. All had been selected by their schools as low attainers in numeracy, who might benefit from intervention. Six pupils from each of 53 primary schools were randomly assigned to one of three groups: a control group that received business-as-usual teaching, a Catch Up Numeracy intervention group that received the intervention as described above, and an "matched time" group that received two 15 min sessions a week without Catch Up Numeracy, to replicate the one to one nature of the intervention. One hundred and twelve pupils were assigned to each group. Due to 25 children moving from their schools, or being consistently absent for tests, the number of participants was reduced to 311: 104 in the Catch Up Numeracy group, 100 in the Matched Time group, and 107 in the Business as Usual group.
The 311 children included 146 boys (49 in the Catch Up group, 39 in the Matched Time group and 58 in the Business as Usual group) and 165 girls. The overall mean age of the participants was 97.51 months with a standard deviation of 14.85. The ages of the different groups are given in Table 1. An ANOVA showed no significant group difference in ages.

Tests
Before the start of intervention, the children were given the Non-Reading Intelligence Test (Young and McCarty, 2012); the Numeracy Screening Test (Gillham et al., 2012) and the New Salford Sentence Reading Test (Bookbinder et al., 2012). The latter includes tests of both Reading and Comprehension. They were given parallel forms of the same tests, ∼8 months later, after the intervention; except for the Non-Reading Intelligence Test, which was not repeated. Table 1 gives the mean starting ages of the children in the Catch Up, Matched Time, and Business as Usual groups and their initial standard scores, for all the tests. A multivariate analysis of variance was carried out with Assignment (Catch Up vs. Matched Time vs. Business as Usual) as the grouping factor, and Age, Non-Reading Intelligence Test standard score, and initial standard scores in Numeracy, Reading, and Comprehension as the dependent variables. The table gives the resulting F-values, p-values, and effect sizes (partial eta squared). The multivariate F (5, 306) = 1.34; p = 0.25; partial eta squared = 0.021.

RESULTS
As can be seen, there were no significant differences between the groups in age or in any of the initial test scores. Table 2 gives the post-test scores. A multivariate analysis of variance was carried out with Assignment (Catch Up vs. Matched Time vs. Business as Usual) as the grouping factor, and post-test standard scores in Numeracy, Reading, and Comprehension as the dependent variables. The table gives the resulting F-values, p-values, and effect sizes (partial eta squared). The multivariate F (3, 308) = 2.03; p = 0.11; partial eta squared = 0.019.
Again, none of the comparisons were significant.   Table 4 gives boys' and girls' pre-test standard scores, for all the tests. A multivariate analysis of variance was carried out with Gender (Boys vs. Girls) as the grouping factor, and Non-Reading Intelligence Test standard score, and pre-test standard scores in Numeracy, Reading, and Comprehension as the dependent variables. The table gives the resulting F-values, p-values, and effect sizes (partial eta squared). The multivariate F (4, 307) = 5.48; p < 0.001; partial eta squared = 0.063.

Gender Effects
As can be seen, girls scored higher in both the intelligence test and the comprehension test, but there were no significant gender differences in numeracy or in reading.  As can be seen, there were no significant gender differences in any of the post-test scores. Table 6 gives boys' and girls' standard store gains. A multivariate analysis of variance was carried out with Gender (Boys vs. Girls as the grouping factor, and standard score gains in Numeracy, Reading, and Comprehension as the dependent variables. The table gives the resulting F-values, p-values, and effect sizes (partial eta squared). The multivariate F (4, 307) = 2.107; p < 0.099; partial eta squared = 0.099.
The only significant group difference was for Comprehension Standard Score Gain, where boys made greater gains.  Thus, children eligible for free school meals performed significantly less well on all pre-tests and post-tests than children, who were not eligible for free school meals, despite the fact that all of the children were selected for their low attainment in numeracy. Socio-economic status clearly has a strong effect on primary school children's performance in literacy and numeracy. However, free school meal status had no effect on children's gains.

Effects of Free School Meal Status
Similar ANOVAs were carried out with both Assignment and Free School Meals Status as grouping factors, to investigate the  possibility of interactions. No significant interactions were found for any of the dependent variables, so the results will not be reported further.

Correlations
Pearson correlation coefficients were computed between the initial standard scores in all three domains and the Non-Reading Intelligence standard score, and between these scores and chronological age in months. All correlations were significant. With 311 participants, Numeracy correlated highly with Reading (r = 0.449; p < 0.001) and Comprehension (r = 0.42: p < 0.001) as well as with Non-Reading Intelligence (r = 0.279; p <0.001).
Another entry level multiple regression was carried out with Comprehension Standard Score Gain as the dependent variable and Initial Comprehension Standard Score, Age, Initial Reading Standard Score, Initial Numeracy Standard Score, and Non-Reading Intelligence Standard Score as the predictors. An entry level multiple regression was carried out with Numeracy Standard Score Gain as the dependent variable and Initial Reading Standard Score, Age, Initial Comprehension Standard Score, and Initial Numeracy Standard Score as the predictors. R 2 = 0.196; F (5, 306) = 14.487; p < 0.001). The significant independent predictors were Initial Numeracy Standard Score, which was a strong negative predictor [β = −0.485, t (5, 306) = −7.77; p < 0.001] and Initial Comprehension Standard Score, which was a positive predictor [β = 0.301, t

DISCUSSION
Firstly, the results show that, as predicted (Prediction 1), those who underwent the interventions significantly more improvement in numeracy than those who did not. They showed an average of nearly 5 months greater gain in number age and over four points greater gain in standard score than those who underwent "business as usual." Analysis of ratio gains showed that children who underwent intervention also showed more than twice the level of improvement that would be expected from the passage of time alone, Thus, the results support earlier findings that the Catch Up Numeracy intervention leads to a significant improvement in mathematics performance (Dowker and Sigley, 2010;Holmes and Dowker, 2013). There was no significant effect of the numeracy intervention on improvement in reading or comprehension, indicating that the effect was specific to numeracy.
There was, however, no significant difference in improvement between children who underwent the Catch Up Numeracy intervention and the Matched Time intervention; though it was found that the Catch Up Numeracy intervention differed significantly from the Business as Usual intervention, while the Matched Time intervention did not. In previous studies, the Catch Up Numeracy intervention had resulted in significantly more improvement than Matched Time intervention (Dowker and Sigley, 2010;Holmes and Dowker, 2013). It is possible that the current results are due to a contamination effect, as the teaching assistants delivering the Catch Up Numeracy and Matched Time interventions were in the same schools, and interview evidence suggests that some of the teaching assistants delivering Matched Time interventions were influenced by input from those delivering Catch Up Numeracy interventions. An ongoing randomized controlled study is currently being conducted to compare Matched Time with Catch Up Numeracy.
It is notable that the children in general showed improvement in all tests between pre-test and post-test. This may be due to regression to the mean; to "Hawthorn effects" of being in schools that were part of a study programme even in the case of controls; or to increased familiarity with test expectations, even though they were given parallel forms rather than repetitions of the same test.
There were a few factors that appeared to affect initial performance, level of improvement or both. Gender had very little effect. Prediction (2) that girls would do better on reading and comprehension tests was only partially confirmed. They did do better on the comprehension pre-test, but not the post-test; and they did not differ in reading. The group somewhat atypical, as the children had been selected for being low attainers in arithmetic; though their scores on literacy measures were much higher than those on arithmetic. Prediction (3) that gender would not influence improvement was broadly supported, Gender had virtually no influence on performance, with one exception: boys made significantly more gains in Comprehension than did girls. This seems to be due to the fact that they started at a lower point, but ended at the same point. This result is a little hard to interpret, and would need further replication to ensure that the findings are not due to chance. If replicated, it may reflect some differences between boys and girls as regards the timing of developmental changes in language comprehension.
In accordance with Prediction 4, one factor that strongly influenced performance was SES, as indicated by free school meal status. Children, who were eligible for free school meals, performed much less well than other children in all domains, both at pre-test and post-test. However, contrary to Prediction 5, free school meal status did not influence level of improvement in any of the domains, nor did it show any interaction with intervention group assignment with regard to improvement in numeracy. Thus, while there is a striking effect of socio-economic status on academic performance, even within a group already selected for low achievement in arithmetic, it does not appear to influence the chances of improvement, or the response to intervention.
Unexpectedly, chronological age was positively correlated with initial standard scores in all the tested domains, despite the fact that the scaling is carried out to control for age. One possible explanation may be that older children did not have to be as markedly delayed as younger children for teachers to note that they were having difficulties and recommend them for intervention. In accordance with Prediction 6, age was a negative predictor of improvement in Reading and Comprehension, even after controlling for initial scores. However, age did not predict improvement in numeracy, either overall or in any of the Assignment groups. Thus, the prediction that age might be negatively associated with improvement was supported for the literacy measures, but not for numeracy. This is not due to intervention nullifying this relationship, since age was not associated with numeracy improvement in the Business as Usual group any more than in the intervention groups. Presumably as a result of this negative association between age and improvement, the correlations between age and the literacy measures disappeared between pre-test and post-test, while the correlations between age and Numeracy persisted.
In accordance with Prediction 7, standard scores in all domains correlated with one another, with the highest correlation being between Reading and Comprehension; and IQ correlated with all the pre-test standard scores. Gains in Reading and Comprehension correlated significantly with one another, but not with gains in Numeracy. Multiple regressions showed that in all domains, initial scores were negative predictors of gains in the same domain, presumably because the lower the initial score, the more room there is for improvement.
In accordance with Prediction 8, initial score in Reading predicted progress in Comprehension, and vice versa, indicating that these are indeed two closely related abilities, longitudinally as well as concurrently.
Contrary to Prediction 9, neither initial Reading score nor initial Comprehension score predicted improvement in Numeracy, whether for the Catch Up group, the Matched Time group the Business as Usual group, or the sample as a whole. Thus, it seems that, while literacy measures do correlate with numeracy, they do not influence children's mathematical progress, or the effectiveness of intervention, and the factors that influence progress in literacy seem to be different from those that influence gains in Numeracy. Intriguingly, initial score in Numeracy predicted progress in Comprehension but not in Reading. This had not been expected, either in terms of the direction of the association, or in terms of the greater association between mathematics and comprehension than between mathematics and reading. The latter was especially unexpected, in view of the fact that the mathematics test was one of numeracy, rather than mainly involving the word problem solving and mathematical reasoning abilities, previously found to be more associated with comprehension. However, it is noteworthy that Haarlar et al. (2012) carried out a twin study involving 12-year-olds, there and found higher genetic and phenotypic correlations between mathematics and reading comprehension than between mathematics and word decoding.
There are some limitations to this study that should be addressed in future studies. As mentioned above, one is the need for an equivalent time group, which avoids the problem of cross-contamination by using between-school rather than within-school randomization. Also, it would be desirable if possible to match children more precisely on their test scores at the start. Although the initial differences between groups were non-significant, the Catch Up group showed a somewhat lower initial Numeracy score than the Business as Usual group (see Table 1), seemingly resulting in the fact that although they showed significantly greater gains, they did not differ significantly in the post-test Numeracy score when not controlling for initial Numeracy score.
It would also be of considerable interest to carry out studies that include interventions in Reading and Comprehension as well as Numeracy, in order to be able to assess influences on response to intervention in these literacy measures as well as in numeracy.
Finally, it would be desirable to look at the factors influencing improvement in these domains over a wider range of ability in these domains. Would the same factors influence or fail to influence improvement in Numeracy children who were initially performing at average and above-average levels, as in these children, who were selected for weaknesses in arithmetic? Would the finding that, for example, initial Numeracy score predicted improvement in Comprehension but not vice versa be replicated in a group who were better at Comprehension than Numeracy to start with? Would such predictive relationships differentiate between children with specific difficulties in literacy or numeracy and those who are performing poorly in all academic domains?
In any case, the results indicate that relationships between different abilities, and between these abilities and other factors such as age, are not simple or static. Future studies should focus more on how such relationships change over time, and how initial factors may predict changes over time in general, and response to interventions in particular.
There are several implications for education. One is that a structured individualized system of one-to-one teaching can lead to quite significant improvement in children with numeracy difficulties, and it does not need to be highly intensive to be effective. Another is that, at least among primary school children, such interventions can be effectively delivered at any age: age did not affect the level of improvement that children showed. Children's socio-economic status, as shown by free school meal status, also does not seem to affect response to intervention, though it does affect the overall level of performance. The results suggest that there are strong concurrent correlations between numeracy and literacy measures. However, they do not suggest a strong longitudinal relationship between numeracy and literacy. Numeracy improvement, whether within or outside the context of intervention, was not predicted by either reading or comprehension. However, it appears that numeracy, at least in a group selected as low attainers in numeracy, can to some extent predict children's progress in reading comprehension (but not decoding), at least in the short term. Since there was no such relationship in the reverse direction, it is unlikely to indicate a strong intrinsic relationship between numeracy and comprehension. It is possible, however, that numeracy is a prerequisite for, but not a consequence of, improvement in comprehension; though there appear to be no previous studies indicating such a relationship. More likely, some domain-general ability may be influencing both. Such an ability is unlikely general logical reasoning, as the intelligence measure used in this study did not predict improvement in comprehension, and the relationship between initial numeracy level and comprehension remained significant even after controlling for this measure.

AUTHOR CONTRIBUTIONS
The author confirms being the sole contributor of this work and approved it for publication.

FUNDING
The Education Endowment Foundation provided funding for the intervention study, on which this article is based.