Individualized Assignments, Group Work and Discussions: How They Interact With Class Size, Low Socioeconomic Status, and Second Language Learners

Varied teaching techniques are an important aspect of a successful classroom. Student and classroom factors such as ability level, lower socioeconomic status, and/or native language can interact with teaching techniques. Previous work suggests that each teaching technique may be more effective for different students or in different classroom situations, but few studies have directly examined which factors relate to effective teaching techniques. This study uses data for early secondary school students in Germany from the National Education Panel Study (NEPS) to examine the effects of group work, discussions, and individualized assignments on reading and math competency change between 7th and 9th grade. Additionally, we model the interactions of effects of class size, second language learners background, and lower socioeconomic status with these teaching techniques. We conclude that group work relates to more competency growth in math for second language learners, while classroom discussions relate to less growth for second language learners. Discussions relate to less growth in math competency for smaller classes and more growth in larger classes. Group work was also related to slower reading competency growth for children with a higher prior ability level. Findings are discussed in relation to existing theories of teaching techniques.


INTRODUCTION
Increasingly heterogeneous classrooms are common throughout developed countries. These classrooms include second language learners and learners from low socioeconomic status (SES), both of which may negatively affect the development of competencies in early secondary school (Björklund and Salvanes, 2011;Solari et al., 2014). An important job of educators is to provide instruction for a diverse group of learners. Inclusive teaching techniques such as individualized assignments, group work, and discussions are often considered effective teaching techniques in heterogeneous classrooms (Tomlinson and Imbeau, 2011;Tomlinson and Moon, 2013;Cohen et al., 2014;Miller et al., 2017). However, large-scale assessments of teaching techniques which asses the efficacy of these techniques for diverse learners are uncommon. Additionally, class size can affect the utilization of each of these teaching techniques (Blatchford and Russell, 2018). This paper utilizes data from a large-scale assessment to examine the roles of individualized assignments, group work, and discussions on math and language competency, and it examines the relationships between these teaching techniques, student background, and class size.
Migrant background and low SES have been repeatedly linked to worse educational outcomes in both language and math skills. The link between a lower SES and poorer education outcomes has been demonstrated in many countries around the world (White, 1982;Lekholm and Cliffordson, 2008;Currie, 2009;Björklund and Salvanes, 2011;Rambo-Hernandez and McCoach, 2014;Sirin, 2016;DeVries et al., 2018). Furthermore, recent studies have indicated that income levels and job categorization are weaker correlates of cognitive function than parental education levels (Eilertsen et al., 2016) while parental educational behavior is a better indicator (Rindermann and Baumeister, 2015). Additionally, being a non-native speaker is also a risk factor for low achievement in reading (Solari et al., 2014) and other skills (Crosnoe and Fuligni, 2012).
Effective teaching strategies can significantly reduce the negative effects of disadvantaged backgrounds (Torres, 2018), and individualized instruction is a hallmark feature of universal design for learning (Tomlinson and Imbeau, 2011;Bešić et al., 2016;Ok et al., 2016). Through such individualized instruction, children are provided specialized tasks (e.g., classwork and homework) based upon their current ability level. Similarly, group work promotes student learning and development (Tomlinson and Imbeau, 2011;Miller et al., 2017). Group work can improve the engagement of children to foster their learning (Roseth et al., 2008;Cohen et al., 2014). Furthermore, groups of heterogeneous ability levels can provide a greater boost for atrisk and lower achieving children (Marzano et al., 2003;Igel and Urquhart, 2015). Meanwhile, effective classroom discussions can also foster learning (Tomlinson and Imbeau, 2011;Jocz et al., 2014). However, not all students gain equally from discussions (e.g., Kang and Keinonen, 2018). Group work and discussions can provide learners individualized, student centered instruction, but the outcomes of these approaches may be inconsistent across student and teacher ability levels (Kang and Keinonen, 2018).
There is mounting evidence of that greater benefits of group work and individualized instruction for learners with certain backgrounds. For instance, King-Sears et al. (2015) found that implementing universal design principles (UDL) in the classroom produced greater benefits at posttest for learners with disabilities. Meanwhile, learners with an immigrant background may be more engaged in mathematics learning in group work settings (Takeuchi et al., 2019), which may result in a greater benefit for learners with a migration background. Meanwhile in a qualitative study in Iceland, Benediktsson and Ragnarsdottir (2019) found that some immigrant children had very positive experiences and others negative experiences with group work. Relatedly, another exploratory study recently indicated that high achieving children may experience group work quite differently than other children (Cera Guy et al., 2019) and that they may not be more motivated or engaged by it. In the same token, learners with a migration background may be more worried and anxious about participating in classroom discussions, leading to a lower level of participation and fewer benefits (Maeda, 2017). On the other hand, discussions highlighting multiple methods and equal participation may be more beneficial for second language learners (Banes et al., 2018).
Teaching strategies are also affected by class size. Class size alone is generally found to have a small, but significant effect on learning outcomes with larger classes leading to slightly worse outcomes on average (Brühwiler and Blatchford, 2011;Krassel and Heinesen, 2014;Watson et al., 2016). Early evidence of this was well-summarized by Glass and Smith's (1979) meta-analysis, which indicated a continuous improvement for shrinking class sizes below around 30 students. However, in a re-examination of their data, Phelps (2011) argued for a more nuanced and careful conclusion. He argued that the relationship between class-size and achievement was a complex one. This is in line with Hattie (2005) conclusion that class size reductions do not necessarily result in improved performance. Class size may affect related student variables such as a sense of belonging, cohesion, and motivation (e.g., Harfitt and Tsui, 2015), and it may affect teacher variables such as classroom management, teaching techniques, the use of group work (Blatchford and Russell, 2018), and amount of individual attention for each student . Indeed, the use and effectiveness of teaching techniques may vary based on class size (Wright et al., 2017). Blatchford and Russell's (2018) conclusion is that class size may not have a direct effect on student learning, but instead, it affects classroom groups, management, and teaching. As a result, group work and individualized instruction may be more suited to smaller classes where classroom management is easier. Whereas, in larger classrooms, teacher techniques such as discussions may suffer less from a larger class size.
One recent large-scale study examined effective use of teaching techniques. Hofmann and Mercer (2016) focused on group work and discussion specific techniques in math classes, but they do not extend their results to other subjects, teaching techniques, or diverse learners. Instead, they focus on specific strategies for teachers to improve teaching techniques rather than how those techniques interact with student and classroom variables. The National Education Panel Study [NEPS; (Blossfeld et al., 2011)] provides a useful database to examine specific teaching techniques in relationship to learner ability level, background, and classroom situation. NEPS is a largescale, multi-cohort study, which tracks many psychological, sociological, economic, educational, and other variables for a large, representative sample of German children, adolescents, and adults. Within such a dataset, longitudinal connections can be found between teaching techniques and a diverse set of learner risk factors.
The present study examines the role of low SES, as indicated by parental education levels, and second language learning on the development of math and reading skills in secondary students between grades seven and nine. We also examine the role of teachers' self-reported use of individualized assignments, group work, and discussions. Lastly, we examine the effects of class size and possible interactions between class size and the use of group work, discussions, and individualized assignments. Our data comes from teacher questionnaires and student competency test results in NEPS (Blossfeld et al., 2011). NEPS tracks the longitudinal development of multiple cohorts drawn from representative samples across Germany. We use data from the third cohort (SC3), which started data collection in the fifth grade. NEPS's unique combination of surveys and competency tests allowed us to investigate three research questions examining the interaction of these critical factors in the development of reading and math competency. 1) How do lower parental education levels and second language learning effect the development of reading and math competency in secondary school? We expect that both of these factors will negatively impact competency gain in both reading and math in seventh to ninth grade (Solari et al., 2014;Gebhardt et al., 2015;DeVries et al., 2018). 2) How does the use of individualized assignments, group work, and discussions effect the development of reading and math competency in secondary schools? We expect that the use of these three teaching techniques will overall benefit the instruction of children (Roseth et al., 2008;Tomlinson and Imbeau, 2011;Tomlinson and Moon, 2013;Miller et al., 2017).
We further expect such techniques will produce a greater benefit for second language learners and for learners whose parents lack a university degree (e.g., Banes et al., 2018;Kang and Keinonen, 2018;Takeuchi et al., 2019). 3) How does class size affect the development of reading and math competency in secondary school and does it interact with the use of group work, classroom discussions, and individualized assignments? We expect to find that a larger classroom has a small, negative effect on competency (Glass and Smith, 1979;Phelps, 2011), and that it interacts negatively with teaching techniques suited to smaller classes [i.e., group work and individualized assignments; (Harfitt and Tsui, 2015;Wright et al., 2017;Blatchford and Russell, 2018)].

Participants and Sample
NEPS is a longitudinal study whose questionnaires are administered in waves that are roughly a year apart from each other. In our study, we considered only students that participated in waves three through five (seventh through ninth grade, N = 5,119). In order to account for drop-outs and to obtain an appropriate sample in terms of school type, federal state, regional classification, and funding institution, we weighted these students according to the provided sampling weights, which were specific to students who participated in these three waves (Steinhauer and Zinn, 2016). However, our analysis combines responses from three resources, students, their respective parents, and the respective subject specific teachers. Not all respective parents and teachers participated in the study. Self-reported, parent-reported, and school track data were available for 2,732 students. From this sample we obtained separate samples for math and German classes because the availability of teacher responses and competence data was subject specific. Complete reading competence data and German teacher questionnaire data were available for 794 students, and complete math competence data and math teacher questionnaire data were available for 1,072 students. A descriptive overview over the samples is given in Table 1, which includes the weighted means and proportions of missing data for all variables in the aforementioned samples. For all non-dichotomous variables, standard deviations are also provided. All values in Table 1 include sample weights. The initial dataset column includes the summary for all students who participated in waves 3 (7th grade) and 5 (9th grade) of the longitudinal study. The parent and school data column includes a summary for the subsample of all students whose student (first language), parent (obtained university degrees), and school track data was available. The last two columns summarize the subsample for which all data (competence, teaching techniques, and class size) of the German or the math courses are available.

Competence
We used Warm's likelihood estimates (based on item response theory) for reading and math competence provided in the NEPS data set. For a full description of the test characteristics see Carstensen and Pohl (2012). Reading and math competence tests were administered at the beginning of grades seven and nine.

Student Background
Native language was determined by student questionnaire. Students whose native language was not German were considered second language learners. Parental education was determined by parent questionnaire based on Comparative Analysis of Social Mobility in Industrial Nation (CASMIN) data from NEPS. Parental education was rated as high if the one or both parents had a university degree. Otherwise, it was rated as low. The education level of the partner of a single parent was treated as missing, and thus overall parental education level was considered as missing if one parent indicated they lacked a university degree and the other parent's data was missing.

Teaching Techniques
A summary of the instruments used and their reliability measures can be found in Table 2. The selection of teaching techniques came from two separate teacher instruments included within the NEPS database. Group work and discussions came from an exploratory factor analysis (EFA) of items the nine items included in ed0004 (German classrooms) and ed00028 (math classrooms) with varimax rotation. The best fit was the 3-factor solution, which had a significantly better fit than the 2-factor solution, χ 2 (8) for 3-factors vs. 2-factors = 241.143, p < 0.001. A 4-factor solution was not considered due to issues with identification. Loadings above 0.3 were used to sort the items into group work, discussions, and presentations. However, only one item loaded onto presentations, so this factor was discarded. One item representing project work (item f) loaded approximately equally (0.38 and 0.36) onto the remaining two factors. This item was dropped from our analyses because both factors were of interest. We also examined the seven items included in ed0009 and ed0033 about teachers' individualized instructions in math and M refers to mean, SD refers to standard deviation, and %na refers to the percentage of missing responses. The full dataset reflects the dataset that appears in NEPS. The final sample columns include only those where data was available for competences, teacher data, and parent data.
German classes. The same EFA procedure was followed, with a 2-factor solution providing the best fit, χ 2 (6) for 2-factors vs. 1-factor = 346.701, p < 0.001. A 3-factor solution was not considered due to identification problems. This resulted in a factor focusing entirely on group composition (homogenous or heterogeneous grouping), and a factor examining individualized instruction techniques regarding assignments and demands within the classroom. We choose the second factor to focus on, and did not further analyze the factor of group composition because of the theoretical goals of this article did not include the issue of homogenous or heterogeneous groups. The resulting scales are further detailed below. The use of group work and discussions was determined by teacher questionnaires from the 7th grade math and German teachers. We provide NEPS translations of the original German language questionnaires. The original versions may be found within the NEPS database. Questions were answered along a 6-point discretized frequency response from "Never" to "almost every lesson." Group work came from the average response from three items, including "Work with small student groups, " "Partner work, " and "Students acting as tutors (peer tutoring)." The frequency of discussions came from the average response to "Discussion rounds" and "The class and I have discussions." Before averaging, the responses were converted into an approximate weekly rate based on the following conversions: never (∼0 times per week), once or twice per school year (∼0.04 times per week), Every few months (∼0.1 times per week), every 2-4 weeks (∼0.33 times per week), Once per week (∼1 time per week), (almost) every lesson (∼2 times per week).
Individualized assignments were determined by a separate five point Likert scale ("does not apply at all" to "applies completely"). This questionnaire was also collected from the 7th grade German and Math teachers. The five items included were: "I demand considerably less from students who are less capable, " "I give students homework ranging in complexity based on their capability, " "I allow students who work faster to move on to the next assignment while I am still practicing or reviewing things with the ones that work slower, " "If students have difficulties in understanding, I give them additional assignments, " and "I give more capable students extra assignments that are really challenging for them." The raw average of the Likert responses was taken for individualized assignments.
In the event of team-taught classes, the average response for each responding teacher was taken for each class unit on each of the three scales. A full description of the final scales along with Cronbach's alpha can be seen in Table 2. Of note is the relatively low Cronbach's alpha for individualized instruction. With the removal of item a, this would rise to 0.68, but this was not done in order to keep the scale consistent across both classes. Similarly, a low Cronbach's alpha was present for discussions in the German class, but as this scale only included two items, such this value is not greatly informative. Figures 1 and 2 display the responses of teachers to questions about their teaching methods. The given percentages are purely based on the teacher data, meaning that they do not account for number of students (in the sample) in their classroom, that they do not account for the number of teachers that taught a specific student (in seventh grade), and that they do not incorporate the sampling weights of each student. In the figures, each row corresponds to a specific question. There are separate columns for both subjects. On the x-axis, each possible response category is notated, while on the y-axis, the percentage of responses in each category (ignoring all missing data) is notated. Figures 1 and 2 include the sample of all teachers in our data (in dark gray), and the final sample used in our models (in black). Responses are roughly normal, although skewed in some cases. All response categories include some responders, except for German, where certain responses to group work or discussions were never used. Lastly, there are no strong differences in the response patterns between teachers who responded and those teachers included in our final analysis.

Class Size
The size of the German and math courses were not recorded by NEPS. We used class sizes reported by the class teacher instead, which will coincide with the size of the respective course in the vast majority of cases. If a class had multiple teachers, their responses were averaged.

Missing Data in the Sample
As seen in Table 1, the marginal distributions of all variables stay very close to the initial NEPS sample (i.e., mean and standard deviation). With the most apparent discrepancies in the dataset underlying our reading model where the proportion of parents without a university degree increases from 62 to 69 percent and mean reading competency grade seven increases from 0.88 to 0.99. A similar shift in math competencies in the data set exists with a shift in seventh grade math competence of 0.91 to 1.06 and in ninth grade of 1.63 to 1.81 with a slightly lower standard deviation for both grades.

Analyses
In order to answer our research questions, we estimated the parameters of a linear mixed effects model for both reading and math competence. In both models, we wished to incorporate all effects of interest as well as all major contributing factors whose omission might bias estimates. Competence at grade 9 was the outcome variable. Predictors included the three teaching techniques, parental education level (low or high), second language learner (native speaker vs. non-native speaker), class size, and the competency at grade 7. We centered class size at 26 students, which corresponds to the mean class size rounded to nearest integer in both subsamples. The centering allows for an easier interpretation of the other effects, as these are than computed for a (typical) class of 26 students. Also included were the two-way interactions between background variables and teaching technique as well as teaching technique and grade 7 competence. Lastly, because German schools are separated into tracks, we also included a binary predictor for attending an upper track school (Gymnasium) or any other school type. Interactions between school track and teaching technique were also included. A description of the correlations between all predictor variables can be found in Table 3. Although some correlations were relatively high (r = 0.47 for upper school track by grade 7 competence), the overall matrix indicates multicollinearity issues are unlikely. All modeled factors and their interactions are explicitly listed in Table 4.
In an intermediate step, we fitted two models, one for reading and one for math competence, using the same predictors but no interaction effects. Adding the interaction terms improved model fit, χ 2 math (15) = 33.112, p < 0.01 and χ 2 reading (15) = 25.458, p < 0.05. Estimation was done in R (R Core Team, 2018) using the package lme4 (Bates et al., 2015). Survey weights for students participating through waves 3 (Grade 7) and 5 (Grade 9) were used to account for the sampling process used in NEPS and drop-out following the recommendations of Rohwer (2011). The R package lmerTest (Kuznetsova et al., 2017) provided p-values.

Accounting for Class Structure
In order to account for bias caused by school selection and shared nuisance effects of the classroom (or course) environment, a random intercept on course level was incorporated into the models. It should be noted that were 210 different classes identified in the math subsample and 172 in the reading  subsample. For the majority of classes in both subsamples only 5 or fewer students were sampled (i.e., in the initial data set in Table 1). The class size mean and standard deviation are seen in Table 1.

Treatment of Missing Data
Only complete cases entered the analysis (the resulting subsamples marginal distributions are listed in Table 1). It should be noted, in general multiple imputations (MI, see Schafer SE stands for standard error. Second language refers to second language learners. Low Parental Edu. refers to learners whose parents both lacked a university degree. *significant at p < 0.05. **significant at p < 0.01. ***significant at p < 0.001. and Grahm, 2002 for an introduction) allows for a better treatment of missing data. However, MI relies on the missing at random (MAR) assumption, which requires knowledge of variables associated to missing data and the actual value that is missing. In our data, MAR was not justifiable because the virtually all of dropped cases are due to missing teacher or parent responses. In most of these cases, the respective entity did not participate in NEPS at all. This means we had no data from the particular respondent to use as a basis for imputation. Table 4 summarizes both the reading and math models. In both the math and reading models, competency at grade 7 significantly predicts competency at grade 9, β math = 0.56, SE = 0.10, p < 0.001, and β reading = 0.51, SE = 0.10, p < 0.001. In both models, students attending the upper secondary track improved more than those attending other tracks, β math = 0.53, SE = 0.25, p < 0.05, and β reading = 0.76, SE = 0.36, p = 0.04. There were no significant main effects of parental education level or second language learning in either model, all ps > 0.10. Because the prior competency was incorporated into both models, this indicates that the rate of change between grades 7 and 9 were not affected by these variables.

Teaching Techniques
Also seen in Table 4, are the main effects of teaching techniques on 9th grade competency. In the math model, there were no significant effects of frequency of group work, discussion, or individualized assignments, p > 0.05. There were a number of significant interactions between teaching techniques and student background variables in the math model. Second language learners had higher German competency after receiving group work, β = 0.48, SE = 0.20, p = 0.02, and lower when discussions were more common, β = −0.36, SE = 0.14, p = 0.01. Meanwhile, group work resulted in less of an improvement for children attending the upper school track, β = −0.41, SE = 0.16, p = 0.01. There were no other significant interactions between student background and teaching technique in the math model, all ps > 0.05.
There was only one significant interaction between teaching technique and student background variables in the reading model. Namely, children with a higher reading competency at grade 7, improved less when receiving group work, β = −0.17, SE = 0.06, p = 0.01. No other interactions between student background and teaching variables were significant in the reading model, all ps > 0.05. The significant interactions in the math model are described in Figures 3 and 4, which plot the predicted competence based on an average class size with a random intercept of zero and the mean value for either discussions or group work as appropriate (see Table 1 for the actual values). Note that since there is no meaningful way to aggregate across school track, we include splits across school track in the upper and lower plots in the figures. These figures show a crossover effect of group work and discussions where more group work is better for nonnative speakers and more discussions are less effective for nonnative speakers.

Class Size
There was no significant main effect of class size in either the reading or the math model, both ps > 0.10. However, in the math model, students in larger classes receiving discussions improved slightly more, β = 0.03, SE = 0.01, p < 0.01. There were no other significant interactions in the math or reading models of class size and teaching techniques, all ps > 0.05.

DISCUSSION
This article provides evidence for a differential impact of different teaching techniques based upon classroom size and the background of the learner. We found that specific teaching techniques in 7th grade have an effect on competency in 9th grade. In particular, we found that students in classes with more group work in 7th grade have a greater increase in mathematical competency after 2 years, while the opposite is true for discussions. Meanwhile, discussions were demonstrated to be slightly more effective in larger math classrooms than in smaller math classrooms. However, in the reading model, the only significant interaction was for the prior ability level with the use of group work, where children of a lower ability level who received more group work in grade 7, had a higher competence in grade 9. It is of particular interest that these interactions were found, but no main effects of the teaching technique, classroom size, or student background were found. Indeed the only significant main effect in either the reading or math models was school track, where learners in the highest school track outgained those in other tracks.
These interactions correspond to research suggesting that learners receive differential benefits from specific teaching techniques based upon classroom situation and background data. This is in line with some previous findings that group work can better facilitate classroom learning for second language learners (e.g., Benediktsson and Ragnarsdottir, 2019;Takeuchi et al., 2019); however the findings of worse gains for second language learners from discussions contrasts with some previous work which predicted a similar boost (Banes et al., 2018). Banes et al. argued that discussions were only beneficial within a given context that encouraged equal participation. We have no data on the quality of the implementation of teaching techniques, so the use of classroom discussions in our sample may not be particularly indicative of whether each learner is encouraged to participate. Indeed, second language learners may feel barriers to participation (e.g., Maeda, 2017).
Another unexpected finding is that group work is less effective for math learners in the highest school track. A similar interaction for the reading model was found, where learners with a higher ability level did worse with more group work. This supports the findings of Cera Guy et al. (2019), who found that higher ability learners may not be more engaged by group work settings and that they produce a similar amount of effort as if they were working individually. However, this would likely result in no difference, whereas we found a negative coefficient in this case. Further, we did not investigate a possible triple interaction between school track, second language, and group work. It may be that group work is only negatively associated with later competence for native speakers in the highest school tracks.
Also noteworthy are several predicted effects that were not found. We predicted slower attainments for non-native speakers and children whose parents lacked a university degree, but this was not found. The only significant effect of a background variable can be seen in a greater rate of improvement in math and reading competency for children attending the upper secondary education track. It may be that 2 years' time is not a long enough period to observe a difference in rate of change, or it may simply mean that there is only a mean-level difference and no difference in rate of change. The latter interpretation is supported by the findings of Rambo-Hernandez and McCoach (2014) growth analysis of reading competency and SES. They found mean-level differences in competency based on SES, but no strong difference in rates of growth.
We also did not detect a main effect for class size, although there was an interaction between class size and discussions in the math class. This contrasts with prior findings that smaller class sizes related to better learning [i.e., (Krassel and Heinesen, 2014)], although in some studies the effects of class size was very small (e.g., Watson et al., 2016), or affected by additional factors (Hattie, 2005).
Additionally, most of the predicted interactions in the reading model were not detected. The only interaction detected was the above-mentioned grade 7 competency by group work. Since some effects were descriptively of similar size as the corresponding Math model, this indicates, that we lack power to detect any possible effects due to the slightly smaller sample and the more restricted range of frequency of used teaching techniques. More work is needed in this area. Similarly, we found none of the predicted effects for individualized instruction. This may because the individualized instructions reflect attitudes rather than frequency of use (as used by the other teacher questions). More work is needed regarding the role of individualization.  Please note, that our math and reading models should not be compared directly using coefficients or effect sizes for two reasons. First, the samples from the two models are not identical or independent from each other. Second, we did not test comparability of the teacher scales between both types of teachers. In order to compare the effectiveness of teaching techniques across subjects, a multivariate outcome model is recommended for future research. Further, it may be possible to accomplish this with data from NEPS, provided there is a justifiable imputation model that allows the retention of a sufficiently large and representative sample of students and teachers.
There are a number of notable strengths of this study. We predicted competency based on the instruction techniques from 2 years prior. We accounted for the effects of school track. We used a random-intercept model to account for coursespecific competence levels. In addition, we used a large dataset. Nonetheless, limitations remain. We only possessed data about teaching styles at a single measurement point, and it is possible that teachers are already adapting their teaching styles based on classroom size or composition. However, Hattie (2002) argues that this is unlikely, as teachers rarely change their styles based on classroom composition. Parent and teacher non-participation was substantial and may be related to all variables considered. As a result, the removal of students with non-participating parents/teachers could have biased our results. Furthermore, panel attrition was accounted by reweighting, which keeps the sample representative but only with respect to factors on which the weights are based. Because we looked at the longterm effects (2-years later), smaller more immediate effects may be missed. Longitudinal tracking studies over a shorter timeframe (a semester or a single school-year) should also be done. Furthermore, we could only use teacher responses about their own teaching techniques. Other studies should examine teaching techniques using more objective measures. This is especially important because of the relatively low reliability in some of our measures of teaching techniques, which may have biased some of our measures of variance (Woodhouse et al., 1996;Dedrick et al., 2009). In this regard, we also did not examine the suitability of teaching techniques with characteristics of the teachers, which may explain a great deal of the variance in the effectiveness of these techniques. Similarly, we did not distinguish between the effectiveness of a teaching style in a particular class setting and the effectiveness on the individual level (e.g., Enders and Tofighi, 2007). We are also limited by the items included in the NEPS database when constructing our scales. Further research is needed to verify the quality of these items and scales in other contexts. Lastly, our results need to be compared with different school levels, such as primary school and later secondary school. Similarly, analyses of changes of teaching styles over times may also be informative.

CONCLUSION
While not related to main effects, group work, and discussions had situational effects based on student background and class size in math and reading which were detectable 2 years later. Students' math competency in grade nine was higher for second language learners who received more group work-based instruction in grade seven. Conversely, math competency in grade nine was lower for second language learners who received more discussion-based instruction in grade seven. At the same time, discussions in larger math classes were associated with a more competency gain than discussions in smaller math classes. Furthermore, group work instruction related to lower math competency for learners in the upper secondary school track. Also, learners with a lower grade seven reading competency did have a higher reading competency in grade nine when receiving more group workbased instruction.

DATA AVAILABILITY STATEMENT
Publicly available datasets were analyzed in this study. This data can be found here: http://dx.doi.org/10.5157/NEPS:SC3:8.0.1.

ETHICS STATEMENT
Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. Written informed consent to participate in this study was provided by the participants' legal guardian/next of kin.

AUTHOR CONTRIBUTIONS
JD served as primary author of the manuscript, writing most of the manuscript and proofing and adapting contributions from other authors. MG and JD developed theoretical bases for chosen analyses. Analyses were performed by CS. CS wrote many sections of the methods and results and the interpretation of the results. PD provided feedback and refinement for analyses and methodology. PD and MG provided additional manuscript supervision.

FUNDING
Funding for this research was provided by the DFG Priority Programme 1646, Education as a Lifelong Process, project number 390666287. We acknowledge financial support by Deutsche Forschungsgemeinschaft and Technische Universität Dortmund/TU Dortmund Technical University within the funding programme Open Access Publishing.