Assessing School Engagement – Adaptation and Validation of “Engagement Versus Disaffection With Learning: Teacher Report” in the Swedish Educational Context

To follow the trajectories of children’s engagement in learning, validated measures of engagement appropriate for different ages and educational contexts are needed. The purpose of this study was to adapt and validate the school engagement questionnaire (Engagement Versus Disaffection with Learning: Teacher Report, EDL) in the Swedish educational context, and to investigate if it assesses the same construct as a measure of engagement used for children of preschool age. After translating the questionnaire to Swedish, cognitive interviews were conducted with six teachers to check for interpretability and relevance of the items. For psychometric validation, teachers of 110 6 to 7-year-old children filled out EDL on two occasions two weeks apart. On the first occasion, they also filled out the Child Engagement Questionnaire, a measure of global engagement intended for children of preschool age. Dimensional structure, convergent validity, test-retest reliability, and internal consistency of EDL were investigated. Factor analysis provided support for differentiating between behavioral and emotional components of school engagement. Measures of school and preschool engagement used in this study correlated highly, which provides support for using them to study the engagement of children as they develop, and their educational contexts change. The subscales of behavioral and emotional engagement showed good test-retest reliability and internal consistency.


INTRODUCTION
School attendance and physical presence in learning situations do not guarantee that children will learn and develop their understanding and competencies. For learning to take place, children need to be engaged with educational activities (Ponitz et al., 2009). In general, engagement refers to the intensity, endurance or the amount of time children spend actively interacting with adults, peers, materials, and activities in their environment, in a way that is appropriate to their development, competency level and environmental context (De Kruif and McWilliam, 1999;Maher Ridley et al., 2000). Children's engagement in learning mediates the relationship between quality of learning environment and child learning outcomes and engagement can also be used as an indicator of the quality of children's participation in their school environment (Fredricks et al., 2004;Skinner et al., 2009a).
The school environment should provide children with plenty of opportunities to interact with their peers, teachers, educational materials, and activities. To investigate how to support school engagement, and to identify children that show low engagement and might benefit from interventions in their educational environments, a reliable and valid assessment of engagement is crucial (Virtanen, 2016). Although several measures of engagement exist in English language and are used in American educational context, so far only one has been adapted to the Swedish educational context (Almqvist, 2006), and it is intended to use with preschool-aged children [Children's Engagement Questionnaire (CEQ); McWilliam, 1991].
As children grow and develop, their experiences and expressions of engagement change. To follow children's engagement trajectories over extended time periods and across different educational contexts, the measures used must tap the same engagement construct while reflecting the natural changes in development, competencies, skills, and role expectations. Although all children regardless of age can show some of the basic characteristics of engagement such as attentive or persistent behavior and feelings of interest and enjoyment, assessment of engagement should be adapted to the developmental level and the educational context the child is in De Kruif and McWilliam (1999), Vitiello et al. (2012), Coelho and Pinto (2018). Studying school engagement in Swedish educational context requires the validation of a measure of engagement appropriate for schoolaged children. To study children's engagement over time, from preschool to school age, measures of engagement in school need to be comparable to the ones appropriate for preschool children.
Within the framework of the bio-ecological systems theory (Bronfenbrenner and Ceci, 1994), engagement can be understood as a proximal process taking place between the child and its immediate environment. Although related to the child's characteristics such as age (Reschly et al., 2008;Aguiar and McWilliam, 2013), disability status (McWilliam and Bailey, 1995), hyperactivity symptoms (Searle et al., 2013), and positive and negative affect (Reschly et al., 2008), engagement is not a fixed attribute of the child, but a potentially malleable state also influenced by contextual factors. Thus, it can be improved by adapting the environment to the child (Sinclair et al., 2003;Fredricks et al., 2004;Skinner et al., 2009b). If engagement is understood as happening in the interaction between the child and the environment, it is expected that it will fluctuate over time as the context changes, and the child's dynamic states change (Fredricks et al., 2004). Observational measures of engagement treat engagement as a state that varies depending on the context the child is in, while global assessments treat it as a stable behavioral pattern more similar to psychological traits (Vitiello et al., 2012;Aguiar and McWilliam, 2013). Since engagement is influenced by stable traits, it can be expected that children can show a recognizable pattern of engagement within certain contexts. Such global engagement can be assessed by surveying the child, or by using proxy ratings by child's parents, teachers or other adults who spend a lot of time with a child and have knowledge of how the individual child expresses engagement (De Kruif and McWilliam, 1999). The ability for introspection and meta-cognitive reflection on one's own engagement and learning increases with age, so when assessing engagement in young children, it is most common to use proxy ratings of engagement, such as teachers' and caregiver's ratings, instead of self-reports (Fredricks et al., 2004). In an educational environment, teachers should be able to identify children that show a tendency toward low or high engagement (Skinner et al., 2009b). In general, global assessments tend to be better predictors of future developmental and educational outcomes than short observations (Frans et al., 2017;Sim et al., 2019).
School engagement is a multifaceted construct; it includes a behavioral, emotional, and a cognitive component (Fredricks et al., 2004;Skinner et al., 2009b). Behavioral engagement is easiest to assess as is refers to observable participating behaviors such as asking questions and contributing to the class. Pupils' positive conduct, adherence, invested effort, persistence level, and a lack of disruptive behaviors can be observed by their teachers (Fredricks et al., 2004;Skinner et al., 2009b). Emotional engagement refers to emotions such as boredom, interest, a general feeling of belonging, and affective reactions in the classroom including positive and negative reactions to teachers, classmates, schoolwork, and school in general. It is related to attitudes and values, and it reflects on children's motivation (Fredricks et al., 2004;Ladd and Dinella, 2009). Cognitive engagement refers to the psychological presence, dedication, and willingness to invest the effort necessary to master new knowledge and skills. It can be expressed as a preference for challenges or going beyond the task. While behavioral engagement is visible, emotional, and cognitive engagement or the lack of it, can be hidden and hard to observe (Fredricks et al., 2004). The behavioral, emotional, and cognitive aspects of engagement are expected to be associated and are often studied as one construct (Fredricks et al., 2004). Several studies indicate there is a need to differentiate between engagement components as they might have different antecedents and consequences (De Kruif and McWilliam, 1999;Fredricks et al., 2004;Ladd and Dinella, 2009;Skinner et al., 2009b). Even though behavioral engagement is a better predictor of academic outcomes, emotional engagement seems to also play an important role since it is preceding behavioral engagement (Ladd and Dinella, 2009). Skinner et al. (2009b) proposed a motivational conceptualization of school engagement, where engagement is understood as an observable manifestation of school motivation and is defined as "quality of a student's connection or involvement with the endeavor of schooling and hence with the people, activities, goals, values, and places that compose it" (p. 494). They recommended distinguishing behavioral from emotional features of engagement to allow for differentiation of children with different engagement profiles. Since engagement is not consistently expressed and pathways to disengagement could be many, they also argued for distinguishing lack of engagement from disaffection. They expected disaffection would appear when disengaged students have no choice but to continue participate in school and thus develop patterns of problematic behaviors and emotions. Their questionnaire, "Engagement Versus Disaffection with Learning: Teacher Report [EDL], " will be validated in a Swedish educational context in this study.
Although engagement is often a concept of interest in educational research, so far, to the authors' knowledge, no peer reviewed studies that validate a measure of schoolaged children's engagement in Sweden have been published. A validated measure of school engagement is necessary to do systematic research on school engagement. By investigating if instruments designed to assess engagement of preschool-aged children and school-aged children tap the same engagement construct, this study is the first step in establishing the prerequisites for following children's engagement trajectories longitudinally. The measure of school engagement "Engagement Versus Disaffection with Learning: Teacher Report" will first be translated to Swedish and then its' relevance, ease of comprehension, factor structure, intra-rater reliability and convergent validity with a measure of engagement intended for preschool-aged children, CEQ, will be investigated. To be able to check the convergent validity of an engagement questionnaire intended for school-aged children (EDL) by comparing it to an engagement questionnaire intended for preschool-aged children (CEQ), the validation phase of this study will be carried out with an age group of children where the use of these two questionnaires overlaps -in the preschool classes attended by 6 to 7-year-olds. Swedish preschool classes serve as a transition period from preschool education dominated by free play activities (Åström et al., 2020) to a more structured school learning environment. In the preschool class children are introduced and prepared for attending school in the following academic year.

Participants
Six teachers participated in the cognitive interview phase. Three of them were teaching 12-year-olds in grade six. The other three teachers were teaching 6 to 7-year-old children in the preschool class. One preschool class teacher was male, while the other five teachers were female.
In the main study where psychometric properties of EDL were investigated, data was collected about 110 children (57 girls and 53 boys) that attended a preschool class. Their mean age at the time of data collection was 82.38 months (SD = 4.32, range 70-90 months). They came from four different classes, and each class was in a different school. All the schools were within Jönköping municipality in Sweden, and areas with different socio-economic backgrounds were represented in the sample. The preschool classes differed in size, with 17 children in the smallest class and 42 children in the biggest class. Children's teachers filled in the questionnaires about children's engagement. All four teachers were female.

Materials
Engagement Versus Disaffection With Learning: Teacher Report "Engagement Versus Disaffection with Learning: Teacher Report" (Skinner et al., 2009b) is a measure of school engagement from the perspective of a child's teacher. The American English version of the questionnaire consists of four subscales across 25 items, and the teacher is instructed to rate on a four-point Likert-type scale, from 1 (not at all true) to 4 (very true) how close each item describes the pupil. Five items make the engaged behavior subscale; including items about effort exertion and persistence but also indicators of mental effort which correspond to cognitive engagement. The engaged emotions subscale includes five items that reflect enthusiasm, interest, and enjoyment. Disaffected behavior subscale includes five items that capture lack of effort and withdrawal from learning activities. Disaffected emotions subscale includes 10 items that capture emotional withdrawal, sadness, boredom, frustration, anger, and anxiety. List of questionnaire items is presented in the Appendix Table A1, including the English version and their translation to Swedish language. The administration is expected to take 5-10 min per child.

Child Engagement Questionnaire
Child Engagement Questionnaire (McWilliam, 1991) is a measure of the global level of engagement of preschool-aged children from the perspective of an adult familiar with the child. The original instrument was developed in the United States and has 32 items. Items describe both low-complexity and high-complexity behaviors and are accompanied by clarifying examples of behaviors which makes understanding easier. Answers are given on a four-point rating scale: (1) not at all typical, (2) somewhat typical, (3) typical, or (4) very typical, where typical means that the child spends a lot of time in a certain activity. CEQ has four underlying factors: competence, persistence, undifferentiated behavior, and attention (De Kruif and McWilliam, 1999). CEQ is widely used in educational research in Sweden and shows good measurement properties (Almqvist, 2006;Gustafsson et al., 2016;Sjöman et al., 2016). The Swedish version of CEQ consists of 29 items since 3 of the items were judged not to be relevant in the Swedish preschool context (Sjöman, 2018). The rating scale was also previously adapted to Swedish, and it translates to (1) almost never happens, (2) sometimes happens, (3) happens quite often, or (4) happens very often. The administration was expected to take about 10 min per child.

Procedure
The study was approved by the Regional Ethical Board in Linköping (Dnr 2018/189-31).

Translation Process
The original questionnaire was in American English. Translation to Swedish was organized in accordance with the checklist for cross-cultural translation and validation proposed by Mokkink et al. (2010). Since school routines and expectations from children might be different between American and Swedish context, it was expected that some adaptations of the scale besides translation would be necessary. Two researchers who were native in Swedish, fluent in English, and familiar with the concept of engagement, independently translated the items to the Swedish language. They met to discuss the differences in their translation and agreed on the final version. The Swedish translation of the questionnaire was titled Pupils' Engagement in Learning (Elevernas engagemang i lärande; EEL). Swedish translation was then sent to two researchers who were native in English, fluent in Swedish and familiar with the concept of engagement. They independently translated the questionnaire back to English. These translations were compared and combined into the back-translated English version. This back-translation was sent to developers of the original scale so that they could give feedback on the translation. After they approved the translation, the questionnaire was used in cognitive interviews.

Cognitive Interviews
Cognitive interviewing is a method of exploring how informants understand the items in the questionnaire. Its purpose is to investigate cultural relevance of the translation and ease of comprehension of the items and to identify irrelevant items, ambiguities and other problems that might result in misinterpretation of the items by the informer (Willis, 2005).
Six teachers were visited at their workplace by one of the researchers, and interviews took place in the quiet areas of their schools. In agreement with teachers, interviews were audio-recorded so that they can be assessed by researchers afterward. The researcher would explain the purpose of the cognitive interviews and instruct the teacher to think aloud while answering the questionnaire so that researchers would know if there were any ambiguities in the items and if items were interpreted in the same way by researchers and practitioners. Teachers were advised to think of any child from their classroom and answer the questionnaire for that child. After filling in the questionnaire they were asked what they thought of the items and questionnaire in general. If they perceived a problem with understanding or answering a certain item, they were asked additional questions to clarify what was problematic. Interviews took 15-30 min.

Data Collection for the Psychometric Validation
Data collection took place in the spring term of 2019. School sampling was convenient. Public and private schools in Jönköping and Gothenburg were contacted and school principals were asked if the preschool class teachers in their school were willing to participate in the study. Teachers from four different schools that included a preschool class agreed to participate as informants. After they agreed to participate, a parental meeting was scheduled where children's parents were informed about the study and asked for passive consent. If a parental meeting was not possible to arrange, a video with information about the study was sent to the principal and teachers so that they could upload it on their school's web-platform and make sure the information about the study reached the parents. They were informed that participation in the study is voluntary and if they did not want their child to participate in the study, they were supposed to inform the teacher about it, and no data would be collected about their child. Since no parents objected to participation in the study, data were collected for all the children in the targeted classrooms.
The teachers' participation was also voluntary. They could opt to have a substitute during the time they worked on the questionnaires which were organized and financed by Jönköping university. Teacher assessed children's engagement using EEL [Engagement Versus Disaffection with Learning: Teacher Report (EEL)] on two occasions, with 2 weeks apart. On the first occasion, they also used CEQ (McWilliam, 1991) to assess children's engagement. On the first time point, one item score was missing for one child in the EEL questionnaire and there were no missing data in the CEQ questionnaire. On the second time point when only EEL questionnaire was used, data for EEL was missing for one child, and one item score was missing for three other children.

Data Analysis
Several aspects of questionnaire validity were investigated. Cognitive interviews were analyzed first, and psychometric properties of the questionnaire were analyzed afterward in the R program (R Core Team, 2018) using psych (Revelle, 2019a) and lavaan packages (Rosseel, 2019).
Cognitive interviews were listened to and analyzed to see if teachers had the same understanding of items as the researchers and if they found items relevant or had problems understanding them.
After the data from first time point was collected for 110 children, score distributions were checked for all the items to see if they were good at differentiating between children. Spearman's correlations between items were computed to investigate if any items should be excluded due to low correlations with the other items.
Both confirmatory and exploratory factor analyses for ordinal data were performed to check if data correspond to the factorial structure proposed by Skinner et al. (2009b) and to decide the appropriate number of factors. Correlation between factors was allowed in the model as it was expected that different aspects of engagement would correlate.
Omega total coefficient was calculated for each subscale to assess reliability, as suggested by Mcneish (2018).
The test-retest reliability was calculated as the correlation between the scores from the first and second data collection (Bolarinwa, 2015). The period of 2 weeks was chosen as it is short enough to assume that no developmental changes or changes in the environment took place in this time, and long enough to make sure that teachers did not provide answers by memorizing their answers from the first time-point.
Preschool class teachers assessed their pupils' engagement with both CEQ and EEL so that the convergent validity of EEL could be examined (Bolarinwa, 2015). To test the convergent validity, a Pearson correlation between the two engagement measures results was calculated. If these two questionnaires measured the same construct as it was assumed, the correlation between them should be high, r = 0.7 (Field, 2013;Post, 2016).

Analysis of the cognitive interviews did not indicate any misinterpretations or misunderstandings of items by the teachers.
Teachers that participated in cognitive interviews found the questionnaire items relevant for the children in their class, with the exception of three preschool class teachers who had problem answering if the child comes prepared to the class (item 12), and two preschool class teachers who had problems answering about the child's reaction to failure (item 5). They did not perceive these items relevant to the preschool class where children do not have homework and are not faced with challenging academic tasks that could lead to failure. Teachers from grade 6 had no problems in answering these items. The items were kept in the questionnaire for the validation study, but the teacher's comments were taken into consideration when interpreting the data from the validation study. A teacher from grade 6 had complaints about the rating scale, which was then adapted to more concrete terms judged easier to understand. In the English version answers were given on a 4-point Likert-type scale, ranging from 1 (not at all true) to 4 (very true), while in the Swedish version the 4-point Likerttype scale was adapted so that 1 stand for "almost never happens, " 2 stands for "sometimes happens, " 3 stands for "happens quite often, " and 4 stands for "happen very often." Once the questionnaire data was collected, distributions of scores were checked for all the items to see if items are good at differentiating children's patterns of engagement, and Spearman's correlations between items were checked to see if there are items that do not correlate with the rest of the questionnaire. Distributions of scores showed that the scores indicating higher engagement prevail on majority of items. Items about coming prepared to the class (item 12) and appearing happy (item 20) had extremely low variations in scores, with over 90% of scores in the category indicating highest level of engagement, and were thus excluded from the further analyses. Item 12 also had very low correlations with the rest of the items, and was recognized as problematic already in the cognitive interview phase. High correlations (>0.8) were noticed between two pairs of items: between doing more than necessary (item 4) and working harder after experiencing failure (item 5), and between appearing bored (item 16) and looking bored (item 17), but no items were excluded due to multicollinearity and redundancy.
Confirmatory factor analysis for ordinal data (WLSMW estimator) was then run on the 23 remaining items to see if the four factor structure suggested by Skinner et al. (2009b) fits the data. The four factor model differentiating between emotional and behavioral engagement and emotional and behavioral disaffection did not show a good fit to the data (x 2 = 375.86, df = 224, p < 0.001). Although this model was a better fit to the data in comparisons to the null model (CFI = 0.99, TLI = 0.99), the absolute fit was not acceptable ].
To explore a number of factors in the data, several exploratory factor analyses for ordinal data were run. There is no definite way to decide on the number of factors (Revelle, 2019b) and it was decided to follow the suggestion of extracting factors as long as they were interpretable. Two factor solution was interpretable as behavioral and emotional engagement, while adding factors did not lead to interpretable models. Two factor solution and loadings of the items are shown in Table 1. Factor 1 corresponds to behavioral engagement and factor 2 corresponds to emotional engagement, with the exception of two items that belonged to emotional engagement/disaffection subscales (items 9 and 25) now loading to behavioral engagement. Correlation between the two factors was r = 0.56.
To check for model fit, confirmatory factor analysis for ordinal data was run based on the factor solution suggested in the exploratory analysis. This initial model did not show a good fit to the data (x 2 = 485.77, df = 208, p < 0.001), and post hoc model modifications were made based on the modification indices. After allowing for covariance between error terms for seven pairs of items that loaded on the same factor and had mi > 10, the model showed a good fit to the data (x 2 = 233.71, df = 201, p = 0.057), and good fit indices [CFI = 0.99, TLI = 0.99, RMSEA = 0.039 (0.000-0.059)]. After additionally over-fitting the model and allowing for covariance between error terms for nine additional pair of items that loaded on the same factor and had mi > 5, model fit the data (x 2 = 163.17, df = 192, p = 0.935) and showed excellent fit indices (CFI = 1, TLI = 1, RMSEA = 0.00 (0.000-0.011)].
Omega total for ordinal scales was calculated after recoding negatively oriented items to check for internal consistency of the subscales. For the behavioral engagement subscale, omega was 0.96 (0.94, 0.97), and for the emotional engagement subscale In the original, American English version of the questionnaire, items 1-5 belonged to the subscale behavioral engagement, items 6-10 belonged to the subscale emotional engagement, items 11-15 belonged to the subscale behavioral disaffection, and items 16-25 belonged to the subscale emotional disaffection (Skinner et al., 2009b). omega was 0.94 (0.93, 0.96), indicating good internal consistency of the subscales. Mean scores on the EEL scale were calculated for the second time point so that test-retest reliability can also be checked. Correlation between the total scores on EEL on the first and second time points was high (r = 0.80, p < 0.001). Test-retest reliability was also good for the emotional engagement subscale (r = 0.71, p < 0.001), and for the behavioral engagement subscale (r = 0.81, p < 0.001).
Finally, a high correlation (r = 0.80, p < 0.001) between the total scores on EEL and CEQ indicated good convergent validity of the questionnaire EEL.

DISCUSSION
The aim of this study was to adapt and validate a measure of school engagement "Engagement Versus Disaffection with Learning: Teacher Report" (Skinner et al., 2009b) in the Swedish educational context. Overall, the results support a two factor model differentiating between emotional and behavioral components of engagement, but not a four factor model that differentiates between behavioral engagement, emotional engagement, behavioral disaffection, and emotional disaffection. Items suggested by Skinner et al. (2009b) to be on the behavioral and emotional disaffection subscales load negatively to the corresponding behavioral and emotional engagement subscales, with the exception of one item from emotional engagement and one item from emotional disaffection subscale loading to behavioral engagement in this analysis. Behavioral engagement and behavioral disaffection seem to be on the opposite ends of the same factor, while emotional engagement and emotional disaffection seem to make the opposite ends of another factor.
Factors of behavioral and emotional engagement are in a moderate correlation which was expected based on the literature stating that the different aspect of engagement are associated but should not be treated as an unidimensional construct (Fredricks et al., 2004;Ladd and Dinella, 2009). Even in a young population of learners, it is possible to somewhat differentiate between behavioral and emotional engagement in learning.
The behavioral engagement subscale includes items about investing effort, paying attention, and appearing involved and interested. The content of these items corresponds to the understanding of behavioral engagement described in the literature, but it also coincides with observable aspects of cognitive and emotional engagement (Fredricks et al., 2004). Based on the factor loadings, items that are most central to behavioral engagement refer to investing effort, and they are followed by items about being mentally present and attentive. Two items about enjoying classwork and appearing interested in it, which belonged to emotional subscales in the original questionnaire, are here included in the behavioral engagement subscale. These results indicate that the constructs of cognitive and emotional engagement, if visible in behavior, overlap with behavioral engagement, at least in a young population of children.
The emotional engagement subscale includes items about enthusiasm and interest, having fun in learning, happiness, and absence of negative emotions such as anger, anxiety, frustration, or boredom. In the literature, emotional engagement is described as affective reactions in the classroom, or as a more general feeling of belonging and interest. When looking into factor loadings, it seems that emotional engagement in EEL is primarily determined by the absence of negative emotions and appearing happy, while being enthusiastic, interested and caring for schoolwork had somewhat lower loadings. Positive and negative affect and emotional problems have been previously associated with engagement (Reschly et al., 2008), and the factor analysis results indicate a challenge with assessing a trait-like emotional engagement in learning without capturing a more general emotional affect or mental health of children. Although the items state that the context for these emotions is within the classroom environment, this does not ensure that affective reactions in the classroom are captured without including a more general tendency children might have toward positive or negative affect.
The overlaps between emotional and behavioral engagement, as well as a correlation between emotional and behavioral engagement, might be somewhat accentuated in a young sample of children whose emotions toward learning are more likely to be visible in behavior. In an educational context such as Swedish preschool classes, where play and creative work matter more than academic achievement, and children have a great freedom in choosing the activities they want to participate in, elements of behavioral and emotional engagement might be more intertwined than in other, more structured and demanding educational contexts. It would be interesting to compare these findings with an older sample of school children. Older children in school environments characterized by higher demands and greater external rewards for achievement might show greater discrepancies between emotional and behavioral engagement. It is also plausible that older children might show more complex expressions of engagement and disaffection, and that engagement and disaffection would then appear as separate components as suggested by Skinner et al. (2009b). Older children participate in a more structured school environment where they have less freedom in choosing the activities to participate in. In such environment, various patterns of disaffected behaviors and emotions could be more likely to emerge and disaffection might appear as a separate construct and not as just an opposite end of engagement. In the population of 6 to 7-year-old children, this was not noticed. Items from engagement and disaffection subscales loaded on the same factors, just in opposite directions.
A high correlation between the children's scores on EEL and CEQ indicates that the questionnaires assess the same construct. This result supports the use of these two engagement questionnaires for investigating trajectories of children's engagement during preschool and school age. EEL also shows good internal consistency and reliability over a 2-week time span. The high test-retest reliability indicates that engagement assessed by EEL is a temporally stable disposition, or a behavioral pattern, and not just a short-term state.
For most items in the questionnaire, scores associated with high engagement were dominating. This indicates that the questionnaire is not very sensitive to differences in engagement level for the majority of the children whose engagement is average or high, but it might identify children that show tendencies toward lower engagement. Further studies are needed to determine score thresholds so that the questionnaire could be used as a screening measure that can indicate which children are not thriving in an educational environment and might benefit from an intervention.
This study was carried out on a sample of 110 6 to 7-yearold children. As recognized in the cognitive interview phase, a couple of items were not relevant for this population, but they might be relevant for older children. Conclusions from this study should be limited to the population of children transitioning from preschool to school environment and further validation in an older population is necessary to make conclusions about item relevance and factor structure in the school environment.

Limitations of the Study
Assuming that teachers are motivated to notice children's engagement, they are considered appropriate informants on engagement and other education-related constructs (Skinner et al., 2009b). Still, with only teachers as informants, the important child perspective is missing in this study.
Although not specified by Skinner et al. (2009b), it would be desirable if the teacher has known children long enough to know their behavioral patterns and can assess their level of engagement across different situations in school, but this was not controlled for in this study. Data collection took place in the spring term and it was assumed that teachers have spent the whole school year with their pupils.
As mentioned, the results of this study should not be generalized to the whole school population due to the young age of subjects in this study. Preschool class attended by 6 to 7year-olds was judged suitable for checking convergent validity of EEL since both EEL and CEQ can be used in this environment characterized both by structured lessons where children are prepared for school and by play activities similar to the ones in preschool. There is a risk that 6 to 7-year-old children in the preschool class are actually not the intended population for neither one of the questionnaires, as their behavior might be more mature than behavior described in CEQ and at the same time, expectations teachers have from them are not as high as they would have been from the school-aged children.

CONCLUSION
This study provides moderate support to differentiating between behavioral and emotional school engagement. Engagement is a complex construct and the main contribution of this study was establishing that two measures of engagement intended for children of different age, EEL and CEQ, do capture the same construct. Findings also raise the challenges if and how can the construct of engagement be captured without including other child's traits such as positive and negative affect and emotional problems which can reflect on engagement but are different concepts. Further validation of the questionnaire on a sample of older children is needed to better examine the factor structure of the scale and determine the item relevance for school-aged children.

DATA AVAILABILITY STATEMENT
The datasets generated for this study are available on request to the corresponding author.

ETHICS STATEMENT
The study was approved by Swedish Ethical Committee in Linköping (Dnr 2018/189-31). Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.

AUTHOR CONTRIBUTIONS
MG, MS, HD, and LA contributed to the conception and design of the study. AR and MS collected data. AR organized the database and wrote the first draft of the manuscript. AR and HD performed the statistical analysis. MG wrote sections of the manuscript. All authors contributed to manuscript revision, have read and approved the submitted version.

FUNDING
The research project was financed by the Swedish Research Council (2017-00538).