Implementing Tablets in Norwegian Primary Schools. Examining Outcome Measures in the Second Cohort

This study examines the implementation of tablets in primary schools in Norway. The outcome measures in the study are external for the intervention and are recorded data from national tests (National reading, arithmetic and English Tests, Classes 5, 8, and 9, National Mapping Tests for reading and arithmetic, Classes 1–3, and the 2014–2017 National Pupil Survey). The entire study (N = 15, 708) relies on an explanatory, sequential mixed-methods design and in this study we examine the quantitative effects of this implementation. The results indicate that in several school areas tablets have rather limited effect on pupils learning outcome. However, there were both some negative- and positive effects of tablets on several of the outcome measures. It seems that tablets contribute more positively to boys’ school achievements than to girls’ school achievements. The effect of introducing tablets is significantly positive for boys in fifth grade in English (as in the first cohort from 2015/2016). This might also be linked to a “spill over-effect” from outside school learning where the significantly positive results for boys in fifth grade in English can be interpreted as a sign of the times where English language immersion in leisure time (e.g., gaming, youtube, etc.) among boys are continuously developing. For boys, we also find positive significant effects on well-being, common rules, and assessment for learning, while in girls we find positive and significant effects on mastering, teacher support, and assessment for learning. From the study, we find some tendencies that when the use of tablets is supported by teachers who have high digital competence, their use seems to have a small equalizing effect between the school achievements of boys and girls. However, we cannot rule out that a grade effect and informal learning may also have an impact on the results, and we therefore request that the results be read with this reservation.


INTRODUCTION
This article examines the second cohort of the trailing research in the Municipality of Baerum's Everyday Digital Schooling tablet project, which examines outcome measures regularly through our longitudinal research design. The first study examined the first nine months of this project (Krumsvik et al., 2018). This second study examines the next 24 months of the project period. These two first studies are the first largescale effect studies of the implementation of tablets in Norwegian primary schools where the outcome measures are external for the intervention, as recommended by for example Cheung and Slavin (2013). This means that the learning outcome in this study is the combined result of National tests, the National Mapping Tests and the National Pupil Survey.
The aim of introducing tablets as a primary learning aid for all pupils at all stages at the pilot schools was to improve the academic and personal outcomes acquired by the pupils from their schooling. Investing in tablets had two objectives: to challenge teachers to develop and change their own teaching and working practices wherever possible and to help with the provision of better learning for pupils. However, to avoid Cheung and Slavin (2013) critique concerning educational technology studies using measures designed by the researchers themselves, we applied external outcome measures (registry data). In this part of the trailing research, the outcome measures in the study are external for the intervention and are recorded data from National Tests (National reading, arithmetic and English tests, Classes 5, 8, and 9, National Mapping Tests for reading and arithmetic, Classes 1-3, and the 2014-2016 National Pupil Survey). In this second cohort of the trailing research, we only examine the quantitative effects of this part of the implementation. The paper first presents a conceptual framework and the methodology of the study, followed by the results and a discussion of the study's main findings.

Literature Review
What do we know about the effect of educational technology in teaching? If we first glimpse at the latest meta-analysis concerning educational technology in teaching, we can see that Kulik and Kulik (1991) meta-analysis found an average effect size of 0.30. Rosen and Salomon (2007) found a mean effect size of 0.46 in their meta-analysis in mathematics. However, this increased to 0.90 when constructivist learning environments were applied. Tamim et al. (2011) second-order meta-analysis of 25 meta analyses, primary studies and 100,000 students found an overall mean effect size of 0.35. In their meta-analysis of the effectiveness of educational technology applications for enhancing mathematics achievement in K-12 classrooms, Cheung and Slavin (2013) find only a positive, modest effect of d 0.15. In another meta-analysis examining how features of educational technology applications affect student reading outcomes, they also find positive, modest effects of d 0.16 (Cheung and Slavin 2012). And Sung et al., (2016) found an overall mean effect size for learning achievement in their metaanalysis of 0.523. A tendency across these meta-analyses seems to be that more recent ones show higher effect sizes than the others; what variables are influencing this development needs to be examined in the coming years. Lai and Bower (2019) provide a good analysis for such an examination in their article "How is the use of technology in education evaluated? A systematic review".
If we examine the abovementioned meta-analysis "The effects of integrating mobile devices with teaching and learning on students' learning performance: A meta-analysis and research synthesis" by Sung et al., (2016) more thoroughly, we find that for tablet PCs, the specific effect size is 0.615. Sung et al. (2016) also state that if we compare these effect sizes with Kulik and Kulik (1991), Tamim et al. (2011) meta-analyses of the difference between using computers and not using computers in education (effect size between 0.30 and 0.35), some of the reason for these improved effects might be attached to the affordances specific tablet and mobile technology give. However, Sung et al. (2016) emphasize that more research is needed to examine such issues.
"Tablets for Teaching and Learning-A Systematic Review and Meta-Analysis" is one of the most extensive research studies so far about tablets in school (Tamim et al., 2015b). The researchers carried out a meta-analysis of sixty-eight studies based on twenty seven quantitative studies and forty-one qualitative research studies. They found a significant average effect size for studies comparing tablet use contexts with no tablet use contexts (g+ 0.23, k 28). They further examined studies comparing two different uses of tablets by students, and found a average effect size (g+ 0.68, k 12) which showed a significant favoring of more student-centred pedagogical use of technology.
However, when it comes to more policy-driven initiatives of implementation of educational technology, Tamim et al. (2015a) carried out a systematic review of current government-supported tablet initiatives around the world, in order to understand more of the educational basis and underlying principles in general. This review from Tamim et al. (2015a) concluded "that the majority of these initiatives have been driven by the tablet hype rather than by educational frameworks or research-based evidence" (p. 9). From this, and the other abovementioned metaanalysis, there is reason to claim that access to technology is not enough-there seems to be a consensus in the research community that it has to be closely attached to well-founded pedagogy and didactics in which the actual use of educational technology is connected to other teaching and research areas in school. Dhir and Gahwaji. (2013) examined the subject of the iPad's role in school in their literature review and found several perceived benefits. Especially tablets seemed to support interactive and collaborative learning as well as increased communication between pupils and teachers. However, Dhir and Gahwaji. (2013) state that the current state of knowledge lacks thorough empirical studies and there is a need for more research in order to develop sustainable didactical and pedagogical framework for the use of tablets in schools. Nooriafshar (2012) carried out a scoping review and found that tablets can be beneficial for language learning and Pellerin (2012) found the same tendencies. Flewitt (2012) revealed that the multimodal aspects of Tablets seemed to support literacy learning in primary school. In Kearney and Schuck (2012) review they found that personalization, collaboration and authenticity were the main benefits with Tablets. However, internationally there is still a significant gap in the current state of knowledge of one-to-one Tablets in school and even if some of the abovementioned studies reveals several benefits with Tablets in school, we need more large scale studies within this area. How is the situation in Norway?
Norway has had a high technology density both in homes and in schools during the last 10 years, and it is therefore interesting to examine how tablets affect school achievements variables. A recent doctoral thesis from Norway by Kongsgården (Kongsgården, 2019;Kongsgården and Krumsvik, 2016;Kongsgården and Krumsvik, 2019), shows that the implementation of tablets in schools is a complex process with both new educational possibilities and pitfalls. The study shows that tablets play a certain role in the learning process, especially in the achievement of learning goals and access to the Internet. However, there are clear differences in how pupils use tablets in their learning processes. In particular, there is a difference between primary and secondary school. Kongsgården (2019) also indicates that a teaching design that includes educational technology contributes to an increase in learning outcomes. Through the teacher's didactical choice, there is evidence that the teacher, by creating a learning community focusing on assessment for learning and technology, established flexible and transparent learning processes that developed the pupils' self-regulation. The study shows that the critical success factor is the teacher and his or her ability to create a teaching plan where the use of technology is justified by didactic choices and not vice versa (Kongsgården and Krumsvik 2016;Kongsgården and Krumsvik 2019).
Another PhD study from Norway examines the effect of adaptive learning technologies (ALT) and the use of tablets (Moltudal et al., 2020) in grades five to seven (10-12 years of age) in mathematics. The findings of the study indicate that the use of ALT at the upper primary level contributed positively to basic pupil learning in mathematics (ES 0.39, p 0.001). However, the study also indicates an intertwined relationship between learning, motivation, and volume training, especially for pupils learning new mathematical concepts. However, successful implementation requires that teachers have expertize in classroom management. It also shows that one of the main educational challenges lies in changing teachers' traditional practice by implementing a digital didactic method that provides the teacher with a greater understanding of digital homework as a measure for, and opportunity to better understand where pupils are during the learning process.
On the basis of this literature review, we find that despite the existence of some international research concerning tablets (and other types of educational hardware) in schools, we have very little research knowledge about how the large-scale implementation of tablets affects pupils' learning outcomes in Norway. Our trailing research is therefore positioned toward this gap, and will provide empirical data as related to our research questions.

Theoretical Framework
A certain theoretical discussion is related to if it is the educational technology (e.g., tablets) by itself that affects learning or is it the teaching method, teacher and other factors? Such debates have been going on since the 1980's and still it is debated in today's research communities. However, Cheung and Slavin (2013) provides a certain «middle way out» solution: Though it may be theoretically interesting to ask whether the impact of technology itself can be separated from the impact of particular applications, in practice, technology, content, and method are often intertwined and cannot be separated. As is the case for many educational interventions with many components, currently available technology applications can be seen as packages of diverse elements and evaluated as such. If a particular combination of hardware, software, print materials, professional development for teachers, and other elements can be reliably replicated in many classrooms, then it is worth evaluating as a potential means of enhancing student outcomes. Components of effective multi-element treatments can be varied to find out which elements contribute to effectiveness and to advance theory, but it is also of value for practice and policy to know the overall impact for students even if the theoretical mechanisms are not yet fully understood (p. 92).
Thus, this article has no ambitions to develop new theory, but to apply theory as Leedy and Ormrod (2005) describe it: "A theory is an organized body of concepts and principles intended to explain a particular phenomenon". The theoretical framework for the entire study underpins the research questions (and are not an analytical framework). The theoretical framework refers to the theories of Piaget (1967), Vygotsky (1978), where tablets are related to both knowledge construction and collaborative learning, and linked to studentcentred and group-based teaching design. Educational technology (like tablets), as it appears today in Baerum schools with its distinctive feature of digital tools, relates especially to more recent socio-cultural perspectives on learning (Wertsch, 1998;Cole, 1996;Säljö 2005Säljö , 2017Stahl, 1993;Lave andWenger, 1991, Wenger, 1998) as a mediating artifact. The socio-cultural perspective emphasizes the point that learning is constructed in interaction with other people and mediating artifacts, which has a significant focus on the basic thinking in the "Digital everyday school" school development project. James Wertsch states that such new kinds of mediation and mediated artifacts can give new possibilities and the experience of ". . .how the introduction of novel cultural tools transforms the action" (Wertsch, 1998, p. 42). The use of tablets for learning purposes also relates to Richard Mayer (2010) Multimedia Learning Theory where he describes learning with technology, such as situations wherein technology is used for the purpose of promoting learning, and is concerned with the human construction of knowledge as a framework for learning.
The coherence between pupils' knowledge construction and collaborative learning linked to student-centred teaching design in schools (attached to sociocultural theory), learning with technology (tablets) attached to multimedia learning theory, and teachers' pedagogical practices (in relation to digital didactic) underpins the research questions of the study, which in the second cohort are: To be able to examine these variance research questions in the second cohort, we have chosen trailing research and mixed method research, described below.

METHODOLOGY
The subsequent research made use of trailing research (Finne et al., 1995) and mixed method research (Fetters et al., 2013), which involved combining different methods and data sources. To be able to answer the research questions in this study, we have chosen to design this study as an explanatory, sequential mixedmethods design (Fetters et al., 2013). We follow the staged approach, which means that data are reported in stages and published separately. In this article (the second cohort), we therefore only report the quantitative effect analysis which is based on existing recorded data. The effects of the learning results are measured by using the following data sources: 1. National reading, arithmetic, and English tests, classes 5, 8, and 9 from 2014 to 2017. 2. National mapping tests for reading and arithmetic, classes 1-3 from 2015 to 2017. 3. The 2014-2017 national pupil survey.
We have obtained the results of the National Tests from the Norwegian Directorate for Education and Training's school portal, and the results of the National Mapping Tests have been provided by the Municipality of Baerum. Our two endpoints in this respect are based on class levels, divided according to gender and test type. Data from the national arithmetic and English tests have been taken from 2014 to 2017, since there is no comparable data available prior to 2014. The reading test is nevertheless included in our analysis, but with the reservation that changes have been made to the scale, so that the comparison cannot be made beyond 2016. However, this should not be a problem since the comparison is only made up to 2016. As regards the Mapping Tests, two respective tests are conducted in reading and arithmetic between 2014 and 2017.
Our third and final endpoint is social enjoyment and learning environments. This has been gathered from the National Pupil Survey. The National Pupil Survey focuses on how pupils perceive their learning environment at school, how motivated they are, their social well-being at school, if they experienced any bullying, how they experience the teachers, and so on. The results of the National Pupil Survey have also been obtained from the Norwegian Directorate of Education and Training's school portal, based on class levels and divided by gender. Our basis includes the various indicators defined by the Directorate as being relevant for pupils' learning environments. We used data from the National Pupil Survey covering 2013 to 2017.

QUANTITATIVE RESULTS
This section presents the quantitative surveys that have been made and the findings that emerge from these. We will present the analyses of our effect analyses which are presented based on the last available registry data. Here we investigate the effect of the introduction of tablets on pupils' learning outcomes (in basic skills) and learning environments. The three effect measures analyzed are the results of the National Tests in the fifth and ninth grades, the National mapping tests first to third grade, and the results from the National student survey in the seventh and 10th grades.

Effect Analyses
The purpose of the effect analyses is to investigate the effect of introducing tablets into pupils' learning exchange and learning environment. Then, pupils' learning outcomes and learning environment are compared with schools where tablets have not yet been introduced for all pupils.
The impact on learning outcomes is measured using the following data sources: 1. National tests in reading, mathematics, and English in the 5th, 8th, and 9th grades 2. National mapping test in reading and mathematics in first through third grade The results from the National tests are taken from the website of the Directorate of Education, «Skoleporten», as well as from the results of the national survey tests which we received from Baerum Municipality. Our two effect measures here are based on grade level, divided by gender and type of test. For the mapping tests, two tests are carried out in reading and mathematics, respectively.
The impact on pupils' learning environment is measured using collected data from the following data sources: 1. National student survey in seventh and tenth grades The results from the national student survey are taken from the website of the Directorate of Education, «Skoleporten», based on grade and divided by gender. Furthermore, we use the different indicators that the Directorate of Education has defined as relevant to pupils' learning environment.
All three effect targets are linked with data at the school level from the "Primary School Information System" (GSI) in addition to socioeconomic indicators for the 24 primary school districts in Baerum municipality. The Figure 1 below illustrates the differencein-difference approach in the study. The Table 1 below describes the pupils in group 1 (pilot 1) and group 2 (pilot 2) schools, as well as the pupils at other schools (non-pilot schools), where we investigate whether or not there are differences between schools that have used tablets and schools that have not.

Description of the Sample as Basis for Effect Analyses
The contextual of use of tablets The tablets were used in number of different ways, with different apps and for different purposes in the subjects. Both the pupils and teachers applied therefore a myriad of different of apps throughout the school days. Still, some of the apps and digital learning resources seemed to be more integrated in the subjects Frontiers in Education | www.frontiersin.org June 2021 | Volume 6 | Article 642686 than others. One of these was the adaptive learning resource Multi Smart Øving and from the observations we saw several classes that used "Multi Smart Øving" in Mathematics. This adaptive learning resource is attached to a paper based, monomodal textbook in Mathematics and aims to improve schoolwork and homework quality by providing pupils with "volume training" in mathematics. The adaptive learning software makes it possible to give multimodal feedback, tasks, hints, etc. And for teachers is provides pre-organized activity data, visible for teachers, indicating the competence level of the pupils, which topics deserve more attention and which pupils need more help. Kynigos (2019) finds that Multi Smart Øving can be considered as an adaptive software resource that on one hand enhances traditional approaches to mathematics education and on the other hand are coupled with an automated, traditional, and generalized type of assessment. Egelandsdal et al. (2019) finds that the main contribution of Multi Smart Øving to mathematics education is that the digital format enables pupils to solve more varied tasks than would be possible with a textbook (Egelandsdal et al., 2019) and it also enables "volume training" in mathematics and ensures that pupils receive assignments adapted to their academic level, which the teacher can monitor. The tablets were also used in other subjects than Mathematics, with different apps and for different purposes. We observed several classes that used the app "Explain Everything" for various tasks, including making learning films and logging. Many worked with "Book Creator" in combination with other apps-such as storytelling with sound recordings and sound effects, English presentation with pictures and text, as a notebook or theme book with sound recordings, mind maps and information taken from the web. Many of the FIGURE 1 | llustration of the difference-in-difference approach. The green bubble is the estimated effect of the introduction of tablets. There are no significant differences between group schools and other schools. The significance is tested by a two-tailed independent t-test with equal variance of 10, 5, and 1% significance level. a The 10 group schools from group 2 were taken out of the control group when they introduced tablets in August 2016 and therefore cannot act as a control group for an after-survey in 2017. b There is a significant difference between group 2 schools and other schools in the variable average number of students per year at a 10% significance level. There are otherwise no significant differences between group schools and other schools on the other variables. c Source: Indicators from 2011 in nine areas in Baerum calculated by Statistics Norway. The distribution between the schools is made by the Municipality of Baerum. For some schools, a percentage distribution has been developed between several areas.
Frontiers in Education | www.frontiersin.org June 2021 | Volume 6 | Article 642686 classes we observed had the assignment text delivered on "Showbie", and there was also information about learning objectives, homework and submission deadlines. We also observed a class that worked in interactive PDF files instead of "Book Creator" because the interactive files enable insight into the students' work process without the student having to submit their work. Other examples of apps in use during school visits are "ItunesU", "YouTube", "Creaza", «Kikora», «Pages» and «Garage Band».

Findings
• Group 1 (pilot 1) schools do not differ significantly from other schools in Baerum (Table 1). • In the socioeconomic parameters, there are also no statistically significant differences between group 1 schools and other schools. As described in the previous report, one should be careful when drawing conclusions based on the socioeconomic variables, as they are from 2011. At the same time, the pupil base in the surrounding area is expected to be relatively constant as the school district change only marginally each year. In the analysis, the indicators are used only to test the robustness of the results in comparative analyses, and not as an independent analysis. • Group 2 (pilot 2) schools differ from other schools by having a slightly lower proportion of secondary schools, larger schools, more students per year, and fewer assistant hours per student. However, these differences are on the whole not significant. • In the socioeconomic parameters, we see that group 2 schools are in an area with a lower proportion of children with immigrant background than are the other schools (the opposite of what we see for group 1 schools). However, there are no statistically significant differences between the school groups in any of the socioeconomic parameters.
• The parameter showing the greatest variation between the three school groups is "Number of students per year". Here the other schools have the lowest average. This could potentially contribute to better student outcomes for these students. However, we have taken this in account through our difference-in-difference analytical approach (see On Method and Identification of Effect).

On Method and Identification of Effect
The effect analysis is performed with a difference-in-difference approach in a simple average analysis and a more advanced fixedeffect regression analysis. In a simple "diff-in-diff" analysis, the average difference between the five group schools and all other schools in Baerum is considered before the introduction of tablets. This is compared with the difference between group schools and all the other schools in Baerum after the introduction of tablets. The Figures 1 and 2 above illustrates the difference-in-difference approach.
Using a diff-in-diff approach in a more advanced fixedeffect regression analysis, you can check for time constant variables at the school level. This means variables that do not change during the years-such as school size, geographical location, and organization-will be checked for. In addition, the method takes into account unobservable characteristics that are constant over the years, such as school culture, student basis (assuming student base is not changing), and the like.

Reservations and Uncertainty in the Analysis
In diff-in-diff analyses (both simple and fixed-effect analysis), it is assumed that schools would have developed equally if the pilot schools had not introduced tablets. This assumption is necessary, as in a diff-in diff analysis the pilot of schools without intervention defines the counterfactual situation of schools that have introduced tablets. That is, after taking into account the different starting points of the school before the introduction of tablets, they are expected to have the same development over the years in the national tests, national mapping tests, and the national student survey. This is a strict assumption, and it cannot be tested in the data we have available. Therefore, in the interpretation of the results, it should be noted that there may be cases where group 1 schools without the introduction of tablets could still have developed as they did. One way to approach this strict assumption is to include variables that describe pupils' individual backgrounds. As we have not had access to such data, we have also not had the opportunity to take this information into account in the analysis. In addition to the strict assumption of development, another uncertainty occurs in the form of a "grade effect". By grade effect, it is believed that the analysis is based on the comparison of students in a single grade, for example, in fifth grade, with the subsequent graduation of students in fifth grade. In other words, the same students are not followed. This implies that there may potentially be students who overall are better or worse, contributing to a proven effect of tablets, and not the characteristics of the tablets themselves. The grade effect can be tested by following a student group over two grades (for example from first to second grade), thus evaluating whether the tablet changes the results in the same student group.
This also means that the results cannot be generalized to other schools or municipalities. Furthermore, we have an analysis of measurable effects, which means that the analysis does not capture potential effects on learning beyond the measurable indicators. All results must therefore be seen in the light of these reservations.

Identification of Effects
The Figures 1 and 2 above shows an overview of when the group schools introduced tablets. The overview also shows when the The significance is tested by a two-tailed independent t-test with equal variance of 10, 5, and 1% significance level. a Means that we can say with 99% certainty that there is a difference between the effort group and the control group. b Means that we can say with 95% certainty that there is a difference between the effort group and the control group. c Meaning that we can say with 90% certainty that there is a difference between the effort group and the control group. d Bekkestua Primary School is not included in the analysis, as at the time of measurement it did not have its own fifth grade. The significance is tested by a two-tailed independent t-test with equal variance of 10, 5, and 1% significance level. a Means that we can say with 99% certainty that there is a difference between the effort group and the control group. b Means that we can say with 95% certainty there is a difference between the effort group and the control group. c Means that we can say with 90% certainty that there is a difference between the effort group and the control group. d National tests in reading cannot be compared beyond 2016, as changes have been made to the scale of this test. The sample is therefore not included in this type of sample in 2017. e The number of schools is not equal to the number of observations. There are double numbers of observations in both measurements, as both measurements extend over two years, i.e., all schools are included twice. The test in reading is, however, an exception, as only 2016 is included in the reassessment.
Frontiers in Education | www.frontiersin.org June 2021 | Volume 6 | Article 642686 various impact targets were collected at a national level. Furthermore, the gray areas mark the years used as before and after measurements. The effect measurements from 2014 to 2015 are used as preliminary measurements for group 1 and group 2 schools, respectively. However, it must be noted that the premeasurement of the National Student Survey and the National Tests for Jong school and Bekkestua primary school may be influenced by the fact that the schools in question introduced tablets already in autumn 2014. However, national mapping surveys in 2014 and 2015 qualify as preliminary measurements for all schools, as they were collected in the spring of the same year.
The reason 2013 data is not used in the National Tests for group 1 schools is that the National Tests in 2013 are not comparable with data from 2014 and later. For the student survey, however, 2013 can be used as a measure for group 1 schools. Nevertheless, the measurements from 2014 are used to see the three analyses in one. As a reassessment, data are used from 2015, 2016, and 2017.

Results From National Tests in Primary School
As mentioned earlier, the results will be divided so that the results of the national samples are described first. Then the results of the surveying tests are presented, and finally the results from the student survey. In conclusion, a brief summary of the results follows.
Effects for group 1 in 5th grade (analysis 1): • Table 2 shows the average test results for national tests in reading, mathematics, and English for all children, boys and girls. "Effect (2015, 2016) diff-in-diff" and "Effect (2017) diff-indiff" show whether group 1 schools have developed favourably compared to other schools in Baerum after the introduction of tablets. A positive figure indicates that group 1 schools have developed favourably compared to other schools. The analysis has been completed by 2017, i.e., it is the effect for 2017 as determined. In addition, the results of the previous report (see Krumsvik et al., 2018) are included in the first column in order to compare short-term and longer-term effects. • National tests in reading cannot be compared after 2016, as changes have been made to the scale of this test (see latest The significance is tested by a two-tailed independent t-test with equal variance of 10, 5, and 1% significance level. a Means that with 99% security we can say that there is a difference between the effort group and the control group. b Means that with 95% security we can say there is a difference between the effort group and the control group. c Means that with 90% security we can say that there is a difference between the effort group and the control group. If a number is not added to stars, it means there is no statistical difference. d The test in reading cannot be compared to 2016 and therefore 2017 is not included in the survey. The significance is tested by a two-tailed independent t-test with equal variance of 10, 5, and 1% significance level. a Means that we can say with 99% certainty that there is a difference between the effort group and the control group. b Means that we can say with 95% certainty there is a difference between the effort group and the control group. c Means that we can say with 90% certainty that there is a difference between the effort group and the control group. If a number is not followed by asterisks, there is no statistical difference. d The effect of tablets is an interaction between a dummy variable to be the intervention school and dummy variable to be after the implementation. I.e., the effect is calculated by a difference-in-difference approach. e A big school is defined as a school with 400 students or more.
f The test in reading cannot be compared to 2016, and therefore 2017 is not included in the survey.
Frontiers in Education | www.frontiersin.org June 2021 | Volume 6 | Article 642686 report). Therefore, the result for reading is omitted from the analysis, all the measurement that takes place after 2017. • In general, the impact of tablets has increased since the measurements done in 2015 and 2016. • The effect of introducing tablets is significantly positive for boys in fifth grade in English (as in 2015/2016). Furthermore, the effect is also positive and significant for all children in fifth grade in English, when the effect is measured in 2017. For girls, we cannot say with statistical certainty that a change has occurred. If a change is to be found in the latter group, the results indicate that the change is likely to be positive. • The fifth-grade boys also had a significant positive effect in the use of tablets in mathematics measured in 2015/2016. This effect is no longer significant in 2017.
• We have also conducted a similar analysis for the three levels of mastery in arithmetic, reading, and English (The analysis can be found as an attachment in section 6.3 (Table 19). Contact 1st author for this supplemental information). Here you can see that the proportion of students in third grade in English rises significantly more for pilot schools than in other schools after the introduction of tablets. It also results in a significant negative effect on Level 2 (albeit trend of positive importance), as a large proportion of Level 2 students pass to Level 3.
Effects for group 2 in 5th grade (analysis 1): • Table 3 shows the average test results for national tests in mathematics, reading, and English for all children, boys and girls. The column on the right shows whether pilot 2 schools have had a better positive development than other schools in Baerum have had after the introduction of tablets. A positive figure indicates that pilot 2 schools have developed more positively than the other schools. Both 2016 and 2017 are included in the aftermath, which means that the measured effect is an average of the effects in 2016 and 2017. • It is considered that national tests in reading cannot be compared to 2016, since the reading for the reading exam consists only of 2016. This is also described in the note below the table. • There are no statistically significant effects to be found for pilot 2 schools as compared to other schools measured in terms of the national 5th-grade tests. This corresponds to the fact that we did not find any effect for pilot 1 schools at this time (i.e., after a relatively short period of time). • We have also conducted a similar analysis for the three levels of mastery in arithmetic, reading, and English. The analysis can be found as an attachment in section 6.3 (Table 20, contact 1st author for this supplemental information). Here, we also find no significant effects of tablet usage on pilot schools. • Table 4 shows the Diff-in-diff in fixed effect regression analysis in fifth grade (Pilot 1 schools) and where we can The significance is tested by a two-tailed independent t-test with equal variance of 10, 5 and 1% significance level. a Means that with 99% security we can say that there is a difference between the effort group and the control group. b Means that with 95% security we can say there is a difference between the effort group and the control group. c Means that with 90% security we can say that there is a difference between the effort group and the control group. If a number is not added to stars, it means there is no statistical difference. d The test in reading cannot be compared to 2016 and therefore 2017 is not included in the survey. e The number of schools is not equal to the number of observations. There are double numbers of observations in both the pre-measurements and the post-measurements, as both measurements extend over two years, i.e., all schools are included twice. The test in reading is, however, an exception, as only 2016 is included in the reassessment. Significance tests have been conducted with a linear regression analysis with fixed effect at school and year. a Means that with 99% security we can say that there is a difference between the effort group and the control group. b Means that with 95% security we can say there is a difference between the effort group and the control group. c Meaning that with 90% security we can say that there is a difference between the effort group and the control group. d The effect of tablets is an interaction between a dummy variable to be the input school and dummy variable to be after the implementation of the bet. I.e., The effect is calculated by a difference-in-difference approach. e A big school is defined as a school with 400 students or more.
Frontiers in Education | www.frontiersin.org June 2021 | Volume 6 | Article 642686 observe that large schools have significant impact on results in arithmetics.
Effects for group 2 in fifth grade (analysis 2): • The fixed effect analysis in Table 5 (group 2) reinforces the results in the difference-in-difference analysis from Table 3 (group 2), where we do not find positive significant effects for all students or any of the two gender groups. At the same time, note that the effect in reading for boys in the fifth grade is significantly positive, albeit as a short-term effect, as the effect of introducing tablets on reading skills is only measured in 2016 (see point below). This means that in 2017 we cannot say with statistical certainty that there has been a positive change in the development of students' reading skills. • It is considered that the national test in reading cannot be compared to 2016, since the reading for the reading exam consists only of 2016. It is also described in the note below the table. For the other national tests (Arithmetic and English), both 2016 and 2017 have been included in the survey. Significance tests have been conducted with a linear regression analysis with fixed effect at school and year. a Means that with 99% security we can say that there is a difference between the effort group and the control group. b Means that with 95% security we can say there is a difference between the effort group and the control group. c Meaning that with 90% security we can say that there is a difference between the effort group and the control group. d The effect of tablets is an interaction between a dummy variable to be the input school and dummy variable to be after the implementation of the bet. I.e., The effect is calculated by a difference-in-difference approach. e A big school is defined as a school with 400 students or more.
f The test in reading cannot be compared to 2016 and therefore 2017 is not included in the survey.
TABLE 9 | Difference-in-difference analysis of share of students above critical limit, first, second, and third grades (group 1).
Before Tablet (2014) After tablet (2017) Effect (2017)  The significance is tested by a two-tailed independent t-test with equal variance of 10, 5 and 1% significance level. a Means that with 99% security we can say that there is a difference between the effort group and the control group. b Means that with 95% security we can say there is a difference between the effort group and the control group. c Means that with 90% security we can say that there is a difference between the effort group and the control group. d Bekkestua Primary School is only included in the post-measurement. Therefore, there are two schools in the pre-measurement and three schools in the post-measurement. e The difference is listed in percentage points. f In the first step, six parameters are usually measured. In this analysis, we have only used the words "Spell words" (spelling), "Read words", and "Reading comprehension". Consequently, "writing letters", "finding sounds in words", and "joining sounds" is not included in the analysis for the first grade, although this is also part of the state survey. For the second and third grades, we have omitted "Understanding words".
Frontiers in Education | www.frontiersin.org June 2021 | Volume 6 | Article 642686 • The fixed effect analysis has taken into account timeconstant characteristics at school level, as well as school size and number of students per year. • In Appendix 6.3 (Table 22, can be accessible by contacting the authors), the fixed effect analysis is designed with a different model specification. It is no longer time-constant characteristics at school level, but instead, indicators from Statistics Norway are included in the regression. This reduces the degree of explanation (R 2 ), resulting in the finding of the effect in reading for boys in the fifth grade to no longer be significant with the new model specification.

Results From National Tests at Secondary School
Effects for group 1 in 9th grade (analysis 1): • Table 5 shows the average test results for national tests in arithmetic and reading for all children, boys and girls, in ninth grade for pilot 1 schools and other schools. The column on the right shows whether pilot 1 schools have developed more The significance is tested by a two-tailed independent t-test with equal variance of 10, 5, and 1% significance level. a Means that with 99% security we can say that there is a difference between the effort group and the control group. b Means that with 95% security we can say there is a difference between the effort group and the control group. c Means that with 90% security we can say that there is a difference between the effort group and the control group. d The number of schools is not equal to the number of observations. There are double numbers of observations in both the pre-measurements and the post-measurements, as both measurements extend over two years, i.e., all schools are included twice. The test in reading is, however, an exception, as only 2016 is included in the reassessment. e The difference is listed in percentage points. f In the first step, six parameters are usually measured. In this analysis, we have only used the words "Spell words" (spelling), "Read words", and "Reading comprehension". Consequently, "writing letters", "finding sounds in words", and "joining sounds" is not included in the analysis for the first grade, although this is also part of the state survey. For the second and third grade, we have omitted "Understanding words". The significance is tested by a two-tailed independent t-test with equal variance of 10, 5, and 1% significance level. a Means that with 99% security we can say that there is a difference between the effort group and the control group. b Means that with 95% security we can say there is a difference between the effort group and the control group. c Means that with 90% security we can say that there is a difference between the effort group and the control group. If a number is not added to stars, it means there is no statistical difference. d In the student survey, the students respond on a scale from 1 to 5. The 10 indicators are based on a number of sub-questions. The composition of the indicators is described in more detail at www.skoleporten.udir.no.
Frontiers in Education | www.frontiersin.org June 2021 | Volume 6 | Article 642686 favourably as compared to other schools in Baerum after the introduction of tablets at the target date 2017. A positive figure indicates that pilot 1 schools have developed more positively compared to other schools. The analysis has been carried out through a survey in 2017, that is, the effect for 2017 is considered. Furthermore, the result of the previous report was included in the Effect (2015, 2016) diff-in-diff column to compare the short-term effect (2015, 2016) against more long-term effects (2017). • The analysis has been completed in the ninth grade, as students in the eighth grade may have gone to one of the primary schools that have introduced tablets, thus creating uncertainty about the results. • National tests in reading cannot be compared to 2016, as changes have been made to the scale of this test (cf. last report). Therefore, the result for reading is omitted from the analysis, as the measurement takes place in 2017. • None of the results are statistically significant and therefore we cannot say with enough certainty that the difference is not random. This applies to both the results from 2015/2016 and 2017. • However, the same trend with negative results in a slightly longer term in 2017 as it was in 2015/2016 (short term). • In Appendix 6.4 (Table 23, can be accessible by contacting the authors), a diff-in-diff analysis has been carried out, where students are followed from eighth grade without tablets to ninth grade with tablets. This way, it is possible to consider the grade effects. Here we also find no statistically significant differences.
Effects for group 2 in 9th grade (analysis 1): • Table 6 shows the average test results in the national test in arithmetic and reading for all children, boys and girls, in ninth grade for pilot 1 schools and other schools. The columns to the right indicate whether pilot 2 schools have developed more positively as compared to other schools in Baerum after the introduction of tablets. A positive figure indicates that pilot 2 schools have developed more positively than other schools. Both 2016 and 2017 are included in the aftermath, which means that the measured effect is an average of the effect in 2016 and 2017. • It has been taken into account that the national test in reading cannot be compared to 2016, since the premeasurement of the reading exam consists only of 2016. It is also described in the note below the table. • None of the results are statistically significant and therefore we cannot say with enough certainty that the difference is not random. • In Appendix 6.4 (Table 24, can be accessible by contacting the authors) a diff-in-diff analysis has been conducted, where students are followed from eighth grade without tablets to ninth grade with tablets. That way, it is possible to take into account the grade effects. Here we also find no statistically significant differences.
Effects for group 1 in 9th grade (analysis 2): • The fixed effect analysis in Table 7 shows the same results as the difference-in-difference analysis in Table 6. This can be seen in the variable "Effect of tablet" where the effect is not significant, which in turn means that we cannot conclude with enough certainty that there is a difference in the development of pilot schools (group) as compared with other schools. • However, we find a significant result of the fixed effect analysis, as it turns out that students in schools with more than 400 students have significantly lower test results in the ninth grade than do students in smaller schools. • The analysis is performed for 2017 and reading is therefore excluded from the analysis, cf. reasoned justifications. • In Appendix 6.4 (Table 26, can be accessible by contacting the authors), the fixed effect analysis is designed with a different model specification. It is no longer time-constant characteristics at school level, but instead, indicators from Statistics Norway are The significance is tested by a two-tailed independent t-test with equal variance of 10, 5, and 1% significance level. a Means that with 99% security we can say that there is a difference between the effort group and the control group. b Means that with 95% security we can say there is a difference between the effort group and the control group. c Means that with 90% security we can say that there is a difference between the effort group and the control group. d In the student survey, the students respond on a scale from 1 to 5. The 10 indicators are based on a number of sub-questions. The composition of the indicators is described in more detail at www.skoleporten.udir.no.
Frontiers in Education | www.frontiersin.org June 2021 | Volume 6 | Article 642686 included in the regression. This reduces the degree of explanation (R 2 ). However, the conclusion is still that we cannot find any statistically significant effects.
Effects for group 2 in 9th grade (analysis 2): • The fixed effect analysis in Table 8 shows the same results as the difference-in-difference analysis in Table 7. This can be read from the variable "Effect of tablet" where the effects are not significant and cannot be concluded with great certainty that there is a difference in the development of pilot schools (group 2) as compared with other schools. • It has been taken into account that the national test in reading cannot be compared to 2016, since the premeasurement of the reading exam consists only of 2016. This is also described in the note below the table. For the other national tests (Arithmetic and English), both 2016 and 2017 are included in the measurement. • In Appendix 6.4 (Table 27, can be accessible by contacting the authors), the fixed effect analysis is designed with a different model specification. This is no longer the case for time-constant characteristics at school level, but instead, indicators from Statistics Norway are included in the regression. This reduces the degree of explanation (R2). The conclusion is, however, that we cannot find any statistically significant effects.

Results From National Mapping Tests in First to Third Grades
In the national mapping tests, it is examined whether the students are above or below the concern threshold for the expected learning level. An increase in the proportion of students across the critical boundary at pilot schools may indicate that the introduction of tablets has contributed to increased learning from the first to third grades.
Effects for group 1 in first through third grade: • Table 9 shows the proportion of students over the critical limit in the state assessment tests for reading, where we have selected subtests spelling, reading words, and reading comprehension among several subtests, and state assessment tests on behalf of pilot 1 schools and other schools in Baerum. A positive value in the column "diffin-diff" indicates a positive effect of introducing tablets. The effect is still negative in 2017, but we can no longer conclude with statistical certainty that this is different from zero. This can in itself be regarded as a positive development. • In Appendix 6.2 (Table 16, can be accessible by contacting the authors), we have completed a diff-in-diff analysis, where we have taken possible grade effects into account, i.e., there are the same pupils in first grade before and after the introduction of tablets. The test is designed by following the same students before and after the introduction of the tablets and then comparing with other students. In this analysis, there are also no significant effects. The above conclusion is therefore robust in regard to the grade effect.
Effects for group 2 in first through third grade: • Table 10 shows the percentage of students above the critical boundary in the state assessment tests for reading, where we have selected the spelling, reading words, and reading comprehension among multiple subtests, and the state survey tests for pilot 2 schools and other schools in Baerum. A positive value in the column "diff-in-diff" indicates a positive effect of introducing tablets. The significance is tested by a two-tailed independent t-test with equal variance of 10, 5, and 1% significance level. a Means that with 99% security we can say that there is a difference between the effort group and the control group. b Means that with 95% security we can say there is a difference between the effort group and the control group. c Means that with 90% security we can say that there is a difference between the effort group and the control group. d In the student survey, the students respond on a scale from 1 to 5. The 10 indicators are based on a number of sub-questions. The composition of the indicators is described in more detail at www.skoleporten.udir.no.
Frontiers in Education | www.frontiersin.org June 2021 | Volume 6 | Article 642686 • The table shows only positive effect sizes, but only two of the results can be considered to be different from zero. There are positive effects on reading and understanding in the first step, both of which are statistically significant. • In Appendix 6.2 (Table 17, can be accessible by contacting the authors) we have conducted a diff-in-diff analysis where we have taken the possible grade effect into consideration, i.e., it is not the same students who are in grade 1 before and after the introduction of tablets. The test is designed by following the same students before and after the introduction of the tablets and then comparing with other students. The analysis here shows other results than in the table above, as the effect of introducing tablets is significantly negative for the tests in reading and arithmetic (from first to second grade). Therefore, the conclusion from Table 9 above is not robust in regard to the grade effect, and therefore, we are careful in drawing conclusions from the results in Table 10, except that it is currently difficult to see any significant effects here.

Results From the Student Survey at Primary School and Secondary School
Effects for group 1 in 7th grade: • Table 11 shows the effect of introducing tablets in seventh grade for pilot 1 schools compared to the other schools in Baerum. A positive value means that pilot 1 schools have had an increase as compared to the other schools. • The table shows that there are no major differences in the student survey between pilot 1 schools and other schools. By 2015, there was a significant effect to be found in the indicator bullying, which means that pilot 1 schools had experienced a significant increase in bullying from 2014 to 2015. The bullying indicator is still higher for pilot schools than for other schools in 2016 and 2017, but the effect is no longer significant, which means we cannot conclude that the effect of tablets on bullying is different from zero. This means that there was a negative effect of tablets in the short term, but that effect has decreased and ceased in the long run. In addition, we cannot rule out that the impact on bullying in 2015 was influenced by a possible grade effect and other conditions, which are not related to the introduction of tablets. • When investigating the results (based on the bullying indicator in the student survey) more closely, by looking at the development in response frequency on the question about digital bullying in the national student survey between group 1 and group 2 and other schools in Baerum in 2016 and 2017, we find no significant differences in level and development between these school groups-neither combined, nor between genders. This can also be an indication that identification of bullying among girls in seventh grade in pilot 2 schools also can depend on other variables than the usage of tablets. • In Appendix 6.5 (Tables 29 and 30, can be accessible by contacting the authors) we show the results divided by gender. Here we find no significant effects for boys, but for girls there is a significant increase in the proportion of pupils who experience bullying. That was the same trend we saw for pilot schools in 2015. • We have also performed some additional analyses on bullying annex 6.5 (Tables 36, 37, and 38, can be accessible by contacting the authors). These analyses also support the conclusion and also show a significant impact on the bullying indicator, and that bullying has increased after the introduction of tablets in the seventh grade for pilot 2 schools.
Effects for group 1 in 10th grade: • Table 12 shows the effect of introducing tablets in the 10th grade for pilot 1 schools as compared to the other schools in Baerum. A positive value means that pilot 2 schools have had an increase as compared to other schools in the other groups. • The table shows, as in the seventh grade (group 1), that the effects of introducing tablets on the student's well-being and learning environment are close to zero and not significant. • In Appendix 6.6 (Tables 39 and 40, can be accessible by contacting the authors) the effects are divided by gender.
Here we also find small effect sizes and non-significant effects for both boys and girls. At the same time, we can mention that the effect on motivation in girls in 2016 is positive and significant, but that the effect decreases and does not become significant in 2017. • We have made some additional analyses on the bullying indicator in Appendix 6.6 (Tables 43-45, can be accessible by contacting the authors). These analyses support the conclusion, as they do not show any significant effects on bullying.
Effects for group 2 in 10th grade: • Table 13 shows the effect of introducing tablets in the 10th grade for pilot 2 schools as compared to other schools in Baerum. A positive value means that the pilot 2 schools have had an increase as compared to the other schools. • The table shows, unlike in the seventh grade in pilot 2 schools, that the introduction of tablets has not had a negative impact on the bullying indicator for tenth-grade students. At the same time, we register positive significant effects on mastering, motivation, well-being, teacher support, and assessment of learning. • In Appendix 6.6 (Tables 41 and 42, can be accessible by contacting the authors) the effects are divided by gender. In boys, we find positive significant effects on well-being, common rules, and assessment for learning, while in girls we find positive and significant effects on mastering, teacher support, and assessment for learning. • We have performed some additional analyses on the bullying indicator in Appendix 6.6 (Tables 46-48, can be accessible by contacting the authors). These analyses support the conclusion, as they do not show any significant effects on bullying.
On the use of Data From National Tests, National Mapping Survey, and the National Student Survey We repeat that it is important to note that the effect results from national tests, the national mapping survey, and the national student survey belong to different students in the pre-and postmeasurements. This means that the results from the effect measurements may potentially be the result of possible grade effects. Analyses and further investigation of the results of national tests in the 8th and 9th grades showed that the results here were quite robust in regard to the grade effect, while the analysis of the state mapping tests showed that the results here were not robust in regard to the grade effect. Therefore, we cannot rule out that the grade effect may also have an impact on the results in the student survey in the seventh and 10th grades. We therefore request that the results be read with this reservation.

DISCUSSION
The study shows that in several school areas tablets have rather limited effect on pupils learning outcome and it is important to underline that the study does not find any direct causality of the relationship between implementing tablets and positive learning outcome. However, among the significant findings in this study, we see that tablets have somewhat more positive effects among boys than among girls. The positive effect of tablets that we see among boys can be related to the fact that the use of tablets serves as a positive structuring factor for the boys' learning work. We also find support for this in the 10th grade, where boys who make use of tablets to a significantly greater extent experience having common rules for the teaching than boys in schools that do not use tablets. Some of this can be interpreted in relation to that the tablet seems somewhat structuring (we also find support for this in the qualitative interviews in the study). One possible explanation here can be that teachers make greater use of and make available work schedules and learning resources for school hours with the use of tablets. At the same time, the use of tablets contributes to the pupils having most of their tools and previous learning work gathered in one place in the tablet. This means that the pupils can get started quickly, and that they experience the learning resources as more transparent and accessible. We also find support for this in the qualitative data in the study.
Furthermore, it seems that the tablet can be a motivating factor in the pupils' school life. In this regard, we see significant positive findings in the 10th step generally for increased motivation. It seems here that the tablet device helps to make boys more motivated for learning with the use of tablets. It can also be related to the tablet's multiple digital, graphic, auditory and visual capabilities and support features (visualization, audio, multimodal aspect, communication capabilities) can give new opportunities for adapted education and differentiation. There are also tendencies that the tablet device provides the opportunity for a digital support that particularly low-performing students benefit from, where boys are over-represented.
Does the tablet have an equalizing effect between the genders? And can the use of tablets in schools thus contribute to a school with less difference between girls' and boys' school performance? Today, girls generally perform better than boys, and several studies reveal that there are not any "quick fix" for increasing boys school performance with or without educational technology. However, findings from the study suggest the possibility that boys benefit from tablets to a greater extent than girls. An interesting finding is that the effect of introducing tablets is significantly positive for boys in fifth grade in English (as in 2015/2016). Furthermore, the effect is also positive and significant for all children in fifth grade in English, when the effect is measured in 2017. These findings can be based on a number of explanations (e.g., the gaming culture among boys, etc) where tablets might only be one of several factors. In general, the study shows that especially the large schools have positive results.
Such findings can be seen in light of the latest national tests in Norway in 2019 which also show that boys perform better than girls in English in 5th grade with a difference of 3.6 scale points (Statistisk sentralbyrå (Statistics Norway), 2020). And from 2014 to 2019 there is a decrease in mastery level 1 and an increase in mastery level 3 as during these years, and the boys are clearly higher than the girls in mastery level 3 in English (Utdanningsdirektoratet 2020). Internationally Niitemaa (2020) reveals similar tendencies in Finland.
These tendencies might be linked to a "spill over-effect" from outside school learning where the significantly positive results for boys in fifth grade in English can be interpreted as a sign of the times where English language immersion in leisure time (e.g., gaming, youtube, etc.) among boys are continuously developing. Such informal learning can be defined as: «any activity involving the pursuit of understanding, knowledge or skill which occurs outside the curricula of educational institutions" (Livingstone 1999, p. 51). If, and eventually how these tendencies of informal learning in leisure time (directly or indirectly) can have a "spillover" effect regarding school performance needs to be addressed in this area in the future. Sefton-Green (2013), Lewin and Charania (2018) point out that building bridges between formal and informal learning arenas are something that digitalization has good potential for and that should be focused more strongly on in the future.

CONCLUSION
It is still too early to say anything about the effect as changes take time. We refer therefore to the effects we see in the pilot 1 and pilot 2 schools as "intermediate effects".
The preliminary results give reason to assume that in several school areas tablets have rather limited effect on pupils learning outcome. However, the use of tablets can have some small positive effects on boys' learning. This can be linked to the fact that the tablet provides poorly performing students, where boys are over-represented, a digital support that contributes to smoothing the students' performance. This also presupposes an appropriate use of tablets and good teaching quality. The use of the tablet is strongly linked to pedagogical practice, which in turn is influenced by teacher digital competence. This might also linked to "outside school learning" where the significantly positive results for boys in fifth grade in English can be interpreted as "a sign of the times" where English language immersion in leisure time among boys are continuously developing.
From the study, we find some tendencies that when the use of tablets is supported by teachers who have high digital competence, their use seems to have a small equalizing effect between the school achievements of boys and girls. However, we cannot rule out that a grade effect and informal learning may also have an impact on the results, and we therefore request that the results be read with this reservation.

Limitations
There are some of limitations in this study. First, in this part of the trailing research, we have only presented the quantitative data of the study. This might be a certain limitation since the trailing research consists of several other data sources which give a broader picture of the implementation of tablets in Baerum Municipality. We also mention several attachments with statistical data which is not included in this article (because of space limit) (but these can be accessible by contacting the authors).

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

AUTHOR CONTRIBUTIONS
RK: He is in charge for research idea of the article, substantial contributions to design of the work and interpretation of the data, and main writer of the manuscript. EB: Provided ethical approval for publication of the content, substantial contributions to the design of the work, data collection, analysis and interpretation, wrote the main Norwegian report for the study. LØJ: substantial contributions to the design of the work, analysis, drafted and provided important contributions to the final manuscript. IG: Provided ethical approval for publication of the content, substantial contributions to the design of the work, data collection, analysis and interpretation and wrote the main Norwegian report for the study. All authors: contributed to the article and approved the submitted version.