Effects of Executive Function Training on Attentional, Behavioral and Emotional Functioning and Self-Perceived Competence in Very Preterm Children: A Randomized Controlled Trial

Objective Very preterm children have poorer attentional, behavioral and emotional functioning than term-born children. Problems on these domains have been linked to poorer executive function (EF). This study examined effects of a game-formatted, comprehensive EF training on attentional, behavioral and emotional functioning and self-perceived competence in very preterm children. Study Design Eighty-five children participated in a multi-center, double-blind, placebo and waitlist-controlled randomized trial. Children were recruited from neonatal follow-up units of two academic medical centers in The Netherlands. Eligible for inclusion were 8–12 year old children born very preterm (<30 weeks of gestation) and/or with extremely low birthweight (<1000 g) with parent reported attention problems. Children were randomly assigned to one of three treatment arms: EF training, placebo training or waitlist. The EF and placebo training involved a 6 weeks, 25 (30–45 min) sessions training program. Attentional functioning (Attention Network Test), behavioral and emotional functioning (parent and teacher Strengths and Difficulties questionnaire) and self-perceived competence (Self-Perception Profile for Children) were assessed at baseline, at the end of the training program and 5 months after the training was finished. Data analyses involved linear mixed model analyses. Results Children in the EF training arm significantly improved on all training tasks over the course of the EF training program. Despite these improvements on the EF training tasks, there were no significant differences over time on any of the outcome measures between the three treatment arms, indicating that this computerized EF training program had no beneficial effects. Conclusion Although there were significant improvements in the EF training tasks, there was no generalization of these improvements to any of the outcome measures. Thus, our findings do not support the use of computerized EF training programs. Future research should investigate effectivity of more ecologically valid, real-world like EF training programs.


INTRODUCTION
Between 0.7 and 1.4% of all live born children in Western countries are born very preterm (gestational age [GA] < 32 weeks) (Delnord et al., 2017). Long-term consequences of very preterm birth have been intensively investigated in the domains of cognitive, academic, behavioral and emotional functioning, with very preterm children showing substantial problems in all of these domains (Bhutta et al., 2002;Anderson et al., 2003;Aarnoudse-Moens et al., 2009;Mulder et al., 2009;Blencowe et al., 2013;Aylward, 2014;Ritchie et al., 2015;Allotey et al., 2017;Twilhaar et al., 2017). For example, executive functions (EF), which is an umbrella term for a set of higher-order cognitive functions allowing for top-down, goal-directed behavior, are adversely affected in very preterm children (Aarnoudse-Moens et al., 2009;Mulder et al., 2009;van Houdt et al., 2019). Deficits in EF have been shown to play an important underlying role in both the academic as well as the behavioral and emotional functioning problems that very preterm children encounter (Nadeau et al., 2001;Taylor et al., 2006;Mulder et al., 2010Mulder et al., , 2011de Kieviet et al., 2012;Loe et al., 2012;Aarnoudse-Moens et al., 2013;Alduncin et al., 2014). For example, EF performance has been shown to predict math performance in very preterm children at primary school (Aarnoudse-Moens et al., 2013) and working memory has been shown to account for academic attainment (Mulder et al., 2010). Furthermore, working memory has been shown to account for attention problems in very preterm children at school-age (Nadeau et al., 2001;Mulder et al., 2011;de Kieviet et al., 2012). Last, poorer EF performance has been shown associated with poorer social competence in very preterm children at preschool age (Alduncin et al., 2014) and school-age (Taylor et al., 2006;Loe et al., 2012).
In the past decade, an increasing number of studies have addressed the efficacy of computerized interventions to improve EF, with Cogmed Working Memory Training (CWMT) (Klingberg et al., 2005) being the most widely studied computerized EF training program. CWMT for school-age children involves gamified verbal and visuospatial working memory training tasks presented on a space-themed interface design. Children's scores are presented on the screen to challenge children to outperform their own scores and difficulty level is automatically adjusted according to the child's performance. CWMT is played five times a week for 30-45 min per session. Studies on CWMT in children with Attention Deficit/Hyperactivity Disorder (ADHD) have shown promising results in improving working memory and also reported some promising transfer effects to untrained functions (Klingberg et al., 2005;Beck et al., 2010;Green et al., 2012;Hovik et al., 2013;Chacko et al., 2014). Compared to a wait-list control group, CWMT was reported to improve verbal and non-verbal working memory storage, visuospatial working memory, verbal working memory, parent-rated working memory and parent-rated inattention symptoms (Beck et al., 2010;Hovik et al., 2013). Furthermore, compared to a placebo control group, CWMT was reported to improve trained working memory tasks and untrained performance on tasks assessing visuospatial working memory, verbal working memory, response inhibition and complex reasoning. Furthermore beneficial effects have been reported on parent-rated inattention and hyperactivity/impulsivity symptoms and on observed behaviors during an academic task (Klingberg et al., 2005;Green et al., 2012;Chacko et al., 2014). There is also some evidence of neural changes following CWMT and associations between these neural changes and improved working memory, both in healthy children and adults (Barnes et al., 2016;Metzler-Baddeley et al., 2016 and in adolescents with ADHD (Stevens et al., 2016). Three meta-analyses have been conducted investigating near-transfer effects of CWMT on working memory (Shipstead et al., 2012;Melby-Lervåg and Hulme, 2013;Aksayli et al., 2019). Two out of these three meta-analyses concluded that there is evidence that CWMT leads to improved working memory task performance (Melby-Lervåg and Hulme, 2013;Aksayli et al., 2019), with the strength of the improvement depending on the similarity of the tasks to the training tasks (Aksayli et al., 2019). Four meta-analyses have been conducted investigating far-transfer effects of CWMT on untrained functions (Shipstead et al., 2012;Melby-Lervåg and Hulme, 2013;Spencer-Smith and Klingberg, 2016;Aksayli et al., 2019). Of these, three metaanalyses concluded that there is no evidence for improvements of untrained functions after following CWMT (Shipstead et al., 2012;Melby-Lervåg and Hulme, 2013;Aksayli et al., 2019). Only one randomized controlled trial into effects of CWMT in very preterm born children has been conducted and showed no improvements in academic achievement, working memory, attention, daily life EF and general cognitive ability (Anderson et al., 2018). However, CWMT is an EF training program that focuses solely on training working memory, while other core EFs such as inhibition and cognitive flexibility are also affected in children born preterm (Aarnoudse-Moens et al., 2009;Mulder et al., 2009;van Houdt et al., 2019).
Recently, a game-formatted and comprehensive EF training program entitled BrainGame Brian (BGB) was developed, that aimed at training not only working memory, but also inhibition and cognitive flexibility, in children aged 8-12 years . BrainGame Brian involves a game-world in which training tasks for visuospatial working memory, response inhibition and cognitive flexibility are played to help the main character, Brian. Difficulty level is automatically adjusted according to the child's performance. The training program is played four times a week for 30-45 min per session. The BGB EF training program has been consistently shown to improve working memory in children with ADHD and Autism Spectrum Disorder (ASD) (van der Oord et al., 2014;de Vries et al., 2015;Dovis et al., 2015). However, effects on other EFs or other untrained functions were inconsistent (van der Oord et al., 2014;de Vries et al., 2015;Dovis et al., 2015). Furthermore, one smallsized non-randomized pilot study has been conducted into the feasibility of the BGB EF training program in very preterm children, which showed positive effects on visuospatial working memory task performance (Aarnoudse-Moens et al., 2018). The BGB EF training program may have beneficial effects on various areas of functioning, including attentional, behavioral and emotional functioning and self-perceived competence in very preterm born children. Deficits in EF have been shown to play a crucial role in a range of psychiatric disorders such as ADHD and ASD, and a large body of literature has indicated that executive functioning is strongly related to both behavioral and emotional functioning (Ozonoff et al., 1991;Pennington and Ozonoff, 1996;Nigg, 2000;Sergeant et al., 2002;Oosterlaan et al., 2005;Willcutt et al., 2005;Riggs et al., 2006;Carlson and Wang, 2007). In very preterm children, deficits in EF have been shown to underlie the attentional problems these children encounter as well (Mulder et al., 2011;de Kieviet et al., 2012;Aarnoudse-Moens et al., 2013). Therefore, improving EFs with the BGB EF training program could lead to improvements in attentional, behavioral and emotional functioning as well. If the BGB EF training program leads to improvement in those domains, it may improve children's selfperceived competence as well.
Therefore, the current study aimed to investigate effects of the BGB EF training program on attentional functioning, parent and teacher rated behavioral and emotional functioning and self-perceived competence in a group of very preterm (<30 weeks of gestation) and/or extremely low birthweight (< 1000 g) children with parent-rated attention problems, compared to both a placebo training and waitlist arm. The BGB EF training program uses game elements and strong and immediate reinforcements to optimize the participants' motivational state and compliance with the training, which in turn is supposed to enhance efficacy of the training. The effects of EF training with BGB may therefore be moderated by exposure to gaming before start of the EF training program. More specifically, children with intensive exposure to gaming may show a more blunted response to the reinforcements build in the training than children with little exposure to gaming. Therefore, exploratory analyses also examined effects of the BGB EF training program while correcting for time spent gaming outside school-hours. Also, associations between time spent gaming outside schoolhours and baseline measurements were examined.

Trial Design
This was a multi-center, double-blind, placebo and waitlistcontrolled randomized trial conducted in two academic hospitals in The Netherlands (Amsterdam University Medical Centers and University Medical Center Utrecht). The Medical Ethical Committee of the two participating academic hospitals approved the study protocol and the execution of the study procedures was according to the Declaration of Helsinki. The trial was registered in the Dutch Trial Registry (NTR, # NTR5365). CONSORT guidelines were followed.

Participants
The Dutch version of the Child Behavior Checklist 6-18 years (CBCL6-18) (Verhulst and Van der Ende, 2013) was sent to parents of 7-12 year old (chronological age) children born very preterm (<30 weeks of gestation) and/or with extremely low birthweight (birthweight < 1000 g) that participated in the national neonatal follow-up program after being admitted to the Neonatal Intensive Care Unit (NICU) in one of the two participating hospitals. Eligible for this study were children of whom parents reported attention problems on the CBCL6-18 (T ≥ 55 on the Attention Problems scale, Hudziak et al., 2004), as soon as they reached the chronological age of at least 8 years. Exclusion criteria were an estimated IQ < 80 (in order to assure that the child was able to understand and comply with instructions), motor problems too profound to allow use of a computer and no Dutch language use in the home situation. The inclusion process and participant's flow through the study is depicted in Figure 1. Reasons not to return the questionnaire that was used to assess whether children had parent-rated attention problems were no time or no interest. Reasons not to participate were that parents found that incorporation of the training sessions into already busy schedules was too burdensome for the child and/or family or that parents or children had no interest in participating. In short, 85 children were randomized, 29 to the EF training arm, 26 to the placebo training arm and 30 to the waitlist arm. Data of the first follow-up visit were available for 24, 20, and 29 children, respectively, and data of the second followup visit were available for 23, 19, and 27 children, respectively. Thus of all children, 81% completed all assessments. Reasons for withdrawal from the study after randomization were not being able to incorporate training sessions into a busy schedule or the child not wanting to complete the training sessions (n = 9), no time or willingness to schedule the follow-up visit(s) at the appropriate time-point(s) (n = 5) or severe illness discovered (n = 2). All available data of participants (also data of participants with missing data) were incorporated in the analyses.

Randomization and Blinding
Children meeting inclusion criteria were randomly assigned to one of three treatment arms: EF training, placebo training or waitlist. Allocation to treatment arms was stratified by age (below or above 10.5 years of age) and severity of attention problems (Attention Problems T-score below or above 65), with equal proportions of children allocated to each arm within the same stratum. A random number generator was used to generate randomization lists. A researcher not otherwise involved in this study was responsible for randomization and handed the test assistant a sealed envelope with a note stating 'waitlist' or a login and password, which was opened by the child and parents after baseline assessment. To ensure blinding, parents were only informed about whether their child was randomized to either one of two training arms or the waitlist arm, and in case more children from the same family were included in the study, one of those was randomized and the other was put in the same arm. All staff was blinded to EF training or placebo training assignment, including the person involved in randomization. Test assistants that played the first training session with the child were deblinded because of differences in training tasks (see below) between EF training and placebo training and were not involved in follow-up assessments of these children. Parents, children and researchers were aware of children's allocation to the waitlist arm. Data were analyzed by a researcher blinded to treatment allocations.

BrainGame Brian Training
The BGB EF training program is a game-formatted, computerized training program  that is performed by the child at home. The BGB EF training program uses game elements and strong and immediate reinforcements to optimize the participants' motivational state and compliance with the training. The game-world exists of several different villages, in each of which there are characters that face problems and need help of the main character: Brian. During the first sessions, only one of those villages in accessible, with more villages becoming accessible during the course of the training program. To help the characters facing problems, children perform the EF training tasks with Brian. After completion of each block of training tasks, an invention made by Brian will appear in the game-world that helps solving the problem of the character, thereby acting as an immediate reinforcement. These inventions remain visible in the game-world during subsequent sessions. Thus the more sessions children have performed, the more inventions will be visible in the game-world, which enhances motivation. The training consists of 25 sessions, with two blocks of three training tasks, one for each EF, administered in every session. These three training tasks remain the same throughout the 25 sessions, except for the visuospatial working memory task, which was administered in five different versions to increase working memory demands.

EF Training Arm
In the working memory task, children are asked to repeat a sequence of dots on a 4 × 4 grid. There were five versions of the working memory task, each of which was administered for five consecutive training sessions and increased in difficulty level across training sessions. In the inhibition task, children are asked to press a button in a specific time window (target), but to refrain from pressing that button when a visual stop signal is presented. In the cognitive flexibility task, children are asked to sort objects according to either it's shape or it's color, with the sorting rule changing every three to five trials. Difficulty level of each training task is automatically adjusted to the child's level of performance. The number of trials and therefore also the duration of all three tasks depends on the child's performance. Most children are able to finish the training tasks within 8 min per task. For the first three and last two versions of the working memory task, a total of at least 74 and 62 boxes need to be repeated correctly (with only correctly repeated boxed within correctly repeated sequences adding to this total), respectively, to end the task. For the inhibition task, the task ends after ten blocks of five trials that were all performed correctly. For the cognitive flexibility task, the task ends after 10 blocks of three-to-five trials that were all performed correctly. Difficulty level was adjusted for each task after completion. Difficulty level for the working memory task was adjusted by increasing or decreasing the sequence length. Difficulty level for the inhibition task was adjusted by increasing or decreasing the time between start of the time window in which children needed to respond and presentation of the stop signal. Difficulty for the cognitive flexibility task was adjusted by increasing or decreasing the time children have to sort each presented target.

Placebo Training Arm
The placebo training arm is identical to the actual training arm, however, the specific elements that actually train the EFs are removed from the training tasks. In the working memory task, children are asked to repeat sequences with a span length of two in the same order as presented. In that way, the training task only involves short-term memory and does not tax working memory. In the inhibition task, no stop-signals are presented. In the shifting task, no shifting trials are presented. Furthermore, difficulty level is not adjusted. Thus, children do play the training tasks, but do not train working memory, inhibition or cognitive flexibility in the placebo training arm.

Waitlist Arm
Children in the waitlist arm do not play the training and were instructed to perform the same activities in the waiting period as they do normally.

Improvement During Training
To validate whether the BGB EF training program actually induced improvement on the trained tasks, we assessed improvement of training performance across all training sessions. For the inhibition and cognitive flexibility tasks, improvement was assessed by comparing the mean difficulty level of day two and day three of training (start level) with the highest achieved difficulty level (highest level). All children start at the same level at day one, but for some children this level is too easy and for some children this level is too difficult. Therefore, taking the mean difficulty level of day two and day three as start level ensures that this is the child's actual level of performance at the beginning of the training. As there were five versions of the working memory task, which were each played in five consecutive training sessions, for each version improvement was assessed by comparing mean difficulty level at day two (start level) with the highest achieved difficulty level (highest level). Again, mean difficulty level at day two was chosen as start level to ensure this was the child's actual level of performance at the start of each new version of the working memory task. Difficulty level at day two and not mean difficulty level of day two and three (as was done for the inhibition and cognitive flexibility tasks) was chosen because each version of the working memory task was only performed in five consecutive training sessions and not 25 as in the inhibition and cognitive flexibility tasks.

Attentional Functioning
The Child version of the Attention Network Test (Child-ANT) (Rueda et al., 2004) was administered to assess efficiency of the three attention networks: (1) the alerting network, (2) the orienting network, and (3) the executive attention network. Each trial of the Child-ANT started with a central fixation cross. The target was one single yellow fish or a horizontally positioned line of five yellow fish, appearing above or below the fixation cross. The child was asked to respond by pressing one of two buttons on the side the central fish pointed to. Trials could be (a) congruent (central fish pointing to same direction as flanking fish), (b) incongruent (central fish pointing to opposite direction as flanking fish) or (c) neutral (only central fish, no flanking fish). Furthermore, each target was preceded by a warning cue condition that comprised one of four options: (a) no cue, (b) center cue (cue presented at the location of the fixation cross), (c) double cue (cues presented above and below the fixation cross), or (d) spatial cue (cue presented at the location of the upcoming target). Outcome measures were efficiency of the alerting, orienting and executive attention networks, calculated by (1) subtracting the median RT for the double cue condition from the median RT for the no cue condition, (2) subtracting the median RT for the spatial cue from the median RT for the central cue and (3) subtracting the median RT for the congruent trials from the median RT for the incongruent trials, respectively. For the alerting and orienting networks, higher values reflect higher network efficiency. For the executive attention network, higher values reflect lower network efficiency.

Behavioral and Emotional Functioning
Behavioral and emotional functioning was measured with the Strengths and Difficulties Questionnaire (SDQ) (van Widenfelt et al., 2003) which contains five subscales: Emotional Problems, Hyperactivity, Conduct Problems, Peer Problems and Prosocial Behavior. Raw scores on these scales were used as outcome measures. Scores may range between 0 and 10, with higher scores reflecting more problems.

Self-Perceived Competence
The Dutch translation of the Self-Perception Profile for Children (CBSK) (Veerman et al., 1997), was used to assess self-perceived competence using six scales: Scholastics, Social Acceptance, Athletics, Physical Appearance, Behavioral Conduct and Global Self-Worth. Raw scores on these scales were used as outcome measures. Scores may range between 6 and 24, with higher scores reflecting higher self-perceived competence.

Gaming at Baseline
Gaming was defined as playing games on any electronic apparatus. At the baseline assessment, parents provided information on the amount of hours per week their children spent gaming outside of school-hours.

Procedure
After written informed consent was obtained from parents and, if applicable, from children aged 12, children completed a baseline neurocognitive assessment including estimated IQ, efficiency of attention networks and self-perceived competence. Parents and teachers were asked to fill out a questionnaire on children's behavioral and emotional functioning (a full description is provided below). Assessments were part of a larger battery of measures administered to study effectiveness of the BGB EF training program. When children were randomized to either the EF training or placebo training arm, a house visit was made to install the BGB EF training program at the home computer or laptop and play the first session. When children were randomized to the waitlist-control arm, no house visit was made. To assess short-term and longer-term efficacy of the BGB EF training program, two follow-up visits were scheduled. The first follow-up assessment (T1) was approximately 2 weeks after the last training session (approximately 2 months after baseline assessment for children in the waitlist condition) and the second follow-up assessment (T2) was approximately 5 months after the first follow-up assessment. Gaming at baseline, demographic characteristics, medical characteristics of the neonatal period and estimated IQ were only assessed at baseline assessment. Attentional functioning, behavioral and emotional functioning and self-perceived competence were assessed at baseline assessment and both follow-up assessments.

Demographic Characteristics, Medical Characteristics of Neonatal Period and IQ
Parents provided information on demographics. Medical data from the neonatal period were obtained from medical records. To estimate IQ, a two subtest short-form (Vocabulary and Block Design) of the Dutch Wechsler Intelligence Scale for Children, Third Edition (WISC-III-NL, Sattler, 1992), was administered during the baseline assessment. Scaled scores for both the Vocabulary and Block Design subtests were computed. Subsequently the estimated full scale IQ equivalent for the sum of scaled scores of these two subtests was taken from the manual. Estimated IQ based on this short-form correlates highly with full scale IQ (r = 0.90) (Sattler, 1992).

Statistical Analyses
Sample size calculation was based on a repeated measures design with three time points. To be able to demonstrate a medium-sized intervention effect (Cohen's d = 0.5), assuming a within-subject correlation of 0.295 (taken from our BGB EF training pilot study in very preterm, Aarnoudse-Moens et al., 2018), a power of 80% and a significance level of 0.05, 39 children in each intervention arm were needed (Twisk, 2013).
IBM SPSS Statistics version 25 was used for the statistical analyses (IBM, 2017). Outliers were winsorized at three standard deviations (SDs) (Ghosh and Vogt, 2012). For baseline assessment, first follow-up assessment and second follow-up assessment, 4.7, 11, and 10.1% of data was missing for the Child-ANT, respectively, 1.2, 1.4, and 5.8% of data was missing for the parent SDQ, respectively, 14.1, 44.7, and 52.2% of data was missing for the teacher SDQ respectively, and 3.5, 2.7, and 1.4% of data was missing for the CBSK, respectively. Missing data were not imputed.
Data were analyzed on intention-to-treat basis. To assess whether attrition from the study was selective, children that did and did not complete all assessments were compared on all demographic and neonatal medical characteristics and all outcome measures with independent t-tests and chi-square tests. To assess whether demographic and neonatal medical baseline characteristics differ between the treatment arms, oneway analyses of variance (ANOVA's) and chi-square tests were performed. To assess whether children actually improved on the training tasks in the BGB EF training program, their start level was compared to their highest level for the inhibition task, cognitive flexibility task and each of five versions of the working memory task with paired t-tests.
To assess whether there was a differential effect of treatment arm over time on attentional functioning, behavioral and emotional functioning, and self-perceived competence, linear mixed model analyses were run for all outcome measures with a random intercept to account for dependency in the data due to family bonds, and fixed factors for treatment arm, time and the interaction between treatment arm and time. To assess whether differential effects of treatment arm over time existed for younger and older children, linear mixed model analyses were performed on all outcome measures. A random intercept accounted for dependency in the data due to family bonds and the three-way interaction between treatment arm, time and age above or below 10.5 years was added as a fixed factor. To assess whether effects of BGB EF training program depend on time spent gaming before start of the training, the described linear mixed model analyses were also run with time spent gaming outside school-hours at baseline assessment added as a covariate. All available data was used in all linear mixed model analyses. In addition, we explored the association between gaming at baseline and baseline measurements of attentional, behavioral and emotional functioning and self-perceived competence, using Pearson r correlations.

Preliminary Analyses
Attrition analyses showed no differences on any of the demographic or neonatal medical characteristics nor on any of the outcome measures at baseline between children that did and did not complete all assessments (all t-values < 1.94, all χ 2 -values < 0.72, all p-values > 0.06). There were no significant differences on any of the baseline demographics or neonatal medical characteristics between the treatment arms, with one single exception. There was a significant difference between the treatment arms for open ductus botalli that was treated with either medication or surgery [χ 2 (2) = 10.2, p = 0.006], with less children with a treated open ductus botalli in the EF training arm than in the placebo and waitlist arm. An open ductus botalli is very common in preterm neonates, with an incidence of 50% in infants born with a birthweight below 750 g and 37% in infants born with a birthweight between 750 and 1000 g (Dice and Bhatia, 2007). Treated open ductus botalli has been found to be not associated with neurodevelopmental outcomes (Chorne et al., 2007). See Table 1 for more detailed information on the demographic and neonatal characteristics of the three treatment groups at baseline. Assessments took place between October 2015 (first baseline measurement) and September 2018 (last second follow-up measurement). Mean number of weeks between baseline assessment and first followup assessment was 9.1 weeks (SD = 2.5) and mean number of weeks between baseline assessment and second follow-up assessment was 32.7 weeks (SD = 4.8). Mean number of months between first and second follow-up assessment was 5.5 months (SD = 0.8 months). There were no significant differences in time between baseline and first or second follow-up assessments between the three treatment arms [F(2,68) = 0.66, p = 0.52; F(2,60) = 2.0, p = 0.15, respectively].

Improvement During Training
For the inhibition training task, the cognitive flexibility training task and all five versions of the working memory task, significant improvements were found across the training sessions in the EF training arm. Performance significantly increased on all measures between the start level and the highest level achieved of children, indicating that children actually improved on all training tasks over the course of the EF training program. See Table 2 for more details.

Effects of the EF Training Program on Attentional, Behavioral and Emotional Functioning and Self-Perceived Competence
There was no significant difference over time between the three treatment arms for efficiency of the orienting and executive attention networks. The difference over time between the three treatment arms for the alerting network approached significance [F(4,133) = 2.40, p = 0.053]. Post hoc mixed model analyses indicated larger improvement of alerting network efficiency in the waitlist arm than in the EF training arm between baseline and first follow-up assessment, but larger improvement in EF training arm than in the waitlist arm between first and second followup assessment. There were significant main effects of time for efficiency of the executive network [F(2,139) = 9.34, p < 0.001] and the alerting network [F(2,133) = 7.51, p = 0.001], indicating efficiency improved over time. See Table 3.
There was no significant difference over time between the three treatment arms for any of the subscales of parent or teacher Strengths and Difficulties questionnaire. There was a significant main effect of treatment arm for the teacher Peer Problems subscale, indicating less peer problems in the EF training arm than in the waitlist arm [F(2,77) = 3.65, p = 0.03]. See Table 4.
There was no significant difference over time between the three treatment arms and time for any of the subscales of the self-perceived competence questionnaire for children. There were significant main effects of time for self-perceived competence in Scholastics [F(2,144) = 6.04, p = 0.003] and Athletic Competence [F(2,142) = 3.42, p = 0.04], both suggesting improved selfperceived competence over time. The main effect of time for self-perceived Behavioral Conduct approached significance [F(2,145) = 2.95, p = 0.06], suggesting improved self-perceived competence over time. See Table 5.
Significant three-way interactions between treatment arm, time, and age (above or below 10.5 years) were found for the alerting and executive attention networks [F(17,121) = 1.89, p = 0.03; F(17,128) = 2.14, p = 0.009, respectively]. However, post hoc analyses did not indicate more improvement for children in the BGB EF training arm than for children in the placebo or waitlist arm, either for children above or for children below 10.5 years of age.

Effect of the EF Training Program, Corrected for Gaming
Adding hours spent gaming outside school-hours to the mixed model analyses as a covariate showed that a significant interaction effect between treatment arm and time was now found for efficiency of the alerting network [F(4,129) = 8.85, p = 0.03]. Post hoc mixed model analyses showed larger improvement of efficiency of the alerting network for the placebo training arm than the EF training arm between baseline and first follow-up assessment. In addition, with time spent gaming in the model, a significant main effect of time was now found for the parent Emotional Symptoms scale of the SDQ [F(2,135) = 3.41, p = 0.04], suggesting less emotional problems over time. Furthermore, a significant main effect of time was found for self-perceived Behavioral Conduct [F(2,138) = 3.08, p = 0.049], indicating a reduction in behavioral problems over time. All other outcomes remained unchanged.

Associations Between Gaming and Baseline Attentional, Behavioral and Emotional Functioning and Self-Perceived Competence
Hours spent gaming outside school-hours was significantly and inversely related to scores on both parent and teacher rated Prosocial Behavior on the SDQ, indicating that the more hours children spent gaming outside of school-hours, the less prosocial behavior parents and teachers reported (r = −0.23, p = 0.04; r = −0.25, p = 0.04, respectively). Furthermore,    hours spent gaming outside of school-hours was significantly and positively related to scores on parent rated Hyperactivity on the SDQ (r = 0.23, p = 0.04), indicating that the more hours children spent gaming outside of school-hours, the more hyperactive behavior they showed. There were no other significant associations between gaming and any of the other baseline measures.

DISCUSSION
This study examined the effects of a computerized, gameformatted EF training program (BGB EF training program) on attentional, behavioral and emotional functioning and selfperceived competence of very preterm children in a doubleblind, placebo and waitlist-controlled randomized trial. We first  analyzed whether or not the intervention group showed improvements on the working memory, cognitive flexibility and inhibition tasks they trained during 12 weeks. Significant training effects were indeed found. Despite of this, results showed no positive effects of the BGB EF training program on any of the dependent measures.
In children with ADHD, promising effects of EF training programs on working memory were reported (Klingberg et al., 2005;Beck et al., 2010;Green et al., 2012;Hovik et al., 2013;Chacko et al., 2014;van der Oord et al., 2014;Dovis et al., 2015). However, in all of these studies, either a placebo or a waitlist-control group was included, but not both. Including a placebo condition enables to entangle specific and a-specific training effects, while including a waitlist-control group enables to entangle training effects (either specific or a-specific) from developmental effects and test-retest effects. In very preterm born adolescents aged 14-15 years, CWMT was shown to have positive effects on working memory and verbal learning (Lohaugen et al., 2011), however again only a non-intervention control group was included in that study, and no placebo control group, and the positive effects could thus reflect developmental or testretest effects instead of effects of CWMT. Our results, without any beneficial effect of a computerized EF training program in very preterm children, are in line with the first randomized controlled trial on CWMT in very preterm children that did include a placebo control group, reporting no positive effects (Anderson et al., 2018). Literature on the effects of EF training programs is inconsistent at least and there is much debate on what effects EF training programs, including CWMT, actually have. Regarding the effects of working memory training on working memory performance, three meta-analyses have been performed, of which two conclude that EF training programs produce reliable improvements in both verbal and visuospatial working memory, with some evidence that the improvements in visuospatial working memory are maintained (Melby-Lervåg and Hulme, 2013;Aksayli et al., 2019). However, the third has theoretical arguments why simple span tasks are not a good measure for working memory improvement following CWMT and concludes that some studies using complex span tasks do and some studies do not find working memory improvements following CWMT (Shipstead et al., 2012). Regarding the effects of working memory training on other, untrained functions, these meta-analyses all three concluded that there was no evidence for generalization of working memory improvement to other domains (Shipstead et al., 2012;Melby-Lervåg and Hulme, 2013;Aksayli et al., 2019). Only one meta-analysis, performed by the research group involved in the development of CWMT (Spencer-Smith and Klingberg, 2016), concluded that CWMT has significant positive effects on inattention in daily life. However, comments on this study by Dovis et al. (2015a,b), have made arguments as to why these conclusions are controversial. In short, they state that: (1) there were coding errors in the initial meta-analysis, and after correction of these coding errors, effects of CWMT were no longer significant for several subgroup analyses, including for studies using an active or non-adaptive control group and for studies using a specific measure of inattention in daily life, (2) that differences between CWMT and control groups were analyzed without taking into account pretest ratings of inattention, thus making it impossible to interpret which group benefits or improves most, or if there is any benefit or improvement at all and (3) that with correction for publication bias, the overall effect of CWMT on inattention was no longer significant, and that the reasons the authors of the meta-analysis provide for not correcting for publication bias are not supported by the literature.
The current study did not find positive effects of the BGB EF training program on attentional, behavioral and emotional functioning and self-perceived competence. Furthermore, metaanalyses have indicated no positive effects of the CWMT program for untrained functions. These results may be interpreted as game-based EF training being inadequate. However, as reported in the most recent meta-analysis on CWMT studies, this training induces moderate improvements in performance on memory tasks that are not included in the training or related to the trained tasks. This suggests that game-based EF training programs actually are able to improve working memory task performance, but that this improvement does not generalize to other functions. This could suggest that the game-based EF training programs need adjustments before they are capable to induce generalization of the trained functions to untrained functions. It could also suggest that the associations between EF deficits and problems in attentional, behavioral and emotional functioning that are commonly found (Nadeau et al., 2001;Taylor et al., 2006;Mulder et al., 2010Mulder et al., , 2011de Kieviet et al., 2012;Loe et al., 2012;Aarnoudse-Moens et al., 2013;Alduncin et al., 2014) are very complex, and that improvements in EFs alone do not directly lead to improvements in attentional, behavioral and emotional functioning. Furthermore, there may be limits to the plasticity of the brain of very preterm children, which may influence the extent to which game-based EF training leads to improvements in trained and untrained functions. Last, very preterm birth does not just influence the development of the child itself, but also has an impact on family functioning and parents' functioning (Treyvaud, 2014) and subsequently parent-child interactions (Potharst et al., 2012). In 5-year-olds, mothers of very preterm children were less supportive of their children's autonomy and interfered more often with their children's autonomy than mothers of term born children (Potharst et al., 2012). In the setting of game-based EF training, this may lead to more negative interactions with the child about planning or execution of the training sessions, which in turn could lead to children being less motivated about the training. This may have negatively affected the extent to which children profit from the training.
The current study included children with a wide age range, including both children and adolescents (ages 8 years up to and including 12 years). As adolescence is a time in which significant neural, cognitive, behavioral and emotional changes take place (Spear, 2000;Yurgelun-Todd, 2007;Casey et al., 2008), effects of the BGB EF training program may differ depending on the ages studied. However, our analyses involving three-way interactions between treatment arm, time and age (above or below 10.5 years) showed that for almost all outcome measures, there was no differential effect of treatment arm over time between children above and below 10.5 years of age. Furthermore, for the two outcome measures for which there was a significant three-way interaction, there were no indications that the BGB EF training induced more improvement in either children above or below 10.5 years of age when compared to the placebo training and waitlist arm.
The interaction-effect for alerting network efficiency approached significance, and after time spent gaming before the intervention was taken into account, this interactioneffect became significant. However, for both, post hoc analyses showed that these interaction-effects were not indicative of larger improvements of alerting network efficiency in the EF training arm.
Significant improvements over time, regardless of treatment arm, were found for efficiency of the alerting and executive attention networks and for self-perceived competence in the domains of scholastics and athletics. After correction for time spent gaming before the intervention, there were also significant improvements over time for self-perceived behavioral conduct and parent-rated emotional symptoms. No negative changes over time were found. These improvements over time could be a sign of spontaneous recovery or regression to the mean. We also cannot exclude the explanation that this may be a Hawthorne effect, in which the effect of participating in research is reflected in a decrease in problems.
Our exploratory analyses revealed no large differences in outcomes of the analyses when these were adjusted for the time spent gaming outside school-hours. The small differences in outcomes when time spent gaming is adjusted for, may suggest that exposure to gaming at forehand does not influence the degree to which an EF training program as BGB may be effective.
Further analyses revealed that more time spent gaming outside school-hours at baseline assessment, was associated with more parent-rated hyperactive behavior and less prosocial behavioral according to both parents and teachers. Correlation obviously does not imply causation. Either way, our findings may suggest that if a computerized intervention is prescribed, it must be done in a healthy way, explaining child and parents that restrictions in time must be taken into account. For example, the American Academy of Pediatrics recommends that children have 2 h or less of sedentary screen time daily and that media-free times with the family and media-free locations in homes should be designated (Council on Communications and Media, 2016).
Is there still a future for EF training programs, or should focus shift away and focus on other promising interventions? The fact that improvements on the training tasks within the BGB EF training program took place, but no effects on the same EFs measured at follow-up assessments was found, suggests that improvement in the EFs was not just EF-specific, but also taskspecific. From the skill learning field, it is known that transfer of learning from a trained task to even highly similar untrained tasks is generally the exception rather than the rule (Green and Bavelier, 2008). Training paradigms where more general learning has been established, are typically more complex and more ecologically valid, corresponding to real-life experiences (Green and Bavelier, 2008). One of the key factors in ensuring more general learning is variability in tasks and input (Green and Bavelier, 2008). In the BGB EF training program, only one EF is trained at a time and there is little correspondence to real-life experiences. For working memory, there is variability in task instructions and difficulty level, but not in the context in which the training task is performed or in what kind of working memory is trained (only visuospatial working memory, not verbal working memory). For inhibition and cognitive flexibility training, there is variability in difficulty level, but not in task instructions, context in which the training is performed or in the manner in which inhibition or cognitive flexibility is trained. Furthermore, for inhibition, only response inhibition is trained, while there are several other kinds of inhibition as well (Nigg, 2000). For CWMT, most of these arguments also apply; although several different working memory tasks are trained, there is little correspondence to real-life experiences and only one EF is trained at a time. Before abandoning the field of EF training programs, more ecologically valid EF training programs should be investigated for effectivity in improving EF and generalization of EF improvements to other functions such as attention. Focus could also shift to other promising interventions. Several activities seem to improve EFs in children in the general population, including traditional martial arts, aerobics, yoga, mindfulness, and several school curricula (Diamond, 2012). It has been suggested that especially interventions that address both EFs and children's emotional, social and character development are effective (Diamond, 2012). Furthermore, two meta-analyses have shown that acute and longitudinal physical activity has positive effects on EF, attention and academic performance in children in the general population (Verburgh et al., 2014;de Greeff et al., 2018). Interventions as mentioned above have not yet been investigated in the very preterm population and thus should be subject of further research.

Strengths and Limitations
Strengths of the current study are the incorporation of both a placebo training-and a waitlist-control arm, the use of intentionto-treat analyses, the objective measure of attentional functioning (efficiency of attention networks), the comprehensive assessment of behavioral and emotional functioning by both parents and teachers, and the assessment of both direct and longer-term effects. A limitation is that we failed to achieve our calculated sample, however, differences over time between groups were small and not clinically meaningful. Another limitation is the relatively high number of missing teacher SDQ questionnaires, however, as results on these measures are highly similar to results on the other outcome measures, we expect that a lower number of missing questionnaires would not have led to different results. As also in other studies using questionnaires (Simons et al., 2019), response rate on the CBCL in our study was low and possibly biased toward families of higher socio-economic status. Last, children with severe neonatal complications (IVH grade III or IV) were not excluded if they met inclusion criteria, which could have increased variability within the sample. However, sensitivity analyses including only children without severe neonatal complications were performed and results remained essentially unchanged.

CONCLUSION
A computerized, game-formatted EF training program does not improve performance measures of attention, parent-or teacher rated behavioral and emotional functioning or self-perceived competence in very preterm children.

DATA AVAILABILITY
The datasets generated for this study are available on request to the corresponding author.

ETHICS STATEMENT
This study was approved by the Medical Ethical Committee of Amsterdam University Medical Centers and University Medical Center Utrecht. Written informed consent was obtained from the parents of all participants. Written informed consent was also obtained from participants aged >11 years.

AUTHOR CONTRIBUTIONS
CH contributed to the conceptualization, design, and methodology of the study, responsible for the outcome assessments and data collection, carried out the data analyses and interpretation, and wrote the manuscript. CA-M contributed to the conceptualization and design of the study, funding acquisition, data analysis methodology and data interpretation, overall supervision, and reviewed and revised the manuscript. AW-L contributed to the conceptualization and design of the study, funding acquisition, data interpretation, overall supervision, and reviewed and revised the manuscript. AL and CK-E contributed to the resources (participants), supervision in one of the Medical Centers, and reviewed and revised the manuscript. AK contributed to the conceptualization, design, and methodology of the study, data interpretation, overall supervision, and reviewed and revised the manuscript. JO contributed to the conceptualization, design, and methodology of the study, funding acquisition, data analysis methodology and data interpretation, overall supervision, and reviewed and revised the manuscript.

FUNDING
Grants for this research were obtained from the Stichting Kinderpostzegels Nederland, Cornelia Stichting, Stichting Zabawas, and Stichting Elise Mathilde Fonds.