Sleep-Dependent Consolidation of Rewarded Behavior Is Diminished in Children with Attention Deficit Hyperactivity Disorder and a Comorbid Disorder of Social Behavior

Children suffering from attention-deficit hyperactivity disorder (ADHD) often also display impaired learning and memory. Previous research has documented aberrant reward processing in ADHD as well as impaired sleep-dependent consolidation of declarative memory. We investigated whether sleep also fosters the consolidation of behavior learned by probabilistic reward and whether ADHD patients with a comorbid disorder of social behavior show deficits in this memory domain, too. A group of 17 ADHD patients with comorbid disorders of social behavior aged 8–12 years and healthy controls matched for age, IQ, and handedness took part in the experiment. During the encoding task, children worked on a probabilistic learning task acquiring behavioral preferences for stimuli rewarded most often. After a 12-hr retention interval of either sleep at night or wakefulness during the day, a reversal task was presented where the contingencies were reversed. Consolidation of rewarded behavior is indicated by greater resistance to reversal learning. We found that healthy children consolidate rewarded behavior better during a night of sleep than during a day awake and that the sleep-dependent consolidation of rewarded behavior by trend correlates with non-REM sleep but not with REM sleep. In contrast, children with ADHD and comorbid disorders of social behavior do not show sleep-dependent consolidation of rewarded behavior. Moreover, their consolidation of rewarded behavior does not correlate with sleep. The results indicate that dysfunctional sleep in children suffering from ADHD and disorders of social behavior might be a crucial factor in the consolidation of behavior learned by reward.

However, a majority of children admitted for treatment in psychiatric institutions suffers from comorbid disorders, younger inpatients most often from conduct disorders (CD) and oppositional defiant disorders (ODD) (Yoshimasu et al., 2012). ADHD itself is a risk factor for school failure, and ADHD in combination with CD/ODD has an even worse prognosis (Kessler et al., 2014). Disturbed reward processing is a key neuropsychological feature in ADHD and in addition to attention deficits, may be a crucial factor in school failure (Luman et al., 2010;Tripp and Wickens, 2012;Silvetti et al., 2013;Plichta and Scheres, 2014;Tomasi and Volkow, 2014). Moreover, impaired reward processing as indicated by aberrant prefrontal cortex activation is a predictor for the persistence of ADHD symptoms into adulthood (Wetterling et al., 2015). Here, we focus on the role of sleep-dependent consolidation of rewarded behavior in children suffering from ADHD. First, we summarize recent studies on sleep-dependent consolidation in ADHD and then we highlight some studies on reward processing in ADHD.
It has been firmly established that sleep fosters the consolidation of declarative and procedural memory in adults (Diekelmann and Born, 2010;Rasch and Born, 2013) and declarative memory in children (Wilhelm et al., 2012). Several studies support the hypothesis that the prospect of reward can foster sleep-dependent consolidation of declarative memories in healthy adults (Tucker et al., 2011;van Dongen et al., 2012;Feld et al., 2014). Perogamvros and Schwartz (2012) suggest that reward-related memories become reactivated during REM-sleep which in turn strengthens their consolidation. Children suffering from ADHD show impairment in subjective and objective measures of sleep quality (Cortese et al., 2009), sleep microstructure (Ringli et al., 2013;Akinci et al., 2015), increased daytime sleepiness (Wiebe et al., 2013), and a higher rate of sleep disorders like restless legs or periodic limb movements (Kirov and Brand, 2014). The risk for behavioral sleep problems seems to be especially high in ADHD-patients with comorbid internalizing or externalizing disorders (Lycett et al., 2015). Moreover, sleep disorders can cause ADHD-like symptoms (Fischman et al., 2015), and behavioral sleep interventions can help to alleviate symptoms in patients with ADHD . For example, a study by Kershavarzi and colleagues highlights the role of sleep in social behavior (Keshavarzi et al., 2014). The authors showed that in ADHD patients a sleep hygiene training as compared to a control condition and to healthy controls not only improved sleep, mood, and psychological functioning but also social relationships and social acceptance. In our previous studies, we found that children with ADHD show reduced sleep-associated consolidation of declarative memory (Prehn-Kristensen et al., 2011) and that the deficit is especially pronounced when emotionally relevant material is learned (Prehn-Kristensen et al., 2013). On the other hand, increasing slow oscillations during slow-wave sleep by transcranial oscillating, direct-current stimulation can improve declarative memory consolidation in children with ADHD (Prehn-Kristensen et al., 2014).
While the focus of research on sleep-associated consolidation in children has been on declarative and procedural memory (Wilhelm et al., 2012), the research on learning and memory in ADHD has emphasized reward learning as a core deficit (Luman et al., 2010;Silvetti et al., 2013). Aberrant reward processing in ADHD is intimately linked to the dopamine hypofunction of the prefrontal cortex and striatum and can in part be normalized by dopaminergic medication like methylphenidate (MPH; Tripp and Wickens, 2012;Volkow et al., 2012;Tomasi and Volkow, 2014). Deficits include altered sensitivity to reward and/or punishment in probabilistic learning tasks (Luman et al., 2010;Maia and Frank, 2011). In Patients suffering from ADHD and comorbid CD/ODD, the deficits in reward processing are more pronounced (Groen et al., 2013) or, with regard to reward learning, only prevalent in comorbid patients (Luman et al., 2015). In ADHD patients the acquisition of stimulus-response mappings by probabilistic feedback (Luman et al., 2015) and their reversal (Wetterling et al., 2015) seem to be intact, but the mechanisms of learning are abnormal (Frank et al., 2007;Wetterling et al., 2015). Moreover, ADHD patients perform better when continuous or frequent feedback is provided but worse when infrequent probabilistic feedback is provided (Luman et al., 2010). Based on the paradigms of Frank et al. (2007) and Swainson et al. (2000), we developed a probabilistic learning and reversal task appropriate for children using increasing frequencies of valid feedback. With reference to Perogamvros and Schwartz (2012), we hypothesize that sleep fixates the stimulus-response mapping previously learned by reward and punishment by reactivating dopaminergic pathways during REM sleep and thereby makes it resistant to remapping during reversal learning.
Here, we focus on severely affected ADHD patients who were treated as in-patients in our clinic. All of the patients suffered from a comorbid disorder of social behavior (CD or ODD). It has been argued that this group poses a diagnostic entity separate from pure ADHD (Banaschewski et al., 2003). These patients also pose a great challenge for the therapeutic and pedagogic team because their learning difficulties most often extend to learning from feedback. Since children with ADHD are known to show deficits in reward learning, as well as alterations in the sleep-associated memory consolidation, and sleep is assumed to support the consolidation of reward-associated memory, we expect that children with ADHD + CD/ODD show diminished sleep-dependent consolidation of rewarded behavior. To investigate reward learning we used a probabilistic, twoalternative forced choice task and -after a retention interval of sleep or wakefulness -reversed the reward contingencies to measure consolidation of the previously learned contingencies. Healthy controls are expected to consolidate the previously learned behavior stronger during sleep than during wake and hence show more resistance to relearning after sleep as compared to wake whereas patients with ADHD + CD/ODD are supposed to consolidate less, especially during sleep (hypothesis 1). Moreover, we expect a correlation between the amount of REM sleep and the amount of sleep-dependent consolidation in the healthy controls but not in the patients (hypothesis 2).

Participants
A sample of 34 children (17 ADHD + CD/ODD, 17 Controls) took part in the study. Healthy controls were recruited via advertisements in local journals. Patients were referred to our study from the Clinic of Child and Adolescent Psychiatry and Psychotherapy of the University of Kiel. All parents of the participants gave written informed consent. All children gave informed assent and they were reimbursed with a voucher for their participation. The study protocol was approved by the ethics committee of the medical faculty of the University of Kiel and followed the ethical standards of the Helsinki declaration.
According to DSM-IV-TR criteria (American Psychiatric Association [APA], 2013), all patients suffered from attention deficit hyperactivity disorder (15 patients with combined type, 314.01; two patients with predominantly inattentive type, 314.0). Furthermore, three patients suffered from comorbid CD (312.81) and 14 from oppositional defiant disorder (313.81). Two of the three patients suffering from CD also fulfilled the diagnostic criteria for an oppositional defiant disorder. One patient also suffered from enuresis (307.6). Six patients fulfilled the diagnostic criteria of a combined disorder of reading and written expression (315.0 and 315.2, ICD-10 F81.0) and five more patients had only subclinical symptoms. No further comorbidities were diagnosed. In all, 13 patients took MPH but discontinued medication 48 h (approximately 12 half-lives) prior to the experimental sessions. None of the control children suffered from any psychiatric disorder. To secure the diagnoses in the patients and to preclude any psychiatric disorders in the controls, all children and their parents were interviewed using a German translation of the Revised Schedule for Affective Disorders and Schizophrenia for School-Age Children: Present and Lifetime Version (K-SADS-PL; ) (Kaufman et al., 1997;Delmo et al., 2000). Furthermore, the Child Behavior Checklist (CBCL) (Achenbach, 1991) was filled out by the parents to assess any psychiatric symptoms of their children. The patients received significantly higher ratings on the internalizing problems scale (t 32 = 3.05, p = 0.005, d = 1.05; descriptive statistics are reported in Table 1) and especially on the externalizing problems scale (t 32 = 9.02, p < 0.001, d = 3.09).
Since ADHD with comorbid disorders of social behavior is more often diagnosed in boys than in girls (Nussbaum, 2012), only boys were included in the sample. The age of patients and healthy controls ranged from 8 to 12 years and did not differ between the groups (t 32 = 0.49, p = 0.629, d = 0.17; for descriptive statistics see Table 1). As assessed using the Edinburgh Handedness Inventory (Oldfield, 1971), 15 of the patients and 16 of the healthy controls were righthanded. All participants had normal or corrected-to-normal vision. Candidates were excluded from the study if they had: (1) below average intelligence with an IQ < 85, as measured by the Culture Fair Intelligence Test Revised Version (CFT-20-R) (Weiß, 2006), (2) significant memory impairment as measured by the Diagnosticum für Cerebralschädigung (DCS) with scores below the 16th percentile (Lamberti and Weidlich, 1999), (3) advanced puberty as measured by Pubertal Development Scale (PDS; total score >7) (Watzlawik, 2009), (4) any medical condition or impairment that would interfere with the ability to participate in the study as assessed by interview, or (5) any sleep disorders. We screened for sleep disturbances using the parent-reported Children's Sleep Habits Questionnaire (CSHQ) (Owens et al., 2000;Schlarb, 2011) and the children's Sleep Self-Report questionnaire (SSR) (Schwerdtle et al., 2010). Moreover, the polysomnograms of the adaptation nights were examined for symptoms of sleep disorders by a trained sleep lab technician. No abnormal sleep patterns or sleep disorders were detected. The demographic and clinical characteristics of the remaining 34 participants are reported in Table 1. Patients and healthy controls were comparable with respect to IQ (t 32 = -1.54, p = 0.134, d = -0.53) and pubertal state (t 32 = 1.12, p = 0.270, d = 0.39), but the patients showed lower memory performance in the DCS (t 32 = -2.13, p = 0.041, d = -0.73) and worse sleep quality reported in the SSR (t 32 = 2.47, p = 0.019, d = 0.846) or the CSHQ (t 32 = 3.21, p = 0.003, d = 1.10).

Probabilistic Learning and Reversal Task
In the so-called "pirate game" the children are asked to explore treasure islands (see Figure 1). Two equivalent versions of the pirate game with different stimulus sets were programmed in Presentation R software (Version 14.9, Neurobehavioral Systems inc.) and the versions were approximately counterbalanced over conditions and order. In each trial, two pictures of islands are presented and the child has to decide which island to explore. Participants indicate their choice by pressing the corresponding mouse buttons. If the "correct" island is chosen, the picture of the island is replaced by a picture of a treasure, a sound of children cheering "yeah" is played, and the treasure counter is colored green and incremented by one (reward). If the "wrong" island is chosen, the island is replaced by a jolly roger, a disappointed voice uttering "ohhh" is played, and the treasure counter is colored red and decremented by one (punishment). During a block of 33 trials, the same islands are repeatedly shown on the left or right side of the monitor in pseudorandom, counterbalanced order. The participants were instructed to learn by trial and error to approach the island on which a treasure is hidden more often and to avoid the island which is inhabited by pirates more often. Probabilistic feedback was provided according to a reinforcement schedule with increasingly valid feedback: In the first third of the trials, the target island was correct with a frequency of 7/11 (≈63.6%) and wrong with a frequency of 4/11 (≈36.4%). In the second third the reward frequency was increased to 8/11 (≈72.7%) and the punishment frequency decreased to 3/11 (≈27.3%). Finally, in the last third, the reward frequency reached 9/11 (≈81.8%) and the punishment frequency 2/11 (≈18.2%). This schedule was chosen to allow the assessment of performance differences between the groups in the beginning of each block while ensuring that both groups encode the "correct" island equally well until the end of the block (compare results section).
During the encoding session before the retention interval, the children played three blocks of 33 trials (encoding). In each block, a unique set of island pictures was used. After the retention interval containing sleep or wake, the children played three blocks with entirely new picture sets (learning) to FIGURE 1 | Probabilistic learning and reversal task ("pirate game"). In each trial, two pictures of islands are presented and the participant has to decide which island to explore. If the "correct" island is chosen, the picture of the island is replaced by a picture of a treasure, a sound of children cheering "yeah" is played, and the treasure counter is colored green and incremented by one (reward). If the "wrong" island is chosen, the island is replaced by a jolly roger, a disappointed voice uttering "ohhh" is played, and the treasure counter is colored red and decremented by one (punishment). The participants were instructed to learn by trial and error to approach the island on which a treasure is hidden more often and to avoid the island which is inhabited by pirates more often. The pictures above are merely symbolic. The actual pictures were color photos sampled from the internet.
obtain a control measure for the influence of sleep vs. wake on learning performance. The crucial reversal learning block was identical to one of the encoding blocks using the same islands and the same reinforcement frequencies (reversal). However, the reinforcement schedule was reversed: now, formerly "correct" islands were "wrong" and vice versa. The rationale behind the reversal learning block was to test whether sleep helps to consolidate the stimulus-response mapping and therefore make it harder to reverse it during reversal learning. In other words, participants are expected to persist in preferring the formerly "correct" island as an indicator of the consolidation of rewarded behavior.

Sleep Recording
All participants spent an adaptation night and a test night in the sleep laboratory separated by at least one night for recovery from potential sleep loss during the first night. The adaptation night's purpose was to exclude severe sleep disorders and to help participants adapt to the conditions in the sleep laboratory. During both nights sleep was recorded by standard procedures using a digital electroencephalogram (EEG), electromyogram (EMG) and electrooculogram (EOG). To amplify and record the data, a SOMNOscreen PSG plus (SOMNOmedics, Randersacker, Germany) was used. The EEG was recorded at a sampling rate of 128 Hz with a band-pass filter of 0.2-35 Hz using multi-use Ag/AgCl-electrodes attached to the positions C3 and C4 according to the 10-20 system referenced to an electrode on the bridge of the nose and with a ground placed at Fpz. A diagonal EOG was recorded at a sampling rate of 128 Hz with a band-pass filter of 0.2-75 Hz using single-use Ag/AgCl-electrodes attached to the lower right and upper left canthi. A bipolar EMG was recorded at a sampling rate of 256 Hz with a band-pass filter of 0.2-128 Hz using singleuse Ag/AgCl-electrodes attached to the chin. Only during the adaptation night did we additionally record an EMG from the anterior tibial muscles, a bipolar electrocardiogram, nasal air flow using a thermistor, and respiratory thorax excursions using a belt sensor. All sleep data were visually scored according to the criteria by Rechtschaffen and Kales (1968) by a certified rater unaware of the hypotheses. The following macro-sleep parameters were obtained: sleep stages 1 to 4 and REM sleep (in min), time in bed (in min), lights off, lights on, sleeponset latency (time in min from lights off to first epoch of sleep stage 2), total sleep time (in min), sleep efficiency (ratio of total sleep time to time in bed in percent), number of awakenings, and duration of wakefulness after sleep onset (in min).
To control for effects of sleep on wakefulness and mood, the participants kept sleep logs, rated their wakefulness on a visual analog scale (ranging from 0 to 100) and their mood on the valence and arousal scales of the self-assessment manikin (SAM) (Bradley and Lang, 1994) in the morning and in the evening in the sleep as well as the wake condition. Furthermore, before every encoding or reversal session, the alertness test from the test battery of attentional performance for children (KiTAP) (Zimmermann et al., 2002) was administered.

Procedure
The participants took part in two diagnostic sessions and two experimental conditions, each comprised of two sessions with an interposed delay of 12 h. During the first diagnostic session, the children and their parents were interviewed independently by trained psychologists, and tests and questionnaires were administered. The second diagnostic session was the adaptation night and took place at least two days before the experimental sleep condition. In the experimental wake condition the children arrived at the laboratory at 8:00 a.m., filled out the sleep log and rating scales, completed the alertness test, and played three blocks (i.e., pairs of islands) of the pirate game (encoding). The participants were instructed not to sleep during the day and attend to their usual daily routines. Twelve hours later (8:00 p.m.) the children came back to the laboratory, filled out log and rating scales again, performed the alertness test, and played three new blocks of the pirate game (learning) as well as on one block with reversed contingencies (reversal).
In the sleep condition, the children arrived at the laboratory at 8 p.m. on the day of the experimental night and worked on the logs, scales, test and pirate game as described above. The electrodes for the PSG were attached around 9 p.m., lights off was at 9:30 p.m. and lights on at 7:00 a.m. Again, the second part of the testing took place 12 h after encoding. The sleep and the wake conditions were at least two weeks apart and the conditions and parallelized stimulus sets were approximately counterbalanced over groups.

Data Processing and Statistical Analysis
First, we describe the preprocessing of the behavioral data from the probabilistic learning and reversal task. Second, we delineate the inference statistics and how we dealt with the control variables.
In the probabilistic learning paradigm, we counted the choices of the "correct" target stimulus which was followed by reward more often. In a few trials, some participants clicked the mouse multiple times so that responses were carried over to the next trial. These responses, indicated by reaction times shorter than 100 ms in the following trial, were deleted and interpolated with a random choice. The patients showed significantly more multiple button presses than healthy controls (t 16.8 = 2.19, p = 0.043, d = 0.75). However, multiple reactions were very scarce, amounting to only 0.65% ± 0.27% (MEAN ± SEM) in the patients and 0.06% ± 0.04% in the healthy controls. The resulting learning curves were then filtered with a two-way moving average with a window size of five to reduce random noise in the choice data (Smith, 1997). To measure encoding success, we calculated the relative frequency of correct choices during the last thirds of the three encoding blocks (see Figure 2). To obtain a measure for the consolidation of the stimulusresponse mappings, we calculated the difference of the relative frequency of correct responses during the first five trials of the reversal learning block and the encoding success (reversalencoding; see Figure 2). To control for influences of time of day on learning performance per se, we calculated the correct responses during the first five trials of the three new learning blocks after the retention interval.
To control whether the participants learned to prefer the island associated with reward more often during encoding, we computed t-tests comparing the performance during the last thirds of the three encoding blocks against guessing frequency (0.5). To evaluate whether sleep effected the consolidation of learned behavior differently in the ADHD + CD/ODD patients compared to healthy controls, we calculated an ANOVA using SLEEP (sleep vs. wake) and LEARNING (encoding vs. reversal) as within-subject factors and GROUP (ADHD vs. control) as a between-subject factor. Significant effects were resolved using Bonferroni-adjusted post hoc contrasts. Also, the difference between the performance during encoding and reversal in the sleep condition as a measure of sleep-dependent memory retention was correlated with the amounts of REM sleep and non-REM sleep. To control for differences in learning performance, we computed an ANOVA of the control measure learning using SLEEP as within and GROUP as between factors. To test whether the manipulation had any effect on subjective wakefulness, valence, arousal or objective alertness and learning performance, we computed ANOVAs with TIME (morning vs. evening) as a within-subject and GROUP (ADHD vs. control) as a between-subject factor with post hoc contrasts. In the case of a significant main effect of GROUP or a significant interaction of TIME and GROUP, the main analysis of the data from the probabilistic learning and reversal task was repeated using the respective control variables as covariates. To compare the groups regarding sleep parameters and questionnaire data, t-tests were used. Descriptives are reported as a mean ± standard error of the mean (SEM). For all t-tests, Cohen's d was computed as a measure of effect size. Pearson's correlation coefficients and partial η 2 from the analyses of variance were converted to d values according to Cohen (1988) and Rosenthal (1994). FIGURE 2 | Sleep-dependent consolidation of rewarded behavior. The line graphics on the left side depict the relative frequencies of correct choices of the target islands. Encoding refers to the last thirds of the learning blocks prior to the retention intervals. After the retention intervals, the reinforcement schedule is reversed and the formerly "correct" islands become "wrong" and vice versa. Reversal refers to the first five trials of the reversal block after the retention interval. The bar graph on the right side depicts the difference of reversal and encoding. Larger negative values indicate a consolidation of the learned preferences, i.e., consolidation of rewarded behavior. The bars represent means ± standard error of means. ADHD, attention deficit hyperactivity disorder.

Probabilistic Learning and Reversal
All participants learned to prefer the island associated with reward more often during encoding. The relative frequencies of correct choices during the last thirds of the encoding blocks were significantly higher than the guessing frequency 0.5 in healthy controls (prior to wake: t 16 = 10.15, p < 0.001, d = 2.46 /prior to sleep: t 16 = 8.59, p < 0.001, d = 2.08) as well as in patients (prior to wake: t 16 = 10.54, p < 0.001, d = 2.56/prior to sleep: t 16 = 9.56, p < 0.001, d = 2.32, also see Table 3 and Figure 2). Please note that on a descriptive level all values of individual participants were greater than the guessing frequency 0.5. This confirms that all participants encoded the rewarded behavior, namely the preference for the correct island.
The performance was significantly higher at the end of the encoding blocks as compared to the beginning of the reversal block (main effect of LEARNING, F 1,32 = 76.90, p < 0.001, d = 3.10; also see Table 3 and Figure 2). This illustrates that the previously learned preferences were retained in memory and had to be relearned during reversal. There were no main effects of GROUP (F 1,32 = 0.68, p = 0.417, d = 0.29) or SLEEP (F 1,32 = 1.61, p = 0.214, d = 0.49) on the performance as well as no interaction effects of LEARNING and SLEEP (F 1,32 = 0.48, p = 0.495, d = 0.24) or LEARNING and GROUP (F 1,32 = 1.12, p = 0.298, d = 0.37). However, there was a significant interaction effect of SLEEP and GROUP (F 1,32 = 4.39, p = 0.044, d = 0.74) which was qualified by a significant three-way interaction of SLEEP, LEARNING and GROUP (F 1,32 = 5.28, p = 0.028, d = 0.81). Comparing the performance of encoding and reversal by post hoc contrasts, we found that performance of the control participants dropped under the wake (p = 0.027, d = 0.74) as well as under the sleep condition (p < 0.001, d = 1.42). In patients performance significantly dropped in the wake (p < 0.001, d = 1.12) and, by trend, in the sleep condition (p = 0.094, d = 0.62). Again, this confirms that in general contingencies were retained in memory in both groups under both conditions. However, the drop of performance did not differ between sleep and wake in the patients (p = 0.264, d = 0.28) but was stronger during sleep than during wake in the control participants (p = 0.042, d = -0.51), and this double difference was significant (p = 0.028, d = 0.79, see Figure 2). Hence, only ADHD, attention deficit hyperactivity disorder; SEM, standard error of the mean; S1-S4 and REM sleep rated according to the criteria of Rechtschaffen and Kales (1968). Relative frequencies of correct choices of the target islands are reported. Encoding refers to the learning blocks prior to the retention intervals. After the retention intervals, during reversal learning the reinforcement schedule is reversed and the formerly "correct" islands become "wrong" and vice versa. ADHD, attention deficit hyperactivity disorder; SEM, standard error of mean; + p = 0.055; * p < 0.05; * * * p < 0.001.
the control participants showed sleep-dependent consolidation of rewarded behavior (hypothesis 1). Note that learning performance per se did not differ between conditions or groups. The ANOVA of the additional control measure learning revealed no main effect of SLEEP (F 1;32 = 0.01, p = 0.930, d = 0.03), no main effect of GROUP (F 1,32 = 1.47, p = 0.234, d = 0.43), and no interaction of SLEEP and GROUP (F 1,32 = 1.41, p = 0.243, d = 0.42). This also illustrates that time of day did not influence learning performance as this would have produced a main effect of condition (SLEEP) in this control measure. Furthermore, our data preprocessing did not bias the results: The interaction effect reported above would remain significant (p = 0.026, d = 0.86) if the two outlier-patients with the highest rate of multiple reactions were excluded from the analysis. Filtering slightly improved the significance of the interaction effect (p = 0.044 to p = 0.028/d = 0.74 to d = 0.81) but did not change the results.
It was assumed that REM sleep fosters the consolidation of rewarded behavior in healthy controls (Perogamvros and Schwartz, 2012). Therefore, REM sleep was expected to correlate with the magnitude of the drop in performance during the night, whereas non-REM sleep should not (hypothesis 2). In ADHD there were only non-significant correlations of the performance drop with REM sleep (r = 0.322, n = 17, p = 0.207, d = 0.68) and non-REM sleep (r = 0.138, n = 17, p = 0.597, d = 0.28), and the difference between these correlations was not significant either (z = 0.640, p = 0.261). In contrast, the control participants showed a significantly higher (z = -1.819, p = 0.034) correlation of performance drop with non-REM sleep (r = 0.441, n = 17, p = 0.076, d = 0.98) than with REM sleep (r = -0.233, n = 17, p = 0.368, d = -0.48). Finally, we analyzed REM latency and REM density but did not find any significant correlations with the consolidation of rewarded behavior in healthy controls (all p > 0.276, all d < | 0.58| ) or in patients (all p > 0.215, all d < | 0.61| ).

Control Variables
To exclude the possible interpretation that the diminished sleepdependent consolidation of rewarded behavior is caused by overall psychopathology we repeated the analysis reported above using LEARNING and SLEEP as within-subject factors, GROUP as a between-subject factor and the overall score of the CBCL a covariate. However, the three-way interaction reported remains significant (p = 0.015, d = 0.94). The same is true when we use the score of the nonverbal memory test DCS as a covariate (p = 0.038, d = 0.78).
Furthermore, we controlled for effects of TIME of day on alertness and subjective valence, arousal and wakefulness. The TIME of day (F 1,32 = 6.90, p = 0.013, d = 0.93) and GROUP (F 1,32 = 6.12, p = 0.019, d = 0.87) had an impact on the reaction times in the alertness test and there was a trend toward an interaction (F 1,32 = 3.39, p = 0.075, d = 0.65). Post hoc contrasts revealed that patients were slower than controls in the mornings (p = 0.046, d = -0.71) and in the evenings (p = 0.014, d = -0.90; descriptive data not shown). Furthermore, patients slowed down during the day (p = 0.003, d = -0.62) as opposed to the controls (p = 0.582, d = -0.20). Therefore, we used the performance in the alertness test in the morning and evening as covariates in an ANCOVA of the performance in the probabilistic learning and reversal task. However, the interaction effect reported above stays stable (F 1,30 = 5.87, p = 0.022, d = 0.89).
Neither TIME of day nor GROUP had any influence on the valence (all p > 0.159, all d < 0.51) or arousal ratings (all p > 0.139, all d < 0.54). However, the analysis of the subjective wakefulness revealed a significant interaction of TIME of day and GROUP (F 1,32 = 9.60, p = 0.004, d = 1.10) but no main effects (all p > 0.270, d < 0.40 ). Post hoc contrasts showed that the groups' wakefulness did not differ in the mornings (p = 0.123, d = -0.54) but in the evenings (p = 0.047, d = 0.71) and that the patients' wakefulness declined during the day (p = 0.005, d = -0.71) as opposed the controls' wakefulness (p = 0.172, d = 0.35). Again, we used the subjective wakefulness in the mornings and evenings as covariates in an ANCOVA of the performance in the probabilistic learning and reversal task. The interaction effect reported above was not significant any longer (F 1,30 = 2.50, p = 0.125, d = 0.58) but on a descriptive level, the adjusted means pointed in the same direction as reported above.

DISCUSSION
It is important to shed some more light on the learning difficulties of children suffering from ADHD, especially those with the worst prognosis affected by comorbid disorders of social behavior. In our study, we focused on the consolidation of behavior learned by reward and the role of sleep in the consolidation process. We found that typically developing control children consolidate rewarded behavior better during a night of sleep than during a day awake and that the sleep-dependent consolidation of rewarded behavior by trend correlated with non-REM sleep but not with REM sleep. In contrast, children with ADHD and comorbid disorders of social behavior do not show sleepdependent consolidation of rewarded behavior (hypothesis 1). Moreover, their retention of rewarded behavior over sleep did not correlate with sleep, especially not with REM sleep (hypothesis 2).
Our results extend our previous studies, in which we showed that sleep promotes the consolidation of declarative memory in healthy children but not in children with ADHD (Prehn-Kristensen et al., 2011) and that this deficit is especially pronounced for emotional declarative memory (Prehn-Kristensen et al., 2013). The current study extends this conclusion to the domain of reward learning: Sleep seems to foster the consolidation of rewarded behavior in healthy children but not in children suffering from ADHD + CD/ODD (hypothesis 1). We did not find any main effects of group on learning performance or resistance to reversal learning. This parallels another study using instrumental learning with probabilistic feedback where patients with ADHD showed normal learning curves (Luman et al., 2015). Furthermore, a study using probabilistic reversal learning also did not find differences between ADHD patients and controls during reversal, but altered activation of the frontostriatal reward network in ADHD patients predicted whether the symptoms of ADHD persisted (Wetterling et al., 2015). This highlights the necessity of looking at the functionality of brain activity during the consolidation of rewarded behavior in ADHD. In our study, we made a first step by showing that sleep-dependent consolidation of rewarded behavior is diminished in ADHD.
It could be argued that typically patients with ADHD show less motivation and/or a greater tendency to perseverate than the healthy controls and that this could produce slower relearning during reversal. However, there are several controls implemented in the design of our study: A lack of motivation would have caused a main effect of the factor GROUP regarding the encoding performance. However, the groups did not differ regarding the performance during encoding nor during the additional encoding control task. Moreover, the main result of our study is a three-way interaction of the between-subject factor GROUP and the within-subject factors SLEEP (wake vs. sleep) and LEARNING (encoding vs. reversal). We calculated several ANOVAs and ANCOVAs to demonstrate the robustness of this result. If any distinctive feature of the patient group, like a lack of motivation or a tendency to perseveration, would have influenced the performance in our paradigm it would have produced a main effect of GROUP or an interaction of GROUP and LEARNING but not a three-way interaction of GROUP, LEARNING, and SLEEP. Therefore we assume that dysfunctional sleep caused the three-way interaction, supporting our claim that sleep fosters the consolidation of rewarded behavior in healthy children but not in children suffering from ADHD + CD/ODD.
According to the reward activation model by Perogamvros and Schwartz (2012), the ventral tegmental area is activated predominantly during REM sleep and replays neural burst firing patterns associated with reward processing, thereby fostering memory consolidation. In essence, we would have expected a strong correlation between the amount of REM sleep and the consolidation of rewarded behavior (hypothesis 2). Contrary to expectations, the consolidation of rewarded behavior showed a stronger correlation with non-REM sleep than with REM sleep only in healthy children. Furthermore, the consolidation of rewarded behavior did not correlate with REM latency or REM density. Therefore, we cannot confirm the reward activation model. Instead, the correlation of the consolidation of rewarded behavior with non-REM sleep may be attributed to explicit aspects of the task. Patients suffering from anterograde episodic amnesia can still implicitly learn using probabilistic reward, but it has also been shown that explicit knowledge of the task structure facilitates learning (Speekenbrink et al., 2008). Furthermore, it has been firmly established that non-REM sleep, especially slow wave sleep, fosters the consolidation of declarative memory (Diekelmann and Born, 2010). Therefore, explicit aspects of the task might have been consolidated during non-REM sleep in healthy children.
Although the macroscopic sleep parameters investigated in this study did not differ between ADHD + CD/ODD patients and controls, the functional role of sleep in the consolidation of rewarded behavior seems to be impaired in the patients. This is in accord with our previous research which did not show differences in macroscopic sleep parameters between ADHD patients and controls either (Prehn-Kristensen et al., 2013). However, in a study utilizing transcranial oscillatory direct current stimulation during sleep, we were able to experimentally increase slow oscillation power during S4 in children with ADHD which was accompanied by a normalization of sleep-dependent consolidation of declarative memory (Prehn-Kristensen et al., 2014). Since in the present study we found a trend toward a correlation between sleep-dependent consolidation of rewarded behavior and non-REM sleep in healthy children, it seems likely that parameters like slow oscillations or sleep spindles might be involved. This would match a recent study, in which fast sleep spindles and delta power during non-REM sleep were shown to help in the development of procedural skills (Fogel et al., 2015). In future studies the amount of REM sleep and slow-wave sleep in children suffering from ADHD + CD/ODD should be manipulated -e.g., using the splitnight paradigm (Yaroush et al., 1971) -to further assess the role of REM sleep and non-REM sleep as well as their accompanying waveforms on the consolidation of rewarded behavior.
A limitation of our study results from the comorbidity. It has been argued that patients suffering from ADHD and a disorder of social behavior pose a diagnostic entity separate from patients with ADHD but without a disorder of social behavior (Banaschewski et al., 2003). This was also reflected in the ICD-10 diagnosis hyperkinetic CD (F90.1). Therefore our results cannot be extended to all presentations of ADHD as defined by the DSMIV-TR or the DSM-V. However, previous studies also found diminished sleep-dependent consolidation of declarative memories in ADHD patients without CD/ODD (Wilhelm et al., 2012). In the present study, we focus on severely affected ADHD patients with CD/ODD because they pose a great challenge for behavior therapists due to their difficulties to learn from feedback. Future studies should investigate whether sleep-dependent consolidation of rewarded behavior is also diminished in lighter forms of pure ADHD.
Further limitations of the study could arise from circadian influences on learning performance, alertness, mood (valence, arousal), and wakefulness. However, we did not find any effects of time of day or group on learning performance or mood. As expected, ADHD + CD/ODD patients showed slower reaction times in the alertness test and slowed down even more during the day. However, the results concerning the consolidation of rewarded behavior did not change when we entered the alertness scores into an ANCOVA. Furthermore, on the descriptive level the ADHD + CD/ODD patients showed better memory retention during the day than during the night, a fact which cannot be explained by decreasing alertness during the day. ADHD + CD/ODD patients' subjective wakefulness declined more rapidly than that of the controls. The ANCOVA showed that subjective wakefulness explained some variance in the consolidation of rewarded behavior, but the direction of the effect remained constant. Here again, the decline in wakefulness during the day in ADHD + CD/ODD patients cannot explain why they showed better memory retention during the day than during the night. Therefore, it seems unlikely that our results are due to circadian effects on learning performance, alertness, mood, or wakefulness. However, in future studies, hormonal changes should be taken into account because ADHD patients display a delayed melatonin dim-light onset and a flatter slope of the cortisol profile, both of which are probably related to sleep and memory (Imeraj et al., 2012;Bijlenga et al., 2013).
A limitation of our study is the lack of significant single correlations of sleep-dependent consolidation of rewarded behavior with sleep parameters. Only in healthy children did the correlation of consolidation with non-REM sleep approach significance. Our conclusion that non-REM sleep is more important for sleep-dependent consolidation of rewarded behavior rests mainly on the significant differences in correlations. On the other hand, the lack of significant correlations of consolidation with REM sleep does not confirm the reward activation model either. In any case, experimental manipulations of sleep stages using the split-night design or selective sleep deprivation could substantially add to the picture (Wiesner et al., 2015).
To our knowledge, ours is the first study investigating the sleep-dependent consolidation of rewarded behavior in ADHD + CD/ODD. Severely affected patients pose a great challenge for therapeutic and pedagogical interventions. Behavioral therapy using reward is the main approach in these patients. Therefore the consolidation of behavior learned by reward is a topic of high clinical relevance. In summary, our results indicate that healthy children consolidate rewarded behavior better during a night of sleep than during a day awake. Furthermore, sleep-dependent consolidation of rewarded behavior in healthy children correlates by trend with non-REM sleep but not with REM sleep. In contrast, sleep-dependent consolidation of rewarded behavior is diminished in children with ADHD and does not correlate with sleep. This could help to explain why children suffering from ADHD often display impaired learning and memory and are at risk of school failure. Moreover, impaired consolidation of behavior learned by feedback might be a reason why children with ADHD do not adopt newly learned social skills in everyday life as seen in healthy children. Therefore, we recommend taking into account poor sleep quality when treating children with ADHD and a comorbid disorder of social behavior. As Keshavarzi et al. (2014) pointed out, a sleep hygiene training might help to improve both sleep as well as social behavior in children with ADHD.

ETHICS STATEMENT
This study was carried out in accordance with the recommendations of Declaration of Helsinki. All parents of the participants gave written informed consent. All children gave written, informed assent. The study protocol was approved by the ethics committee of the medical faculty of the University of Kiel and followed the ethical standards of the Helsinki declaration.

AUTHOR CONTRIBUTIONS
CW, IM, AP-K, LB designed the study. CW programmed the software. IM and AP-K collected the data. CW, IM, AP-K, LB analyzed and interpreted the data. CW wrote the manuscript. IM, AP-K, LB revised the manuscript. CW, IM, AP-K, LB approved the manuscript. CW, IM, AP-K, LB agreed to be accountable for all aspects of the work. FUNDING Christian Wiesner was supported by a grant (SFB 654 "Plasticity and Sleep") from the Deutsche Forschungsgemeinschaft. However, neither the Deutsche Forschungsgemeinschaft nor any of its associates had any role in the design of the study, in the collection, analysis, or interpretation of data, in the writing of the paper, or in the decision to submit the paper for publication.