Intermittent Theta Burst Stimulation Increases Reward Responsiveness in Individuals with Higher Hedonic Capacity

Background: Repetitive transcranial magnetic stimulation over the left dorsolateral prefrontal cortex (DLPFC) has been documented to influence striatal and orbitofrontal dopaminergic activity implicated in reward processing. However, the exact neuropsychological mechanisms of how DLPFC stimulation may affect the reward system and how trait hedonic capacity may interact with the effects remains to be elucidated. Objective: In this sham-controlled study in healthy individuals, we investigated the effects of a single session of neuronavigated intermittent theta burst stimulation (iTBS) on reward responsiveness, as well as the influence of trait hedonic capacity. Methods: We used a randomized crossover single session iTBS design with an interval of 1 week. We assessed reward responsiveness using a rewarded probabilistic learning task and measured individual trait hedonic capacity (the ability to experience pleasure) with the temporal experience of pleasure scale questionnaire. Results: As expected, the participants developed a response bias toward the most rewarded stimulus (rich stimulus). Reaction time and accuracy for the rich stimulus were respectively shorter and higher as compared to the less rewarded stimulus (lean stimulus). Active or sham stimulation did not seem to influence the outcome. However, when taking into account individual trait hedonic capacity, we found an early significant increase in the response bias only after active iTBS. The higher the individual's trait hedonic capacity, the more the response bias toward the rich stimulus increased after the active stimulation. Conclusion: When taking into account trait hedonic capacity, one active iTBS session over the left DLPFC improved reward responsiveness in healthy male participants with higher hedonic capacity. This suggests that individual differences in hedonic capacity may influence the effects of iTBS on the reward system.

Background: Repetitive transcranial magnetic stimulation over the left dorsolateral prefrontal cortex (DLPFC) has been documented to influence striatal and orbitofrontal dopaminergic activity implicated in reward processing. However, the exact neuropsychological mechanisms of how DLPFC stimulation may affect the reward system and how trait hedonic capacity may interact with the effects remains to be elucidated.
Objective: In this sham-controlled study in healthy individuals, we investigated the effects of a single session of neuronavigated intermittent theta burst stimulation (iTBS) on reward responsiveness, as well as the influence of trait hedonic capacity.
Methods: We used a randomized crossover single session iTBS design with an interval of 1 week. We assessed reward responsiveness using a rewarded probabilistic learning task and measured individual trait hedonic capacity (the ability to experience pleasure) with the temporal experience of pleasure scale questionnaire.

INTRODUCTION
Repetitive transcranial magnetic stimulation (rTMS) is a relatively new therapeutic tool to treat major depressive disorder (MDD). Most frequently applied to the left dorsolateral prefrontal cortex (DLPFC), a series of studies have demonstrated its efficiency in the treatment of this disorder (Berlim et al., 2014;Lefaucheur et al., 2014). Studies combining behavioral and neuroimaging data have shown the modulatory effect of high frequency rTMS (HF-rTMS) on different cognitive processes (for a review see Guse et al., 2010). Although the exact working mechanisms of how HF-rTMS treatment improves mood and cognition in MDD patients remains to be elucidated, one possible pathway could be that the mechanisms of action are modulated by the reward system (Downar et al., 2014). It has already been demonstrated in healthy adults that HF-rTMS over the left DLPFC modulates dopamine release in the anterior cingulate cortex, striatum and orbitofrontal cortex and alleviates anhedonic symptoms (Strafella et al., 2001(Strafella et al., , 2003Pogarell et al., 2007;Cho and Strafella, 2009).
These regions also play a critical role in reinforcement learning Shohamy et al., 2008;Pizzagalli et al., 2009;Kunisato et al., 2012). Pizzagalli et al. (2008) evaluated the effect of the intake of a single dose of dopamine (D2/D3) agonist (pramipexole dihydrochloride) on reinforcement learning in healthy adults. Participants who received the drug (disrupting dopaminergic neurotransmission) exhibited impaired performance (lower response bias toward the most rewarded stimulus) when compared to placebo. In a similar protocol, Pessiglione et al. (2006) demonstrated that, compared to placebo, the intake of drugs known to enhance dopaminergic neurotransmission (L-DOPA) increased their response bias toward the most rewarded stimulus indicating a sharpened responsiveness to reward. Interestingly, using a non-pharmacological approach, Ahn et al. (2013) assessed the effects of a single HF-rTMS session over the left DLPFC on reward responsiveness in 18 healthy male individuals using a probabilistic reward task (Pizzagalli et al., 2005). After active stimulation only, participants showed significantly increased response bias in the early trials, indicating that a single HF-rTMS session over the left DLPFC increased reward responsiveness toward the most rewarded stimulus. However, as participants were only assessed after the stimulation session and not before, no change to baseline could be examined limiting the interpretations of the results.
Given the importance of examining whether the neurophysiological effects of rTMS are mediated by the reward system, in the current sham-controlled study we wanted to further verify whether in a similar sample of young healthy male individuals, one stimulation session would affect reward Abbreviations: rTMS, repetitive transcranial magnetic stimulation; MDD, major depressive disorder; DLPFC, dorsolateral prefrontal cortex; HF-rTMS, high frequency repetitive transcranial magnetic stimulation; TEPS, temporal experience of pleasure scale; iTBS, intermittent theta burst stimulation; TEPS ANT, temporal experience of pleasure scale anticipatory subscale score; TEPS CON, temporal experience of pleasure scale consummatory subscale score; TEPS TOT, temporal experience of pleasure scale total score; RB, response bias; RT, reaction time. responsiveness during probabilistic learning (Pizzagalli et al., 2005). However, individual differences in hedonic capacity, which is the ability to experience pleasure in response to rewarding stimuli, could affect task performance (Sherdell et al., 2012). Therefore, we also assessed the trait hedonic capacity of the participants using the temporal experience of pleasure scale (TEPS) (Gard et al., 2006). As far as we know, the role of individual hedonic capacities on the response to neurostimulation has not yet been investigated in healthy controls.
For the stimulation protocol, we used intermittent theta burst stimulation (iTBS). This kind of stimulation not only reduces significantly the length of the stimulation sessions, making it of high interest for clinical treatment paradigms (Di Lazzaro et al., 2008;Bakker et al., 2015), it is also thought to result in deeper stimulation of the brain and longer lasting stimulatory effects as compared to "classic" HF-rTMS protocols (Huang et al., 2005). We hypothesized that only active iTBS and not sham would positively modulate participants' performance during the completion of the probabilistic learning task. We also hypothesized that trait reward sensitivity would influence the participant's task performance after active iTBS.

MATERIALS AND METHODS
This study was approved by the local ethics committee of Ghent University Hospital and is in accordance with the declaration of Helsinki (2004). This study was part of a larger project investigating the influence of iTBS on neurocognitive markers in healthy controls and depressed patients.

Participants
Twenty two healthy male students, all right-handed and naive to TMS, volunteered to participate in this study. Their mean age was 23.2 years (SD = 3.59). They had no neurological disorders, psychiatric illness or medical history and were screened by a certified psychiatrist for medical contraindications for rTMS following rTMS safety guidelines (Rossi et al., 2009;Lefaucheur et al., 2014). Participants gave written informed consent prior to the start of the study. Participants were financially compensated for their participation (50 euro + a maximum of 20 extra euros, depending on their performance on the probabilistic learning task).

Transcranial Magnetic Stimulation
iTBS stimulation was applied using a Magstim Rapid2 Plus1 magnetic stimulator (Magstim Company Limited, Wales, UK) connected to a 70 mm "figure eight" shaped coil. Before the first stimulation session, the individual resting motor threshold was determined using surface electromyography to measure the minimal stimulation intensity necessary to produce a motor evoked potential on the right abductor pollicis brevis muscle. In order to accurately target the stimulation site [left DLFPC i.e., the center part of the midprefrontal gyrus (Brodmann 9/46)], the Brainsight neuronavigation system (Brainsight TM , Rogue Research, Inc.) was used guided by the participant's structural cerebral MRI. After randomization (flipping a coin), participants received one stimulation session (active/sham) using the following parameters: 1620 pulses in 54 cycles of 10 bursts of 3 pulses with a train duration of 2 s and an inter-train interval of 8 s with a power output of 110% of the resting motor threshold. For the sham condition we used a specially designed sham coil identical to the active coil, mimicking the active stimulation feeling and sound without delivering any active stimulation. In this randomized within-subject crossover design, each participant received one active and one sham stimulation session (or vice versa) with an interval of 1 week between the two sessions. For both stimulations, participants were blinded and fitted with ear plugs to limit possible perceptual differences due to the stimulation condition (active or sham). At the start of the experiment, they completed the TEPS questionnaire. Before and after each stimulation, participants were assessed with the probabilistic learning task (Pizzagalli et al., 2005). The order of stimulation was counterbalanced across participants. After exclusion of an outlier, the active-sham group consisted of 10 participants and the sham-active group of 11 participants. On sociodemographic variables (education, marital status, lateralization), the order of stimulation groups only differed in age: active-sham (M = 20.80, SD = 1.68) shamactive (M = 25.45, SD = 3.44), t (19) =3.86 p < 0.01. This group difference is balanced-out by the crossover design in which each participant acts as his own control.
Probabilistic Learning Task (Pizzagalli et al., 2005) The task is composed of three blocks (B1, B2, and B3) of 100 trials. Each trial starts with the presentation of a fixation cross for 500 ms followed by a mouthless cartoon face for 500 ms. A schematic mouth (a horizontal line), long (13 mm) or short (11.5 mm), is then presented on the cartoon face for 100 ms. Participants are forced to choose which stimulus was shown by pressing the corresponding key on their keyboard. The association between a key and a mouth was counterbalanced between participants and between task completion (before vs. after stimulation) to avoid lateralization bias. If not correct, a new trial starts. If correct, participants are sometimes rewarded: a feedback screen announcing that they won 5 eurocents is presented for 1750 ms before starting a new trial (Figure 1). For each block, a pseudo random sequence of 50 short and 50 long mouths is used among which 40 correct responses are programmed to be rewarded. To induce a response bias, one mouth stimulus (called "rich" stimulus) is randomly chosen before the start of the task to be three times more often rewarded when correctly recognized than the other one (called "lean" stimulus). Among the 40 rewarded trials per block, 30 were allocated to the correct recognition of the rich stimulus and 10 to the lean stimulus. The assignment of rich and lean stimuli is counterbalanced within subject across the 4 task completions (if the long mouth is designated to be the rich stimulus for the first task completion, it is automatically designated to be the lean stimulus for the second task completion to avoid repetition during a testing day). Before starting the task, participants are instructed that not all correct trials will be rewarded but they are not informed that one stimulus will be more frequently rewarded than the other one. Participants are instructed to try to win as much money as possible.

Temporal Experience of Pleasure Scale
The temporal experience of pleasure scale (Gard et al., 2006) is composed of 18 self-report items and assesses individual trait dispositions in both anticipatory (TEPS ANT) and consummatory (TEPS CON) experiences of pleasure (10 items for the anticipatory pleasure scale and 8 items for the consummatory pleasure scale). The sum of the two subscales (TEPS TOT) is a measure of hedonic capacity (or anhedonia; Gard et al., 2006): the lower the score, the lower the hedonic capacity. This scale has the advantage of being applicable in both healthy controls and depressed patients (in depressed patients, low hedonic capacity is referred to as anhedonia), and it has been validated and used as such (Gard et al., 2006;Strauss et al., 2011). It has been demonstrated to have a good internal consistency, test-retest reliability, and convergent and discriminant validity (Gard et al., 2006). As advised by Sherdell et al. (2012) we used this validated scale because of its specificity for hedonic capacity, instead of using separate items from a larger depression scale (i.e., BDI), as they do not provide clear insight into the different subcomponents of reward processing.

Data Reduction and Statistical Analyses
Three outcome variables were used to assess participant's performance at the probabilistic learning task: response bias (RB), response accuracy and reaction time (RT). The RB is the main dependent variable for this study. It measures the systematic preference of a participant toward the rich stimulus. The RB increases as the participant shows high rates of correct identification for the rich stimulus and low rate of correct identification for the lean stimulus.
The Response accuracy was also analyzed.
Response accuracy = number of hits number of hits + number of misses Due to low variance in response accuracy, arcsine transformation was performed on raw accuracy data before entering statistical analyses. Regarding reaction time (RT), because the data were not normally distributed, log transformation was also performed before statistical analyses, which resulted in a normal distribution as indexed by the Shapiro-Wilk test and visual inspection of Q-Q plots.
The analyses were performed according to Pizzagalli et al. (2005). Analyses of variance (ANOVA) were performed on transformed accuracy and RT data, with Condition (rich, lean), Stimulation (active, sham), Time (pre, post stimulation) and Block (B1, B2, B3) as repeated measures. For response bias, the ANOVA included Block, Time and Stimulation only. As the aim is to investigate probabilistic learning processes, it is crucial to separate the task in different analytic parts (Blocks) and to include Block as a factor in the analysis. Per participant and for each task completion separately, trials with RTs shorter than 3 standard deviations were discarded. For all analyses, the significance level was set at α = 0.05. Cohen's d was calculated to evaluate effect sizes at the contrast level (difference between the means divided by the pooled standard deviation). Where necessary, we applied the Greenhouse-Geisser correction to ensure the assumption of sphericity. All collected data were analyzed with SPSS 22 (Statistical Package for the Social Sciences; IBM SPSS Statistics for Windows, Version 22.0, IBM Corp., Armonk, NY).

RESULTS
One participant mostly answered using only one key, resulting in either extremely high or extremely low (negative) RB scores with no learning effect throughout the blocks. This indicates that he did not follow the task instructions. The reason for this behavior was not known, as we detected this irregularity only when checking the data later. This participant was consequently removed from the analyses.

Response Bias
In a first step, the repeated measures ANOVA with Stimulation  Table 1; for an overview of the means, see Supplementary Material).
In a second step, to check the possible influence of individual differences in trait hedonic capacity, the TEPS scores were used as covariates in the analysis. TEPS TOT, TEPS ANT and TEPS CON were entered successively as covariates (ANCOVA). No significant effect emerged from the ANCOVAs using TEPS TOT or TEPS ANT. However, the ANCOVA with TEPS CON as a covariate revealed a significant interaction between Stimulation, Time, Block and TEPS CON, F (2, 19) = 3.86 p = 0.03 (for an overview of the results, see Table 2).
Following the significant omnibus interaction with TEPS CON, we ran follow-up tests with TEPS CON as a covariate and check for potential interaction effects with the individual's hedonic capacity (moderation).
At the block level with TEPS CON as a covariate, we looked at the differences between active and sham stimulation at pre-or post-measurements. Only one significant difference was found: B3 pre-active (M = 0.23, SD = 0.27) was higher than B3 pre-sham (M = 0.12, SD = 0.26) and this trended toward significance, F (1, 19) = 4.26, p = 0.053 d = 0.41. The interaction Simple effect analyses with TEPS CON as a covariate were conducted to compare response bias between blocks for each measurement: in the post-active measurement B2 (M = 0.21, SD = 0.30) was higher than B1 (M = 0.02, SD = 0.20), F (1, 19) = 12.33, p < 0.01, with Cohen's d indicating a large effect size (d = 0.76), and there was no interaction with TEPS CON, F (19) = 1.33, p = 0.26. In the post-sham measurement B3 (M = 0.26, SD = 0.32) was higher than B2 (M = 0.10, SD = 0.35), F (1, 19) = 6.00, p = 0.02, with Cohen's d indicating a medium effect size (d = 0.47), and there was no interaction with TEPS CON, F (19) = 1.29, p = 0.27, (Figure 2). To check whether the order of active vs. sham stimulation would influence the effects, order was included as a within-subject factor in a separate ANCOVA in combination with all the other factors. No main effect or crucial interaction with Order was found.
For Time and Stimulation, pre-vs. post-stimulation (active or sham), neither statistical difference nor interaction with TEPS CON was found.
For Time and Block we looked at the difference between blocks ( RB) pre-and post-measurements, and we computed the change post-minus pre-stimulation ( RB pre/post) of the differences between blocks for each stimulation.

Correlations
To visualize and further explore possible influential cases related to the nearly significant interaction with TEPS CON and RB B2B1 pre/post active and the significant interaction with TEPS CON and RB B3B2 pre/post active, we ran bivariate correlation analyses on the relationship between TEPS CON and these variables.
For TEPS CON and RB B2B1 pre/post-active ( Figure 3A) there was a positive correlation, r (19) = 0.41, p = 0.064 (see the abovementioned interaction). After visual inspection of the plot of TEPS CON against RB B2B1 pre/post-active, 2 data points appear as influential cases. These 2 cases have the highest scores on both Cook's distance and leverage values (influence measurements) and exceeded the Cook's distance numerical cutoff (Fox, 1991). In addition these 2 data points correspond to the 2 lowest scores in the TEPS CON from our population sample. After removal of these data points, the correlation between TEPS CON and RB B2B1 pre/post-active increased and became highly significant, r (17) = 0.59, p < 0.01 ( Figure 3B).
For TEPS CON and RB B3B2 pre-post-active ( Figure 3C) there was a negative correlation, r (19) = −0.54 p = 0.01 (see the abovementioned interaction). One case exceeded numerical cutoff for Cook's distance but had low leverage value. After removal of this case the correlation decreased but remained significant, r (18) = −0.46, p = 0.04 ( Figure 3D).

Reaction Time
The repeated measures ANOVA with Condition (rich and lean), Stimulation (active and sham), Time (pre-and post-stimulation) Frontiers in Human Neuroscience | www.frontiersin.org  Table 3; for an overview of the means, see Supplementary Material).
To investigate the Condition x Stimulation interaction we compared the RT for the rich and lean stimuli per stimulation condition. In the active stimulation condition, the RT for the rich stimulus (M = 6.11, SD = 0.19) was faster than for the lean stimulus (M = 6.18, SD = 0. Follow-up tests to investigate the Block x Condition interaction revealed that the average RT for the rich condition decreased along blocks whereas it increased for the lean condition: for the rich condition, RT at B3 (M = 6.10, SD = 0.15) was significantly faster than at B1 (M = 6.14, SD = 0. We computed the change in RT by subtracting the prestimulation to the post-stimulation ( RT pre/post stimulation) for each stimulation (active or sham) and stimulus (rich or lean).  We also computed for each stimulus the RT difference in the pre-stimulation condition between active and sham and in the post-stimulation condition between active and sham. For the rich stimulus the comparison of the difference of RT "preactive minus pre-sham" vs. the difference "post-active minus post-sham" tended toward significance, respectively (M = 0.02, SD = 0.21) and (M = −0.06, SD = 0.12), t (20) = 1.94, p = 0.067 d = 0.50 indicating that for the rich stimulus the difference of RT between active and sham was more important after stimulation than before stimulation. When comparing the difference of RT "post-active minus post-sham" of the rich vs. the lean stimulus, the RT difference was more important for the rich (M = −0.06, SD = 0.12) than for the lean (M = −0.01, SD = 0.13); t (20) = 2.82, p = 0.01 d = 0.40.
We then compared the differences "active minus sham; post minus pre" values of the rich and lean stimuli to specify how the Stimulation and Time factors had a different influence on the RT of the two stimuli. The difference between "active minus sham; post minus pre" stimulation was greater for the rich stimulus (M = −0.08, SD = 0.20) than for the lean stimulus (M = −0.04, SD = 0.19) indicating that the RT for the rich stimulus was more modulated by the Stimulation and Time factors than the RT for the lean stimulus but this did not reached significance.
For an overview of the follow-up test means, see supplementary material.
To check for the influence of individual differences in trait hedonic capacity, the TEPS scores were used as covariates in the analysis. TEPS TOT, TEPS ANT and TEPS CON were entered successively as covariates in the abovementioned model (ANCOVA) (see Table 4 for an overview of the results).
Follow-up tests on the Condition × Block interaction revealed that the average accuracy for the rich condition increased along blocks whereas it decreased for the lean condition. For the rich condition the accuracy at B3 (M = 1.18, SD = 0.15) was higher than at B1 (M = 1.10, SD = 0.13) t (20) = 2.91, p < 0.01 d = 0.55; for the lean condition the accuracy at B3 (M = 0.99, SD = 0.17) was lower than at B1 (M = 1.06, SD = 0.14), t (20) = 4.06, p < 0.01 d = 0.44. Overall the accuracy for the rich stimulus increased along blocks and between pre and post stimulation whereas for the lean stimulus it remained stable along blocks pre stimulation and decreased post stimulation.
To investigate the interaction between Time and Block, we compared each block pre-and post-stimulation. The accuracy  We also calculated the change in accuracy between blocks post-minus pre-stimulation ( pre/post stimulation) for each condition (rich or lean). The change in accuracy between B2 and B1 pre/post stimulation for the rich stimulus was less important than for the lean stimulus: change for the rich stimulus  Analysis of the effect of Stimulation on Condition revealed that in the active condition the accuracy for the rich stimulus (M = 1.14, SD = 0.15) was higher than for the lean stimulus (M = 1, SD = 0.14); t (20) = 5.66, p < 0.01 d = 0.96. The same effect was found in the sham condition: accuracy for the rich stimulus (M = 1.14, SD = 0.13) was higher than for the lean (M = 1.04, SD = 0.16), t (20) = 3.48, p < 0.01 d = 0.68.
For an overview of the follow-up test means, see Supplementary Material.
To check for individual differences in trait hedonic capacity, the TEPS scores were used as covariates in the abovementioned model. TEPS TOT, TEPS ANT and TEPS CON were entered successively as covariates (ANCOVA) (see Table 6 for an overview of the results).

DISCUSSION
The aim of this study was to assess the effects of a single session of iTBS over the left DLPFC on reward responsiveness in healthy male individuals and to check for a possible influence of trait hedonic capacity on the effect of the stimulation. As expected, participants developed a response bias toward the rich stimulus along the task blocks (B1, B2, and B3), indicating that they progressively learned which stimulus was the most often rewarded (Pizzagalli et al., 2005). The RT and response accuracy analysis showed that, as expected, participants were overall quicker to react toward the rich than the lean stimulus and that the accuracy for the rich condition was higher than for the lean condition.
However, this increased reward responsiveness seems to be independent of the type of stimulation (active vs. sham). Although the interaction effects were not significant, in both post-active and post-sham stimulation conditions a significant increase of the response bias was observed and in the postactive stimulation condition the RB increase was observed during the first blocks (B1 and B2) whereas in the post-sham stimulation condition the RB increase was found during the last blocks (B2 and B3). Ahn et al. (2013) reported similar observations: a higher RB during the early trials after HF-rTMS. Surprisingly this increase was limited to the first block of the task and an RB decrease was observed during the second block. Also, no difference in reward learning between blocks was observed. Importantly, and in contrast to our study design, Ahn and coworkers did not perform baseline measurements before the stimulation sessions, limiting the interpretation of these results. As mentioned before, by using a sham controlled cross-overdesign we could not replicate their findings.
However, given our assumption that individual trait reward sensitivity may influence the task performance related to the reward system, participants were assessed before entering the study design with the TEPS. Here our findings showed that only the active stimulation influenced participants' task performance, and that this influence was related to consummatory TEPS scores. Indeed, interactions with the TEPS CON were found in the active stimulation condition (B2B1 and B2B3): a positive correlation between the TEPS CON and the change in the reward learning (pre/post stimulation) between the early blocks (B1 and B2) and a negative correlation between the TEPS CON and the later blocks (B2 and B3) were found. Because the participants developed their RB more importantly during the early blocks, their RB development during the later block decreased and this pattern was correlated with their trait hedonic capacity. The more hedonic the participants the faster they developed their RB after the active stimulation suggesting an increase of their sensitivity to the rewarding stimulus. This is of interest given that neurostimulation methods can be used to treat depressed patients. For instance, Downar et al. (2014) applied 20 sessions of HF-rTMS on the left DLPFC in 47 MDD patients and compared responders to nonresponders. Treatment response appeared to be strongly bimodal showing one group with preserved consummatory hedonic function responding to HF-rTMS and another group with a lower consummatory hedonia ranking (higher consummatory anhedonia) not responding to HF-rTMS. Non-responders also displayed significantly lower connectivity within a classical reward dopaminergic network including the striatum, the caudate nucleus, the ventral tegmental area and the ventromedial prefrontal cortex (vmPC). Within this network the ventromedial prefrontal cortex, which is known for its consistent activation during the experience of rewarding stimulus across studies (Strauss et al., 2011;Diekhof et al., 2012) and thus associated to the consummatory process of reward, was predictive of the treatment outcome.
Interestingly, Vrieze et al. (2013a) found in a study with 79 depressed patients that reduced reward learning as assessed by their performance with the same probabilistic learning task, decreases their odds of remission after 8 weeks of treatment. These observations strengthened the idea of a link between patient's hedonic capability and their response to HF-rTMS. Although our male participants were not clinically depressed, our findings may be indicative of how iTBS treatment may successfully improve mood in one given patient but not the other. Furthermore, in a PET study with 10 healthy volunteers, Vrieze et al. (2013b) demonstrated that dopamine release in the vmPC plays an important role in reinforcement learning.
In our case, the more hedonic the participants (for the consummatory process), the more iTBS could modulate their reward system, increasing dopamine release. Keller et al. (2013), showed in healthy participants that trait hedonia and the functional connectivity within the reward system were positively correlated. Although speculative at this point, higher trait hedonic capacity reflecting stronger functional connectivity between key components of the reward system could explain whether or not cortical stimulation would propagate and modulate deeper structures of the reward system.
In addition, Pizzagalli et al. (2009) showed in a fMRI study that unmedicated depressed patients compared to controls exhibited weaker responses to monetary gains in the left nucleus accumbens and caudate bilaterally but not during reward anticipation, indicating that in depressed patients, the consummatory phase of reward learning might be impaired whereas the anticipatory phase might be preserved. Also in our study no influence of the TEPS anticipatory subscale, in contrast to the TEPS consummatory subscale, was observed. Our results indicate that the more hedonic for the consummatory process the participants were, the more they developed their RB during the early blocks after the active stimulation session. This additive effect being only present in the post-active stimulation and not in the post-sham, it is possible to think that iTBS positively influences reward processing.
The fact that after active iTBS healthy male participants with higher hedonic capacity seem to become more sensitive to reward, makes one wonder whether these neurostimulation parameters could not be contraindicated for patients with bipolar depression. Current rTMS treatment paradigms do not advocate the use of excitatory or high frequency rTMS paradigms in bipolar depression. Indeed, in few cases excitatory stimulation of the left DLPFC has been reported evoking a switch from depression into mania (Lefaucheur et al., 2014). However, only one study to date explicitly examined the effects of HF-rTMS on the reward system, though it was not able to distinguish different clinical effects between uni-and bipolar depression (Downar et al., 2014), leaving this question still open.
Besides the relatively small sample size there are some limitations. First, the interpretations should be limited to young male participants only. Second, even though we used a placebo coil mimicking the physical sensation of the active stimulation, and even though the participants were blinded and used ear plugs during the stimulations sessions, the placebo condition still was not perfect as sound and sensation were different. However, this is a methodological issue affecting almost all shamcontrolled rTMS paradigms. Finally, we did not make correction for multiple comparisons. By consequence our results should be interpreted with some caution due to the increased possibility of false positive statistical results.
In conclusion, we could not replicate the results observed by Ahn et al. (2013). However, we found a modulatory effect of trait hedonic capacity on participants' response to iTBS. The higher the hedonic score of the participants, the stronger their reward responsiveness increased after active iTBS. This indicates that individual differences in hedonic capacity may influence the effects of iTBS on the reward system. Neuroimaging studies applying probabilistic paradigms, also in MDD patients, are needed to understand the role of the reward system in the response to neurostimulation treatments.

AUTHOR CONTRIBUTIONS
All authors contributed to the conception and design, or acquisition of data, or analysis and interpretation of data and drafted the article or revised it critically for important intellectual content and gave final approval of the version to be published. Specifically, RD and CB made substantial contributions to the conception and design of the work; the acquisition, analysis, and interpretation of data for the work; drafted the work and revised it critically for important intellectual content; gave final approval of the version to be published; and agreed to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. RDR and GW made substantial contributions to the analysis and interpretation of data for the work; drafted the work and revised it critically for important intellectual content; gave final approval of the version to be published; and agreed to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.