Skip to main content

ORIGINAL RESEARCH article

Front. Hum. Neurosci., 09 May 2017
Sec. Brain Imaging and Stimulation
Volume 11 - 2017 | https://doi.org/10.3389/fnhum.2017.00200

Proximity of Substantia Nigra Microstimulation to Putative GABAergic Neurons Predicts Modulation of Human Reinforcement Learning

Ashwin G. Ramayya1 Isaac Pedisich2 Deborah Levy2 Anastasia Lyalenko2 Paul Wanda2 Daniel Rizzuto2 Gordon H. Baltuch1 Michael J. Kahana2*
  • 1Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
  • 2Department of Psychology, University of Pennsylvania, Philadelphia, PA, USA

Neuronal firing in the substantia nigra (SN) immediately following reward is thought to play a crucial role in human reinforcement learning. As in Ramayya et al. (2014a) we applied microstimulation in the SN of patients undergoing deep brain stimulation (DBS) for the treatment of Parkinson's disease as they engaged in a two-alternative reinforcement learning task. We obtained microelectrode recordings to assess the proximity of the electrode tip to putative dopaminergic and GABAergic SN neurons and applied stimulation to assess the functional importance of these neuronal populations for learning. We found that the proximity of SN microstimulation to putative GABAergic neurons predicted the degree of stimulation-related changes in learning. These results extend previous work by supporting a specific role for SN GABA firing in reinforcement learning. Stimulation near these neurons appears to dampen the reinforcing effect of rewarding stimuli.

1. Introduction

Thorndike's “Law of Effect” states that rewards strengthen associations between preceding stimuli and actions, resulting in reinforcement learning (Thorndike, 1932). Animal studies have shown that the phasic firing of substantia nigra (SN) neurons may represent a neural mechanism underlying reinforcement learning. SN dopamine (DA) neurons display phasic bursts that encode reward prediction error (RPE), a latent variable that tracks subsequent changes in associative strength (Sutton and Barto, 1990; Montague et al., 1996; Schultz et al., 1997; Bayer and Glimcher, 2005). They send prominent projections to dorsal striatal regions (Montague et al., 1996; Haber et al., 2000) that mediate action selection (Williams et al., 2006; Lau and Glimcher, 2008). Furthermore, DA release in the striatum has been shown to cause reinforcement of preceding actions and increased cortico-striatal synaptic strength (Reynolds et al., 2001).

Whereas animal studies have established a relation between SN neural firing and reinforcement learning, direct evidence from human studies is lacking. Patients undergoing deep brain stimulation (DBS) surgery for the treatment of Parkinson's Disease (PD) offers a rare opportunity to directly study the functional role of phasic SN activity during reinforcement learning (Jaggi et al., 2004; Zaghloul et al., 2009). Microstimulation, a technique that is widely used in animals to causally relate neural activity to behavior (Histed et al., 2009; Clark et al., 2011), is routinely applied as part of clinical protocol to aid in targeting of the DBS electrode. Patients are awake during this process to allow for detection of potential DBS-related adverse effects, and are able to perform cognitive tasks. In the only prior study relating microstimulation to human learning (Ramayya et al., 2014a), we showed that SN microstimulation near putative DA neurons impaired performance on a reinforcement learning task where rewards were contingent on stimuli, but unrelated to actions. Because of the experimental design used in this prior study, the observed stimulation-related decrease in performance could either signify impaired stimulus-reward learning, or a selective strengthening of action-reward associations that competed with stimulus-reward associations; the latter hypothesis was supported by further computational analyses (also, see de Berker and Rutledge, 2014).

In this study, we sought to clarify the role of phasic SN neural firing in human reinforcement learning. We applied SN microstimulation in eleven patients as they performed a reinforcement learning task with consistent stimulus-response mapping. In this task, stimulus-reward and action-reward associations were always correlated, and thus there was no confound between impaired learning and a selective strengthening of action-reward associations; improved performance suggests increased learning, whereas decreased performance suggests decreased learning. Because microstimulation has been shown to enhance the activity of neurons near the electrode tip (Histed et al., 2009), and because the human SN contains both DA and GABAergic neurons that represent functionally distinct populations (Ramayya et al., 2014b), we hypothesized that SN microstimulation would alter learning in a manner that was dependent on the properties of neurons near the electrode tip. Specifically, we expected stimulation-related improvements in learning when the electrode was positioned near putative DA neurons, but stimulation-related impairments in learning when the electrode was positioned near putative GABA neurons, that have been shown to exert inhibitory control over DA neurons (Tepper et al., 1995; Lobb et al., 2011; Henny et al., 2012; Pan et al., 2013).

2. Materials and Methods

2.1. Subjects

Eleven patients undergoing Deep Brain stimulation (DBS) surgery for the treatment of Parkinson's Disease volunteered to take part in this study (6 male, 5 female, mean age = 63.8 years). Subjects provided their informed consent during pre-operative consultation and received no financial compensation for their participation. Per routine clinical protocol, Parkinson's medications were stopped on the night before surgery (12 h preoperatively); hence subjects engaged in the study while in an OFF state. The study was conducted in accordance with a University of Pennsylvania Institutional Review Board-approved protocol.

2.2. Intra-operative Methods

During surgery, intra-operative microelectrode recordings (obtained from a 1 μm diameter tungsten tip electrode advanced with a power-assisted microdrive) were used to identify the substantia nigra (SN) and the subthalamic nucleus (STN) as per routine clinical protocol (Jaggi et al., 2004) (Figure 1A). Electrical microstimulation is routinely applied through the microelectrode to aid in clinical mapping of SN and STN neurons, and was approved for use in this study by the University of Pennsylvania IRB. Once the microelectrode was positioned in the SN, we administered a two-alternative probability learning task through a laptop computer placed in front of the subject. Subjects viewed the computer screen through prism glasses placed over the stereotactic frame and expressed choices by pressing buttons on handheld controllers placed in each hand.

FIGURE 1
www.frontiersin.org

Figure 1. Reinforcement learning task. (A) Subjects performed a reinforcement learning task with consistent stimulus-response mapping. The visual stimuli presented during the choice and feedback interval are shown. Feedback was provided probabilistically in accordance with one of four reward probability regimes that were re-assigned every 20 trials. (B) Each subject's intra-operative session was divided into two stages. During stage 1 (40 trials), we obtained microelectrode recordings and the assigned reward probabilities were either 0.8:0.2 or 0.2:0:0.8 red:blue, whereas during stage 2 (160 trials) we applied SN microstimulation, and the assigned reward probabilities were one of the following: 0.8:0.2, 0.7:0.3, 0.3:0.7, or 0.2:0:0.8). See Section 2 for additional details. (C) Subjects demonstrated a greater win-stay than expected by chance during both stage 1 and 2. (D) Subjects made the high reward probability choice with greater frequency during the last 10 trials of a reward probability regime as compared to the first 10 trials. Error bars indicate standard error of mean (s.e.m) across subjects. *indicates p < 0.001; see main text for statistics.

2.3. Reinforcement Learning Task

Subjects performed a two-alternative forced choice task with feedback. Each subject performed a single intra-operative session that consisted of two stages as described below. They also performed a pre-operative practice session that we did not include in our analyses. During each trial, subjects were presented with a pair of stimuli (red card deck and blue card deck), and asked to make a selection by pressing a button on one of two hand-held controllers (one in the left hand and one in the right hand). The red and blue card decks were presented simultaneously and arranged such that one deck was associated with a left button press, whereas the other deck was associated with a right button press. The arrangement of stimuli on the screen was randomly determined at the beginning of the experiment and remained fixed throughout.

Following each selection, subjects probabilistically received positive or negative feedback. Positive feedback was indicated by the appearance of a silver dollar accompanied by the audible ring of a cash register; negative feedback was indicated by the appearance of a copper penny accompanied by an error tone. The timing of each experiment was as follows: stimulus presentation and response time (variable), feedback presentation for 2 s and a 0–400 ms jitter between trials. Each experimental session consisted of 200 trials and was divided into two stages. During stage 1 (40 trials), we obtained microelectrode recordings from the SN, whereas during stage 2 (160 trials), we applied microstimulation following a subset of reward trials (see Section 2.4). To encourage subjects to attend to the rewards throughout the task, we employed a regime-switch model such that the reward probabilities associated with each of the card decks fluctuated throughout the experiment. In general, every 20 trials, the reward probabilities associated with the red and blue deck were assigned to one of several reward probability regimes. During stage 1, at trial 1 and trial 20, reward probabilities were assigned to one of two regimes, 0.8:0.2 or 0.2:0.8 (red:blue reward probability). During stage 2, every 20 trials, reward probabilities were assigned to one of four regimes (red:blue): 0.8:0.2, 0.7:0.3, 0.3:0.7, and 0.2:0.8. Before beginning the task, patients were shown an introductory video describing the task. Patients also participated in a pre-operative practice session of the task prior to the intra-operative session. On average, subjects had a response time of 1.80 ± 1.00 s (mean ± s.d.) per trial, and the intra-operative experiment lasted 15.57 ± 2.45 m (mean ± std).

As compared to the task used in Ramayya et al. (2014a), the current task included the following changes. First, only one set of stimuli (a red and blue card deck) were presented throughout the experiment instead of multiple stimulus pairs. Second, the stimuli were presented in the same arrangement on the screen from trial to trial such that there was consistent stimulus-response mapping. In other words, for a given experimental session, the red card was always presented on the left and the blue card was always presented on the right, associated with left and right button presses, respectively. Third, because only one set of stimuli were presented throughout the experimental session, we employed a regime-switch design to encourage learning throughout the task as described above.

2.4. Stimulation Parameters

We applied microstimulation immediately following feedback on approximately half of the reward trials during stage 2 (the latter 160 trials) of each intra-operative session. Specifically, we applied stimulation following 2 of every 4 reward trials that were pseudorandomly determined at the beginning of each experiment. Stimulation was provided through the microelectrode immediately following feedback presentation during the learning task using an FHC Pulsar 6b microstimulator using the following parameters: bi-phasic, cathode phase-lead pulses at 90 Hz, lasting 500 ms at an amplitude of 150 Amps and a pulse width of 500 μs. These stimulation parameters were used in our previous SN microstimulation study (Ramayya et al., 2014a), and similar parameters have induced learning in the rodent SN (Reynolds et al., 2001) and the non-human primate VTA (Grattan et al., 2011). An LED on the front chasse of the stimulator indicated the onset of stimulation, however, this was not visible to the patient as they performed the task. There was no sound associated with stimulation. Thus, stimulation trials were not signaled to subjects in any manner. None of the subjects reported a perceptual change following the application of microstimulation.

2.5. Extracting Spiking Activity from Microelectrode Recordings

We obtained microelectrode recordings during the first 40 trials of each intra-operative session prior to applying microstimulation during the experiment. Because these recordings were of a relatively short duration (≈ 5 min), their main purpose was to aid in interpretation of the stimulation results, rather than to characterize the functional properties of human SN neuronal activity (Zaghloul et al., 2009; Ramayya et al., 2014b). To assess whether stimulation-related behavioral changes were related to the properties of neurons near the electrode tip, we extracted multi-unit activity following methods previously described (Ramayya et al., 2014a,b).

Briefly, we extracted neuronal activity from each microelectrode recording using the WaveClus software package (Quiroga et al., 2005) after band-pass filtering the signal and manually removing periods of motion artifact. We identified spike events as positive or negative deflections in the voltage trace that crossed a threshold that was manually defined for each recording (≈ 3.5 S.D.). We used both positive and negative voltage fluctuations to identify units, rather than only negative deflections as in our previous microstimulation study (Ramayya et al., 2014a), because our recent electrophysiological study demonstrated that positive voltage fluctuations also contain task-related unit activity (Ramayya et al., 2014b). Spikes were subsequently clustered into units based on the first three principal components of the waveform and noise clusters from motion artifact or power line contamination were manually invalidated. We considered positive and negative deflections in the voltage signal to be independent units, but otherwise combined spiking activity on a given channel into multi-unit activity. We identified between 1 and 2 multi-units on each recording channel, except for one subject (#8) where we could not distinguish spiking activity from noise contamination (Table 1). When 2 multi-units were recorded from a single subject, we considered baseline firing rate to be the average baseline firing rate of the two contributing units to account for the artificial elevation in firing rate that results from combining units.

TABLE 1
www.frontiersin.org

Table 1. Summary of participant data.

2.6. Identifying Putative Dopaminergic and GABAergic Neurons

To study effect of microstimulation on SN DA and GABA neurons, we sought to assess the location of the microelectrode relative to each of these neural populations. Because DA and GABA neurons are locally clustered but largely interspersed in the SN (Poirier et al., 1983), putative DA and GABA neurons are typically identified based on physiological and functional properties of neurons, rather than their relative location within the SN (Fiorillo et al., 2013). Because of the limited intra-operative time, and technical challenges in simultaneously recording neural activity and applying microstimulation, we obtained neural recordings for a short duration (stage 1, ≈ 5 min) from a particular site in the SN prior to applying stimulation at that site. We sought to leverage findings from prior dedicated electrophysiology studies in animals (Ungless and Grace, 2012) and humans (Ramayya et al., 2014b) infer proximity of the microelectrode to these respective neuron types.

Previous studies which have combined electrophysiological recordings with pharmacological manipulations (Schultz and Romo, 1987) or histochemical techniques (Henny et al., 2012) have shown that DA neurons exhibit slow firing rates and broad waveforms, whereas GABA neurons display fast firing rates and narrow waveforms (Ungless and Grace, 2012). In a previous study, we showed that SN single-units that demonstrated firing rates slower than 15 Hz and waveform durations >0.8 ms demonstrated post-feedback responses consistent with DA neurons, whereas units that demonstrated high spike rates (>15 Hz) and narrow waveforms (<0.8 ms) demonstrated post-feedback responses consistent with GABAergic neurons (Ramayya et al., 2014b), a finding consistent with prior non-human primate studies (Matsumoto and Hikosaka, 2009).

In the current study, because of limited recording time and a limited number of subjects, we did not seek to identify distinct DA and GABA units to study as separate groups. Instead, we sought to extract physiological parameters of multi-unit activity that could serve as biomarkers of putative DA or GABA neural populations near the microelectrode. We extracted three physiological features from each unit as indicators of putative DA and GABAergic activity (Ungless and Grace, 2012; Ramayya et al., 2014a,b): mean spike rate, waveform duration (computed as peak-to-trough duration), and phasic post-reward activity (the difference between the average spike rate during 0–500 ms post-reward interval, and that during the −250–0 and 500–750 ms intervals). We sought to assess whether these physiological parameters could predict the effect of microstimulation on behavior. We used this approach in our previous microstimulation study (Ramayya et al., 2014a) to uncover a relation between the effect of stimulation and the properties of neurons recorded near the electrode tip.

2.7. Statistical Analyses

Unless otherwise noted, we performed across subject analyses whereby each subject contributed one observation to each statistical test. We used Student t-tests to compare mean value of continuous distributions, and Pearson's correlation r when studying the linear dependence between two variables. We considered a p < 0.05 to be statistically significant.

3. Results

We applied intra-operative microstimulation in the SN of eleven patients undergoing DBS for the treatment of PD as they performed a reinforcement learning task (Table 1). Subjects selected between a red and blue card deck by pressing buttons on hand-held controllers and subsequently received positive or negative feedback (Figure 1A). The reward probabilities associated with each card deck stochastically fluctuated throughout the intra-operative session to encourage learning (Figure 1B, see Section 2).

Subjects demonstrated clear evidence of learning on the task. Both during stage 1 and stage 2, subjects showed an increased probability of repeating the same action after receiving positive feedback [“win-stay,” 0.5 expected by chance; t(10) > 5.8, p's < 0.001, Figure 1C]. Subjects also showed an increased probability of making a high reward probability choice (“accuracy”) during the last 10 trials of a particular reward probability regime, as compared to the first 10 trials after a regime switch [t(10) = 4.35, p = 0.001, Figure 1D].

To assess the importance of SN neuronal activity for learning, we applied SN microstimulation following approximately half the reward trials during stage 2 of each subject's intra-operative session. To assess whether SN stimulation had an effect on learning, we compared subjects' win-stay probabilities following reward trials that were accompanied by stimulation (“stim trials”) and stage 2 reward trials during which stimulation was not applied (“control trials”). Across 11 subjects, we observed a trend toward decreased win-stay following stimulation trials compared to control trials [t(10) = 2.03, p = 0.068, Figure 2].

FIGURE 2
www.frontiersin.org

Figure 2. Stimulation-related change in learning. (A) Each subjects' probability of win stay during stage 2 is indicated by an “x,” following control trials on the left and following stimulation trials on the right. (B) Across subjects, we observed a trend toward an stimulation-related decrease in learning (p = 0.068). Error bars indicate standard error of mean (s.e.m) across subjects; see main text for statistics.

Our main hypothesis was that stimulation-related changes in learning would vary based on the functional properties of neurons near the electrode tip. To assess whether this was the case, we extracted various physiological parameters from neural activity recorded during stage 1 of each subject's intra-operative session (see Section 2). We assessed whether there was a correlation between stimulation-related changes in learning and mean spike rate of units recorded on each channel, and observed a significant negative correlation such that the greatest impairments in learning were observed when the electrode was positioned near neurons with relatively high spike rates (r = −0.64, p = 0.045, Figure 3A). Based on the the established finding that high spike rates and narrow waveforms are properties of GABAergic neurons (Ungless and Grace, 2012), we also assessed for a correlation between stimulation-related changes in learning and mean waveform duration. We observed a positive correlation between stimulation-related changes in learning and waveform duration, such that the strongest impairments occurred near neurons with narrow waveforms (r = 0.64, p = 0.044, Figure 3B). We did not observe a significant relation between stimulation-related changes in learning and phasic post-reward changes in activity (p > 0.5), and generally did not observe post-reward phasic changes in activity (z-score range: −0.1:0.36). Two example neurons are shown in Figure 3C.

FIGURE 3
www.frontiersin.org

Figure 3. Stimulation-related changes in learning are related to recorded neural activity. (A) Stimulation-related changes in learning during stage 2 were negatively correlated with mean spike rate of units recorded on that electrode during stage 1 (Pearson's r = −0.64, p = 0.045). (B) Same as (A) but demonstrating a positive correlation between stimulation-related changes in learning and mean waveform duration (Pearson's r = 0.64, p = 0.044). Each dot represents a subject, the solid black line is the regression slope, and the dashed lines represent 95% confidence intervals. (C) Neural recordings of multi-unit activity observed from two subjects (shown in red in A,B). For each unit, we show the average waveform (top left, gray shading marks the standard deviation), the inter-spike interval (bottom left, dashed line marks 3 ms), the average post-reward firing response (top right, smoothed with a Gaussian kernel of half-width = 75 ms; gray shading indicates s.e.m), and the spike raster following reward trials. Dashed black line indicates reward onset.

4. Discussion

We applied microstimulation in SN of patients undergoing DBS for the treatment of PD as they performed a reinforcement learning task. We found that microstimulation applied during the 500-ms post-reward interval impaired learning. These results demonstrate a causal relation between post-reward SN firing and human reinforcement learning as microstimulation is known to acutely enhance local neural firing (Histed et al., 2009). We hypothesized that the effect of SN microstimulation on learning would vary based on their relative proximity to dopaminergic (DA) neurons that guide reinforcement learning (Glimcher, 2011) or GABAergic neurons that exert inhibitory control on DA neurons (Damier et al., 1999a; Lobb et al., 2011; Ramayya et al., 2014b). As hypothesized, we observed the largest stimulation-related impairments in learning when the electrode was positioned near neurons with relatively high firing rates and narrow waveforms, properties characteristic of GABA neurons (Joshua et al., 2009; Matsumoto and Hikosaka, 2009; Ungless and Grace, 2012). Thus, our results suggest that microstimulation near GABA neurons impairs reinforcement learning.

This finding provides direct evidence relating phasic SN neural firing to human reinforcement learning. It goes beyond animal electrophysiology studies that may not generalize to human learning because they typically involve long periods of intense training. It also goes beyond prior human studies of reinforcement learning; functional neuroimaging studies cannot test a causal role for SN neural activity (Montgomery et al., 2009), and pharmacological manipulations of DA in patients with PD (Frank et al., 2004; Rutledge et al., 2009) cannot distinguish phasic neural activity from tonic changes in DA throughout the brain (Niv et al., 2007). Ramayya et al. (2014a) also showed a stimulation-related decrease in performance. However, because rewards in that study were contingent on stimuli, but independent of actions, the observed stimulation-related decrease in performance could either be attributed to an impairment of learning or a selective strengthening of action-reward associations that competed with stimulus-reward learning. Our current study overcame this limitation by using an experimental design with consistent stimulus-response mapping, such that stimulus-reward and action-reward associations were always correlated. Thus, our finding of a stimulation-related impairment in performance suggests decreased learning.

In Ramayya et al. (2014a), stimulation-related decreases in performance were correlated with an increased propensity to repeat the same action following reward, particularly when the electrode was positioned near putative DA neurons, suggesting that microstimulation near SN DA neurons enhanced action-reward learning. The current finding that stimulation near putative GABA neurons produced impairments in reinforcement suggests opposing roles of DA and GABA neurons during reinforcement learning. Specifically, if phasic bursts of SN DA neurons encode reward prediction errors that result in subsequent learning (Glimcher, 2011), and SN GABA neurons provide inhibitory inputs to local DA neurons (Tepper et al., 1995; Luscher and Ungless, 2006; Lobb et al., 2011; Henny et al., 2012; Pan et al., 2013), then one would observe enhanced learning when stimulating DA neurons (Ramayya et al., 2014a), but impaired reinforcement learning following microstimulation of SN GABA neurons. This explanation is also supported by our observation of opposing post-reward firing responses from putative DA and GABA neurons in the human (Ramayya et al., 2014b).

It is difficult to interpret whether the observed changes reinforcement learning were related to changes in stimulus-reward and/or action-reward learning because these forms of learning were perfectly correlated in the current experimental design. That we did not observe robust stimulation-related changes in learning near putative DA sites is difficult to interpret when considering our previous finding that microstimulation near putative DA neurons enhances action-reward learning (de Berker and Rutledge, 2014; Ramayya et al., 2014a). It is possible that we did not sample from a functional population of DA neurons in this study, as suggested by the absence of phasic post-reward bursts in activity from putative DA neurons in this study, unlike our previous studies (Ramayya et al., 2014a,b). Alternatively, it is possible that stimulation near SN DA neurons has a specific effect on action-reward learning that was not evident in this study because it was masked by simultaneous stimulus-reward learning.

An alternative explanation for how microstimulation of SN GABA neurons might have resulted in impaired learning is that stimulation may have caused a behavioral change during the post-reward interval that impaired subjects' learning during those trials. Several studies have linked the firing of SN GABA neurons in the pars reticulata subregion (that contains the majority of SN GABA neurons; Nair-Roberts et al., 2008) to regulation of downstream movement and saccade-generating structures (e.g., superior colliculus; Carpenter et al., 1976; DeLong et al., 1983; Hikosaka and Wurtz, 1983). If microstimulation of SN GABA neurons suppressed orienting saccades that likely occurred in response to the presentation of salient reward stimuli (in this case, a silver dollar and the sound of cash register; Hikosaka and Wurtz, 1983), then reward stimuli presented during stimulation trials might be associated with diminished salience and result in reduced learning. However, this is unlikely to be the case because non-human primate studies have shown that SN microstimulation has a limited influence on visually-guided saccades (Mahamed et al., 2011).

We note several limitations to our study. First, we are unable to provide direct histochemical evidence that electrophysiological parameters (spike rate and waveform duration) indicate distinct neuronal populations, however, a large body of evidence from animal studies suggest that these electrophysiological criteria may be used to identify distinct midbrain neuronal populations (Ungless and Grace, 2012). Second, we did not observe stimulation-related changes in learning near putative DA neurons in this study, whereas we observed such changes in our previous microstimulation study (Ramayya et al., 2014a). This likely reflects reduced sampling of DA neurons during this experiment, which is consistent with the fact that we did not observe post-reward bursts of activity in this study (a marker of DA activity), in contrast to Ramayya et al. (2014a). Finally, the population we studied—patients undergoing DBS surgery for PD—is known to have degeneration of DA neurons in SN. Even though this poses the challenge of interpreting findings concerning the functional role of SN neurons in patients who have degenerative disease, histological studies in PD patients (Damier et al., 1999b), and electrophysiological studies in rat models of PD (Hollerman and Grace, 1990; Zigmond et al., 1990), and humans (Zaghloul et al., 2009; Ramayya et al., 2014b) indicate that a significant population of viable neurons remain in the parkinsonian SN. Taken together with the clear evidence of learning that subjects demonstrated during the task, we suggest that the neural processes we describe reflect the subpopulation of healthy neurons that remain in the SN.

5. Conclusion

We demonstrate a specific role for SN GABAergic neural activity in human reinforcement learning. We found that the proximity of SN microstimulation near putative GABA neurons predicted impairments in learning, possibly related to local inhibition of phasic DA bursts. These results raise the possibility that SN microstimulation may allow for bi-directional control of reinforcement learning in pathological conditions (e.g., stimulation of GABA neurons to reduce learning during addiction, and stimulation of DA neurons enhance learning during stroke recovery). To further evaluate this possibility, future studies must improve intra-operative targeting of DA and GABA neurons and clarify the mechanisms by which SN microstimulation alters learning.

Ethics Statement

This study was carried out in accordance with the recommendations of University of Pennsylvania Institutional Review Board with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the University of Pennsylvania Institutional Review Board.

Author Contributions

AR, GB, IP, and MK designed research; AR, DL, AL, PW, and GB performed research; IP and AR analyzed the data; AR and MK wrote the paper. All authors contributed to the intellectual content of the research and provided final approval on the manuscript.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

This work was supported by the DARPA Restoring Active Memory (RAM) program (Cooperative Agreement N66001-14-2-4032). The views, opinions, and/or findings contained in this material are those of the authors and should not be interpreted as representing the official views or policies of the Department of Defense or the U.S. Government.

Patient Recruitment

We thank Marie Kerr and Hanane Chaibainou for their invaluable help in coordinating patient recruitment for this study.

Dedication

We dedicate this manuscript to the memory of our dear friend and colleague, Anastasia Lyalenko, who tragically passed away at the age of 22, from complications related to viral myocarditis on June 15, 2015.

References

Bayer, H., and Glimcher, P. (2005). Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron 47, 129–141. doi: 10.1016/j.neuron.2005.05.020

PubMed Abstract | CrossRef Full Text | Google Scholar

Carpenter, M., Nakano, K., and Kim, R. (1976). Nigrothalamic projections in the monkey demonstrated by autoradiographic technics. J. Comp. Neurol. 165, 401–416. doi: 10.1002/cne.901650402

PubMed Abstract | CrossRef Full Text | Google Scholar

Clark, K. L., Armstrong, K. M., and Moore, T. (2011). Probing neural circuitry and function with electrical microstimulation. Proc. Biol. Sci. 278, 1121–1130. doi: 10.1098/rspb.2010.2211

PubMed Abstract | CrossRef Full Text | Google Scholar

Damier, P., Hirsch, E., Agid, Y., and Graybiel, A. M. (1999a). The substantia nigra of the human brain I. Nigrosomes and the nigral matrix, a compartmental organization based on calbindin d28k immunohistochemistry. Brain 122, 1421–1436. doi: 10.1093/brain/122.8.1421

CrossRef Full Text | Google Scholar

Damier, P., Hirsch, E., Agid, Y., and Graybiel, A. M. (1999b). The substantia nigra of the human brain II. Patterns of loss of dopamine-containing neurons in Parkinson's disease. Brain 122, 1437–1448. doi: 10.1093/brain/122.8.1437

PubMed Abstract | CrossRef Full Text | Google Scholar

de Berker, A., and Rutledge, R. (2014). A role for the Human Substantia Nigra in reinforcement learning. J. Neurosci. 34, 12947–12949. doi: 10.1523/JNEUROSCI.2854-14.2014

PubMed Abstract | CrossRef Full Text | Google Scholar

DeLong, M., Crutcher, M., and Georgopoulos, A. P. (1983). Relations between movement and single cell discharge in the substantia nigra of the behaving monkey. J. Neurosci. 3, 1599–1606.

PubMed Abstract | Google Scholar

Fiorillo, C., Yun, S., and Song, M. (2013). Diversity and homogeneity in responses of midbrain dopamine neurons. J. Neurosci. 33, 4693–709. doi: 10.1523/JNEUROSCI.3886-12.2013

PubMed Abstract | CrossRef Full Text | Google Scholar

Frank, L. M., Stanley, G., and Brown, E. (2004). Hippocampal plasticity across multiple days of exposure to novel environments. J. Neurosci. 24, 7681–7689. doi: 10.1523/JNEUROSCI.1958-04.2004

PubMed Abstract | CrossRef Full Text | Google Scholar

Glimcher, P. (2011). Understanding dopamine and reinforcement learning: the dopamine reward prediction error hypothesis. Proc. Natl. Acad. Sci. U.S.A. 108, 15647–15654. doi: 10.1073/pnas.1014269108

PubMed Abstract | CrossRef Full Text | Google Scholar

Grattan, L., Rutledge, R., and Glimcher, P. (2011). “Increased dopamine concentrations increase the perceived value of an action,” in Program No. 732.12. Society for Neuroscience Meeting Planner (San Diego, CA: Society for Neuroscience).

Haber, S. N., Fudge, J. L., and McFarland, N. R. (2000). Striatonigrostriatal pathways in primates form an ascending spiral from the shell to the dorsolateral striatum. J. Neurosci. 20, 2369–2382.

PubMed Abstract | Google Scholar

Henny, P., Brown, M., Northrop, A., Faunes, M., Ungless, M., Magill, P., et al. (2012). Structural correlates of heterogeneous in vivo activity of midbrain dopaminergic neurons. Nat. Neurosci. 15, 613–619. doi: 10.1038/nn.3048

PubMed Abstract | CrossRef Full Text | Google Scholar

Hikosaka, O., and Wurtz, R. (1983). Visual and oculomotor functions of monkey substantia nigra pars reticulata. I. Relation of visual and auditory responses to saccades. J. Neurophyiol. 49, 1230–1253.

PubMed Abstract | Google Scholar

Histed, M., Bonin, V., and Reid, C. (2009). Direct activation of sparse, distributed populations of cortical neurons by electrical microstimulation. Neuron 63, 508–522. doi: 10.1016/j.neuron.2009.07.016

PubMed Abstract | CrossRef Full Text | Google Scholar

Hollerman, J., and Grace, A. (1990). The effects of dopamine-depleting brain lesions on the electrophysiological activity of rat Substantia Nigra dopamine neurons. Brain Res. 533, 203–212. doi: 10.1016/0006-8993(90)91341-D

PubMed Abstract | CrossRef Full Text | Google Scholar

Jaggi, J., Umemura, A., Hurtig, H., Siderowf, A., Colcher, A., Stern, M., et al. (2004). Bilateral subthalamic stimulation of the subthalamic nucleus in Parkinson's disease: surgical efficacy and prediction of outcome. Stereot. Funct. Neurosurg. 82, 104–114. doi: 10.1159/000078145

PubMed Abstract | CrossRef Full Text | Google Scholar

Joshua, M., Adler, A., Rosin, B., Vaadia, E., and Bergman, H. (2009). Encoding of probabilistic rewarding and aversive events by pallidal and nigral neurons. J. Neurophysiol. 101, 758–772. doi: 10.1152/jn.90764.2008

PubMed Abstract | CrossRef Full Text | Google Scholar

Lau, B., and Glimcher, P. (2008). Value representations in the primate striatum during matching behavior. Neuron 58, 451–463. doi: 10.1016/j.neuron.2008.02.021

PubMed Abstract | CrossRef Full Text | Google Scholar

Lobb, C., Wilson, C., and Paladini, C. (2011). High-frequency, short-latency disinhibition bursting of midbrain dopaminergic neurons. J. Neurophsyiol. 105, 2501–2511. doi: 10.1152/jn.01076.2010

PubMed Abstract | CrossRef Full Text | Google Scholar

Luscher, C., and Ungless, M. (2006). The mechanistic classification of addictive drugs. PLoS Med. 3:e437. doi: 10.1371/journal.pmed.0030437

CrossRef Full Text | Google Scholar

Mahamed, S., Garrison, T. J., Shires, J., and Basso, M. A. (2011). Stimulation of the substantia nigra influences the specification of memory-guided saccades. J. Neurophysiol. 111, 804–816. doi: 10.1152/jn.00002.2013

PubMed Abstract | CrossRef Full Text | Google Scholar

Matsumoto, M., and Hikosaka, O. (2009). Two types of dopamine neuron distinctly convey positive and negative motivational signals. Nature 459, 837–841. doi: 10.1038/nature08028

PubMed Abstract | CrossRef Full Text | Google Scholar

Montague, P. R., Dayan, P., and Sejnowski, T. J. (1996). A framework for mesencephalic dopamine systems based on predictive hebbian learning. J. Neurosci. 16, 1936–1947.

PubMed Abstract | Google Scholar

Montgomery, S., Betancur, M., and Buzsaki, G. (2009). Behavior-dependent coordination of multiple theta dipoles in the hippocampus. J. Neurosci. 29:1381. doi: 10.1523/JNEUROSCI.4339-08.2009

PubMed Abstract | CrossRef Full Text | Google Scholar

Nair-Roberts, R., Chatelain-Badie, S., Benson, E., White-Cooper, H., Bolam, J., and Ungless, M. (2008). Stereological estimates of dopaminergic, GABA-ergic, and glutamatergic neurons in the Ventral Tegmental Area, Substantia Nigra and Retrorubal Field in the rat. J. Neurosci. 152, 1024–1031. doi: 10.1016/j.neuroscience.2008.01.046

CrossRef Full Text | Google Scholar

Niv, Y., Daw, N., Joel, D., and Dayan, P. (2007). Tonic dopamine: opportunity costs and the control of response vigor. Psychopharmacology 191, 507–520. doi: 10.1007/s00213-006-0502-4

PubMed Abstract | CrossRef Full Text | Google Scholar

Pan, W. X., Brown, J., and Dudman, J. (2013). Neural signals of extinction in the inhibitory microcircuit of the ventral midbrain. Nat. Neurosci. 16, 71–78. doi: 10.1038/nn.3283

PubMed Abstract | CrossRef Full Text | Google Scholar

Poirier, L., Giguére, M., and Marchand, R. (1983). Comparative morphology of the substantia nigra and ventral tegmental area in the monkey, cat and rat. Brain Res. Bull. 11, 371–397. doi: 10.1016/0361-9230(83)90173-9

PubMed Abstract | CrossRef Full Text | Google Scholar

Quiroga, R. Q., Reddy, L., Kreiman, G., Koch, C., and Fried, I. (2005). Invariant visual representation by single neurons in the human brain. Nature 435, 1102–1107. doi: 10.1038/nature03687

PubMed Abstract | CrossRef Full Text | Google Scholar

Ramayya, A. G., Misra, A., Baltuch, G. H., and Kahana, M. J. (2014a). Microstimulation of the human substantia nigra following feedback alters reinforcement learning. J. Neurosci. 34, 6887–6895. doi: 10.1523/JNEUROSCI.5445-13.2014

CrossRef Full Text | Google Scholar

Ramayya, A. G., Zaghloul, K. A., Weidemann, C. T., Baltuch, G. H., and Kahana, M. J. (2014b). Electrophysiological evidence for functionally distinct neuronal populations in the human substantia nigra. Front. Hum. Neurosci. 8:655. doi: 10.3389/fnhum.2014.00655

PubMed Abstract | CrossRef Full Text | Google Scholar

Reynolds, J., Hyland, B., and Wickens, J. (2001). A cellular mechanism of reward-related learning. Nature 413, 67–70. doi: 10.1038/35092560

CrossRef Full Text | Google Scholar

Rutledge, R., Lazzaro, S., Lau, B., Myers, C. E., Gluck, M. A., and Glimcher, P. (2009). Dopaminergic drugs modulate learning rates and perseveration in Parkinson's patients in a dynamic foraging task. J. Neurosci. 29, 15104–15114. doi: 10.1523/JNEUROSCI.3524-09.2009

PubMed Abstract | CrossRef Full Text | Google Scholar

Schultz, W., Dayan, P., and Montague, P. R. (1997). A neural substrate of prediction and reward. Science 275, 1593–1599. doi: 10.1126/science.275.5306.1593

PubMed Abstract | CrossRef Full Text | Google Scholar

Schultz, W., and Romo, R. (1987). Responses of nigrostriatal dopamine neurons to high-intensity somatosensory stimulation in the anesthetized monkey. J. Neurophysiol. 57, 201–217.

PubMed Abstract | Google Scholar

Sutton, R., and Barto, A. (1990). “Time-derivative models of pavolovian reinforcement,” in Learning and Computational Neuroscience: Foundations of Adaptive Networks, eds M. Gabriel and J. Moore (Cambridge, MA: MIT Press), 497–537.

Tepper, J., Martin, L., and Anderson, D. (1995). GABA-A receptor-mediated inhibition of rat Substantia Nigra dopaminergic neurons by pars reticulata projection neurons. J. Neurosci. 15, 3092–3103.

Google Scholar

Thorndike, E. L. (1932). The Fundamentals of Learning. New York, NY: Bureau of Publications.

Google Scholar

Ungless, M., and Grace, A. (2012). Are you or aren't you? Challenges associated with physiologically identifying dopamine neurons. Trends Neurosci. 35, 422–430.

PubMed Abstract | Google Scholar

Williams, J., Ramaswamy, D., and Oulhaj, A. (2006). 10 hz flicker improves recognition memory in older people. BMC Neurosci. 7:21. doi: 10.1186/1471-2202-7-21

PubMed Abstract | CrossRef Full Text | Google Scholar

Zaghloul, K. A., Blanco, J. A., Weidemann, C. T., McGill, K., Jaggi, J. L., Baltuch, G. H., et al. (2009). Human Substantia Nigra neurons encode unexpected financial rewards. Science 323, 1496–1499. doi: 10.1126/science.1167342

PubMed Abstract | CrossRef Full Text | Google Scholar

Zigmond, M., Abercrombie, E., Berger, T. W., Grace, A., and Stricker, E. (1990). Compensations after lesions of central dopaminergic neurons: some clinical and basic implications. Trends Neurosci. 13, 290–296. doi: 10.1016/0166-2236(90)90112-N

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: substantia nigra, human, dopamine, GABA, neuron, reinforcement learning, microstimulation, Parkinson's disease

Citation: Ramayya AG, Pedisich I, Levy D, Lyalenko A, Wanda P, Rizzuto D, Baltuch GH and Kahana MJ (2017) Proximity of Substantia Nigra Microstimulation to Putative GABAergic Neurons Predicts Modulation of Human Reinforcement Learning. Front. Hum. Neurosci. 11:200. doi: 10.3389/fnhum.2017.00200

Received: 05 November 2016; Accepted: 06 April 2017;
Published: 09 May 2017.

Edited by:

Carol Seger, Colorado State University, USA

Reviewed by:

Kenneth T. Kishida, Wake Forest School of Medicine, USA
Kenji Morita, University of Tokyo, Japan

Copyright © 2017 Ramayya, Pedisich, Levy, Lyalenko, Wanda, Rizzuto, Baltuch and Kahana. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Michael J. Kahana, kahana@psych.upenn.edu

Download