A spiking Basal Ganglia model of synchrony, exploration and decision making

Mandali, Alekhya; Rengaswamy, Maithreye; Chakravarthy, V. Srinivasa; Moustafa, Ahmed A.

doi:10.3389/fnins.2015.00191

ORIGINAL RESEARCH article

Front. Neurosci., 27 May 2015

Sec. Decision Neuroscience

Volume 9 - 2015 | https://doi.org/10.3389/fnins.2015.00191

A spiking Basal Ganglia model of synchrony, exploration and decision making

AM
Alekhya Mandali ¹
MR
Maithreye Rengaswamy ¹
VS
V. Srinivasa Chakravarthy ¹^*
AA
Ahmed A. Moustafa ²

1. Computational Neuroscience Lab, Department of Biotechnology, Bhupat and Mehta School of BioSciences, Indian Institute of Technology Madras Chennai, India
2. Marcs Institute for Brain and Behaviour and School of Social Sciences and Psychology, University of Western Sydney Sydney, NSW, Australia

Article metrics

View details

Citations

11,6k

Views

2,9k

Downloads

Abstract

To make an optimal decision we need to weigh all the available options, compare them with the current goal, and choose the most rewarding one. Depending on the situation an optimal decision could be to either “explore” or “exploit” or “not to take any action” for which the Basal Ganglia (BG) is considered to be a key neural substrate. In an attempt to expand this classical picture of BG function, we had earlier hypothesized that the Indirect Pathway (IP) of the BG could be the subcortical substrate for exploration. In this study we build a spiking network model to relate exploration to synchrony levels in the BG (which are a neural marker for tremor in Parkinson's disease). Key BG nuclei such as the Sub Thalamic Nucleus (STN), Globus Pallidus externus (GPe) and Globus Pallidus internus (GPi) were modeled as Izhikevich spiking neurons whereas the Striatal output was modeled as Poisson spikes. The model is cast in reinforcement learning framework with the dopamine signal representing reward prediction error. We apply the model to two decision making tasks: a binary action selection task (similar to one used by Humphries et al., 2006) and an n-armed bandit task (Bourdaud et al., 2008). The model shows that exploration levels could be controlled by STN's lateral connection strength which also influenced the synchrony levels in the STN-GPe circuit. An increase in STN's lateral strength led to a decrease in exploration which can be thought as the possible explanation for reduced exploratory levels in Parkinson's patients. Our simulations also show that on complete removal of IP, the model exhibits only Go and No-Go behaviors, thereby demonstrating the crucial role of IP in exploration. Our model provides a unified account for synchronization, action section, and explorative behavior.

Introduction

Imagine a situation where you would like to dine out and are in search of suitable restaurants. Some restaurants you know for sure are good, and others you have no idea about. In other words you have two fundamentally different options of which one is to order your favorite dish and play it safe (i.e., “exploit”) while the other is to try something new (i.e., “explore”). Further, an unexpected weather change would force you to stay at home (i.e., a No Go decision). How does our brain make a decision in such a scenario? Depending on the situation, an optimal decision could be to either explore, exploit or to take no action (Cohen et al., 2007; Prescott et al., 2007).

A group of subcortical structures collectively called the Basal ganglia (BG) play an important role in many cognitive processes (Gurney et al., 2001a,b; Humphries and Gurney, 2002; Chakravarthy et al., 2010; Schroll et al., 2012; Yucelgen et al., 2012; Chersi et al., 2013) including decision making and action selection. The BG circuit includes the neo-striatum (caudate and putamen), Globus pallidus (externa, GPe, and interna, GPi), subthalamic nucleus (STN), and substantia nigra (pars compacta, SNc and pars reticulata, SNr). BG receive inputs from the cortex through the striatum and STN (Maurice et al., 1998; Aravamuthan et al., 2007) and project through SNr and GPi, the output nuclei of BG, via thalamus (Albin et al., 1989) to motor and executive areas of the cortex (Steiner and Tseng, 2010). Classically BG pathways are segregated into the indirect pathway (IP) constituting a part of the striatum, GPe and STN projecting to GPi (Gerfen and Surmeier, 2011) and the direct pathway (DP) constituting the projection from the striatum to GPi (Gerfen and Surmeier, 2011). The final “action selection” is based on the combined contributions of the two pathways at output nuclei (Smith et al., 1998). The effect of dopamine (DA) on BG pathways and decision making has been well known (Rogers, 2010). Under low DA conditions, IP is more active than DP leading to “No-Go” behavior (Frank, 2005) whereas in high DA conditions DP is more active than IP leading to “Go” (Chevalier and Deniau, 1990). But this traditional explanation of action selection in binary terms of Go/No Go misses out “exploration” and its possible neural substrates out of the picture.

The ability to switch between explorative and exploitative behavior during decision making drew the attention of neuroscientists to study and characterize the corresponding anatomical substrates. It has been suggested that the pallidum, in its interactions with the noradrenergic system, controls the balance between exploration-exploitation (Russell et al., 1992; Aston-Jones et al., 1994; Doya, 2002). Humphries et al. (2007) argue that the brainstem specifically medial reticular formation (mRF) might be the substrate for action selection (Humphries et al., 2007). Schroll et al. (2012) presented a model of working memory sub-served by the cortico-basal-ganglia-thalamic loops where exploration in the model was obtained by the addition of noise to neural dynamics, but no anatomical substrate was suggested (Schroll et al., 2012). Chersi et al. (2013) simulate the role of BG and prefrontal cortex in goal-oriented learning vs. habitual learning and hypothesize that exploration emerges during the “up” state of striatal neurons (Chersi et al., 2013). Shouno et al. (2009) built a spiking network model of BG where the IP selects an action and the DP determines the timing of the selected action. Though the model was able to show exploration in terms of variability in action selection, there was no component of learning in the network (Shouno et al., 2009). Stewart et al. (2012) simulated the rat bandit task experiment using leaky integrate and fire model of cortex and BG where the spiking activity of ventral striatum during a response was measured. Though the model showed behavioral learning, anatomical substrate for exploration was not suggested (Stewart et al., 2012). A recent study by Humphries et al. (2012) suggest the role of tonic DA in setting the exploitation- exploration tradeoff (Humphries et al., 2012) in basal ganglia. Among computational models of BG, very few simulated the neural substrates for exploration within the BG system. The study by Archibald et al. (2013a) on PD patients indicates a decrease in exploration behavior compared to healthy controls during a visuo-spatial task (Archibald et al., 2013a,b).

Chakravarthy et al. (2010) suggested that STN-GPe loop, a coupled excitatory-inhibitory network in the IP, might be the substrate for exploration (Chakravarthy et al., 2010). It is well known that coupled excitatory-inhibitory pools of neurons can exhibit rich dynamic behavior like oscillations and chaos (Borisyuk et al., 1995; Sinha, 1999). This hypothesis has inspired models simulating various BG functions ranging from action selection in continuous spaces (Krishnan et al., 2011), reaching movements (Magdoom et al., 2011), spatial navigation (Sukumar et al., 2012), precision grip (Gupta et al., 2013), and gait (Muralidharan et al., 2013) in normal and Parkinsonian conditions. Using a network of rate-coding neurons, Kalva et al. (2012) showed that exploration emerges out of the chaotic dynamics of the STN-GPe system (Kalva et al., 2012). Most rate coded models, by design, fail to capture dynamic phenomena like synchronization found in more realistic spiking neuron models (Terman et al., 2002; Park et al., 2010, 2011). Synchronization within BG nuclei had gained attention since the discovery that STN, GPe, and GPi neurons show high levels of synchrony in Parkinsonian conditions (Bergman et al., 1994; Bevan et al., 2002; Hammond et al., 2007; Tachibana et al., 2011; Weinberger and Dostrovsky, 2011). This oscillatory activity was found to be present in two frequency bands, one around the tremor frequency [2–4 Hz] and another in [10–30 Hz] frequency (Weinberger and Dostrovsky, 2011). Park et al. (2011) report the presence of intermittent synchrony between STN neurons and its Local field potentials (LFP), recorded using multiunit activity electrodes from PD patients undergoing Deep Brain Stimulation (DBS) surgery (Park et al., 2011) which is absent in healthy controls.

One of the key objectives of the current study is to use a 2D spiking neuron model to understand and correlate STN-GPe's synchrony levels to exploration. As the second objective, we apply the above-mentioned model to the n-armed bandit problem of Daw et al. (2006) and Bourdaud et al. (2008) with the specific aim of studying the contributions of STN-GPe dynamics to exploration. The proposed model shares some aspects of classical RL-based approach to BG modeling. For example, dopamine signal is compared to reward prediction error (Schultz, 1998). Furthermore, DA is allowed to control cortico-striatal plasticity [47], modulate the gains of striatal neurons (Kliem et al., 2007; Hadipour-Niktarash et al., 2012) and influence the dynamics of STN-GPe by modulating the connections (Kreiss et al., 1997; Fan et al., 2012).

The paper is organized as follows. Section Methods: Model Details describes the model architecture and equations used in the simulations. Section Results presents the results. Implications of the modeling study are discussed in the final section.

Methods: model details

The model consists of the striatum, STN, GPe, GPi, and SNc (Figure 1). Modeling details of various BG nuclei are described below. All the simulations were coded using MATLAB v2012.

Figure 1

Striatum

Striatal neurons display irregular firing patterns during wakeful stage (Stern et al., 1998; Mahon et al., 2006) which was accounted by modeling the striatal (both D1 and D2) output as Poisson process. The presynaptic potentials due to this striatal output [D1R expressing and D2R expressing medium spiny neurons (MSNs), (Kreitzer, 2009)] was represented as 2 unconnected pools (50 × 50 each) that give rise to GABAergic current from D1 striatum to GPi (Gerfen et al., 1990; Gurney et al., 2001a; Gerfen and Surmeier, 2011) and D2 striatum to GPe neurons (Gerfen et al., 1990) respectively.

Izhikevich neuron model

Izhikevich spiking neuron models have an advantage of being computationally inexpensive compared to biophysical models yet capable to capture various neuronal properties such as firing rate and spike pattern (Izhikevich, 2003) which is absent in rate coded models. The key modules in BG circuit including GPe, STN, and GPi (Figure 1) were modeled using Izhikevich neuron models arranged in a 2D lattice (= 50 × 50) consisting of 2500 neurons each. The Izhikevich parameters (a, b, c, d) for STN neuron were adapted from (Michmizos and Nikita, 2011), GPe and GPi neurons were modeled as tonically spiking Izhikevich neurons (Izhikevich, 2003). The external current (I^x) was adjusted to match the published firing frequencies of these neuronal types (Modolo et al., 2007). The values of the Izhikevich parameters are given in Table 1.

where, v^x_ij = membrane potential, u^x_ij = membrane recovery variable, I^Syn_ij = total synaptic current received, I^x_ij = external current applied to neuron x at location (i, j), v_peak = maximum voltage set to neuron (+30 mv) with x being STN or GPe or GPi neuron.

Table 1

Parameter	Values with description
	STN		GPe	GPi
Izhikevich parameters	a = 0.005, b = 0.265, c = −65, d = 1.5		a = 0.1, b = 0.2, c = −65, d = 2	a = 0.1, b = 0.2, c = −65, d = 2
External current (I)	I_STN = 30		I_GPe = 10	I_GPi = 10
StrD1→GPi	0.8		Synaptic weight between D1 striatum and GPi
W_StrD2→GPe	1		Synaptic weight between D2 striatum and GPe
DA	0.1–0.9 in increments of 0.1
A_D1	10	Amplitude of GABAergic current from D1 striatum to GPi neurons due to DA
A_D2	7.5	Amplitude of GABAergic current from D2 striatum to GPe neurons due to DA
λ^Str	7.5	Slope of the Gain functions (cD1 and cD2)
Mg²⁺	1 nM	Concentration of Magnesium ions in nM
E_AMPA	0 mv	Synaptic potential of AMPA receptor-associated channel
E_NMDA	0 mV	Synaptic potential of NMDA receptor-associated channel
E_GABA	−60 mV	Synaptic potential of GABA receptor-associated channel
w_sg	1	Synaptic weight for excitatory STN to GPe projection
w_gs	20	Synaptic weight for inhibitory GPe to STN projection
cd2	0.1	Parameter that affects the STN→GPe (w_sg) and GPe→STN(w_gs) weights
τ_AMPA	6 ms	Time decay constant for AMPA receptor
τ_NMDA	160 ms	Time decay constant for NMDA receptor
τ_GABA	4 ms	Time decay constant for GABA receptor
τ_{NMDA_GPi}	67 ms	Time decay constant for NMDA receptor of GPi neurons
r_s	1	Radius of STN laterals Gaussian
r_g	0.5	Radius of GPe laterals Gaussian
cD21	0.1	Parameter that affects the radius of STN and GPe laterals
A_GPe	1	Synaptic strength within GPe laterals
A_STN	0.2	Synaptic strength within STN laterals
nlat_STN−	5	# of lateral connections considered in STN neurons
nlat_GPe−	11	# of lateral connections considered in GPe neurons
w_STN→GPi	1.15	Synaptic weight between STN and GPi

Gives the values and the description of the parameters used in the model and simulation.

STN, Sub Thalamic Nucleus; GPe, Globus Pallidus Externa; GPi, Globus Pallidus Interna.

Synaptic connections

The synaptic connectivity between the nuclei was considered as one to one as in Dovzhenok and Rubchinsky (2012) and was modeled as (similar to Humphries et al., 2009)

The effect of voltage-dependent magnesium channel on NMDA current (Jahr and Stevens, 1990) was modeled as, where, τ_Recep = decay constant for synaptic receptor, E_Recep = receptor associated synaptic potential (Recep = AMPA/GABA/NMDA), S^x_ij = Spiking activity of neuron “x” at time “t,” h^x→y_ij = gating variable for the synaptic current from “x” to “y,” W_x→y = synaptic weight from neuron “x” to “y,” Mg²⁺ = Magnesium ion concentration and V^y_ij = membrane potential of the neuron “y” for the neuron at the location (i, j) The time constants of GABA, AMPA, and NMDA in STN and GPe were chosen from (Götz et al., 1997) are given in Table 1. All the synaptic connections with their respective variables are described in Table 2 and the values of parameters are given in Table 1.

Table 2

Variable	Description
h^StrD1x→GPi_ij	Gating variables for GPi neuron due to GABAergic projections from D1 striatum. “x” represents the input #. For example, if there are 2 inputs presented to the model, x = 1, 2
h^StrD2x→GPe_ij	Gating variable for GPe neuron due to GABAergic projections from D2 striatum
h^AMPA→GPe_ij/h^NMDA→GPe_ij	Gating variable for GPe neuron due to glutamatergic input from STN due to either NMDA or AMPA receptor
h^GABA→STN_ij	Gating variable for STN neuron due to GABAergic input from GPe neuron
h^AMPA→STN_ij/h^NMDA→STN_ij	Gating variable for STN neuron due to glutamatergic input from its collaterals due to either NMDA or AMPA receptor
h^GABA→GPe_ij	Gating variable for GPe neuron due to GABAergic input from its collaterals.
I^StrD1→GPi_ij	Inhibitory GABAergic current to GPi neuron from D1 striatum
I^StrD2→GPe_ij	Inhibitory GABAergic current to GPe neuron from D2 striatum
I^AMPA→GPe_ij/I^NMDA→GPe_ij	Excitatory glutamatergic current (AMPA/NMDA) from STN neuron to GPe neuron
I^GABA→STN_ij	Inhibitory GABAergic current from GPe neuron to STN neuron
I^AMPAlat_ij/I^NMDAlat_ij	Excitatory glutamatergic current (AMPA/NMDA) from STN neuron to STN neuron due to collateral synapses
I^GABAlat_ij	Inhibitory GABAergic current from GPe neuron to GPe neuron due to collateral synapses
S^GPe_ij	Spiking activity of GPe neuron at location (i, j) at time “t”
S^STN_ij	Spiking activity of STN neuron at location (i, j) at time “t”
S^Dxy_ij	Spiking activity of striatum at location (i, j) at time “t.” The variable “y” represents either D1 striatum (=1) or D2 striatum (=2) for an input stimulus “x”

Gives a description of all the synaptic variables of various synaptic currents modeled using Equations (4) and (8) in Section Synaptic Connections.

TN, Sub Thalamic Nucleus; GPe, Globus Pallidus Externa; GPi, Globus Pallidus Interna.

Lateral connections in STN and GPe neurons

Various anatomical studies show the presence of collaterals in STN (Kita et al., 1983) and GPe (Kita and Kita, 1994) neurons. Gillies et al. (2002) show, using a computational model, how various neural firing patterns could emerge due to collaterals in STN (Gillies and Willshaw, 1998). The lateral connections in the current network were modeled as Gaussian neighborhoods. Each neuron (in STN/GPe) receives collateral synaptic input from a fixed number of neighboring neurons located in a 2D grid of size nxn.

Effect of DA on synaptic structural plasticity of STN and GPe neurons

Behavioral learning can lead to synaptic structural changes either in dendrites or in signaling pathways (Caroni et al., 2012). Axonal and dendritic spine elongation and reduction in various areas of brain such as neo-cortex and hippocampus is well known (Richards et al., 2005; Gogolla et al., 2007; Caroni et al., 2012). Interestingly, this structural plasticity has also been observed in dorsal and ventral striatum of BG due to DA depletion (Meredith et al., 1995). Structurally, an increase in synaptic strength could be due to increase in the number of contacts or number of dendritic spines (Mckinney, 2010) which is associated with an increase in NMDA current (Tian et al., 2007). The burst firing in STN neurons observed under PD conditions is hypothesized to be due to increased NMDA currents (Zhu et al., 2005; Shen and Johnson, 2010). Also, Robertson et al. (1990) show a reduction in GABA-A receptor expression levels in GPe neurons of MPTP primates (Robertson et al., 1990), an area that receives projections from SNc (Smith and Kieval, 2000). A decrease in GABA-A levels has also been shown to be correlated to decrease in the number of dendritic spines in neurons (Pallotto and Deprez, 2014). A study by Fan et al. (2012) showed a greater proliferation of synapses from GPe to STN neuron in 6-OHDA rats compared to controls (Fan et al., 2012).

Considering the above mentioned experimental results, one may expect dopamine-dependent plasticity in STN and GPe neurons. Experimental studies have shown the synaptic currents from collaterals (inhibitory or excitatory) follow Gaussian distribution (Lukasiewicz and Werblin, 1990). It has been observed during experimental recordings that low DA levels increase the synchrony levels within STN neurons (Bergman et al., 1994, 1998). Theoretically, such behavior can be observed in any excitatory neurons when their lateral connections are strengthened (Hansel et al., 1995). Moreover, GPe neurons also show synchrony (Bergman et al., 1998) at low DA conditions and such a behavior in inhibitory neurons can be observed when their collateral strength is decreased (Wang and Rinzel, 1993). Taking these theoretical and experimental results into account, we assume that the width of the collateral spread to be modulated by DA levels.

Accordingly, the width of Gaussians in STN and GPe laterals (Section Lateral connections in STN and GPe neurons) in the model was assumed to be modulated by DA and modeled as, where r_s = constant variance of STN Gaussian, r_g = constant variance of GPe Gaussian, R_s = changed variance of STN Gaussian due to the effect of dopamine, R_g = changed variance of GPe Gaussian due to the effect of dopamine, cD21 = a constant that determines the effect of DA on STN and GPe laterals, w^m→m_ij = lateral weight matrix of neuron “m” at location (i,j), d_ij,pq = distance from center neuron (p,q), R_m = R_g(or) R_s, A_m = strength of lateral synapse, m = STN or GPe neuron. All parameter values are given in Table 1.

The inhibitory (excitatory) collateral current developed in GPe (STN) neurons are governed by Equation (8). The effect of “B_ij” is valid only for NMDA synapses made by STN collaterals but not for GABAergic synapses.

Experimental data suggests that DA causes post-synaptic effects on glutamatergic and GABA currents in STN and GPe respectively (Smith and Kieval, 2000; Cragg et al., 2004; Fan et al., 2012). Thus, we included a factor (cd2), which regulates the effect of the DA, on synaptic currents between STN and GPe, as in Equation (9). This leads to a decrease in the regulated current with increase in DA, as observed in Kreiss et al. (1997).

where the synapses are GPe→STN and STN→GPe. A similar method for DA-dependent synaptic modulation on striatal neurons was used in Humphries et al. (2009).

Total synaptic currents received by each neuron

Total synaptic currents received by GPe neurons.

The total synaptic current received by a GPe neuron at lattice position (i, j) is the summation of GABAergic input from the D2-expressing striatal MSNs (Gerfen et al., 1990) Equation (5), glutamatergic current from STN considering both AMPA and NMDA currents (Götz et al., 1997) Equation (5) and the inhibitory lateral current form other GPe neurons Equation (8). The influence of DA on the GABAergic current from D2 striatum to GPe neuron (Hadipour-Niktarash et al., 2012) was accounted by the variable cD2.

where, I^GABAlat_ij = the inhibitory lateral GABAergic current from other GPe neurons, I^NMDA→GPe_ij = excitatory glutamatergic current from STN neuron due to NMDA receptor, I^AMPA→GPe_ij = excitatory glutamatergic current from STN neuron due to AMPA receptor, I^StrD2→GPe_ij = inhibitory GABAergic current from D2 striatum, cD2 = Gain parameter that affects the GABAergic D2 striatal current.

Total synaptic currents received by STN neurons.

The total synaptic current received by an STN neuron at lattice position (i, j) is summation of GABAergic current from GPe neurons (Fan et al., 2012) Equation (5) and glutamatergic input (both AMPA and NMDA) from other STN (Kita et al., 1983) Equation (8).

Where, I^NMDAlat_ij = excitatory glutamatergic current from collateral STN neurons due to NMDA receptor, I^AMPAlat_ij = excitatory glutamatergic current from collateral STN neurons due to AMPA receptor, I^GABA→STN_ij = inhibitory GABAergic current from GPe neuron.

Total synaptic currents received by GPi neurons.

The total synaptic current received by a GPi neuron at lattice position (i,j) is a summation of GABAergic currents from D1 striatal neurons (Gerfen et al., 1990)and glutamatergic (both AMPA and NMDA) input from STN neurons (Gerfen and Surmeier, 2011). The increase in GABAergic current from D1 striatum to GPi neurons due to DA modulation (Kliem et al., 2007) was taken into account by the variable cD1.

Where, I^NMDA→GPi_ij = excitatory glutamatergic current from STN neuron due to NMDA receptor, I^AMPA→GPi_ij = excitatory glutamatergic current from STN neuron due to AMPA receptor, I^StrD1→GPi_ij = inhibitory GABAergic current from D1 striatum, cD1 = Gain parameter that affects the GABAergic striatal current.

Synchronization

The phenomenon of neural synchrony has attracted the attention of many computational and experimental neuroscientists in the recent decades (Pinsky and Rinzel, 1995; Plenz and Kital, 1999; Hauptmann and Tass, 2007; Kumar et al., 2011; Park et al., 2011). It is believed that partial synchrony helps in the generation of various EEG rhythms such as alpha and beta (Izhikevich, 2007). Studying synchrony in neural networks has been gaining importance due to its presence in normal functioning (coordinated movement of the limbs) and in pathological states (e.g., synchronized activity of CA3 neurons in the hippocampus during an epileptic seizure) (Pinsky and Rinzel, 1995). Plenz and Kitai (1998) proposed that STN-GPe might act as a pacemaker (Plenz and Kital, 1999), a source for generating oscillations in pathological conditions such as Parkinson's disease. Park et al. (2011) report the presence of intermittent synchrony between STN neurons and its Local field potentials (LFP), recorded using multiunit activity electrodes from PD patients undergoing DBS surgery (Park et al., 2011). They also calculated the duration of synchronized and desynchronized events in neuronal activity by estimating transition rates, which were obtained with the help of first return maps plotted using phase of neurons (Park et al., 2010, 2011). To observe how dopamine changes synchrony in STN-GPe, we calculated the phases of individual neurons as defined in (Pinsky and Rinzel, 1995).

The phase of jth neuron was calculated as follows, where, t_j,k and t_j,k+1 are the onset times of kth and k + 1th spike of the jth neuron T_j,k ∈ [t_j,k, t_j,k+1], ∅_j (t) = Phase of jth neuron at time “t,” R^sync is the synchronization measure 0 ≤ R^sync = 1,θ = Average phase of neurons, N = total number of neurons in the network.

Action selection using the race model

Action selection is modulated by BG output nucleus GPi which projects back to the cortex via the thalamus. We have used the race model (Vickers, 1970) for the final action selection where an action is selected when temporally integrated neuronal activity of the output neurons crosses a threshold (Frank, 2006; Frank et al., 2007; Humphries et al., 2012).

The dynamics of the thalamic neurons is as follows, where, z_k (t) = integrating variable for kth stimulus, f_GPik (t) = normalized and reversed average firing frequency of GPi neurons receiving kth stimulus from striatum, f^maxGPi = highest firing rate among the GPi neurons, S^Gpik_ij = neuronal spikes of GPi neurons receiving kth stimulus, N = number of neurons in a single row/column of GPi array (=50), T = duration of simulation.

The first neuron (z_k) among k stimuli to cross the threshold (=0.15) represents the action selected. All the variables representing neuron activity are reset immediately after each action selection.

Binary action selection task

The first task we simulated was the simple binary action selection similar to Humphries et al. (2006), where two competing stimuli were presented to the model (Humphries et al., 2006). The input firing frequency is thought to represent “saliency,” with higher frequencies representing higher salience (Humphries et al., 2006). The response of striatal output to cortical input falls in the range of a few tens of Hz (Sharott et al., 2012). Therefore the frequencies that represent the 2 actions were assumed to be around 4 Hz (stimulus #1) and 8 Hz (stimulus #2). Spontaneous output firing rate of the striatal neurons (without input) is assumed to be around 1 Hz (Plenz and Kitai, 1998; Sharott et al., 2012). Selection of higher salient stimulus among the available choices could be considered as “exploitation” while selecting the less salient one as “exploration” (Sutton and Barto, 1998). So the action selected is defined as “Go” if stimulus #2 (more salient) is selected, “explore” if stimulus #1(less salient) is selected and “No Go” if none of them is selected.

The inputs were given spatially such that the neurons in the upper half of the lattice receive stimulus #1 and lower half the other (Figure 2). The striatal outputs from D1 and D2 neurons of the striatum are given as input to GPi and GPe modules respectively with the projection pattern as shown in Figure 2. Poisson spike trains corresponding to Stimulus #1 were presented as input to neurons (1–1250) and were fully correlated among themselves. Similarly, Poisson spike trains corresponding to Stimulus #2 were presented as input to neurons (1251–2500) and were fully correlated among themselves. Stimulus #1 and #2 are presented for an interval of 100 ms between 100 and 200 ms; at other times uncorrelated spike trains at 1 Hz are presented to all the striatal neurons. The values of the parameters used synaptic weight to implement the binary action selection problem are given in Table 1.

Figure 2

The N-armed bandit task

We now describe the 4-arm bandit task (Daw et al., 2006; Bourdaud et al., 2008) used to study exploratory and exploitatory behavior. In this experimental task, subjects were presented with 4 arms where one among them is to be selected in every trial for a total of 300 trials. The reward/payoff for each of these slots was obtained from a Gaussian distribution whose mean changes from trial to trial with payoff ranging from 0 to 100. The payoff, r_i.k associated with the ith machine at the kth trial was drawn from a Gaussian distribution of mean μ_i,k and standard deviation (SD) σ₀. The payoff was rounded to the nearest integer, in the range [0, 100]. At each trial, the mean is diffused according to a decaying Gaussian random walk. The trial was defined as an “exploitatory” trial if highest reward giving arm was selected else defined as an “exploratory” trial.

The payoffs generated by the slot machines are computed as follows, where, μ_i,k is the mean of the Gaussian distribution with standard deviation(σ₀) for ith machine during kth trial. λ_m and θ_m control the random walk of mean (μ_i,k) and e ~ N(0, σ²_d) is obtained from Gaussian distribution of mean 0 and standard deviation σ_d. r_i,k and r'_i,k are the payoffs before and after rounding to nearest integer respectively. The initial value of mean payoff, μ_i,0, is set to a value of 50. All the values for the parameters λ_m, θ_m, σ_d, σ₀ were adapted from (Bourdaud et al., 2008).

To make an optimal decision, the subjects need to keep track of rewards associated with each of the 4 arms. The subject's decision to either explore or exploit would depend on this internal representation which would closely resemble the actual payoff that is being obtained. It is quite difficult to identify whether the subject made an exploratory decision or an exploitative one just by observing the EEG and selected slot data. A subject-specific model is required to classify their decisions and identify the strategy (Daw et al., 2006; Bourdaud et al., 2008). Keeping this in mind, Bourdaud et al. (2008) used a “behavioral model” that uses the soft-max principle of RL to fit the selection pattern of human subjects. The parameter “β” of the behavioral model was adjusted such that the final selection pattern matches that of individual subjects in the experiment (refer Appendix A and Table A.1 in Supplementary Material for details). The parameter “β” which controls the exploration level in the behavioral model is tuned to match % exploitation obtained for each of the 8 subjects (1 subject's data was discarded because of artifacts); 2 out of the 8 subjects had similar exploration levels. Hence, a total of 6 subjects' data is taken to account to check the performance of the proposed spiking BG model.

Strategy for slot machine selection

To simulate the experiment, we utilized the concepts of RL and combined the dynamics of BG model to select an optimally rewarding slot in each trial. Experimental data show that BG receives reward related information in the form of dopaminergic input to striatum (Niv, 2009; Chakravarthy et al., 2010). Cortico-striatal plasticity changes due to dopamine (Reynolds and Wickens, 2002) were incorporated in the model by allowing DA signals modulate the Hebb-like plasticity of cortico-striatal synapses(Surmeier et al., 2007).

The architecture of the proposed network model is depicted in Figure 3. The output of striatum (both D1 and D2 parts) was divided equally into 4 quadrants which receive input from corresponding stimulus. The stimuli are associated with 2 weights (w^D1_i,0, w^D2_i,0) initialized with equal value of 50 which represent the cortico-striatal weights of D1 and D2 MSNs in the striatum. Each of the cortico-striatal weight represents the saliency (in terms of striatal spike rate) for that corresponding arm. These output spikes generated from each of the D1 and D2 striatum project to GPi and GPe respectively. The final selection of an arm is made as in Section Action Selection Using the Race Model. The reward r_i,k received for the selected slot was sampled from Gaussian distribution with mean μ_i,k and SD (σ₀) Equation (19).

Figure 3

Using the obtained reward (r_i,k), the expected value of the slots, inputs to D1 and D2 striatum are updated using the following equations:

The expected value (V_k) for kth trial is calculated as

The received payoff (Re_k) for kth trial is calculated as

The error (δ) for kth trial is defined as where, w^D1_i,k are the cortico-striatal weights of D1 striatum for ith machine in kth trial, w^D2_i,k are the cortico-striatal weights of D2 striatum for ith machine for kthtrial, r_i,k is the reward obtained for the selected ith machine for kth trial, x^inp_i,k is the binary input vector representing the 4 slot machines, e.g., if the first slot machine is selected x^inp_i,k = [1 0 0 0], η (=0.3) is the learning rate of D1 and D2 striatal MSNs, Re_k is the recieved payoff for selected slot for kth trial,V_k is the expected value for selected slot for kth trial

The cortico-straital weights are updated Equations (22) and (23) using the error term “δ” Equation (26). The reward related information in the form of dopaminergic input to striatum has been correlated to the error (δ) (Niv, 2009; Chakravarthy et al., 2010). The δ calculated from the Equation (26) has both positive and negative values with no upper and lower boundaries but the working DA range in the model was limited to small positive values (0.1–0.9). Hence, a mapping from δ to DA is defined as follows, where, DA is the dopamine signal within range of 0.1–0.9, λ is the slope of sigmoid (=0.2), δ_k is the error obtained for kth trial Equation (26), sig () is the sigmoid function given by equation:

To verify whether a rewarding slot is selected, delta (δ_k) as described in Equation (26) was calculated which keeps track of expected and actual payoff.

Results

We have investigated if the chosen Izhikevich parameters for STN, GPe and GPi displayed biological properties of corresponding neurons (Figure 4). The distinctive property of inhibitory rebound in STN (Hamani et al., 2004) was observed in simulation which was absent in GPe and GPi neurons. The firing rate of STN, GPe and GPi neurons increased when increasing current inputs (I^x_ij) as observed in experimental recordings (Magnin et al., 2000; Thibeault and Srinivasa, 2013).

Figure 4

We then present results from 3 sets of simulation studies starting with the characterization of the dynamics of STN-GPe network (Simulation set 1). A key idea explored in this study is that the dynamics of STN-GPe critically influence action selection by BG, particularly in the component of exploration. Therefore, we characterize STN-GPe network dynamics in terms of firing frequency and synchronization measure, R^sync. Following that, we present results from the simple binary action selection task (Simulation set 2) where the presence of 3 regimes (Go/explore/No-Go) in action selection is demonstrated revealing the interplay of IP and DP in action selection. Then we present results from the n-arm bandit task (Simulation set 3). The amount of exploration obtained from experimental data was comparable to that of BG model by changing the IP weight.

Simulation set 1: STN-GPe circuit dynamics and synchrony

Pathological oscillations of STN and GP have been associated with various PD symptoms (Bergman et al., 1994; Brown et al., 2001; Levy et al., 2002; Brown, 2003; Litvak et al., 2011; Park et al., 2011; Dovzhenok and Rubchinsky, 2012). Correlated neural firing patterns in STN and GPi can be seen in both experimental conditions of dopamine depletion and in Parkinsonian conditions (Bergman et al., 1994; Nini et al., 1995; Magnin et al., 2000; Brown et al., 2001). Using a conductance based model of STN and GPe system, Terman et al. (2002) demonstrated a variety of rhythmic behaviors by varying the connectivity patterns between STN and GPe (Terman et al., 2002). In the present model we assume that the connections within and between STN and GPe are dopamine-dependent (Cragg et al., 2004) and show increased synchronized behavior under conditions of reduced dopamine, resembling the situation in dopamine-deficient conditions of Parkinson's disease. No inputs were given to STN-GPe network; dopamine (DA) was varied as a parameter Equations (7) and (9) and neural dynamics in the two systems was studied.

The firing patterns in both STN and GPe varied from synchronized to desynchronized states as levels of dopamine are increased from 0.1 (low) to 0.9 (high) (Figures 5–7). Synchronization parameter “R^sync” as described in Section Action Selection Using the Race Model, Equation (16), is calculated within STN (R^sync_STN) neurons, GPe (R^sync_GPe) neurons and also between STN and GPe (R^sync_STNGPe) neurons (Figure 9). For low value of DA (0.1), we observed that both R^sync_STN (Figure 5B) and R^sync_GPe (Figure 5C) were high (=1). The value of R^sync_STNGPe (Figure 5E) oscillated between 0 and1 indicating an alternating pattern of synchrony, which is observable in raster plots (Figures 5A,D).

Figure 5

As DA value was increased to an intermediate level (0.5), a decrease in the value of R^sync_STN (Figure 6B) and R^sync_GPe (Figure 6C) was observed with time. The decrement in the amplitude of oscillatory pattern in R^sync_STNGPe (Figure 6E) indicates the presence of synchronized and desynchronized firing patterns of the neurons. This can be observed in the raster plots of STN and GPe neurons (Figures 6A,D) which show the beginning of desynchronized behavior.

Figure 6

At high DA (0.9), R^sync_STN (Figure 7B) has decreased to an average value of 0.3 and R^sync_GPe (Figure 7C) reached an average value of 0.1. The oscillatory pattern in R^sync_STNGPe (Figure 7E) is completely absent at high DA indicating a desynchronized firing pattern, which can also be seen in the raster plots of STN and GPe neurons (Figures 7A,D).

Figure 7

The average firing rate of the neurons in the network was calculated using the following equation: where, k = denotes STN/GPe, f = average firing rate of the STN/GPe network for a simulation time of 1 s, S^k_ij(t) = Spiking activity of neuron at (i,j) in the network defined by “k,” N = total number of neurons (50∗50), T = simulation time (1 s).

The firing rate of STN neurons decreased from a range of 45–50 Hz (due to bursting) at low DA(0.1) to 35–40 Hz at high DA (0.9) (Figure 8A). Similarly the frequency of GPe neurons increased from about 60–70 Hz at low DA (0.1) to 80–90 Hz at high DA (0.9) (Figure 8B). The simulated firing rates of STN and GPe neurons match with reports from electrophysiological studies (Magnin et al., 2000; Benazzouz et al., 2002) where an increase and decrease in firing rate was observed for STN and GPe respectively in Parkinsonian conditions and vice versa for normal conditions.

Figure 8

Under low DA conditions, the contribution of excitatory lateral current is higher in STN, thereby increasing overall firing rate (Figure 8A) and synchrony (Figure 9A) which is observed in general excitatory neurons (Hansel et al., 1995) and such synchrony was found to be absent at high DA levels. GPe neurons show a synchronized firing pattern with decreased lateral synaptic weights at low DA level (Figure 9B) (Wang and Rinzel, 1993). On the contrary, a high lateral inhibition at high DA prevents the neighborhood neurons from firing at the same time, thereby producing a desynchronized firing pattern.

Figure 9

The effect of DA on the synchronization of STN and GPe neurons was studied by estimating the values ofR^sync_STN, R^sync_GPeR^sync_STNGPe for increasing values of DA (0.1–0.9). The 3 “R^sync” values showed a decrease in amplitude with an increase in DA level (Figures 9A–C) and the oscillatory activity at low and high DA levels was shown in Figure 9D. Under low DA conditions, GPe activity follows STN activity (Plenz and Kital, 1999) thus forming a pacemaker kind of circuit, which could be the source of STN-GPe oscillations. One of the suspected reasons of bursting activity in STN is the decreased inhibition from GPe neurons (Plenz and Kital, 1999) at low DA levels. This feature is captured by the model since GPe firing rates are smaller for lower DA levels. The STN neurons showed oscillatory around the frequency of 10 Hz at low DA but was absent at high DA level (Kang and Lowery, 2013).

Simulation set 2: binary action selection

The simulation was run for a period of 250 ms, out of which the input stimuli (assumed to be projected from cortex) is given during the time interval 100–200 ms. A background input around 1 Hz was given during the rest of the simulation. The striatal network with 2500 neurons is divided into two equal sections such that the neurons in the first section (1–1250) received Stimulus #1 and the rest(1251–2500) received Stimulus #2 (Figure 2). Seeking to understand how dopamine affects action selection, we varied the dopamine level from 0.1 (low) to 0.9 (high) and observed which of the 2 inputs was selected at different dopamine levels. The action selected is classified into Go/Explore/No-Go depending on which stimulus is selected: it is “Go” if the stimulus with higher salience is selected, “Explore” if the other stimulus is selected, and “No-Go” if no input is selected.

We studied the pattern of action selection as a function of DA level. In low DA state, the activity of STN is high (Figure 10) thus increasing the activity of GPi (Figure 10); an overactive GPi inhibits the thalamus and therefore no action is selected (Figure 10). We thus have a “No-Go” case under low DA conditions.

Figure 10

At intermediate levels of dopamine (0.4–0.6), GPi neurons dynamically segregate into two pools, those corresponding to Stimulus #1 and #2 respectively. Neurons in either pool fire in strong synchrony among themselves, while the two pools fire in alternation (Figure 11). The alternation is more visible in GPe and GPi and not so much in STN. This alternation, as we will see below, introduces variability in action selection, even though there is no change in input stimulus. We interpret this variability as a form of exploration in action selection because the burst of activity for the neuron pool corresponding to one action increases the probability of its selection, while simultaneously preventing the selection of the other action. We interpret this dynamical regime corresponding to intermediate DA levels as the “explore” regime.

Figure 11

For high dopamine levels (Figure 12), the activity of D1 striatum is high and the DP dominates the IP. The stronger input (Stimulus #2) is selected always as it reaches the threshold sooner. Thus higher dopamine levels correspond to the “Go” regime.

Figure 12

Simulations were run for 100 trials and the percentage of actions selected under each regime (Go, Explore and No-Go) was calculated for dopamine levels ranging from low (0.1) to high (0.9) (Figure 13). From Figure 13, we may note that the probability of No-Go, where no action is selected, decreases with increase in dopamine; probability of Go increases with dopamine; the peak of exploration is found at intermediate levels of dopamine.

Figure 13

To check the influence of other structures on action selection, we removed the STN projections to GPi, which is best done by omitting the first two terms on the right side of Equation (13). The resulting decision plot exhibited only Go and No-Go with a completely flat Explore regime (Figure B1-b in Supplementary Material Appendix B). The above result suggests that the IP is crucial for exploration. Then we studied how various aspects of the STN-GPe network affect exploration. Changes in GPe lateral connections did not alter exploration levels (results not shown). We then varied the strength of STN-lateral connections and found that at very high values, the system shows very little exploration (Figure B1-a in Supplementary Material Appendix B). STN lesions in 6-OHDA and MPTP animals are known to relieve the symptoms of PD and initiate motor movements (Baunez et al., 1995) but results in multiple deficits while performing attentional and choice tasks such as increase in reaction time and decrease in exploration levels (Baunez et al., 1995, 2001; Baunez and Robbins, 1997). So we studied the effect of STN lesions on exploration and found that as the size of lesion is increased the amount of exploration decreased. We have added the result for a lesion patch of 20 × 20 neurons (Figure B1-c in Supplementary Material) where the lesion was created at the center of the STN neuron lattice. This was achieved by setting the spiking activity of the corresponding neurons to zero (=0). To investigate the effect of STN laterals on exploration, we increased the strength of STN laterals and calculated % exploration at intermediate levels [0.4–0.6] by applying the binary action selection problem. We increased strength of STN laterals from [0.05 to 0.25] with a step of 0.05. We have observed that at low and high levels, the system does not show exploration but peaks for a range of strengths. The result is shown in Figure B2 in Supplementary Material Appendix B.

Simulation set 3: the N-armed bandit task

The decision making ability of the BG model was checked by comparing its performance with behavioral model, representation of experimental data in the n-armed bandit task (n = 4). The task was simulated for a total of 300 trials. The payoff pattern of the 4 arms for 300 trials calculated using the Equations (19)–(21) is shown in Supplementary Material Figure A1.

Parameter “delta”

The difference between the received payoff and estimated payoff from the BG model, the error (δ^bg_k) was calculated for each trial. These results were compared with the error (δ^be_k) obtained from the behavioral model (Bourdaud et al., 2008). The performance of BG model was found to be comparable to behavioral model, which was reflected in the difference between the expected values (V) obtained from behavioral model and the BG model, defined as e^bebg_k = V^bg-V^be, where V^bg and V^be are expected values obtained from BG and behavioral model. The average and SD of the 3 errors (δ^bg_k, δ^be_k, e^bebg_k) obtained by simulating both behavioral and BG model are listed in Table 3 for all the 6 subjects.

Table 3

Subject	δ^bg	δ^be	e^bebg
1	4.68 ± 4.74	5.27 ± 4.36	4.40 ± 4.49
2	5.04 ± 4.15	5.80 ± 4.59	3.95 ± 4.02
3	5.52 ± 5.05	5.88 ± 6.10	4.27 ± 6.12
4	5.05 ± 4.15	5.18 ± 4.27	4.31 ± 4.20
5	4.89 ± 4.13	5.85 ± 5.21	4.63 ± 4.51
6	4.45 ± 3.52	6.06 ± 5.55	4.68 ± 4.55

Errors obtained from behavioral model (δ^be_k) and BG (δ^bg_k) model independently and a comparison of the errors obtained from the 2 models (e^bebg_k).

Column 2 shows error (δ^bg_k) values for each of the 6 subjects in BG model. Column 3 shows the error obtained from behavioral model (δ^be_k) of Bourdaud et al. (2008). Column 4 indicates the difference in error (e^bebg_k) which is the difference between the expected values (V^bg, V^be) calculated from the 2 models.

δ bg, error obtained from BG model; δ be, error obtained from Behavioral model; ebebg, difference between Vbg and Vbe.

Percent exploitation

In addition to payoff error (δ), another measure that we used to compare performance of BG model with the experimental data, which measures “percentage exploitation.” It is defined as the percentage number of times the highest (expected) reward yielding action (calculated over 300 trials) was selected. For example in a trial “k” if highest reward is obtained from slot 4, and if the model also selects slot 4 then the trial resulted in exploitation; else it is exploration. We calculated the average percentage exploitation values for 10 sessions, where each session consists of 300 trials.

Subject to subject exploration variability was accounted by varying the “temperature” parameter β in the behavioral model Equation (A.8) (Appendix A in Supplementary Material). The parameter “β” controls the exploit-explore balance (higher β implies greater exploitation). Since the indirect pathway (IP) dynamics drives exploration in the BG model, we expected that varying the strengths of the direct pathway (DP) (decreasing w_Str→GPi) and the indirect pathway (increasing w_STN→GPi) would give similar results in terms of decreased % exploitation levels.

The performance of BG model was compared with the behavioral model in terms of % exploitation shown in the Figure 14. Figure 14A shows the % percentage exploitation as the Y-axis with x-axis as individual subjects, which relates to corresponding beta (β) values in behavioral and DP weight values in BG model. Holding w_STN→GPi constant at 0.75, we varied w_{Str→ GPi} over the range of [2, 4] in steps of 0.25 to match the exploitation levels of the subjects. The relationship between the DP weights (w_Str→GPi,) and beta (β) is plotted in Figure 14C. Similarly % percentage exploitation for changing beta (β) and increasing (w_STN→GPi) was plotted in Figure 14B. A decrease in w_{Str→ GPi} implies reduced influence of DP relative to IP, resulting in greater exploration. Similarly one can be control exploration by varying the strength of the IP (w_STN→GPi). Holding w_Str→GPi, constant at value (=5), we varied w_STN→GPi over the range of [0.25, 1.25]. The individual weight values for corresponding beta's have been plotted in Figure 14D.

Figure 14

To simulate the performance of PD subjects in the above model, we clamped the delta (δ) to a negative value (-20) (simulating low levels of DA) and checked the performance. We observed that the % exploitation decreased (=44%) compared to normals. The decrease in the performance of the PD off condition might be due to decreased exploration leading to the selection of suboptimal choice.

In the binary action selection task (Section Simulation Set 2: Binary Action Selection), we observed that the level of exploration could be related to the synchrony levels in STN neurons. So we classified each of the 300 trials into either exploratory or exploitatory and then checked the corresponding synchrony levels in STN neurons. The synchrony levels for exploitatory case was observed to be significantly lower (=0.13 ± 0.12) than exploratory ones (=0.33 ± 0.176). Independent 2 sample t-test was conducted between synchrony parameter “R_sync” for exploratory and exploitatory trials. With a P-value of 0.002, we could say that there is a statistically significant difference between the 2 mean “R_sync” values. The bar plot for mean “R_sync” for explore and exploit trials is shown in Figure 15.

Figure 15

Discussion

The goal of this model was to understand the role of the BG in explorative behavior as well as the occurrence of synchrony in PD conditions. Studies on exploration-exploitation tradeoff show the importance of these processes during decision making (Sutton and Barto, 1998; Cohen et al., 2007; Humphries et al., 2012; Laureiro-Martãnez et al., 2013). Experimental studies conducted by the Baunez group also suggest a role for STN in exploration where they observed that STN lesions tend to alter the choice made by rats (Baunez et al., 2001). These results emphasize the importance of studying exploration and the corresponding neural substrates at the subcortical level.

Exploration in action selection is usually modeled as being driven by noise or a stochastic mechanism (Cohen et al., 2007; Moustafa and Gluck, 2011; Schroll et al., 2012). The STN-GPe loop of BG has been proposed to act as a pacemaker (Plenz and Kital, 1999) capable of producing synchronized oscillations at low DA levels (PD) (Brown et al., 2001; Bevan et al., 2002) and desynchronized spiking activity at high DA level. In an earlier study, using a rate-coded neural network model of BG (Kalva et al., 2012), we have shown that the STN-GPe system exhibits chaos and fixed point dynamics as two network parameters (w = strength of connections between STN and GPe; σ = strength of lateral connections within STN and GPe) are varied. This trend reached its peak when the STN-GPe system was located on the border between chaos and ordered regimes, viz., the “edge of chaos.” From the facts that synchrony plays an important role in PD and DA levels influence the synchrony levels in STN-GPe, we developed a spiking neuron network model of BG to study the relation between synchrony and exploration by simulating simple (binary action selection) and complex decision making (n-armed bandit task). The model also showed oscillatory activity in STN-GPe neurons at low DA level known to correlate to PD tremor (Bevan et al., 2002).

The STN-GPe loop and exploration-exploitation dynamics

One of the aims of the present study is to show that the complex dynamics of STN-GPe system contributes to exploration and can be correlated to the synchrony levels in the STN-GPe loop. So we studied the STN-GPe loop dynamics (Section Simulation Set 1: STN-GPe Circuit Dynamics and Synchrony) without input from D2 striatum for increasing levels of dopamine (Figures 5–7). As we are interested in studying the role of synchrony in exploration, we characterized the dynamics of STN-GPe in terms of synchronization and oscillatory activity (Figure 9). Due to the observations that STN and GPe neurons show synchrony (asynchrony) at low(high) DA levels (Bergman et al., 1994, 1998) and such behavior in excitatory-inhibitory networks is observed when the excitatory lateral connection is high and inhibitory is low (Lukasiewicz and Werblin, 1990). Considering the above observations we assume the DA modulates the width of Gaussians of STN and GPe collaterals. The results tally with the general observation from electrophysiology that at higher levels of dopamine, the STN-GPe system shows desynchronized activity and under dopamine-deficient conditions exhibits synchronized bursts (Bergman et al., 1994; Gillies and Willshaw, 1998; Park et al., 2011). It is also consistent with the experimental finding that dopamine-deficiency results in an increase of correlations in firing patterns of STN neurons (Brown et al., 2001; Benazzouz et al., 2002; Levy et al., 2002; Willshaw and Li, 2002; Brown, 2003; Foffani et al., 2005). Some computational modeling effort has investigated the link between STN, GPe oscillations and PD tremor (Bevan et al., 2002), an idea that also has strong experimental support with regard to the STN-GPe circuit (Nini et al., 1995; Hurtado et al., 1999; Levy et al., 2002; Brown, 2003; Park et al., 2010). We observed that STN activity showed oscillatory activity with a frequency (=10 Hz) which falls under the beta frequency range observed in experimental PD study (Weinberger and Dostrovsky, 2011).

Role of STN-GPe in binary action selection task

We then used the same model to simulate the binary action selection task (similar to Humphries et al., 2006). Here, we presented two stimuli as inputs to the model (Figure 1). The firing rate of the stimulus was represented as its saliency (Humphries et al., 2006) where selection of higher one was defined as “exploitation/Go” and lesser one as “exploration/Explore” and not selecting any of the inputs as “No-Go.” In the BG model of Kalva et al. (2012) some action is always chosen—thus it does not have a “No-Go” regime (Kalva et al., 2012). The current Izhikevich BG network showed No-Go at low DA levels (0.1–0.3) and Go at high DA levels (0.7–0.9) consistent with the classical picture of BG function. Along with this a peak in “Explore” at intermediate levels of DA (0.4–0.6) was also observed (Figure 13). At intermediate levels of DA, the neuronal pools of STN corresponding to the 2 inputs were firing out of phase which was also observed in GPe and GPe neurons. This anti-phase spiking behavior of GPi neurons became the source of randomness in deciding which stimulus would finally get selected. In other words, the neuronal pool ahead in alternation crossed the threshold first and that corresponding action got selected. This exploratory behavior was controlled by the strength of laterals in STN neurons (A_STN). As the strength of lateral connections was increased, the exploratory action selection percentage peaked and decreased on further increment (Figure B2 in Supplementary Material Appendix B). Increased A_STN leads to high synchrony among STN neurons which is one of characteristic feature observed in PD patients (Park et al., 2010, 2011). From the above observations we may suggest that the decrease in exploration levels in PD subjects (Archibald et al., 2013a) could be due to increased lateral strength in STN neurons value. We found at high lateral synaptic strength, the system switched only between Go and No-Go regimes (Figure B1a in Supplementary Material). To check whether any other module in the network is influencing exploration in the system, we removed the STN to GPi connection (which effectively eliminated the IP). This omission rendered the system to display only Go and No-Go regimes (no exploration) (Figure B1b in Supplementary Material). We also studied the effect of STN lesions on exploration and found as the size of lesion increased the system's exploratory behavior decreased. This result is in agreement with study conducted by Baunez et al. (2001) where they found a decrease in explorative behavior of rats while performing a choice reaction task (Baunez et al., 2001).

Complex decision making: N-armed bandit task

In the n-arm bandit task, the aim of the subject was to maximize the reward by selecting best (highest reward giving) slot in each trial and their performance was measured in terms of amount of exploitative behavior. To make sense of the experimental data, a behavioral model was used to estimate which trial was explorative or exploitative. A similar model was used by Daw et al. (2006) to analyze their fMRI data (Daw et al., 2006). The behavioral model uses the classical soft-max principle (Equation A.8 Appendix A in Supplementary Material) and the parameter “β” controls the level of exploration. Though the behavioral model helps in analyzing the experimental data, it does not elaborate on underlying neural mechanism of exploration due to constraint of being abstract. Apart from understanding the neural mechanism for exploration, the model can also be used to study decision making ability in Parkinsonian conditions and predict the effect of various drugs (L-Dopa, DA agonist) and STN–DBS. We attempted to explain the decision making mechanism in terms of synchrony levels in STN-GPe neurons. When subjected to the binary action selection task, the spiking network model of BG showed exploration at intermediate levels of DA controlled by STN-GPe synchrony levels. So the amount of exploration in the model was controlled by adjusting the synaptic weights in DP (w_{Str→ GPi}) and IP pathway (w_STN→GPi).

The results of behavioral model of Bourdaud et al. (2008) were approximated in 2 ways: (1) Increasing w_STN→GPi while holding w_{Str→ GPi} constant; this would increase exploration since the the GPi neurons are now more strongly influenced by STN and (2) Decreasing w_{Str→ GPi} while keeping w_STN→GPi constant, which has the same effect of increasing w_STN→GPi.

Two parameters (δ and % exploitation) were compared to check the accuracy of the model to account for experimental results. Individual performance of behavioral and BG models were checked by calculating delta (δ),TD error Equation (27) which indicates how well the model is able to track the actual reward pattern from the slots (Table 3). If the BG model is able to replicate the experimental results, the “δ” obtained from BG and behavioral (measure of experimental results) should be correlated. So we calculated the error e^bebg between the two delta's (δ^bg and δ^be) to check the accuracy and found the error to be low (Table 3). We also conducted 2 sample t-test on the delta values obtained from BG and behavioral model. An “H” value of (=0) for the test cases indicates that both delta's are from distribution of equal means. In other words, the difference/error between the 2 model expected values is low. The second measure was percentage exploitation i.e., the percentage number of times the model selects the slot with the highest expected payoff (Figure 14). The results obtained from BG model closely match with the behavioral model reinforcing the theory that STN-GPe could be a source for exploration at sub-cortical level. The synchrony level in STN was also found to be statistically different (P = 0.002) during exploratory vs. exploitatory trials. The “R_sync” value (Figure 15) during exploitatory trial (=0.13 ± 0.12) at high DA levels showed a desynchronized behavior leading to the selection of highest reward slot. During exploratory trials, synchrony level of (=0.33 ± 0.175) was observed in STN neurons which is similar to that as observed during binary action selection at intermediate DA level. This intermediate synchrony levels gave rise to the alternating pattern, source of randomness in the model leading to an exploratory behavior.

From these results, we would to emphasize that the exploratory behavior in the system can be controlled by collateral connection strength in STN neurons by changing the synchrony levels in the system. These results suggest that STN-GPe system of BG might be the possible exploratory substrate at subcortical level. Cortical structures also play a critical role in decision making (Bechara et al., 1994; Fellows and Farah, 2003; Clark et al., 2004; Ragozzino, 2007). Many research groups have been working to characterize the anatomical substrates of exploration and exploitation during decision making (Daw et al., 2006; Bourdaud et al., 2008; Pearson et al., 2009, 2010; Laureiro-Martãnez et al., 2013). The ability to modulate the oscillatory activity in STN-GPe neurons by cortex through the hyper-direct pathway has also been suggested and modeled (Kang and Lowery, 2013). Since most classical models of basal ganglia do not include the connection between GPe and GPi (Albin et al., 1989; Delong, 1990) but recent studies by (Nambu et al., 2005) indicate the presence of this pathway. Modeling studies such as (Coulthard et al., 2012) showed the role of BG in decision making without including this specific connection in their model. Though the presence of this pathway has been found out anatomically, the functional significance is yet to be explored. Considering these in to account, we have not included GPe-GPi connection in the model. As a part of future work we would like to integrate cortical areas and the inhibitory GPe-GPi connection with the current model and study the rich dynamics in the system.

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Statements

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Supplementary material

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fnins.2015.00191/abstract

References

1
AlbinR. L.YoungA. B.PenneyJ. B. (1989). The functional anatomy of basal ganglia disorders. Trends Neurosci. 12, 366–375. 10.1016/0166-2236(89)90074-X
2
AravamuthanB.MuthusamyK.SteinJ.AzizT.Johansen-BergH. (2007). Topography of cortical and subcortical connections of the human pedunculopontine and subthalamic nuclei. Neuroimage37, 694–705. 10.1016/j.neuroimage.2007.05.050
3
ArchibaldN. K.HuttonS. B.ClarkeM. P.MosimannU. P.BurnD. J. (2013a). Visual exploration in Parkinson's disease and Parkinson's disease dementia. Brain136, 739–750. 10.1093/brain/awt005
4
ArchibaldN. K.HuttonS. B.ClarkeM. P.MosimannU. P.BurnD. J. (2013b). Visual exploration in Parkinsonâ€™s disease and Parkinsonâ€™s disease dementia. Brain136, 739–750. 10.1093/brain/awt005
5
Aston-JonesG.RajkowskiJ.KubiakP.AlexinskyT. (1994). Locus coeruleus neurons in monkey are selectively activated by attended cues in a vigilance task. J. Neurosci. 14, 4467–4480.
- Pubmed Abstract
- Google Scholar
6
BaunezC.HumbyT.EagleD. M.RyanL. J.DunnettS. B.RobbinsT. W. (2001). Effects of STN lesions on simple vs choice reaction time tasks in the rat: preserved motor readiness, but impaired response selection. Eur. J. Neurosci. 13, 1609–1616. 10.1046/j.0953-816x.2001.01521.x
7
BaunezC.NieoullonA.AmalricM. (1995). In a rat model of parkinsonism, lesions of the subthalamic nucleus reverse increases of reaction time but induce a dramatic premature responding deficit. J. Neurosci. 15, 6531–6541.
- Pubmed Abstract
- Google Scholar
8
BaunezC.RobbinsT. W. (1997). Bilateral lesions of the subthalamic nucleus induce multiple deficits in an attentional task in rats. Eur. J. Neurosci. 9, 2086–2099. 10.1111/j.1460-9568.1997.tb01376.x
9
BecharaA.DamasioA. R.DamasioH.AndersonS. W. (1994). Insensitivity to future consequences following damage to human prefrontal cortex. Cognition50, 7–15. 10.1016/0010-0277(94)90018-3
10
BenazzouzA.BreitS.KoudsieA.PollakP.KrackP.BenabidA. L. (2002). Intraoperative microrecordings of the subthalamic nucleus in Parkinson's disease. Mov. Disord. 17, S145–S149. 10.1002/mds.10156
11
BergmanH.FeingoldA.NiniA.RazA.SlovinH.AbelesM.et al. (1998). Physiological aspects of information processing in the basal ganglia of normal and parkinsonian primates. Trends Neurosci. 21, 32–38. 10.1016/S0166-2236(97)01151-X
12
BergmanH.WichmannT.KarmonB.DelongM. (1994). The primate subthalamic nucleus. II. Neuronal activity in the MPTP model of parkinsonism. J. Neurophysiol. 72, 507–520.
- Pubmed Abstract
- Google Scholar
13
BevanM. D.MagillP. J.TermanD.BolamJ. P.WilsonC. J. (2002). Move to the rhythm: oscillations in the subthalamic nucleus–external globus pallidus network. Trends Neurosci. 25, 525–531. 10.1016/S0166-2236(02)02235-X
14
BorisyukG. N.BorisyukR. M.KhibnikA. I.RooseD. (1995). Dynamics and bifurcations of two coupled neural oscillators with different connection types. Bull. Math. Biol. 57, 809–840. 10.1007/BF02458296
15
BourdaudN.ChavarriagaR.GalánF.Del R MillanJ. (2008). Characterizing the EEG correlates of exploratory behavior. IEEE Trans Neural Syst. Rehabil. Eng. 16, 549–556. 10.1109/TNSRE.2008.926712
16
BrownP. (2003). Oscillatory nature of human basal ganglia activity: relationship to the pathophysiology of Parkinson's disease. Mov. Disord. 18, 357–363. 10.1002/mds.10358
17
BrownP.OlivieroA.MazzoneP.InsolaA.TonaliP.Di LazzaroV. (2001). Dopamine dependency of oscillations between subthalamic nucleus and pallidum in Parkinson's disease. J. Neurosci. 21, 1033–1038.
- Pubmed Abstract
- Google Scholar
18
CaroniP.DonatoF.MullerD. (2012). Structural plasticity upon learning: regulation and functions. Nat. Rev. Neurosci. 13, 478–490. 10.1038/nrn3258
19
ChakravarthyV.JosephD.BapiR. S. (2010). What do the basal ganglia do? A modeling perspective. Biol. Cybern. 103, 237–253. 10.1007/s00422-010-0401-y
20
ChersiF.MirolliM.PezzuloG.BaldassarreG. (2013). A spiking neuron model of the cortico-basal ganglia circuits for goal-directed and habitual action learning. Neural Netw. 41, 212–224. 10.1016/j.neunet.2012.11.009
21
ChevalierG.DeniauJ. (1990). Disinhibition as a basic process in the expression of striatal functions. Trends Neurosci. 13, 277–280. 10.1016/0166-2236(90)90109-N
22
ClarkL.CoolsR.RobbinsT. W. (2004). The neuropsychology of ventral prefrontal cortex: decision-making and reversal learning. Brain Cogn. 55, 41–53. 10.1016/S0278-2626(03)00284-7
23
CohenJ. D.McclureS. M.AngelaJ. Y. (2007). Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration. Philos. Trans. R. Soc. B: Biol. Sci. 362, 933–942. 10.1098/rstb.2007.2098
24
CoulthardE. J.BogaczR.JavedS.MooneyL. K.MurphyG.KeeleyS.et al. (2012). Distinct roles of dopamine and subthalamic nucleus in learning and probabilistic decision making. Brain135, 3721–3734. 10.1093/brain/aws273
25
CraggS. J.BaufretonJ.XueY.BolamJ. P.BevanM. D. (2004). Synaptic release of dopamine in the subthalamic nucleus. Eur. J. Neurosci. 20, 1788–1802. 10.1111/j.1460-9568.2004.03629.x
26
DawN. D.O'dohertyJ. P.DayanP.SeymourB.DolanR. J. (2006). Cortical substrates for exploratory decisions in humans. Nature441, 876–879. 10.1038/nature04766
27
DelongM. R. (1990). Primate models of movement disorders of basal ganglia origin. Trends Neurosci. 13, 281–285. 10.1016/0166-2236(90)90110-V
28
DovzhenokA.RubchinskyL. L. (2012). On the Origin of Tremor in Parkinson's Disease. PLoS ONE7:e41598. 10.1371/journal.pone.0041598
29
DoyaK. (2002). Metalearning and neuromodulation. Neural Netw. 15, 495–506. 10.1016/S0893-6080(02)00044-8
30
FanK. Y.BaufretonJ.SurmeierD. J.ChanC. S.BevanM. D. (2012). Proliferation of external globus pallidus-subthalamic nucleus synapses following degeneration of midbrain dopamine neurons. J. Neurosci. 32, 13718–13728. 10.1523/JNEUROSCI.5750-11.2012
31
FellowsL. K.FarahM. J. (2003). Ventromedial frontal cortex mediates affective shifting in humans: evidence from a reversal learning paradigm. Brain126, 1830–1837. 10.1093/brain/awg180
32
FoffaniG.BianchiA.BaselliG.PrioriA. (2005). Movement−related frequency modulation of beta oscillatory activity in the human subthalamic nucleus. J. Physiol. 568, 699–711. 10.1113/jphysiol.2005.089722
33
FrankM. J. (2005). Dynamic dopamine modulation in the basal ganglia: a neurocomputational account of cognitive deficits in medicated and nonmedicated Parkinsonism. J. Cogn. Neurosci. 17, 51–72. 10.1162/0898929052880093
34
FrankM. J. (2006). Hold your horses: a dynamic computational role for the subthalamic nucleus in decision making. Neural Netw. 19, 1120–1136. 10.1016/j.neunet.2006.03.006
35
FrankM. J.SamantaJ.MoustafaA. A.ShermanS. J. (2007). Hold your horses: impulsivity, deep brain stimulation, and medication in parkinsonism. Science318, 1309–1312. 10.1126/science.1146157
36
GerfenC. R.EngberT. M.MahanL. C.SuselZ.ChaseT. N.MonsmaF.et al. (1990). D1 and D2 dopamine receptor-regulated gene expression of striatonigral and striatopallidal neurons. Science250, 1429–1432. 10.1126/science.2147780
37
GerfenC. R.SurmeierD. J. (2011). Modulation of striatal projection systems by dopamine. Annu. Rev. Neurosci. 34, 441. 10.1146/annurev-neuro-061010-113641
38
GilliesA.WillshawD. (1998). A massively connected subthalamic nucleus leads to the generation of widespread pulses. Proc. R. Soc. Lond. Ser. B Biol. Sci. 265, 2101–2109. 10.1098/rspb.1998.0546
39
GilliesA.WillshawD.LiZ. (2002). Subthalamic-pallidal interactions are critical in determining normal and abnormal functioning of the basal ganglia. Proc. Biol. Sci. 269, 545–551. 10.1098/rspb.2001.1817
40
GogollaN.GalimbertiI.CaroniP. (2007). Structural plasticity of axon terminals in the adult. Curr. Opin. Neurobiol. 17, 516–524. 10.1016/j.conb.2007.09.002
41
GötzT.KraushaarU.GeigerJ.LübkeJ.BergerT.JonasP. (1997). Functional properties of AMPA and NMDA receptors expressed in identified types of basal ganglia neurons. J. Neurosci. 17, 204–215.
- Pubmed Abstract
- Google Scholar
42
GuptaA.BalasubramaniP. P.ChakravarthyV. S. (2013). Computational model of precision grip in Parkinson's disease: a utility based approach. Front. Comput. Neurosci. 7:172. 10.3389/fncom.2013.00172
43
GurneyK.PrescottT. J.RedgraveP. (2001a). A computational model of action selection in the basal ganglia. I. A new functional anatomy. Biol. Cybern. 84, 401–410. 10.1007/PL00007984
44
GurneyK.PrescottT. J.RedgraveP. (2001b). A computational model of action selection in the basal ganglia. II. Analysis and simulation of behaviour. Biol. Cybern. 84, 411–423. 10.1007/PL00007985
45
Hadipour-NiktarashA.RommelfangerK. S.MasilamoniG. J.SmithY.WichmannT. (2012). Extrastriatal D2-like receptors modulate basal ganglia pathways in normal and parkinsonian monkeys. J. Neurophysiol. 107, 1500–1512. 10.1152/jn.00348.2011
46
HamaniC.Saint-CyrJ. A.FraserJ.KaplittM.LozanoA. M. (2004). The subthalamic nucleus in the context of movement disorders. Brain127, 4–20. 10.1093/brain/awh029
47
HammondC.BergmanH.BrownP. (2007). Pathological synchronization in Parkinson's disease: networks, models and treatments. Trends Neurosci. 30, 357–364. 10.1016/j.tins.2007.05.004
48
HanselD.MatoG.MeunierC. (1995). Synchrony in excitatory neural networks. Neural Comput. 7, 307–337. 10.1162/neco.1995.7.2.307
49
HauptmannC.TassP. A. (2007). Therapeutic rewiring by means of desynchronizing brain stimulation. Biosystems89, 173–181. 10.1016/j.biosystems.2006.04.015
50
HumphriesM. D.KhamassiM.GurneyK. (2012). Dopaminergic control of the exploration-exploitation trade-off via the basal ganglia. Front. Neurosci. 6:9. 10.3389/fnins.2012.00009
51
HumphriesM. D.LeporaN.WoodR.GurneyK. (2009). Capturing dopaminergic modulation and bimodal membrane behaviour of striatal medium spiny neurons in accurate, reduced models. Front. Comput. Neurosci. 3:26. 10.3389/neuro.10.026.2009
52
HumphriesM. D.StewartR. D.GurneyK. N. (2006). A physiologically plausible model of action selection and oscillatory activity in the basal ganglia. J. Neurosci. 26, 12921–12942. 10.1523/JNEUROSCI.3486-06.2006
53
HumphriesM.GurneyK. (2002). The role of intra-thalamic and thalamocortical circuits in action selection. Network13, 131–156. 10.1080/net.13.1.131.156
54
HumphriesM.GurneyK.PrescottT. (2007). Is there a brainstem substrate for action selection?Philos. Trans. R. Soc. B Biol. Sci. 362, 1627–1639. 10.1098/rstb.2007.2057
55
HurtadoJ. M.GrayC. M.TamasL. B.SigvardtK. A. (1999). Dynamics of tremor-related oscillations in the human globus pallidus: a single case study. Proc. Natl. Acad. Sci. U.S.A. 96, 1674–1679. 10.1073/pnas.96.4.1674
56
IzhikevichE. M. (2003). Simple model of spiking neurons. IEEE Trans. Neural Netw. 14, 1569–1572. 10.1109/TNN.2003.820440
57
IzhikevichE. M. (2007). Dynamical Systems in Neuroscience. Cambridge, MA: The MIT press.
- Google Scholar
58
JahrC. E.StevensC. F. (1990). Voltage dependence of NMDA-activated macroscopic conductances predicted by single-channel kinetics. J. Neurosci. 10, 3178–3182.
- Pubmed Abstract
- Google Scholar
59
KalvaS. K.RengaswamyM.ChakravarthyV. S.GupteN. (2012). On the neural substrates for exploratory dynamics in basal ganglia: a model. Neural Netw. 32, 65–73. 10.1016/j.neunet.2012.02.031
60
KangG.LoweryM. M. (2013). Interaction of oscillations, and their suppression via deep brain stimulation, in a model of the cortico-basal ganglia network. IEEE Trans. Neural Syst. Rehabil. Eng. 21, 244–253. 10.1109/TNSRE.2013.2241791
61
KitaH.ChangH.KitaiS. (1983). The morphology of intracellularly labeled rat subthalamic neurons: a light microscopic analysis. J. Comp. Neurol. 215, 245–257. 10.1002/cne.902150302
62
KitaH.KitaS. (1994). The morphology of globus pallidus projection neurons in the rat: an intracellular staining study. Brain Res. 636, 308–319. 10.1016/0006-8993(94)91030-8
63
KliemM. A.MaidmentN. T.AckersonL. C.ChenS.SmithY.WichmannT. (2007). Activation of nigral and pallidal dopamine D1-like receptors modulates basal ganglia outflow in monkeys. J. Neurophysiol. 98, 1489–1500. 10.1152/jn.00171.2007
64
KreissD. S.MastropietroC. W.RawjiS. S.WaltersJ. R. (1997). The response of subthalamic nucleus neurons to dopamine receptor stimulation in a rodent model of Parkinson's disease. J. Neurosci. 17, 6807–6819.
- Pubmed Abstract
- Google Scholar
65
KreitzerA. C. (2009). Physiology and pharmacology of striatal neurons. Annu. Rev. Neurosci. 32, 127–147. 10.1146/annurev.neuro.051508.135422
66
KrishnanR.RatnaduraiS.SubramanianD.ChakravarthyV. S.RengaswamyM. (2011). Modeling the role of basal ganglia in saccade generation: Is the indirect pathway the explorer?Neural Netw. 24, 801–813. 10.1016/j.neunet.2011.06.002
67
KumarA.CardanobileS.RotterS.AertsenA. (2011). The role of inhibition in generating and controlling Parkinson's disease oscillations in the basal ganglia. Front. Syst. Neurosci. 5:86. 10.3389/fnsys.2011.00086
68
Laureiro-MartãnezD.CanessaN.BrusoniS.ZolloM.HareT.AlemannoF.et al. (2013). Frontopolar cortex and decision-making efficiency: comparing brain activity of experts with different professional background during an exploration-exploitation task. Front. Hum. Neurosci. 7:927. 10.3389/fnhum.2013.00927
69
LevyR.AshbyP.HutchisonW. D.LangA. E.LozanoA. M.DostrovskyJ. O. (2002). Dependence of subthalamic nucleus oscillations on movement and dopamine in Parkinson's disease. Brain125, 1196–1209. 10.1093/brain/awf128
70
LitvakV.JhaA.EusebioA.OostenveldR.FoltynieT.LimousinP.et al. (2011). Resting oscillatory cortico-subthalamic connectivity in patients with Parkinson's disease. Brain134, 359–374. 10.1093/brain/awq332
71
LukasiewiczP. D.WerblinF. S. (1990). The spatial distribution of excitatory and inhibitory inputs to ganglion cell dendrites in the tiger salamander retina. J. Neurosci. 10, 210–221.
- Pubmed Abstract
- Google Scholar
72
MagdoomK.SubramanianD.ChakravarthyV. S.RavindranB.AmariS.-I.MeenakshisundaramN. (2011). Modeling basal ganglia for understanding Parkinsonian reaching movements. Neural Comput. 23, 477–516. 10.1162/NECO_a_00073
73
MagninM.MorelA.JeanmonodD. (2000). Single-unit analysis of the pallidum, thalamus and subthalamic nucleus in parkinsonian patients. Neuroscience96, 549–564. 10.1016/S0306-4522(99)00583-7
74
MahonS.VautrelleN.PezardL.SlaghtS. J.DeniauJ.-M.ChouvetG.et al. (2006). Distinct patterns of striatal medium spiny neuron activity during the natural sleep–wake cycle. J. Neurosci. 26, 12587–12595. 10.1523/JNEUROSCI.3987-06.2006
75
MauriceN.DeniauJ.-M.GlowinskiJ.ThierryA.-M. (1998). Relationships between the prefrontal cortex and the basal ganglia in the rat: physiology of the corticosubthalamic circuits. J. Neurosci. 18, 9539–9546.
- Pubmed Abstract
- Google Scholar
76
MckinneyR. A. (2010). Excitatory amino acid involvement in dendritic spine formation, maintenance and remodelling. J. Physiol. 588, 107–116. 10.1113/jphysiol.2009.178905
77
MeredithG.YpmaP.ZahmD. (1995). Effects of dopamine depletion on the morphology of medium spiny neurons in the shell and core of the rat nucleus accumbens. J. Neurosci. 15, 3808–3820.
- Pubmed Abstract
- Google Scholar
78
MichmizosK. P.NikitaK. S. (2011). Addition of deep brain stimulation signal to a local field potential driven Izhikevich model masks the pathological firing pattern of an STN neuron, in Engineering in Medicine and Biology Society, EMBC, 2011 Annual International Conference of the IEEE (Boston, MA: IEEE), 7290–7293.
- Google Scholar
79
ModoloJ.MosekildeE.BeuterA. (2007). New insights offered by a computational model of deep brain stimulation. J. Physiol. Paris101, 56–63. 10.1016/j.jphysparis.2007.10.007
80
MoustafaA. A.GluckM. A. (2011). A neurocomputational model of dopamine and prefrontal–striatal interactions during multicue category learning by Parkinson patients. J. Cogn. Neurosci. 23, 151–167. 10.1162/jocn.2010.21420
81
MuralidharanV.BalasubramaniP. P.ChakravarthyV. S.LewisS. J.MoustafaA. A. (2013). A computational model of altered gait patterns in parkinson's disease patients negotiating narrow doorways. Front. Comput. Neurosci. 7:190. 10.3389/fncom.2013.00190
82
NambuA.TachibanaY.KanedaK.TokunoH.TakadaM. (2005). Dynamic model of basal ganglia functions and Parkinson's disease, in The Basal Ganglia VIII, eds BolamJ. P.InghamC. A.MagillP. J. (Springer), 307–312.
- Google Scholar
83
NiniA.FeingoldA.SlovinH.BergmanH. (1995). Neurons in the globus pallidus do not show correlated activity in the normal monkey, but phase-locked oscillations appear in the MPTP model of parkinsonism. J. Neurophysiol. 74, 1800–1805.
- Pubmed Abstract
- Google Scholar
84
NivY. (2009). Reinforcement learning in the brain. J. Math. Psychol. 53, 139–154. 10.1016/j.jmp.2008.12.005
- CrossRef
- Google Scholar
85
PallottoM.DeprezF. (2014). Regulation of adult neurogenesis by GABAergic transmission: signaling beyond GABAA-receptors. Front. Cell. Neurosci. 8:166. 10.3389/fncel.2014.00166
86
ParkC.WorthR. M.RubchinskyL. L. (2010). Fine temporal structure of beta oscillations synchronization in subthalamic nucleus in Parkinson's disease. J. Neurophysiol. 103, 2707–2716. 10.1152/jn.00724.2009
87
ParkC.WorthR. M.RubchinskyL. L. (2011). Neural dynamics in parkinsonian brain: the boundary between synchronized and nonsynchronized dynamics. Phys. Rev. E83:042901. 10.1103/physreve.83.042901
88
PearsonJ. M.HaydenB. Y.RaghavachariS.PlattM. L. (2009). Neurons in posterior cingulate cortex signal exploratory decisions in a dynamic multioption choice task. Curr. Biol. 19, 1532–1537. 10.1016/j.cub.2009.07.048
89
PearsonJ. M.HeilbronnerS. R.BarackD. L.HaydenB. Y.PlattM. L. (2010). Posterior cingulate cortex: adapting behavior to a changing world. Trends Cogn. Sci. 15, 143–151. 10.1016/j.tics.2011.02.002
90
PinskyP. F.RinzelJ. (1995). Synchrony measures for biological neural networks. Biol. Cybern. 73, 129–137. 10.1007/BF00204051
91
PlenzD.KitaiS. T. (1998). Up and down states in striatal medium spiny neurons simultaneously recorded with spontaneous activity in fast-spiking interneurons studied in cortex–striatum–substantia nigra organotypic cultures. J. Neurosci. 18, 266–283.
- Pubmed Abstract
- Google Scholar
92
PlenzD.KitalS. T. (1999). A basal ganglia pacemaker formed by the subthalamic nucleus and external globus pallidus. Nature400, 677–682. 10.1038/23281
93
PrescottT. J.BrysonJ. J.SethA. K. (2007). Introduction. Modelling natural action selection. Philos. Trans. R. Soc. B Biol. Sci. 362, 1521–1529. 10.1098/rstb.2007.2050
94
RagozzinoM. E. (2007). The contribution of the medial prefrontal cortex, orbitofrontal cortex, and dorsomedial striatum to behavioral flexibility. Ann. N.Y. Acad. Sci. 1121, 355–375. 10.1196/annals.1401.013
95
ReynoldsJ. N. J.WickensJ. R. (2002). Dopamine-dependent plasticity of corticostriatal synapses. Neural Netw. 15, 507–521. 10.1016/S0893-6080(02)00045-X
96
RichardsD. A.MateosJ. M.HugelS.De PaolaV.CaroniP.GähwilerB. H.et al. (2005). Glutamate induces the rapid formation of spine head protrusions in hippocampal slice cultures. Proc. Natl. Acad. Sci. 102, 6166–6171. 10.1073/pnas.0501881102
97
RobertsonR.ClarkeC.BoyceS.SambrookM.CrossmanA. (1990). The role of striatopallidal neurones utilizing gamma-aminobutyric acid in the pathophysiology of MPTP-induced parkinsonism in the primate: evidence from [3H] flunitrazepam autoradiography. Brain Res. 531, 95–104. 10.1016/0006-8993(90)90762-Z
98
RogersR. D. (2010). The roles of dopamine and serotonin in decision making: evidence from pharmacological experiments in humans. Neuropsychopharmacology36, 114–132. 10.1038/npp.2010.165
99
RussellV.AllinR.LammM.TaljaardJ. (1992). Regional distribution of monoamines and dopamine D1-and D2-receptors in the striatum of the rat. Neurochem. Res. 17, 387–395. 10.1007/BF00974582
100
SchrollH.VitayJ.HamkerF. H. (2012). Working memory and response selection: a computational account of interactions among cortico-basalganglio-thalamic loops. Neural Netw. 26, 59–74. 10.1016/j.neunet.2011.10.008
101
SchultzW. (1998). Predictive reward signal of dopamine neurons. J. Neurophysiol. 80, 1–27.
- Pubmed Abstract
- Google Scholar
102
SharottA.DoigN. M.MalletN.MagillP. J. (2012). Relationships between the firing of identified striatal interneurons and spontaneous and driven cortical activities in vivo. J. Neurosci. 32, 13221–13236. 10.1523/JNEUROSCI.2440-12.2012
103
ShenK.-Z.JohnsonS. W. (2010). Ca2+ Influx through NMDA-Gated channels activates ATP-Sensitive K+ currents through a nitric oxideâ€“cGMP pathway in subthalamic neurons. J. Neurosci. 30, 1882–1893. 10.1523/JNEUROSCI.3200-09.2010
104
ShounoO.TakeuchiJ.TsujinoH. (2009). A spiking neuron model of the basal ganglia circuitry that can generate behavioral variability, in The Basal Ganglia IX, eds GroenewegenH. J.VoornP.BerendseH. W.MulderA. B.CoolsA. R. (New York, NY: Springer), 191–200.
- Google Scholar
105
SinhaS. (1999). Noise-free stochastic resonance in simple chaotic systems. Phys. A Stat. Mech. Appl. 270, 204–214. 10.1016/S0378-4371(99)00136-3
106
SmithY.BeyanM. D.ShinkE.BolamJ. P. (1998). Microcircuitry of the direct and indirect pathways of the basal ganglia. Neuroscience86, 353–388.
- Pubmed Abstract
- Google Scholar
107
SmithY.KievalJ. Z. (2000). Anatomy of the dopamine system in the basal ganglia. Trends Neurosci. 23, S28–S33. 10.1016/S1471-1931(00)00023-9
108
SteinerH.TsengK. Y. (2010). Handbook of Basal Ganglia Structure and Function: A Decade of Progress. London: Elsevier.
- Google Scholar
109
SternE. A.JaegerD.WilsonC. J. (1998). Membrane potential synchrony of simultaneously recorded striatal spiny neurons in vivo. Nature394, 475–478. 10.1038/28848
110
StewartT. C.BekolayT.EliasmithC. (2012). Learning to select actions with spiking neurons in the basal ganglia. Front. Neurosci. 6:2. 10.3389/fnins.2012.00002
111
SukumarD.RengaswamyM.ChakravarthyV. S. (2012). Modeling the contributions of Basal ganglia and Hippocampus to spatial navigation using reinforcement learning. PLoS ONE7:e47467. 10.1371/journal.pone.0047467
112
SurmeierD. J.DingJ.DayM.WangZ.ShenW. (2007). D1 and D2 dopamine-receptor modulation of striatal glutamatergic signaling in striatal medium spiny neurons. Trends Neurosci. 30, 228–235. 10.1016/j.tins.2007.03.008
113
SuttonR. S.BartoA. G. (1998). Reinforcement Learning: an Introduction. Cambridge: Cambridge University Press.
- Google Scholar
114
TachibanaY.IwamuroH.KitaH.TakadaM.NambuA. (2011). Subthalamo−pallidal interactions underlying parkinsonian neuronal oscillations in the primate basal ganglia. Eur. J. Neurosci. 34, 1470–1484. 10.1111/j.1460-9568.2011.07865.x
115
TermanD.RubinJ.YewA.WilsonC. (2002). Activity patterns in a model for the subthalamopallidal network of the basal ganglia. J. Neurosci. 22, 2963–2976.
- Pubmed Abstract
- Google Scholar
116
ThibeaultC. M.SrinivasaN. (2013). Using a hybrid neuron in physiologically inspired models of the basal ganglia. Front. Comput. Neurosci. 7:88. 10.3389/fncom.2013.00088
117
TianL.StefanidakisM.NingL.Van LintP.Nyman-HuttunenH.LibertC.et al. (2007). Activation of NMDA receptors promotes dendritic spine development through MMP-mediated ICAM-5 cleavage. J. Cell Biol. 178, 687–700. 10.1083/jcb.200612097
118
VickersD. (1970). Evidence for an accumulator model of psychophysical discrimination. Ergonomics13, 37–58. 10.1080/00140137008931117
119
WangX.-J.RinzelJ. (1993). Spindle rhythmicity in the reticularis thalami nucleus: synchronization among mutually inhibitory neurons. Neuroscience53, 899–904. 10.1016/0306-4522(93)90474-T
120
WeinbergerM.DostrovskyJ. O. (2011). A basis for the pathological oscillations in basal ganglia: the crucial role of dopamine. Neuroreport22, 151. 10.1097/WNR.0b013e328342ba50
121
WillshawD.LiZ. (2002). Subthalamic–pallidal interactions are critical in determining normal and abnormal functioning of the basal ganglia. Proc. Biol. Sci. 269, 545–551. 10.1098/rspb.2001.1817
122
YucelgenC.DenizdurduranB.MetinS.ElibolR.SengorN. S. (2012). A biophysical network model displaying the role of basal ganglia pathways in action selection, in Artificial Neural Networks and Machine Learning–ICANN 2012 (Berlin; Heidelberg: Springer), 177–184.
- Google Scholar
123
ZhuZ.-T.MunhallA.ShenK.-Z.JohnsonS. W. (2005). NMDA enhances a depolarization-activated inward current in subthalamic neurons. Neuropharmacology49, 317–327. 10.1016/j.neuropharm.2005.03.018

Summary

Keywords

Basal Ganglia, Izhikevich neurons, synchronization, n-arm bandit task, exploration

Citation

Mandali A, Rengaswamy M, Chakravarthy VS and Moustafa AA (2015) A spiking Basal Ganglia model of synchrony, exploration and decision making. Front. Neurosci. 9:191. doi: 10.3389/fnins.2015.00191

Received

31 December 2014

Accepted

12 May 2015

Published

27 May 2015

Volume

9 - 2015

Edited by

Paul Schrater, University of Minnesota, USA

Reviewed by

Bruno B. Averbeck, National Insitute of Mental Health, USA; Leonid L. Rubchinsky, Indiana University - Purdue University Indianapolis, USA

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: V. Srinivasa Chakravarthy, Computational Neuroscience Laboratory, Department of Biotechnology, Bhupat and Mehta School of BioSciences, Indian Institute of Technology Madras, Chennai, Tamil Nadu 600036, India schakra@iitm.ac.in

This article was submitted to Decision Neuroscience, a section of the journal Frontiers in Neuroscience

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Decision Neuroscience

ORIGINAL RESEARCH article

A spiking Basal Ganglia model of synchrony, exploration and decision making

Abstract

Introduction