The Arbitration–Extension Hypothesis: A Hierarchical Interpretation of the Functional Organization of the Basal Ganglia

Based on known anatomy and physiology, we present a hypothesis where the basal ganglia motor loop is hierarchically organized in two main subsystems: the arbitration system and the extension system. The arbitration system, comprised of the subthalamic nucleus, globus pallidus, and pedunculopontine nucleus, serves the role of selecting one out of several candidate actions as they are ascending from various brain stem motor regions and aggregated in the centromedian thalamus or descending from the extension system or from the cerebral cortex. This system is an action-input/action-output system whose winner-take-all mechanism finds the strongest response among several candidates to execute. This decision is communicated back to the brain stem by facilitating the desired action via cholinergic/glutamatergic projections and suppressing conflicting alternatives via GABAergic connections. The extension system, comprised of the striatum and, again, globus pallidus, can extend the repertoire of responses by learning to associate novel complex states to certain actions. This system is a state-input/action-output system, whose organization enables it to encode arbitrarily complex Boolean logic rules using striatal neurons that only fire given specific constellations of inputs (Boolean AND) and pallidal neurons that are silenced by any striatal input (Boolean OR). We demonstrate the capabilities of this hierarchical system by a computational model where a simulated generic “animal” interacts with an environment by selecting direction of movement based on combinations of sensory stimuli, some being appetitive, others aversive or neutral. While the arbitration system can autonomously handle conflicting actions proposed by brain stem motor nuclei, the extension system is required to execute learned actions not suggested by external motor centers. Being precise in the functional role of each component of the system, this hypothesis generates several readily testable predictions.

on their dense connections with the traditional BG nuclei, it has recently been suggested that several other nuclei may join this club, in particular the pedunculopontine nucleus (PPN; Mena-Segovia et al., 2004) and habenula (Hikosaka et al., 2008).
Since the BG output nuclei, i.e., the GPi and the SNr send projections to both subcortical areas responsible for posture and locomotion (McHaffie et al., 2005;Grillner, 2006;Grillner et al., 2008;Takakusaki, 2008;Redgrave et al., 2010) and to parts of the motor thalamus, which in turn project to motor cortex (Gerfen et al., 1982), the BG stand in a critical position to both control the automatic responses of subcortical motor areas and influence the volitional movements originating in the motor areas of the cortex (Takakusaki et al., 2004).
The STR is believed to be organized in three functionally distinct segments, i.e., the skeletomotor, associative, and limbic regions on one hand (Alexander et al., 1986), and two separated compartments, i.e., the striosome and the matrix on the other hand (Graybiel andRagsdale Jr., 1978, 1979). The combination makes at least six

IntroductIon
The basal ganglia (BG) are several subcortical nuclei that are supposedly involved in vertebrate action selection (Mink, 1996), reinforcement learning (Barto, 1995), and dimensionality reduction (Bar-Gad and Bergman, 2001) both in motor and cognitive (Alexander et al., 1986) domains through their extensive interconnections and heavy reciprocal projections with the thalamus and the brain stem (Parent and Hazrati, 1995a,b). Cerebral cortex sends direct afferents to the BG but only receives BG efferents indirectly via specific and non-specific thalamic nuclei, thus forming so called BG-thalamo-cortical loops (Alexander et al., 1986). Several pathological states such as Parkinson's and Huntington's diseases have been associated to BG.
The traditional components of the BG are six major ganglia namely the caudate/putamen complex aka striatum (STR), the subthalamic nucleus (STN), the external globus pallidus (GPe), the internal globus pallidus (GPi), the substantia nigra pars reticulata (SNr), and the substantia nigra pars compacta (SNc). Based The arbitration-extension hypothesis: a hierarchical interpretation of the functional organization of the basal ganglia Iman Kamali Sarvestani 1,2 *, Mikael Lindahl 1,2 , Jeanette Hellgren-Kotaleski 1,2,3 and Örjan Ekeberg 1,2 different nuclear domains with different histological properties, input/output structures, and putative functionalities. This separation is also observed in other components of the BG thus forming the so called segregated loops. Hereafter, by mentioning the BG nuclei, we are referring to the skeletomotor loop. However, since associative STR and STN are reported to project prominently to the motor GPi and GPe (Joel and Weiner, 1997), here we will extend the STR and STN to include both their motor and associative areas.
We will first review the thalamic and cortical inputs to the BG. Next, we will cover some of the known connections between these ganglia and discuss the output they deliver to other regions in the CNS. We will finally review some existing functional hypotheses before proposing our novel hypothesis about the functional structure of the BG.
As a major thalamic input to the BG, the centromedian nucleus of thalamus (CM; Smith et al., 2004;Smith et al., 2009) projects both to the STR and to the STN. Therefore, defining the nature of the information this thalamic nucleus carries to the BG is essential in forming a functional hypothesis. A review of different nuclei projecting to the CM in different species (Comans and Snow, 1981;Sadikot and Rymar, 2009) reveals a well conserved afferent structure in all vertebrates. The CM nucleus receives inputs from nuclei responsible for preliminary transformation of sensory information into motor commands. The major afferents to the CM nucleus are from motor cortex, neurons in the intermediate and deep layers of the superior colliculus that carry motor commands about eye, head and trunk movements, the lateral and superior vestibular nuclei reporting postural responses, the ventral horn of the spinal cord as the end point in transforming the sensory input to motor output in spinal reflexes, the cerebellar output nuclei carrying motor commands for correction of movement and nuclei in the reticular formation responsible for eye and head orienting commands.
The cortical input to the STN originates in the primary and supplementary motor areas (M1 and SMA) as well as frontal eye field and supplementary frontal eye field (FEF and SFEF; Parent and Hazrati, 1995b). A slim projection from primary somatosensory cortex of rat to the STN has been reported (Canteras et al., 1988) but has not been verified by other studies (Petras, 1967;Hartmann von Monakow et al., 1978;McBride and Larsen, 1980;Afsharpour, 1985). Therefore, the sensory input to the STN may not play a major role in its overall functionality.
Following the same typical pattern as their thalamic counterparts, the cortical afferents to the STR are not limited to motor regions but extend to sensory and associative areas (Clary and Irvine, 1986;Graziano and Gross, 1993;Parent and Hazrati, 1995a).
The thalamic and cortical information sent to the STN and the STR is distributed to other ganglia via several pathways. The STN sends its glutamatergic outputs to the GPe and the GPi and in turn receives GABAergic projections from the GPe (Shink et al., 1996;Sato et al., 2000). The STN also sends glutamatergic projections to the PPN and receives reciprocal mixed cholinergic/glutamatergic projections from the PPN (Bevan and Bolam, 1995). It is worth noting that since PPN is a heterogeneous structure with disputed anatomical boundaries and is associated with a vast spectrum of putative behaviors, we will exclusively consider cholinergic and glutamatergic neuronal populations involved in regulation of postural muscle tone and locomotion (Takakusaki et al., 2004). We will call the set of neurons associated with these two motor tasks the PPN/mesencephalic locomotor region (MLR) complex. Therefore, other neuronal groups within the PPN are left out of the hypothesis hereafter.
The GPe neurons inhibit their GPi counterparts, the SNr and other GPe neurons through GABAergic connections. Unlike the GPe, GPi does not directly project back to the STN, but there are polysynaptic pathways between these two ganglia. The GPi and SNr send inhibitory projections to several brain stem nuclei, the CM (Sadikot and Rymar, 2009), the PPN/MLR and some motor nuclei of ventral thalamus. The PPN/MLR complex serves as the excitatory complement of the GPi output (Winn, 2006) by projecting to the CM, the superior colliculus and other brain stem nuclei especially those in control of locomotion and postural reflexes (Garcia-Rill, 1991;Karachi et al., 2010).
The striatal medium spiny neurons (MSNs) project mainly to the GPe and GPi. The MSNs are traditionally classified in two categories: Those expressing substance-P while having dominant dopamine receptor type 1 (D1R) and those expressing enkephalin while having dominant dopamine receptor type 2 (D2R). Although early hypotheses postulated that the D1R dominant neurons (D1RN) project to GPi (direct pathway) and D2R dominant neurons (D2RN) project to the GPe (indirect pathway), later observations revealed that the D1RNs also send collaterals to GPe on their way to the GPi (Lévesque and Parent, 2005). Figure 1 summarizes the interconnection of BG nuclei and their position in the CNS.
The first widely accepted model of functional organization in the BG (Albin et al., 1989;Wichmann and DeLong, 1996) proposed that the direct pathway carries information about the action(s) to do [Go command(s)] and the indirect pathway contains information about the action(s) to avoid [NoGo command(s)]. A more recent model has suggested that the striatopallidal pathways may be interpreted as systems to reduce the dimensional order of cortical information (Bar-Gad and Bergman, 2001). Both models focus on the role of the striatopallidal pathways and give the STN a minor role as a part of the indirect pathway. The complexities and ambiguities arising from inclusion of the STN in the classical models of the BG stem in three major questions: First, the type of input to the STN, second, how the STN relays its message to other nuclei and third, the functional role of the STN in overall BG networks.
Early models of the BG considered GPe as the exclusive source of input to the STN. In other words, they tended to view the STN as a secondary relay stage carrying the GPe commands to the GPi. In this view, GPe messages were transferred to the GPi not only monosynaptically but also disynaptically via the STN. Introduction of the so called hyperdirect pathway considering the STN as a major input to the BG (Nambu et al., 2002), however, revolutionized the field. Inspired by the idea that the STN receives direct input that reaches the GPi faster than the STR message, some authors included the hyperdirect pathway in their models (Gurney et al., 2001;Rubchinsky et al., 2003;Frank, 2006;Humphries et al., 2006;Lebloise et al., 2006). However, such models do not differentiate between the type of the input to the STR and those of STN.
The topography of connections between the STN and other nuclei especially the globus pallidus is another source of ambiguity in different models of the BG. Some authors have reported exact reciprocal projection from the STN to the globus pallidus (Shink et al., 1996). Later observations however, have shown that a single STN neuron contacts several neurons in the globus pallidus (Sato et al., 2000). Some models have interpreted the multiple targeted projection of the STN neurons on pallidal neurons as diffuse oneto-all connectivity (Mink, 1996;Gurney et al., 2001;Frank, 2006;Humphries et al., 2006;Lebloise et al., 2006) while others have assumed focused but out of register topography for the STN-GPe projection (Rubchinsky et al., 2003).
In accordance with the ambiguities in input type and connectivity pattern, the functional role of STN in the BG circuitry has also been interpreted differently by different authors. Some emphasize the spatial role of the assumed diffuse connections from the STN to the globus pallidus as providing an off surround whose on center comes from the inhibitory projections of the STR so that the combined effect will choose one action and suppress all others (Gurney et al., 2001;Lebloise et al., 2006). Other authors (Frank, 2006) have highlighted the temporal role of the presumed diffuse projections from the STN to the GPi/SNr as putting hold on all actions (global NoGo) until the right time for triggering one action via striatal inhibition. Having observed three temporally distinct responses in the GPi after stimulation in the cortex, some authors have extended the idea of temporal sequencing by assuming differential roles for the hyperdirect, nuclei responsible for locomotion and muscle tone as well as the CM nucleus of thalamus that directly projects back to the BG thus forming a clear "loop." the two systems of the basal ganglIa A major theme of the arbitration-extension hypothesis is the dissociated role of two major systems in the BG: The arbitration system and the extension system. The arbitration system composed of the STN, PPN/MLR, GPe, and GPi, serves as a selection mechanism capable of choosing one out of several conflicting actions. The extension system composed of the STR, GPe, and GPi, extends the repertoire of behaviors by overriding the innate choices of the arbitration system and imposing actions they have learned to take under certain states. The two systems have different inputs, outputs, and putative functionality while communicating via shared components.

Input
The inputs to the arbitration system are the motor commands originating in the brain stem sensory-motor transformation nuclei aggregated in the CM nucleus of thalamus as well as the motor commands initiated by the motor areas of the cortex, i.e., the M1, the SMA, the FEF, and the SFEF. In contrast to the input to the arbitration system, the extension system receives sensory and associative inputs from all sensory and associative cortical areas in addition to the motor commands. In other words, the inputs to the arbitration system are the candidate actions while the inputs to the extension system represents the spontaneous state of the animal. It is worth noticing that the state of the animal is not only defined by sensory and associative information but also by information about the current action being performed and probable candidates to replace the current action sequentially. Therefore, we consider the arbitration system as an action-input/action-output system but regard the extension system as a state-input/action-output system.

Function
The brain stem contains several nuclei for transforming early sensory information to preliminary motor responses. These nuclei possess pre-wired connections and serve the innate goals of an animal. The CM is here considered as an aggregation point for all such responses, i.e., the CM efferents carry an instantaneous mix of brain stem responses. A mixture of responses is rarely the best motor output for an animal. For example, averaging the motor responses when an animal faces two targets one on the left and the other on the right side leads to an erroneous decision to go between the two. The STN via its connections with the PPN/MLR and GPe selects one of the candidate actions suggested by the brain stem and cortical motor regions. Thus, the arbitration system essentially suppresses all but one action at a time.
The arbitration system is serving the pre-wired innate goals together with the fixed policies associated with them. However, an animal has a clear evolutionary advantage if policies can be formed and modified during its life time via learning mechanisms. The extension system is the substrate for such plastic modifications. This system extends the repertoire of responses an animal displays when facing more complex states by learning the association between such compound stimuli and the responses. Simple stimuli are combined to form arbitrarily complex states. Therefore, direct, and indirect pathways creating the global NoGo (early excitation), start (inhibition), and termination (late excitation) commands respectively (Nambu et al., 2000).
Some fundamental questions and several new observations remain untouched in current models of the BG. First, the nature of the input to the BG has been treated quite loosely. The models proposed so far toggle between the sensory and motor nature of the inputs to the BG. Second, those models that include STN as an input site tend to assume the same input to the STN and the STR. Third, the BG output via the STN-PPN/MLR path is ignored in BG models. The importance of PPN in the decision making process has been appreciated in some models (Mena-Segovia et al., 2004) but a more precise explanation of the role it may play seems needed. Fourth, the assumption of one-to-all connectivity in STN-GPe connections reduces the role of the STN to a modulatory nucleus setting the threshold of action selection in decisions involving high conflicts. This is inconsistent with precise topographic maps observed in the STN. Fifth, striatal and pallidal neurons are usually treated as relay neurons and their computational capacities are neglected. Sixth, the functional properties of the D1RN-GPe projections have been included in some models (Cohen and Frank, 2009) for reward generation but not for motor control. These, together with some other problems in the interpretation of BG activity (Graybiel, 2005;Nambu, 2008) have been the driving forces to develop the current system level hypothesis on the functional organization of the BG.

hypothesIs
The arbitration-extension hypothesis of the BG organization, developed in this article, is based on anatomical and electrophysiological data on the BG interconnections. In order to keep the analysis transparent, we intentionally avoid introducing any learning mechanisms and will focus on what the structural organization of BG can perform after learning. We will postpone a brief explanation of how the efficacy of connections may be modified to the discussion section but do not count it as a part of the hypothesis itself. We will also focus on the relationship between the BG and subcortical areas because of three reasons: First, decerebrate animals possess a rich repertoire of behaviors that are supposed to be controlled by the neuronal networks of the BG and brainstem (Bjursten et al., 1976;Grillner and Wallén, 1985). Second, many vertebrate species do not have a well developed layered cerebral cortex yet have full machinery of the BG (Stephensson-Jones et al., in preparation). Such animals do choose different behaviors by selecting the direction, velocity, and amount of movement and changing the gait. They can also adjust the muscle tone for postural control during movement. Third, the GPi receiving thalamic neurons have their terminals in layers I and II of the motor cortex (Jinnai et al., 1987) while the cortical neurons projecting to the matrix compartment of the STR are located mainly in layer V and to less extent in layers III and II (Gerfen, 1989). The cortical activity is so complex both layer-wise and corticocortically with so many players involved, that it cannot naïvely be regarded as simplistic relay neurons disinhibited by the GPi via thalamic relays. Including the cortex as a target to the BG adds the complications of cortical processing to the already complicated network processing in the BG. In contrast, GPi can effectively inhibit subcortical model, a certain neuron in STN representing a specific action provides excitatory input to the GPe neurons representing competing action(s) (antitopographic projection) which in turn project back topographically thus inhibiting conflicting STN neurons. The number of actions competing with a certain action may range from one to several hundreds. Therefore, the antitopographic projections from the STN to GPe may form a spectrum from focal to diffuse. The back projection from GPe to STN is usually more focused and topographically organized. However, if the action is a compound one, then GPe neurons may target several distinct patches of STN neurons focally. The combined effect of the STN-PPN/MLR and STN-GPe loops is to facilitate an action and suppress competing action(s) in the STN, a process that eventually leads to a single action stand alone as the winner.

Secondary loops
There are three negative and two positive secondary feedback loops in the arbitration system. These polysynaptic loops provide afferent copies to nuclei of interest. Moreover, the variety of latencies offered by such secondary loops improves the robustness of decisions in different temporal scales of decision making.
We assume that the STN-GPi projections are similar to those of STN-GPe but the inhibitory feedback travels a longer path before reaching the STN. There are several long negative feedback pathways from GPi to the STN, the shortest of which is the GPi-CM-STN pathway. This topographic pathway inhibits the CM neurons instead of directly inhibiting the STN. The overall effect is however pretty much the same as that of GPe-STN with a longer latency. An important consequence of the connectivity between the GPi and the CM is that the CM neurons will also carry a copy of the winner. As we will argue later, this copy is used by the extension system.
A second long inhibitory pathway from the GPi to the STN is the topographically organized GPi-PPN/MLR-STN loop. Positive feedback loops such as STN-PPN/MLR are prone to instability if loop gain is larger than one. The GPe and the GPi, respectively inhibiting the STN and PPN/MLR neurons avoid this loop from going unbound. Since the PPN/MLR neurons are disynaptically connected to the spinal cord, the GPi innervations of PPN/MLR seems crucial in avoiding facilitation of several actions.
Internal globus pallidus output reaches brain stem nuclei not only indirectly but also by monosynaptic innervations of some brain stem nuclei. This connection guarantees that brain stem decision nuclei do not generate shadow decisions in parallel to the decision made by the arbitration system.
We hypothesize that the PPN/MLR-CM connection is the positive dual of the GPi-CM connection and the PPN/MLR-brain stem connection is the positive dual of the GPi-brain stem connection. The GPi connections to the CM and brain stem suppresses the losing actions representatives in the brain stem while the PPN connections to the same targets facilitate the winner. Together, the four projections convey sufficient and necessary information to execute the proper action in the spinal cord and brain stem. The existence of an excitatory output besides the inhibitory output considered in classical models of the BG may explain the paradoxically minor influence lesioning the GPi has on motor performance (Inase et al., 1996;Desmurget and Turner, 2008).
the extension system can be viewed as a general purpose Boolean logic machine (crisp or fuzzy) to construct and implement complex rules using complex states.

Hierarchical organization and output
The arbitration system is in the position to control the outcome of the brain stem decision nuclei. The extension system in turn is capable of altering the selected responses proposed by the arbitration system by introducing learned policies to the arbitration process. This hierarchical organization suggests an evolutionary process as well as an advanced system of decision making in vertebrates facilitated with different levels of decision making, serving both the hard wired evolutionary goals and learned strategies. Such a hierarchical organization requires common output structures to avoid dual decision making centers.
The arbitration system is in charge of controlling the brain stem via two pathways: one excitatory output via PPN/MLR and one inhibitory output via the GPi. The extension system has direct access only to one inhibitory output via GPi, but also has the power to modify the output of the arbitration system by influencing both the GPi and the GPe.

the arbItratIon system
The arbitration system is a winner-take-all network composed primarily of one positive and one negative feedback loop centered around the STN. The main feature of the negative feedback loop via the GPe is the diffuse excitatory leg from the STN to the globus pallidus and focused inhibitory leg back to the STN. The combination of such a negative feedback loop and the self-excitatory nature of the positive feedback loop via the PPN/MLR is the essence of the winner-take-all network of the arbitration system. We postulate that each nucleus in the arbitration system contains an array of neurons each representing a certain response and that each action entering as a candidate has a strength value assigned to it. The strength, associated with "salience" used by other authors (Humphries et al., 2006), is representing the urgency of the responses generated by the brain stem and aggregated in the CM or coming from motor cortex or proposed by the extension system as it enters the gateways of the arbitration system in the STN or the globus pallidus. This system facilitates the strongest response and suppresses the others.

Primary loops
There are two disynaptic primary loops in the arbitration system. The STN and the PPN are linked reciprocally by excitatory (glutamatergic and cholinergic) projections (Bevan and Bolam, 1995). Although not much is known about the precise synaptic connectivity and axonal branching of this loop, we hypothesize that the neurons representing one action in the STN and those representing the same action (topographic projection) in PPN are reciprocally connected thus enhancing the activity of the neurons representing the selected action.
The STN and GPe form a well-known connection of the BG whose activity both in physiological and pathological states has been an active area of research. We hypothesize that the STN-GPe loop functionally provides mutual inhibition between the STN neurons. The exact mechanism of this phenomenon is unclear. In one possible connectivity pattern we use in our computational excites them. In other words, the output of any MSN depends on the conjunction of the stimuli it receives on its dendritic tree. Pallidal neurons, on the other hand, have relatively high membrane potentials and low thresholds, qualifying them as disjunction (OR) neurons. They reduce their firing rates when any of the D1RNs in their action unit fires, i.e., the output of a given pallidal neuron depends on the disjunction of the striatal inputs it receives. In this framework, the disjunction of conjunctions of state components can actively be generated within an action unit.
Disjunction of conjunctions can represent any arbitrarily complex Boolean rule when facilitated with a negation operation. We postulate that the inhibitory collaterals between the D2RNs and the D1RNs in the same action unit effectively play the role of negation (NOT) between conjunctions. In other words, the D2RNs record the conjunction of stimuli to be negated in the Boolean logic representation of a certain complex state. They enforce this "NOT" operation by sending inhibitory collaterals to the D1RNs in their action unit.

Striatopallidal projections
In accordance with classical models of the BG, we postulate that the Go action units facilitate execution of actions leading to rewards or reduced punishment while NoGo action units suppress the actions leading in punishment or reduced reward.
We hypothesize that the arbitration system can operate autonomously in the absence of the extension system using brain stem and cortical motor commands. Therefore, when the extension system learns to facilitate a certain response via D1RN-GPi projections, it must simultaneously suppress competing responses,

the extensIon system
During their lifetimes, animals face many complex states, i.e., states composed of several stimuli. Those animals capable of exploring higher levels of state complexity and learning appropriate rules associated with those states may exploit their surroundings more efficiently. The extension system is responsible for learning such complex states and linking them to single actions or sets of actions. We postulate that the linkage between a state and the learned response associated with it is achieved via the striatopallidal fibers running from a population of MSNs to each GPi neuron. We name a single GPi neuron, D1RNs connecting to it and the D2RNs connecting to those D1RNs via inhibitory collaterals a Go action unit (Figure 2). Each action unit keeps records of all states requiring facilitation of the same shared response. In the same way we name a single GPe neuron, D2RNs connecting to it and the D1RNs connecting to those D2RNs via inhibitory collaterals a NoGo action unit. We postulate that an action unit is the functional equivalent of a pallidal neuron and its matrisome, groups of striatal neurons receiving sensory input from the same body part representation in the sensory cortex and converging on the same pallidal neuron Graybiel, 1993, 1994;Kincaid and Wilson, 1996).

Neuronal types and functions
We hypothesize differentiated functionality for the MSNs and the pallidal neurons. Through their very low resting membrane potentials, high thresholds and extensive dendritic trees, the MSNs well qualify to play the role of conjunction (AND) neurons. They do not fire unless a complete constellation (conjunction) of stimuli

summary
To put it in a compact form, here we summarize the hypothesis: (a) The basal ganglia are composed of two functional systems: the arbitration system and the extension system. (b) The two systems are operating hierarchically so that the arbitration system controls the brain stem and cortical motor regions and in turn is controlled by the extension system. (c) The input to the arbitration system is a set of candidate actions from different sensory-motor transformation regions in the brain stem and cerebral cortex while the input to the extension system is a set of state components that can have sensory, associative, or motor nature. (d) The arbitration system has two outputs via PPN/MLR and GPi whereas the extension system has just the GPi as an output. (e) The arbitration system operates as a winner-take-all network facilitating the candidate action with highest strength and suppressing others.

model and demonstratIons
In order to verify whether this hypothetical model can actively make decisions using both the arbitration and the extension systems while keeping the biological plausibility on a fairly high level, we designed a simulation framework. The model represents a simple simulated "animal" with a one dimensional array representing a "retina" of 128 pixels capable of observing and measuring its relative distance to aversive (designated by red color), appetitive (designated by blue color), obstacle (designated by green color), and contextual stimuli (designated by different levels of gray). The animal's motor system can steer it toward any of the 128 directions. The animal must decide about the instantaneous direction of the movement. All ganglia involved except for the STR have 128 neurons each representing steering toward one of the directions. The STR has two arrays of D1RNs and one array of especially those who would be suggested independently by the arbitration system otherwise. We hypothesize that D1RNs in a certain action unit project antitopographically to the GPe. Suppression of competing actions in the GPe lifts the inhibition off the corresponding GPi neurons via the topographic GPe-GPi projections. The combined effect of D1RN-GPi and D1RN-GPe projections is to suppress all but one response. Thus, a Go action unit must facilitate a response and at the same time suppress its competitors.
When the NoGo action units suppress a response however, they do not need to offer an alternative. The brain stem and cortical alternatives are already there ready to be selected by the arbitration system. We hypothesize that if a NoGo action unit does not provide an alternative response, then its striatopallidal fibers project only to GPe. On the other hand, by sending an antitopographic collateral to the GPi, a NoGo action unit may provide an alternative besides suppressing a response. This pattern of connectivity is in accordance with the experimental data showing that all MSNs project to the GPe, and some proceed their way toward the GPi.
When several action units are active at the same time, the arbitration system arbitrates the response as it does for brain stem and cortical responses. A given action can even have both Go and NoGo representations and the probability that it will be actually selected depends on the relative difference in its Go-NoGo activation level (Frank, 2005). Figure 2A shows a cartoon of a typical action unit and its input projections. State components originating in sensory, associative, and motor areas of the cerebral cortex and thalamus are fed to the action unit. Bigger triangles belong to reinforced state components and smaller triangles belong to depressed components. Each striatal neuron has a unique set of reinforced and depressed states. The D1RN on the left side of Figure 2A for example has learned to associate the conjunction of two components of the state, namely components a and b to action 3. The D1RN on right side of Figure 2A on the other hand has learned to link a different situation, i.e., the conjunction of state components e and f to the same action. The D2RN in Figure 2A however, has learned to veto taking action 3 under the conjunction of components c and d of the state. The D2RN can shunt the activity of the action unit because the D2RNs are more excitable than D1RNs (Gertler et al., 2008). Therefore, this mechanism effectively models a complex rule: If (a and b) or (e and f) but not (c and d) then take the action 3. We postulate that each matrisome can record the complex state under which the pallidal neuron it is attached to should be inhibited. Therefore, the collateral connectivity pattern within different matrisomes may vary according to the Boolean logic expression of the state they are recording ( Figure 2B). Figure 2C abstracts all D1RN projections to a certain GPi neuron in a single axon to show how the D1RNs in a certain action unit project not only to the GPi neuron in the same action unit, but also send collaterals to GPe neurons representing competing action(s) thus suppressing the alternative action that would be independently selected by the arbitration system otherwise. Projections of all D2RNs in the same action unit are likewise abstracted in the same axon in Figure 2C to demonstrate their connection to the GPe.

Frontiers in Systems Neuroscience
www.frontiersin.org (Nakanishi et al., 1987;Gertler et al., 2008;Heida et al., 2008). Here we use some simulations to demonstrate the way the proposed hypothesis can work.
In the first demonstration, the brain stem nuclei are activated as the animal observes different types of stimuli around it but their activation is not transmitted to neither of the BG systems. Figure 3A shows the neuronal activity in the brain stem nuclei (aggregated in one subplot) and the CM alongside the resulting trajectory of the animal in the field. The animal without BG averages the responses D2RNs each containing 128 neurons of an action unit connected similar to the action unit shown in Figure 2A. Three pre-wired brain stem responses are included in the simulation for escape, approach and avoidance behaviors to aversive, appetitive and obstacle stimuli respectively.
We use conductance based two compartmental integrate-andfire neurons simulated in Neural Simulation Tool NEST (Gewaltig and Diesmann, 2007). The cell properties and synaptic weights are tuned to match those reported in electrophysiological studies the conjunction of this response and constellation of stimuli forming a certain context is rewarded, then dopamine flow will reinforce synaptic inputs from currently active stimuli and depress others. This process will eventually lead to differential synaptic efficacies on the MSN similar to those shown in Figure 2A. Since D1RNs in an action unit are mutually inhibitory only one of them will have the opportunity to record a certain rewarding situation (such as a AND b) hence keeping the recording capacity of the rest for other rewarding situation (such as e AND f). If the animal fails to receive an expected reward, the D2RNs record the inhibiting context by potentiating the present state elements and depressing others.
The cortical input to the STR comes from two major subtypes of corticostriatal neurons with different innervations patterns within the STR (Cowan and Wilson, 1994). The pyramidal tract (PT) neurons conveying motor commands from the motor cortex to the brain stem and spinal cord sends collaterals to the STR (Donoghue and Kitai, 1981). These neurons target few neurons in the STR focally. The intratelencephalic tract (IT) with corticostriatal neurons that may carry sensory and associative information via their dense corticocortical arborization on the other hand, synapse on many striatal neurons. This matches very well with our hypothesis. A certain PT neuron may innervate D1RNs in a single action unit. The sensory and associative information however, is not specific to a single action unit and many action units may use different parts of the same sensory or associative data.
The immense learning capacity of the STR has biased the classical models of the BG functional organization toward an STR-centered view. The STN has traditionally been interpreted as a supplementary structure assisting the STR in delivering its message to the GPi. Our hypothesis however, suggests a relatively independent role for the STN as the central nucleus of the arbitration system. We suggests that the classical models of the BG are actually models of the extension system. The traditional GO-NOGO model, recently supported experimentally (Bateup et al., 2010;Kravitz et al., 2010) for example, is embedded (with some refinement as discussed below) in our hypothesis, but we interpret it as a model of the extension system exclusively. Furthermore, the classical models of the BG (here assumed as models of the extension system) tend to view the neurons in BG nuclei as relay neurons. We try to refine this view by assigning different functional roles to STR and GPi neurons, each being capable of serving a certain Boolean logical operation. Thus, the current hypothesis can be seen as both refining the models of the extension system to account for formation of Boolean logic based states on one side and broadening the BG models beyond models of the extension system functional organization by introducing the independently functional arbitration system on the other side.
Another feature of the striatopallidal structure that is ignored in canonical models of the BG is the D1RN-GPe (and D2RN-GPi) connections. Since our hypothesis divides the BG into two functional subsystems, these connections finally make sense.
One clear prediction following the independence of the arbitration system from the extension system is that striatal lesioning will only eliminate the learned responses but will not interfere with the process of selecting the strongest brain stem or cortical response in the arbitration system. Therefore, according to our hypothesis, an animal with striatal lesion must still be capable of executing many fundamental actions. Although some authors (Denny-Brown, it receives from different preliminary decision centers and therefore is deprived of effective escapes and precise targeting behavior and is often trapped in conflicting situations of multiple stimuli such as those shown in the figure. In the second demonstration we connect the pre-wired brain stem responses to the arbitration system via the CM. Exploiting the winner-take-all property of the arbitration system, the animal successfully suppresses all but one of the responses at a time, resulting in an effective escape followed by a precise targeting (Figure 3B). The activities in all nuclei of the arbitration system show domination of a single action at any given time and a proper soft switch between actions when the relative strength of the second response takes over (Figure 3B).
In the third demonstration, the capability of storing disjunction of conjunctions patterns in an action unit is shown. The animal is assumed to have learned that either the combination of landmarks a and b or the combination of landmarks e and f will transform the red stimulus (originally aversive) into an appetitive one. Lack of a proper combination of landmark stimuli (a and b together or e and f together) fails to push the membrane potential of the MSNs to the vicinity of threshold. However, a proper combination of landmarks available activates either of the D1RNs in the action unit (matrisome) responsible for approach behavior. Activation of D1RNs inhibits the GPe neurons representing the escape response hence suppressing the innate tendency of the animal to escape from the aversive stimulus. The same striatal neurons also inhibit GPi neurons representing approach response thus lifting inhibition from corresponding PPN/MLR. The PPN/MLR neurons fire by the virtue of their intrinsic spontaneous activity, enforcing the learned approach response ( Figure 3C). The same GPi neurons disinhibit the CM neurons which in turn activates corresponding STN neurons thus facilitating a new arbitrated winner.
In the fourth demonstration, the capability of D2RN-D1RN inhibitory collaterals in negating a certain situation (Boolean NOT) is shown. Landmarks c and d the combination of which is assumed to restore the original nature of the red stimulus (as an aversive stimulus) accompany landmarks a and b in this demonstration. Since the learned action induced by stimulation of the D1RN in previous demonstration is to be suppressed here, the D2RN stimulated by components c and d of the state sends its axon collaterals not only to GPe but also to the D1RN somata hence inhibiting them and removing the influence of the extension system on the GPi (Figure 3D). It is worth noting that although there is enough contextual support to activate both MSNs, since D2RN is more excitable than the D1RNs, it fires more easily and wins the mutual competition.

dIscussIon
Our hypothesis intentionally avoids addressing some important features of the BG anatomy and physiology, most notably the microcircuitry, neuro-modulation, and learning, as a compromise for a comprehensive description of the input-output structure, and functional connectivity of the system. However, a brief sketch of how learning protocols can possibly shape the extension system networks is given here: Before learning, all MSNs receive random sensory, associative, and motor inputs on weak synapses that are not strong enough to depolarize the cells to fire. The threshold is reached when the CM neuron representing the action just taken (designated as r in Figure 2A) depolarizes the MSN further via its dendritic synapses. If etc. The early development of the STN in comparison to the STR however, may suggest a clue about the evolutionary age of the two systems.
Although our hypothesis initially excludes the cortex as a target for the BG to avoid extra complications arising from involvement of cortical activity and isolate the BG activity in order to analyze it independently, the final hypothesis can be used as a generic hypothesis of the BG functionality with subcortical or cortical output.
Our hypothesis suggests that the topographic and antitopographic projections are hard wired in the structure of the arbitration system thus creating synergies between actions originating in different sensory-motor mechanisms. Hard wired synergies are also observed in the spinal cord and the brain stem. The major difference between spinal cord and brain stem synergies and those in the arbitration system is possibly that the lower level synergies are often constructed within a single mechanism while the arbitration system wirings serve multi-system synergies. Moreover, the extension system has the capacity to create synergistic connection patterns. Here, in the absence of more precise anatomical data, the organization of projections both in the arbitration and extension systems has necessarily been constructed based on speculations to generate the desired outcome. For example the antitopographical projections from D1RNs to GPe and topographical projection from GPe to GPi is functionally equivalent to topographical projections from D1RNs to GPe followed by antitopographical projections from GPe to GPi and the true connectivity pattern cannot be revealed by computational or hypothetical tools. Besides the functional level tools, precise anatomical experiments are required to reveal the true topology and develop the hypothesis into a comprehensive theory.

acknowledgments
The authors wish to thank professor Sten Grillner for his valuable comments on the manuscript. This work was supported by grants from the EU LAMPETRA project, FP7 ICT-2007.8.3., Swedish Research Council, Parkinson's Foundations, and EU SELECT AND ACT project, Health 2007/2.2. 1-2. 1962) have reported that decorticated and de-striated animals are actually successful in following moving objects, other authors suggest (White, 2009) a spectrum of deficits in decision making including turning behavior and recalcitrance after striatal lesioning. Whole striatal lesioning definitely destroys the fine balance of neuromodulators in the BG and such an unbalance can have many unpredictable consequences. We predict that more focused lesioning of the STR restricted to a set of competing action units may leave the arbitration system function properly.
Although the independence of the arbitration system in choosing the strongest response it receives is achieved via the antitopographic projection of STN to GPe and the topographic projections from GPe to STN in the model demonstrated here, other connectivity patterns may also generate the same functional effect. For example, a one-to-all projection from STN to GPe, in parallel with focused STN-PPN reciprocal excitatory connection will also lead to a winner action stand out alone. In fact, a computational study (Terman et al., 2002) has shown that random connectivity between the STN and GPe will lead to just few STN and GPe neurons connected reciprocally.
The central role of the STN in our model can give novel testable insights about the mechanisms of deep brain stimulation in treatment of Parkinsonian patients. Electrical stimulation of the STN can activate the excitatory pathway of the BG composed of the CM, the STN, and the PPN and ultimately the brain stem nuclei and the spinal cord. In fact, the components of the BG excitatory pathway, i.e., the CM, the STN, and the PPN are all targets of deep brain stimulation.
The hierarchical organization of our hypothesis suggests that the arbitration system is serving more fundamental functionalities and is probably an evolutionary older structure. However, recent studies show that all components of the BG are present even in the most primitive vertebrates (Stephensson-Jones et al., in preparation). In fact, no living animal has been reported to possess the BG without the STR. Both systems have actually evolved in size, neuronal types, number of neurons, neurotransmitter diversity