Chemosensory Learning in the Cortex

Taste is a primary reinforcer. Olfactory–taste and visual–taste association learning takes place in the primate including human orbitofrontal cortex to build representations of flavor. Rapid reversal of this learning can occur using a rule-based learning system that can be reset when an expected taste or flavor reward is not obtained, that is by negative reward prediction error, to which a population of neurons in the orbitofrontal cortex responds. The representation in the orbitofrontal cortex but not the primary taste or olfactory cortex is of the reward value of the visual/olfactory/taste input as shown by devaluation experiments in which food is fed to satiety, and by correlations of the activations with subjective pleasantness ratings in humans. Sensory-specific satiety for taste, olfactory, visual, and oral somatosensory inputs produced by feeding a particular food to satiety is implemented it is proposed by medium-term synaptic adaptation in the orbitofrontal cortex. Cognitive factors, including word-level descriptions, modulate the representation of the reward value of food in the orbitofrontal cortex, and this effect is learned it is proposed by associative modification of top-down synapses onto neurons activated by bottom-up taste and olfactory inputs when both are active in the orbitofrontal cortex. A similar associative synaptic learning process is proposed to be part of the mechanism for the top-down attentional control to the reward value vs. the sensory properties such as intensity of taste and olfactory inputs in the orbitofrontal cortex, as part of a biased activation theory of selective attention.


INTRODUCTION
The aim of this paper is to describe some of the principles of chemosensory learning in the cerebral cortex. The focus is on the mechanisms that are present in primates including humans. One of the reasons for this focus is that the taste and related pathways in non-human primates are similar to those in humans (Norgren, 1984;Rolls and Scott, 2003;Rolls, 2005;Small and Scott, 2009), and thus evidence from these sources is particularly relevant to understanding taste and olfactory processing in humans. For example, in primates the taste pathways project from the nucleus of the solitary tract directly to the taste thalamus (Beckstead et al., 1980) and thus to the primary taste cortex in the anterior insula (Pritchard et al., 1986). There is no known pontine taste area in primates (Norgren, 1984;Rolls and Scott, 2003;Rolls, 2005;Small and Scott, 2009), whereas in rodents there is a pontine taste area that then sends onward connections to a number of subcortical areas including the hypothalamus and amygdala (Rolls and Scott, 2003). In contrast, in primates the taste processing is directed straight to the primary taste cortex (from the nucleus of the solitary tract via the thalamus), which then has onward connections to the cortical taste hierarchy of the orbitofrontal cortex, which contains the secondary taste cortex (defined by its direct anatomical projections from the primary taste cortex; Baylis et al., 1995), which in turn projects to the anterior cingulate cortex which is thus a tertiary taste cortical area (Rolls, 2008a) (Figure 1). The primary taste cortex in primates is the source of connections to subcortical structures such as the amygdala. It has been suggested that this cortically dominated taste connectivity in primates including humans is related to the great development of cortical processing in primates including humans, so that the unifying design is to bring all sensory modalities to the cortex for processing, and then after one or several mainly unimodal cortical areas for computations, to then bring the different sensory pathways together, with one key convergence area being the orbitofrontal cortex, as shown in Figure 1 (Rolls, 2005(Rolls, , 2008b. Another key reason for focusing on taste and related processing in primates including humans is that the principles of operation with respect to taste reward, olfactory reward, and the control of appetite, appear to be rather different from those in rodents. For example, in macaques there is no reduction of the neuronal responses to taste stimuli in the primary taste cortex in the anterior insula  and adjoining frontal opercular cortex  as hunger is reduced to zero by feeding to normal, physiological, self-determined, satiety. (The same holds for the nucleus of the solitary tract; Yaxley et al., 1985.) Thus taste reward (whether one works to obtain a taste, i.e., has an appetite for a taste) is not represented in the primary taste cortex, or at any earlier stage of taste processing, including the taste receptors. Instead, neuronal activity in the macaque primary taste cortex reflects the concentration of a tastant, and what the taste is (sweet, salt, bitter, sour, umami) as shown by information theoretic and related analyses of the neuronal activity (Baylis and Rolls, 1991;Rolls et al., 1996aRolls et al., , 2010aKadohisa et al., 2005;Rolls and Treves, 2011). The same is the case in humans, in that functional magnetic resonance neuroimaging (fMRI) investigations show that the subjective correlate of activations in the primary taste cortex is the intensity of the taste, not its pleasantness Grabenhorst et al., 2008a) [which is the subjective correlate of reward value (Rolls, 2005;Grabenhorst and Rolls, 2011)]. In contrast, in rodents there is evidence that satiety stimuli such as food in the gut can decrease neuronal responses to taste stimuli even in the nucleus of the solitary tract (Rolls and Scott, 2003). [It is worth noting that these studies in rodents often do not use self-determined, that is physiological levels of, satiety, but instead use set quantities of satiety stimuli (and the studies may also be performed under anesthesia). In those cases effects may be being investigated that are outside the physiological range. In addition, it is found that the pleasantness of food reliably goes to zero when humans eat to self-determined satiety, Rolls et al., 1981, and, correspondingly, in macaques, neurons that respond to food reward simply stop responding to the food when self-determined satiety is reached; Burton et al., 1976;Rolls et al., 1986Rolls et al., , 1989Critchley and Rolls, 1996a.] For these reasons, investigations of the neurophysiology of chemosensory processing in macaques may be particularly relevant to studying the fundamental principles of the neural processing including learning in the chemosensory system that occur in humans. These studies are complemented in the following by fMRI studies in humans, which however cannot reveal the details of the neural mechanisms, which can only be understood at the neuronal level (Rolls, 2008b;Rolls and Treves, 2011). I highlight key points about this chemosensory processing and learning in each of the following sections.

TASTE IS A PRIMARY REINFORCER, AND MOST OLFACTORY STIMULI ARE NOT
A primary reinforcing stimulus is a stimulus that is rewarding or punishing without learning. Taste is a primary reinforcer, in that for example the first time that a sweet taste is encountered it will be accepted, and the first time that a bitter taste is encountered it will be rejected (Rolls, 2005). The mechanism is that genes specify taste receptors, and these must be connected by labeled lines to parts of the brain where they are then represented in terms of their reward value which reflects the gene-specified taste receptors from which they receive inputs (Rolls, 2005). The first stage in the primate taste system at which this occurs is in the secondary taste cortex in the orbitofrontal cortex (see above and Rolls, 2005;. This probably applies to all five tastes, sweet, salt, bitter, sour, and umami. Most olfactory stimuli are not primary reinforcers. Their reward or punishment value is learned by association with a primary reinforcer such as a taste by mechanisms that will be described below. Exceptions to the general principle are for example pheromones that may attract other individuals (including the odors involved in major histocompatibility gene effects), probably some odors that promote disgust produced for example by rotting food, possibly some odors associated with food such as maltol, and some odors that may signal danger such as burning-related odors, though here the effects may be at least in part trigeminal (unpleasant somatosensory sensation) or learned by association with trigeminal stimuli (Rolls, 2005).
This summary (with evidence provided in the literature, e.g., Rolls, 2005Rolls, , 2012 provides a background for some of the principles described in the next few sections.

TASTE VALUE CAN BE ALTERED BY ASSOCIATIVE LEARNING
Although taste is a primary, gene-specified, reinforcer, its value can be relearned by association with a strong primary reinforcer, such as energy intake in the processes known as conditioned appetite and conditioned satiety (Booth, 1985), and such as sickness (nausea). The classic example is taste aversion learning, in which for example a salty taste of lithium chloride is avoided after it has been ingested and sickness has followed. Most of this research, described elsewhere (Scott, 2011), has been performed in rodents, and appears to involve changes to neural encoding as early as the nucleus of the solitary tract which however depends on mechanisms in the gustatory cortex for the learning. This is an interesting and unusual example of associative learning in that there can be a long delay of up to several hours between the taste (the conditioned stimulus) and the sickness (the unconditioned stimulus). This is possible in the taste system, where foods are eaten at periods often separated by long intervals, so that there is no confusion about which taste it was that caused the sickness. This is not possible with for example visual-to-sickness learning, for there is usually a continuing succession of visual stimuli before the sickness occurs, and there is no easy way to relate the particular visual stimulus that caused the sickness with the sickness. Indeed, rodents show neophobia (fear of new foods), and implement a strategy of selecting one of a set of new foods to eat, so that sickness, if it follows, can be associated with that particular food. If all the new foods were eaten early on, there would be no way to determine which one caused the sickness. This learning mechanism depends on the amygdala in rats (Rolls and Rolls, 1973).

OLFACTORY-TO-TASTE ASSOCIATION LEARNING
This is an example of stimulus-reinforcer association learning. In macaques, neurons in the primary taste cortex in the anterior insula are not activated by olfactory stimuli . The primary taste cortex is not therefore the site of olfactory-to-taste association learning. (We do not typically find activations in the human primary taste cortex in the anterior, taste, insula by odors. However, if some activations are reported in some studies, they may reflect the effects of cortico-cortical back projections from multimodal areas such as the orbitofrontal cortex that are being used for memory recall, Rolls, 2008b, for example of a taste associated with an odor. Such memory recall and related top-down attentional effects must be relatively weak so as not to dominate bottom-up sensory processing, as analyzed quantitatively elsewhere; Renart et al., 1999b;Deco and Rolls, 2005a,b;Rolls, 2008b.) Taste and olfactory pathways first come together anatomically in the primate brain in the orbitofrontal cortex (see Figure 1; Carmichael et al., 1994;Price, 2006) where bimodal neurons are found that respond to both odor and taste stimuli (Rolls and Baylis, 1994;Critchley and Rolls, 1996b). These bimodal neurons reflect olfactory-to-taste association learning (olfactory discrimination learning) in which one odor is paired with one taste (e.g., glucose), and a second odor with a different taste (e.g., salt, which is mildly aversive). This is shown to be a learned effect by the fact that when the olfactory-to-taste pairing is reversed, these neurons reverse the olfactory stimuli to which they respond (see Figure 2; Rolls et al., 1996b). This type of associative learning is how flavors are formed, where flavors are defined by olfactory-taste combinations.
In the case of umami, such olfactory-to-taste association learning appears to be key to the pleasantness of umami (Rolls, 2009). Monosodium glutamate as a taste is not very pleasant, but when combined with a savory pleasant odor (such as vegetable), can become very pleasant (McCabe and Rolls, 2007). (The odor must be consonant: in these experiments the effect of combining rum odor with monosodium glutamate was to produce a flavor that was quite unpleasant.) The combination of monosodium glutamate and vegetable odor produced supralinear activations (greater than the sum of those produced by the taste and odor separately) in the part of the brain that represents the pleasantness of odors and taste, the orbitofrontal cortex (McCabe and Rolls, 2007). That is the explanation of how umami can make a food pleasant: by a combination of monosodium glutamate and a consonant odor. That will have been learned in a lifetime of experience of eating foods rich in glutamate and/or inosine monophosphate such as tomatoes, mushrooms, meat, and human mother's milk (Rolls, 2009).
In humans, olfactory-taste convergence occurs in the orbitofrontal cortex and in the region that is intermediate between it and the primary taste and olfactory cortices, the agranular insula, at the far anterior end of the insula in what is topologically related to the orbitofrontal cortex (De Araujo et al., 2003).
The reversal of olfactory-to-taste association learning is a relatively slow process which takes often 40-60 trials for the reversal to occur . This is consistent with the utility of maintaining neurons that represent particular flavors because of previously learned combinations of odorants and tastants.

VISUAL-TO-TASTE ASSOCIATION LEARNING IN THE ORBITOFRONTAL CORTEX
Neurons with visual responses to the sight of food are found in the lateral hypothalamus . These neurons probably receive their inputs from neurons in the orbitofrontal cortex, Frontiers in Systems Neuroscience www.frontiersin.org

FIGURE 2 | Orbitofrontal cortex: olfactory-to-taste association reversal. (A)
The activity of a single orbitofrontal cortex olfactory neuron during the performance of a two-odor olfactory discrimination task and its reversal is shown. Each point represents the mean poststimulus activity of the neuron in a 500-ms period on approximately 10 trials of the different odorants. The SE of these responses are shown. The odorants were amyl acetate (closed circle; initially S−, punished with a taste of salt) and cineole (o) (initially S+, rewarded with fruit juice flavor). After 80 trials of the task the reward associations of the stimuli were reversed. This neuron reversed its responses to the odorants following the task reversal. (B) The behavioral responses of the monkey during the performance of the olfactory discrimination task. The number of lick responses to each odorant is plotted as a percentage of the number of trials to that odorant in a block of 20 trials of the task. (After Rolls et al., 1996b).
where we also discovered neurons that respond to the sight of food (Thorpe et al., 1983), and to taste (Thorpe et al., 1983;Rolls et al., 1989Rolls et al., , 1990Critchley and Rolls, 1996b). The orbitofrontal cortex neurons that respond to the sight of food do so by visualto-taste association learning, as shown by the fact that they reverse their responses when the visual-to-taste contingency is reversed in a visual discrimination task (Thorpe et al., 1983;Rolls et al., 1996b). The mechanism probably involves in part pattern association learning, and its decrement by synaptic long-term depression when the contingency is reversed (Rolls, 2005(Rolls, , 2008b. But association learning is not all that there is to the learning, for the reversal can take place in one-trial (see Figure 3). In particular, in the Go-NoGo visual discrimination task on a trial on which the reward contingencies are reversed, the following occurs. When one stimulus is shown which indicates that taste reward (glucose or fruit juice) should be obtained but instead saline is delivered, the monkey licks to the other stimulus which has been recently associated with saline, and obtains reward (Thorpe et al., 1983; see Figure 3). This is termed serial reversal learning set, and can occur after repeated experience with reversal has been obtained. The effect cannot therefore involve visual-taste association learning, but in this case involves the switch of a rule (about which of the two visual stimuli is currently associated with reward).
This type of reversal trial produces remarkable activity in a population of orbitofrontal cortex neurons that respond when the expected reward is not obtained (Thorpe et al., 1983; Figure 3). They thus respond to an expectation-outcome mismatch that is negative. We thus term them error neurons (Thorpe et al., 1983), or negative reward prediction error neurons (Rolls, 2008b(Rolls, , 2011bGrabenhorst and Rolls, 2011). Consistent effects are found in humans .
The rapid reversal requires a rule which indicates which of the visual stimuli is currently associated with reward. We hypothesize that the negative reward prediction error neurons, which maintain their firing for 8-10 s after the non-reward event (Thorpe et al., 1983; see Figure 3) in what is likely to be an attractor state (Rolls, 2008b), are important in the reversal. We believe that they reset, by inhibition through inhibitory interneurons, short-term memory rule-encoding attractor networks in the same brain region. After the inhibition, the attractor that emerges from the noisy (Poisson) firing of the neurons is the attractor for the opposite rule, because it is showing less synaptic or neuronal adaptation than the neurons in the network that represent the recently active rule (Deco and Rolls, 2005c).
An integrate-and-fire computational model which illustrates how the rapid reversal learning could be implemented is shown in Figure 4 (Deco and Rolls, 2005c). In the lower module, stimuli are mapped from sensory neurons (level 1, at the bottom), through an intermediate layer of conditional object-reward combination neurons with rule-dependent activity, to layer 3 which contains reward/punishment neurons. The mapping through the intermediate layer can be biased by the rule module inputs to perform a direct or reversed mapping. The activity in the rule module can be reversed by the error signal which occurs when an expected reward is not obtained. The reversal occurs because the attractor state in the rule module is shut down by inhibition arising from the effects of the error signal, and restarts in the opposite attractor state because of partial synaptic or neuronal adaptation of the previously active rule neurons.
The operation of this system is facilitated by the conditional reward neurons, which respond to a reward stimulus only when one rule applies. These neurons for example respond to a green stimulus when it is associated with taste reward, but not to a blue stimulus when it is associated with taste reward (Thorpe et al., 1983;Rolls, 2008b; Figure 5). The importance of these conditional reward neurons is that they can be biased on (or off) by the rule neurons. For example, if a green stimulus is seen, and the "green is reward" rule attractor is firing and biasing the "conditional green is reward" neurons, then the "conditional green is reward" neurons will win the competition and be activated, and in turn activate the "go" or "reward" neurons at the output stage (Figure 4). A fuller description is provided elsewhere (Deco and Rolls, 2005c;Rolls, 2008b).
It is significant in terms of brain design that in the orbitofrontal cortex where these multimodal olfactory-to-taste Frontiers in Systems Neuroscience www.frontiersin.org FIGURE 3 | Visual discrimination reversal for sweet taste reward vs. the aversive taste of salt (NaCl). Negative reward prediction error neuron: responses of an orbitofrontal cortex neuron that responded only when the monkey licked to a visual stimulus during reversal, expecting to obtain fruit juice reward, but actually obtained the taste of aversive saline because it was the first trial of reversal. Each single dot represents an action potential; each vertically arranged double dot represents a lick response. The visual stimulus was shown at time 0 for 1 s (labeled "shutter open"). The neuron did not respond on most reward (R) or saline (S) trials, but did respond on the trials marked x, which were the first trials after a reversal of the visual discrimination on which the monkey licked to obtain reward, but actually obtained saline because the task had been reversed. It is notable that after an expected reward was not obtained due to a reversal contingency being applied, on the very next trial the macaque selected the previously non-rewarded stimulus. This shows that rapid reversal can be performed by a non-associative process, and must be rule-based. (After Thorpe et al., 1983).
and visual-to-taste convergences and learning occur, it is the reward value of the olfactory/visual/taste combination that is represented, as shown by experiments in which the neuronal response to the particular food eaten decreases to zero during feeding to satiety Critchley and Rolls, 1996a;. This orbitofrontal cortex association learning system is very important in behavior, for damage to it in macaques (Butter, 1969;Iversen and Mishkin, 1970) and humans Hornak et al., 2004) impairs reversal learning and may be very important in the behavioral changes that follow damage to the human orbitofrontal cortex (Rolls, 2005(Rolls, , 2008b. The responses of amygdala neurons are much less specifically tuned to respond to the sight of particular foods, and reversal of the responses of amygdala neurons is much more difficult to obtain, and is much slower than the one-trial reversal found in the orbitofrontal cortex (Sanghera et al., 1979;Rolls, 2005;Wilson and Rolls, 2005). The fact that if primate amygdala neurons reverse they do so slowly was confirmed in a trace conditioning procedure [in which there is a delay between the end of the conditioned stimulus (a visual image) and the unconditioned stimulus (an air-puff to the eye, or a liquid)] in which if neurons reversed it took 30-60 trials (Paton et al., 2006). The evidence thus indicates that primate amygdala neurons do not alter their activity as flexibly and rapidly in visual-reinforcer reversal learning as do orbitofrontal cortex neurons (Rolls, 2008b). The rodent amygdala is involved in the neophobia to new foods, which gradually becomes replaced by investigation and acceptance over time (Rolls and Rolls, 1973).

LEARNING OF NEW OLFACTORY-TASTE AND ORAL TEXTURE-TASTE REPRESENTATIONS BY COMPETITIVE LEARNING IN THE ORBITOFRONTAL CORTEX
Each orbitofrontal cortex neuron responds to a different combination of taste and oral texture stimuli. The taste stimuli that may be combined in this way include sweet, salt, bitter, sour, and umami; and the oral somatosensory stimuli include viscosity, fat texture, gritty texture, capsaicin, fatty acids such as linoleic and lauric acid, and oral temperature Verhagen et al., 2003;Kadohisa et al., 2004Kadohisa et al., , 2005. This encoding of information by different neurons is to some extent independent, which enables the total information to increase approximately linearly with the number of neurons involved in the population, a very powerful neural code (Rolls, 2008b;Rolls et al., 2010a;Rolls and Treves, 2011). Part of the basis for this representation may be the random sampling by each neuron of the different inputs being received in a cortical area (Rolls, 2008b). That process is likely to be facilitated by competitive learning, which, because of the inhibition implemented by the cortical inhibitory interneurons, helps the neurons to learn to respond to different combinations of their inputs Rolls, 2008b).
The same two processes may contribute to the non-linear separation of the olfactory and taste inputs to neurons in the orbitofrontal cortex. Evidence for such non-linear processing is that after feeding to satiety with fruit juice, a neuron may no longer respond to fruit juice, but does still respond to one of the components, sweet taste ; see Figure 6, which illustrates that the responses can become sometimes a little larger to other Frontiers in Systems Neuroscience www.frontiersin.org stimuli after one food has been fed to satiety). Indeed, the fact that neurons can respond in this specific way to combinations of their inputs, so that a neuron may respond optimally to a particular flavor, is an important part of the mechanism of sensory-specific satiety Rolls, 2005Rolls, , 2008b.

LEARNING AS A MECHANISM FOR SENSORY-SPECIFIC SATIETY
Sensory-specific satiety, discovered during lateral hypothalamic neuronal recordings (Rolls, 1981;Rolls et al., 1986), is the process by which the reward value, and its correspondent, human subjective pleasantness, of the flavor of a particular food decreases to zero after the food has been eaten to satiety, but remains relatively high for other foods not eaten in the meal (Rolls et al., 1982(Rolls et al., , 1983a(Rolls et al., ,b, 1984Rolls, 2005). Sensoryspecific satiety is reflected in the responses of orbitofrontal cortex neurons that respond to the taste, odor, sight, and/or oral texture of foods Critchley and Rolls, 1996a; see example in Figure 6), and is also reflected in activations in the human orbitofrontal cortex with fMRI neuroimaging . The taste neurons in this population are found throughout FIGURE 5 | A conditional reward neuron recorded in the orbitofrontal cortex which responded only to the Green stimulus when it was associated with reward (G+), and not to the Blue stimulus when it was associated with reward (B+), or to either stimuli when they were associated with a punisher, the taste of salt (G− and B−). The mean firing rate ± SEM is shown. (After Thorpe et al., 1983).
a wide medial as well as lateral extent of the orbitofrontal cortex (Rolls et al., , 1990Rolls and Baylis, 1994;Critchley and Rolls, 1996c;Verhagen et al., 2003; Frontiers in Systems Neuroscience www.frontiersin.org  (Pritchard et al., 2007(Pritchard et al., , 2008Rolls, 2008a). The orbitofrontal cortex projects to the lateral hypothalamus, and provides a route for hypothalamic neurons to also show sensory-specific satiety effects (Rolls, 1981;Rolls et al., 1986). Sensory-specific satiety effects are not found in the macaque primary taste cortex Yaxley et al., 1988) or inferior temporal visual cortex (Rolls et al., 1977), and the mechanism for sensory-specific satiety is thus implemented in the orbitofrontal cortex, which receives direct inputs from both these structures (Rolls, 2005(Rolls, , 2008b.

2004, 2005; Rolls, 2008a), as has been confirmed
The mechanism of sensory-specific satiety that is proposed is a simple type of learning, in which the neurons in the orbitofrontal cortex that respond to relatively specific foods gradually show habituation of their responses over a time period of approximately 10-15 min of stimulation by the food in the mouth, while it is being eaten. The mechanism may involve synaptic adaptation of the afferent inputs to the neuron that are activated by a particular food, for the neuron can still respond after satiety to other foods that have not been eaten in a meal (see example in Figure 6). Sensory-specific satiety generalizes a little to similar foods, but not to dissimilar foods, reflecting the somewhat distributed encoding used by the neurons, which allows the similarity of stimuli to be reflected in neuronal responses that utilize dotproduct decoding (Rolls, 2008b;Rolls and Treves, 2011). In the case of sensory-specific satiety, the generalization to other foods thus reflects the similarity (dot-product or correlation) between the firing rate vectors that activate the synaptic weight vector on a neuron (Rolls, 2008b).
Sensory-specific satiety can occur in part if the food is not swallowed, but only chewed or even only smelled for 10-15 min . The mechanism thus does not rely on food entering the stomach or intestines, though full satiety only occurs if that is the case, showing that gastro-intestinal feedback is necessary for full satiety (Rolls, 2005).
Although the proposed mechanism thus involves synaptic adaptation, the process is not at all the same as sensory adaptation, in that there is no effect of satiety on neuronal responses at stages before the orbitofrontal cortex Yaxley et al., 1988), and in that subjective ratings of the intensity of food hardly change after feeding to satiety, whereas the subjective pleasantness decreases to zero (Rolls et al., 1983b;.

FLAVOR-PLACE LEARNING IN THE HIPPOCAMPUS
The primate anterior hippocampus (which corresponds to the rodent ventral hippocampus) receives inputs from brain regions involved in flavor reward processing such as the amygdala and orbitofrontal cortex (Suzuki and Amaral, 1994;Carmichael and Price, 1995a,b;Stefanacci et al., 1996;Pitkanen et al., 2002;Price, 2006). The primate hippocampus contains spatial view neurons, which respond to spatial locations "out there" being viewed Robertson et al., 1998;Georges-François et al., 1999;Rolls, 1999;Rolls and Xiang, 2006). To investigate how this affective input may be incorporated into primate hippocampal function, we  recorded neuronal activity while macaques performed a flavor reward-to-place association task in which each spatial scene shown on a video monitor had one Frontiers in Systems Neuroscience www.frontiersin.org location which if touched yielded a preferred fruit juice reward, and a second location which yielded a less preferred juice reward. Each scene had different locations for the different rewards. Of 312 hippocampal neurons analyzed, 18% responded more to the location of the preferred reward in different scenes, and 5% to the location of the less preferred reward. When the locations of the preferred rewards in the scenes were reversed, 60% of 44 neurons tested reversed the location to which they responded, showing that the reward-place associations could be altered by new learning in a few trials. The majority (82%) of these 44 hippocampal rewardplace neurons tested did not respond to object-reward associations in a visual discrimination object-reward association task, showing that the hippocampal representation is specialized for flavor-place rather than object-flavor representations. Thus the primate hippocampus contains a representation of the reward associations of places "out there" being viewed, and this is a way in which reward information can be stored as part of an episodic memory Rolls, 2008bRolls, , 2010b. There is consistent evidence that rewards available in a spatial environment can influence the responsiveness of rodent place neurons (Hölscher et al., 2003;Tabuchi et al., 2003).

TOP-DOWN COGNITIVE MODULATION OF TASTE, OLFACTORY, AND FLAVOR REPRESENTATIONS INVOLVES LEARNING
If a cognitive, high level, indeed verbal, label is used to describe an odor, the odor can be rated as more subjectively pleasant than when the label indicates that it is unpleasant (De Araujo et al., 2005). In a study of the underlying neural mechanisms with fMRI, we showed that when an olfactory stimulus, isovaleric acid (with a smell somewhat like brie) was delivered with a visual word label indicating that it was cheese, the activations in the orbitofrontal cortex were greater to the odor than when the label was body odor (De Araujo et al., 2005). We showed that this was an interaction between the top-down cognitive label and the bottom-up olfactory input, for the difference of the activations was much greater with the label and the odor present than with the labels alone (De Araujo et al., 2005). We have shown similar cognitive modulation of the pleasantness of taste (umami, monosodium glutamate) and flavor (umami, monosodium glutamate plus vegetable odor) in the orbitofrontal cortex (Grabenhorst et al., 2008a ; Figure 7).
These findings are of great interest, for they show that high level cognitive influences descend down into the first part of the human taste, olfactory, and flavor brain systems in which the reward value is made explicit in the representation. The cognition appears to actually modulate the neural representation that is related to subjective pleasantness.
The question arises about how the top-down (cognitive) signal connects to the correct neurons in the orbitofrontal cortex so that when the verbal indication is of good value, then the reward representation is enhanced, and when the verbal indication is of poor value, the reward effects produced by the bottom-up input are not enhanced. This requires a matching between the top-down and the bottom-up signals. How could this be achieved? I propose that the mechanism is analogous to that which we have described in relation to the recall of memories from the hippocampus to the neocortex in our theory of hippocampal function (Rolls, 1989(Rolls, , 2008b(Rolls, , 2010b, and for which we have a quantitative analysis (Treves and Rolls, 1994;Rolls, 1995). The hypothesis is as follows, and is described with the help of Figure 8, which describes a related mechanism, that for the top-down biasing of activity in affective vs. sensory systems in the brain for taste, flavor, olfactory, etc., representations. When there is a rewarding taste present as a bottom-up input that is causing orbitofrontal cortex neurons to fire, and simultaneously there is a cognitive top-down set of afferents (originating in language or related cortical areas) to the orbitofrontal cortex some of which are active reflecting cognitive processing that a good taste is present, then the active synaptic afferents labeled s1 in Figure 8 show synaptic modification by associative, Hebb-like, long-term potentiation onto the active neurons reflecting the good bottom-up input. This associative synaptic modification is what sets up the correct relation between the cognitive top-down input and the bottom-up input. Other neurons, which might be activated by bottom-up bad tastes, odors, or flavors, would similarly become associated by synaptic modification of other synapses (for example s2, s3, or s4 in Figure 8) with the corresponding top-down cognitive input to the orbitofrontal cortex representing the unpleasant or aversive nature of the bottom-up taste, etc., stimulus. Then later, after the learning, the top-down cognitive inputs that enhance reward value would enhance the activity of just those neurons that represented a good taste, etc. If the top-down reward value input was not present, there would be less activation produced by the bottom-up input, in the same way that we have analyzed for attention (Deco and Rolls, 2005b).
This mechanism is analogous to the memory recall mechanism, in that the top-down signal (in that case from the hippocampus) activates the correct neurons back in the neocortex, because of prior associative synaptic modification when both the bottomup and top-down inputs were present (Rolls, 1989(Rolls, , 1995(Rolls, , 2008b(Rolls, , 2010bTreves and Rolls, 1994).
Studies of the neuronal mechanisms of attention show that the top-down input cannot be very strong, or else it dominates the bottom-up perception, which must not be disconnected from the world (Renart et al., 1999a,b;Deco and Rolls, 2005b). Given that fact, the modulatory effects of these top-down signals are most evident when the bottom-up input is weak or ambiguous (as in the case of the isovaleric acid "brie-like" odor; De Araujo et al., 2005), for otherwise the bottom-up input then dominates the system and there is little or no attentional or cognitive modulation that can be observed (Deco and Rolls, 2005b).

TOP-DOWN ATTENTIONAL MODULATION OF TASTE, OLFACTORY, AND FLAVOR REPRESENTATIONS INVOLVES LEARNING
If humans are asked to pay attention to pleasantness so that they can later rate the pleasantness of an odor, then activations related to pleasantness are enhanced in the orbitofrontal (secondary olfactory) cortex . Selective attention to intensity enhances representations in other cortical areas .
If humans are asked to pay attention to pleasantness so that they can later rate the pleasantness of a taste (umami), then activations related to pleasantness are enhanced in the orbitofrontal (secondary taste) cortex ; Figure 9).

Frontiers in Systems Neuroscience
www.frontiersin.org The BOLD signal in the medial orbitofrontal cortex was correlated with the subjective pleasantness ratings of taste and flavor, as shown by the SPM analysis, and as illustrated (mean across subjects ± SEM, r = 0.86, p < 0.001). (After Grabenhorst et al., 2008a).
Selective attention to intensity enhances representations in the primary taste cortex in the anterior insula . There is the same problem as for cognitive modulation of affective representations. How is a top-down signal originating from the level of language made to correspond with the correct bottom-up signals? The mechanism that I propose for attention is analogous to that which I proposed for cognitive modulation, that the topdown signal that is appropriate becomes associated by associative synaptic modification with the bottom-up signals when both are present. The circuitry for this is schematized in Figure 8, which shows the model we have proposed to accommodate these findings, the top-down biased activation model of selective attention (Grabenhorst and Rolls, 2010). The crucial synaptic modification for the correct correspondence to be set up is that between the topdown connections, and the neurons that receive the bottom-up input, labeled s in Figure 8.

BEYOND REWARD VALUE TO DECISION-MAKING
Representations of the reward value of food, and their subjective correlate the pleasantness of food, are influenced by associative learning, and by top-down cognitive and attentional control, as described above. But after the reward evaluation, a decision has to be made about whether to seek for and consume the taste, olfactory, flavor, oral texture, or other type of reward. We are now starting to understand how the brain takes decisions as described in The Noisy Brain (Rolls and Deco, 2010), and this has implications for whether a reward of a particular value will be selected (Rolls, 2008b(Rolls, , 2011aRolls and Deco, 2010;Grabenhorst and Rolls, 2011). A tier of processing beyond the orbitofrontal cortex, in the medial prefrontal cortex area 10, becomes engaged when choices are made between odor stimuli based on their pleasantness (see Figure 1; Grabenhorst et al., 2008b;Rolls et al., 2010b,c,d). The choices are made by a local attractor network in which the winning attractor Frontiers in Systems Neuroscience www.frontiersin.org

FIGURE 8 | Biased activation theory of top-down selective attention.
The short-term memory systems that provide the source of the top-down activations may be separate (as shown), or could be a single network with different attractor states for the different selective attention conditions. The top-down short-term memory systems hold what is being paid attention to active by continuing firing in an attractor state, and bias separately either cortical processing system 1, or cortical processing system 2 via synapses labeled s. This weak top-down bias interacts with the bottom-up input to the cortical stream and produces an increase of activity that can be supralinear (Deco and Rolls, 2005b). Thus the selective activation of separate cortical processing streams can occur. In the example, stream 1 might process the affective value of a stimulus, and stream 2 might process the intensity and physical properties of the stimulus. The outputs of these separate processing streams then must enter a competition system, which could be for example a cortical attractor decision-making network that makes choices between the two streams, with the choice biased by the activations in the separate streams. (After Grabenhorst and Rolls, 2010).
state represents the decision, with each possible attractor state representing a different choice, and the neurons in each of the possible attractors receiving inputs that reflect the evidence for that choice. (The attractor network is formed in a part of the cerebral cortex by strengthening of the recurrent collateral excitatory synapses between nearby pyramidal cells using associative synaptic modification. One group of neurons with strengthened synapses between its members can form a stable attractor with high firing rates, which competes through inhibitory interneurons with other possible attractor states formed by other groups of excitatory neurons; Rolls, 2008bRolls, , 2010a. The word attractor refers to the fact that inexact including incomplete inputs are attracted to one of the states of high firing that are specified by the synaptic connections between the different groups of neurons. The result in this non-linear system is that one attractor wins, and this implements a mechanism for decision-making with one winner; Wang, 2002Wang, , 2008Rolls, 2008b;Rolls and Deco, 2010). The decisions are probabilistic as they reflect the noise in the competitive non-linear decision-making process that is introduced FIGURE 9 | Effect of paying attention to the pleasantness vs. the intensity of a taste stimulus. Top: a significant difference related to the taste period was found in the medial orbitofrontal cortex at [−6 14 −20] z = 3.81 p < 0.003 (toward the back of the area of activation shown) and in the pregenual cingulate cortex at [−4 46 −8] z = 2.90 p < 0.04 (at the cursor). Middle: medial orbitofrontal cortex. Right: the parameter estimates (mean ± SEM across subjects) for the activation at the specified coordinate for the conditions of paying attention to pleasantness or to intensity. The parameter estimates were significantly different for the orbitofrontal cortex t = 7.27, df = 11, p < 10 −4 . Left: the correlation between the pleasantness ratings and the activation (% BOLD change) at the specified coordinate (r = 0.94, df = 8, p 0.001). Bottom: pregenual cingulate cortex. Conventions as above. Right: the parameter estimates were significantly different for the pregenual cingulate cortex t = 8.70, df = 11, p < 10 −5 . Left: the correlation between the pleasantness ratings and the activation (% BOLD change) at the specified coordinate (r = 0.89, df = 8, p = 0.001). The taste stimulus, 0.1 M monosodium glutamate, was identical on all trials. (After Grabenhorst and Rolls, 2008).
by the random spiking times of neurons for a given mean rate that reflect a Poisson process (Rolls and Deco, 2010;Rolls et al., 2010c).

Frontiers in Systems Neuroscience www.frontiersin.org
The costs of each reward need to be subtracted from the value of each reward to produce a net reward value for each available reward before the decision is taken (Rolls, 2008b;Grabenhorst and Rolls, 2011). The reasoning or rational system with its long-term goals (introducing evidence such as "scientific studies have shown that fish oils rich in omega 3 may reduce the probability of Alzheimer's disease") then competes with the rewards such as the pleasant flavor of food (which are genespecified, Rolls, 2005, though subject to conditioned effects, Booth, 1985;Rolls, 2005) in a further decision process which may itself be subject to noise (Rolls, 2005(Rolls, , 2008bRolls and Deco, 2010). This can be described as a choice between the selfish individual or "phene" (standing for phenotype) and the selfish gene (Rolls, 2011a(Rolls, , 2012.
In this context, the findings described in this paper about chemosensory learning and top-down cognitive and attentional effects on the taste, olfactory, and more generally reward systems in the brain are important advances in our understanding of how reward value is represented in the brain and is influenced by learning, and how decisions between those reward values are reached in attractor networks that themselves involve associative learning to set up the correct attractor states.