Blocking NMDA-Receptors in the Pigeon’s Medial Striatum Impairs Extinction Acquisition and Induces a Motoric Disinhibition in an Appetitive Classical Conditioning Paradigm

The medial striatum of birds resembles the mammalian dorsal striatum, which plays a key role in the extinction of learned behavior. To uncover the variant and invariant neural properties of extinction learning across species, we use pigeons as an animal model in an appetitive extinction paradigm. Here, we targeted a medial sub-region of the pigeon’s striatum that receives executive, visual and motor pallial projections. By locally antagonizing the N-methyl-D-aspartate (NMDA) receptors through 2-Amino-5-phosphonovalerianacid (APV) during extinction, we observed an unspecific disinhibition effect, namely an increase in conditioned pecking to a rewarded control stimulus. In addition, blocking the NMDA receptors substantially deteriorated the extinction acquisition, implying that the pigeons still responded vigorously to the CS- even without food reward during extinction. After correcting for the unspecific effect of APV, the impaired extinction acquisition remained significant, which leads to the assumption that the delayed extinction effect is possibly caused by deficits in the updating of value coding of altered reward contingencies. Also, the APV-induced disinhibition seems to result from local hyperactivity that primarily drives actions towards cues of high appetitive value. The overall correspondence of our results with those from mammals suggests common neural substrates of extinction and highlights the shared functionality of the avian and mammalian dorsal striatum despite 300 million years of independent evolution.


INTRODUCTION
Classically, the basal ganglia had always been associated with motor function. Meanwhile, it is increasingly understood that this system of substructures constitutes the core for a variety of learning, memory, and action selection processes. Since they receive cortical and subcortical projections carrying executive, limbic, sensory, and motivational information, these nuclei are well positioned to foster behavioral strategies to guide motor output in order to achieve favorable outcomes. Thereby, the striatum as a key component of the basal ganglia serves as the main entry gate to the basal ganglia connectivity loop. Numerous studies have indicated that striatal activation coincides with several neuropsychiatric disorders such as post-traumatic stress disorder or substance abuse (Graybiel and Rauch, 2000;Goh and Peterson, 2012;Goodman et al., 2012Goodman et al., , 2014Everitt and Robbins, 2013;Gillan and Robbins, 2014). A thorough understanding of the dorsal striatum memory functions may help to unravel the underlying neural mechanisms for these human psychopathologies, and to further advance their treatment strategies. Therefore, it is of paramount importance to examine not only how the dorsal striatum modulates memory formation, but also how it might be potentially involved in the extinction of learned behavior.
Extinction of learned responses is as important for adaptive behavior as initial acquisition. During extinction, a conditioned stimulus (CS) appears repeatedly without the unconditioned stimulus, or the reinforcer. Consequently, it causes the reduction of the previously learned conditioned response. Evidences suggest that extinction involves partial erasure of the original learning (Rescorla, 2004), as well as the formation of a new memory trace (Bouton et al., 2006). The existence of this new memory trace can be demonstrated by two phenomena: spontaneous recovery (the recovery of the extinguished response caused by a passage of time) and renewal (the recovery of the extinguished response induced by changing the context from the extinction phase to the test phase). Until now, the importance of the mammalian dorsal striatum in the extinction of learned habit behavior has been repeatedly revealed by lesion experiments with monkeys (e.g., Butters and Rosvold, 1968) and rats (e.g., Thullier et al., 1996;Goodman et al., 2016). Studies discovered that it is mostly the dorsolateral (DLS) (e.g., Goodman et al., 2017) and not the dorsomedial part of the striatum (DMS) (Dunnett and Iversen, 1981) that modulates habit memory extinction. Furthermore, postextinction DLS inactivation impairs memory consolidation after extinction training (Campus et al., 2015). More specifically, it has been shown that NMDA receptors of DLS are the key components which participate in extinguishing habit responses during extinction (Ghasemzadeh et al., 2009;Goodman et al., 2017). Apart from a role in habit memory, the dorsal striatum in mammals also participates in Pavlovian fear conditioning (Ferreira et al., 2003(Ferreira et al., , 2008. However, its involvement in fear extinction seems to be contradictory. A metabolic mapping study showed an elevated glucose consumption in the dorsal striatum during fear extinction in rats (Barrett et al., 2003), whereas a lesion study revealed no effect in Pavlovian fear extinction after excitotoxic lesions in both DLS and DMS (Wendler et al., 2014).
In order to uncover invariant properties of extinction learning in evolutionary distantly related animals, we are using pigeons as an animal model in an appetitive Pavlovian conditioning task.
Pigeons are an excellent model system for learning and memory. Behaviorally, they can work with large numbers of visual stimuli while keeping track of individual reward contingencies, and adapting their responses accordingly . Although the forebrain of birds and mammals differs in many respects, both vertebrate classes have homologous structures like the striatum and the hippocampus, as well as non-homologous, but functionally equivalent structures like the nidopallium caudolaterale (NCL) which is comparable to the mammalian prefrontal cortex (PFC) (Güntürkün, 2012;Güntürkün and Bugnyar, 2016). Several studies started to uncover the neural basis of extinction learning in birds. It is indicated that the prefrontal-like NCL, the hippocampus and the (pre)motor arcopallium, are crucial in the consolidation of extinction memory in birds (Lengersdorf et al., 2014;Gao et al., 2018). Also, acquisition of extinction memory engages the NCL and the amygdala via NMDA receptors in these regions (Lissek and Güntürkün, 2005;Lengersdorf et al., 2015;Gao et al., 2018). In addition, transiently inactivating the nidopallium frontolaterale (NFL), one of the pigeons' associative visual areas impairs extinction acquisition and perturbs context processing during extinction (Gao et al., 2019). Altogether, the avian neural substrates of extinction learning exhibit comparable characteristics to those in mammals, although their last common ancestor lived ca. 300 million years ago.
In the present study, we attempt to answer the question whether the avian striatum participates in extinguishing learned behavior, since the mammalian striatum plays a crucial role in extinction learning (for a review see Goodman and Packard, 2018). Avian and mammalian striata are considered to be homologous to each other. Both are enriched in dopamine receptors (Sun and Reiner, 2000) and are innervated by dopamine fibers from the substantia nigra pars compacta and the ventral tegmental area (Bottjer, 1993;Casto and Ball, 1994;Wynne and Güntürkün, 1995). Both have neuropils that are rich in acetylcholine and cholinesterase (Reiner et al., 1994), and both have abundant GABAergic medium-sized neurons with spiny dendrites, which contain either substance P or enkephalin (Grisham and Arnold, 1994;Medina and Reiner, 1995;Reiner et al., 1998). In addition, the developing embryonic dorsal striatum in both birds and mammals expresses the same Dlx1/2 genes (Rubenstein et al., 1994;Puelles et al., 2000), which indicates the same developmental neuroepithelial origin.
Anatomical studies indicate that the avian dorsal striatum is composed of medial (StM) and lateral striatum (StL) (Veenman et al., 1995;Reiner et al., 1998Reiner et al., , 2004. Despite the many similarities between avian and mammalian striatum, there is possibly no homology in their respective striatal subregions (Reiner, 2002). Although the striatum as a whole in birds and mammals projects to both substantia nigra and pallidum, neurons projecting to substantia nigra and pallidum are intermingled throughout the DMS and DLS in mammals (Reiner and Anderson, 1990;Reiner et al., 1998Reiner et al., , 2004. By contrast, the StM of birds contains mainly striatonigral projecting neurons, whereas the StL projects primarily to the pallidum (Bottjer, 1993;Reiner et al., 1998Reiner et al., , 2004. Therefore, StM and StL together form the avian dorsal striatum, but do not appear to be one-to-one homologous to the DMS and DLS, respectively. In addition, the corticostriatal projections in mammals are topographically organized (Alexander et al., 1986;Alexander and Crutcher, 1990). For example, the DLS is primarily innervated with sensorimotor and motor cortices, while DMS receives efferents from visual, auditory areas as well as associative and prefrontal regions (McGeorge and Faull, 1989). In birds, evidence has shown that pallial input to the avian striatum is as extensive as that in mammals and arises from all major parts of the pallium (Veenman et al., 1995;Reiner, 2002). However, the pallial-striatal projections are intermingled in StM and StL, having no topographical organization (Veenman et al., 1995;Reiner et al., 1998;Reiner, 2002). For example, the StM receives inputs from the sensory, motor, and associative regions (Veenman et al., 1995;Medina and Reiner, 1997;Kröner and Güntürkün, 1999), possibly showing a mixture of DMS and DLS characteristics.
In the present study, we targeted a subregion within the avian StM that receives pallial efferents from the "prefrontal-like" NCL, the visual associativ NFL, and the (pre)motor arcopallium (Veenman et al., 1995;Kröner and Güntürkün, 1999;Letzner et al., 2016). We aim to investigate the role of StM in the course of Pavlovian extinction, since the sources of its palliostriatal projections are all significantly involved in extinction (Lengersdorf et al., 2014(Lengersdorf et al., , 2015Gao et al., 2018). Considering the importance of the rodent dorsal striatal NMDA receptors in extinction learning (Goodman et al., 2017), and the presence of NMDA receptors in the avian striatum (Wada et al., 2004;Herold et al., 2014), we bilaterally injected the NMDA receptor antagonist APV in the target region before extinction. We adopted a well established behavioral paradigm (Rescorla, 2008;Lengersdorf et al., 2014Lengersdorf et al., , 2015Gao et al., 2018) for pigeons to test the hypothesis that the blockade of NMDA receptors in the StM impairs extinction learning.

Animals
In total, 22 adult homing pigeons (Columba livia) from local breeders participated in the study. The animals were individually housed in separate wire-mesh home cages (40 × 40 × 45 cm) in a colony room, where the temperature, humidity and the 12h-light-dark circle were strictly controlled (lights on at 8 am). Since we adopted a pavlovian conditioning procedure with food reward, all the animals were food deprived prior to training, and maintained at 80-90% of their free-feeding body weight. Water was provided ad libitum in their home cages with additional free food on weekends. Subjects were treated in accordance with the German guidelines for the care and use of animals in science and all experimental procedures were approved by a national ethics committee of the State of North Rhine-Westphalia, Germany and were in agreement with the European Communities Council Directive 86/609/EEC concerning the care and use of animals for experimental purposes.

Surgery
Prior to behavioral training, the animals received chronical implantation of one 26-gauge (10 mm) stainless steel guide cannulas (Plastics One Inc., Roanoke, United States) in each brain hemisphere. For anesthesia, a 7:3 mixture of ketamine (100 g/ml; Pfizer GmbH, Berlin, Germany) and xylazine (20 mg/ml Rompun, Bayer Vital GmbH, Leverkusen, Germany) was injected i.m. with a dosage of 0.075 ml per 100 g body weight. Additional isoflorane (Forane 100%, Abbott GmbH & Co. KG, Wiesbaden, Germany) was applied through a breathing mask to maintain a stable anesthetized state. Body temperature was maintained using a heat pad during surgery.
When reflexes were tested negatively for pain perception, the animals were fixed in a stereotaxic device. With one incision in the skin, the skull was exposed. Craniotomies were performed on both sides with the following coordinates based on the pigeon brain atlas (Karten and Hodos, 1967): AP+10.5 mm, ML ± 2 mm. Under visual inspection, one cannula was inserted vertically into the medial striatum (StM) in each hemisphere (DV+5.3 mm). The abovementioned coordinates were carefully chosen based on previous tracing studies (Veenman et al., 1995;Medina and Reiner, 1997;Kröner and Güntürkün, 1999). We calculated the coordinates with the aim to target a subregion within the StM that receives efferents from the visual associative NFL, the (pre)motor arcopallium and the NCL since all three regions are involved in extinction learning based on our previous studies (Lengersdorf et al., 2015;Gao et al., 2018Gao et al., , 2019. 3-5 stainless steel micro-screws (Small Parts, Logansports, United States) were drilled into the skull around each cannula as anchors. In the end, dental cement was applied on the cannulas and the screws to secure the cannulas to the implanted positions. After surgery, all animals received analgesic injections with 0.5 ml Carprofen (Rimadyl, 10 mg/ml Pfizer GmbH, Münster, Germany) twice daily on three consecutive days. The recovery period was 7 days in total, where the animals received free food and water. Two days before restarting the behavioral training, they were food deprived again and maintained at 80-90% of their freefeeding body weight.

Behavioral Apparatus
The same behavioral apparatus was used as in the previous studies (Lengersdorf et al., 2014(Lengersdorf et al., , 2015Gao et al., 2018). Briefly, experimental chambers were four skinner boxes of similar shapes (36 × 34 × 36 cm), which were housed in sound-attenuating cubicles (80 × 80 × 80 cm). Opposite to the opening of the skinner boxes, animals could see the stimulus presented on the monitor screen (either Belinea Model No.: 10 15 36 or Philips Model: Brilliance17S1/00) through the transparent pecking key (5 × 5 cm; 12 cm above the floor). Every effective key peck produced a feedback sound. Food was provided by a food hopper, which was positioned at the bottom center directly underneath the pecking key.
The skinner boxes were grouped in two distinct contexts by using different wallpapers (either with 2.5 cm wide vertical brown stripes spaced 5 cm apart on red background or marbling pattern on turquoise background) on the rear and side walls to differentiate between the contexts. Different noise, either white or brown noise (60 dB SPL), was used additionally during training for better distinction of the two contexts. Well distinguishable visual stimuli were used in the study (see section "Behavioral Procedure"). The hardware was controlled by a custom written MATLAB code Matlab (The Mathworks, Natick, MA, United States) using the Biopsychology toolbox (Rose et al., 2008).

Behavioral Procedure
The training procedure was identical to that used in Lengersdorf et al. (2014Lengersdorf et al. ( , 2015, and Gao et al. (2018). Briefly, training was composed of five separate phases: pretraining I, pretraining II, conditioning, extinction and test (Table 1 and Figures 1, 2). Except of the extinction training (described later), animals were trained with two sessions per training day with one session in each context (Figure 2).

Pretraining I and II
In pretraining I, there were 48 trials in each session. In each trial, a stimulus ("target") was presented for 5 s and followed by 3 s food reward with grain provided by the food hopper. This "target" stimulus was always rewarded, no matter whether the pigeons responded or not. The inter-trial-interval (ITI) was fixed at 48 s. Immediately when the animals achieved the learning criterion with consistent pecking responses in 80% of the trials in both contexts on three consecutive days, the animals entered the second pretraining phase. In pretraining II, in addition to the "target" stimulus, another control stimulus ("non-target") was introduced. The "non-target" was never rewarded regardless of the response of the animals. Each session consisted of 24 trials of "target" and 12 trials of "non-target" presentations (5 s) in both contexts. The ITI was reduced to 35 s. A minimum of 80% correct responses (pecking response to the "target" and no response to the "non-target") in both contexts for three consecutive training days were required to enter the conditioning phase.

Conditioning
During subsequent conditioning phases, animals were trained with an additional CS in each context. They received training for CS A in context A and CS B in context B ( Figure 1A). Each of the three stimuli ("target, " "non-target" and the corresponding CS) was presented for 5 s in 12 trials with 36 trials in total per session. The CS and "target" presentations were rewarded by 3 s of food access via the food hopper, while the "non-target" was not rewarded. Specifically, the duration of the conditioning phase was dependent on how long the pigeons needed to achieve the learning criterion with 80% correct responses for all stimuli across three consecutive days.

Extinction
The extinction phase consisted of 4 days in total. The two extinction sessions were scheduled 48 h apart from each other (Figures 1A, 2). The pigeons received an extinction session in each context where the corresponding CS was no longer paired with food reward. 15 min before extinction training, the pigeons were micro-infused bilaterally with either 1 µl of 5 µg/µl APV (Lissek and Güntürkün, 2003;Lengersdorf et al., 2015;Gao et al., 2018) dissolved in 0.9% saline (Tocris Cookson Ltd., Bristol, United Kingdom) or 1 µl 0.9 % saline (B. Braun Melsungen AG., Germany). The order of injections (APV or saline) was randomized across subjects and contexts. There was one day without experimental intervention between each extinction session to ensure a complete wash out of the injected substances from the body. Extinction sessions took place in the two contexts with one session in each context ( Figure 1A): the CSs were presented without reward in the other context in which they hadn't been presented in the conditioning phase ( Figure 1A). Simply, the CS A that was previously presented and rewarded in context A, was now presented without reward in context B during extinction. Similarly, the CS B that had been used previously in context B during conditioning, was now presented in context A during extinction without reward. In each extinction session, animals received the corresponding non-rewarded CS-(24 trials), rewarded "target" (24 trials) and non-rewarded "nontarget" (12 trials). Again, only the target stimulus was rewarded for 3 s. The order of contexts was randomized across subjects.

Test
In the final testing phase and 48 h after the second extinction session, responses to all four stimuli were tested under drug free conditions (Figure 2 and Table 1). Each stimulus was presented for 5 s and for 12 times in each context with 2 h between the two testing sessions. One session contained 48 trials in total. In the testing phase only the target stimulus was rewarded.
Overall, our within-subject design (Figure 1) allows each pigeon to be compared with itself for two different conditions under different pharmacological manipulations. For example, CS A was acquired in context A, extinguished in context B, and tested in both A and B. Thus, we had two conditions, ABA and ABB. Renewal can be observed in the ABA condition, while   The "+" indicates that the CS was rewarded, and the "-" indicates that the CS was not rewarded. Not shown are the "target" (rewarded) and the "non-target" (not rewarded). (B) The training histories of the two CSs were illustrated according to (A), with (+) indicating food reward and (-) no food reward. The two CSs were processed under different pharmacological conditions during extinction. For simplification, CS APV_EXT and CS saline_EXT were used to refer to CS A and CS B , respectively. The subscript notes of "APV_EXT" and "saline_EXT" indicate the timing of injections that is prior to extinction training. In the experiment, contexts, stimuli, and injection sequences were balanced across subjects. The figures show only one possible example of (1) the sequence in which animals were exposed to contexts A and B, and (2) the sequence in which they received saline and TTX infusions, all of which were counterbalanced across animals.
FIGURE 2 | Schematic representation for the experimental procedure in the experiment. This depiction shows only one possible example, and the pretraining I and II are not included. Squares indicate a single training session in one corresponding context (depicted in dark blue or red). The black vertical bars separate consecutive workdays from each other. In the conditioning phase, two sessions were separated 2 h apart from each other on every workday. The conditioning phase was at least 6 days. The specific duration (n) depended on how long the pigeons needed to achieve the learning criterion. During the extinction phase on day n + 1 and n + 3, the animals were trained with one extinction session per day. The black arrows on day n + 1 and n + 3 indicate the injections of different substances either drug or saline 15 min before extinction training. There was no training the day after the injection to ensure the complete wash out of injected agents from the body system. The subjects were tested in each context on day n + 5.
spontaneous recovery is visible in ABB. For CS B the BAB was the same as ABA, and the BAA equaled ABB. During extinction, the two CSs were processed under different pharmacological conditions. Therefore, CS-APV_EXT refers to the CS under the effect of drug, in this case CS A . For simplification, CS APV_EXT is also used to refer to the CS responses in conditioning and testing phases (CS+ APV_EXT in conditioning and CS-APV_EXT in testing), although conditioning and testing sessions were conducted drugfree. The same applies for CS saline_EXT , accordingly. Again, the subscript note of "APV_EXT" and "saline_EXT" indicate the timing of injections that is prior to extinction training. In order to assess the effect of APV on spontaneous recovery, we compared CS-APV_EXT with CS-saline_EXT in condition ABB/BAA within one pigeon. In addition, by comparing CS-APV_EXT with CS-saline_EXT in ABA/BAB, it revealed how the drug affected the renewal. As described above, apart from the two CSs that underwent extinction, we trained the pigeons with two additional control stimuli, the "target" and the "non-target." The "target" stimulus remained rewarded throughout the whole experiment, whereas the non-target was never rewarded at all ( Table 1). The purpose of including the control stimuli was to identify possible non-specific effects induced by APV infusions.

Histology
After the animals went through all the behavioral experiments, histology was conducted to verify whether the cannulas were positioned in the StM. To prevent blood clots, animals were injected i.m. with 0.1 ml heparine (Rotexmedica GmbH, Trittau, Germany) dissolved in 0.1 ml of 0.9% NaCl before the perfusion procedure. 15 min later, anesthetization was introduced by i.m. injection of equithesin (0.55 ml/100 g body weight). After the animal was tested negatively for pain stimulation, the animal's circulatory system was transcardially flushed with ca. 500 ml of 0.9% saline (40 • C). Subsequently animals were perfused with 1 L 4% paraformaldehyde (VWR Prolabo Chemicals, Leuven, Belgium). After dissection of the brains, they were post-fixed for at least 2 h in paraformaldehyde and 30% sucrose at 4 • C. Afterwards it was transferred in 30% sucrose diluted in 0.12 M PBS for 24 h for cryoprotection. Finally, the brains were embedded in 15% Gelatine (Merck KGaA, Darmstadt, Germany) dissolved in 30% sucrose for 12 h fixation in 4% paraformaldehyde, and then preserved in the solution of 30% sucrose and 0.12 M PBS. For the last steps of histology, the brains were cut frontally into 40 µm slices on a microtome (Leica Microsystems GmbH, Nussloch, Germany), and then stained with cresyl violet to reveal the brain structures. The atlas of the pigeon brain from Karten and Hodos (1967) was used to identify the positions of cannulas.

Data Analysis
The pecking response to "target, " "non-target, " CS APV_EXT , and CS Saline_EXT were registered and stored using a custom-built interface and a custom written matlab code. The number of registered responses on the pecking key during a 5 s stimulus presentation were the main dependent variable. IBM SPSS Statistic (Version 21, IBM Corp., Armonk, NY, United States) and Matlab were used for statistical analysis. The data from the last three training sessions in the conditioning phase were included for statistical analysis. During the extinction session, pecking responses were restructured into six blocks for the target (24 trials) and CS (24 trials) with four consecutive trials constituting one block. While for non-target (12 trials) two consecutive trials constitute one block, and therefore, also form six blocks. In the test phase, we summarized the data from ABA and BAB conditions and named it as ABA for simplification purpose. ABB was used to refer to the conditions of ABB and BAA. Normal distribution was evaluated by Kolmogorov-Smirnov test. Then data sets were analyzed with Repeated Measure ANOVA (RMANOVA). Mauchly's test was conducted to validate the data sphericity. On occasion of violation of the sphericity, the Greenhouse-Geisser or Huynh-Feld corrections was applied. Importantly, post hoc tests were conducted in case of significant factor effects.

Histology
In total, 22 pigeons participated in the study. Five pigeons were excluded from further analysis since three animals failed to learn the task after surgery and in two pigeons the cannulas were incorrectly placed. Thus, data from the remaining 17 pigeons were analyzed. In all 17 pigeons cannula implantations were successfully positioned in the StM of both hemispheres, as revealed by histological analysis (Figure 3).

Learning Speed of the Animals
On average, the animals needed 17.29 (±1.44) days of training to achieve the learning criterion of 80% of correct responding for target, non-target and the corresponding CS of each context. This duration includes training phases of pretraining I and II as well as conditioning (Figure 4).

Conditioning
In the last three sessions of conditioning, the mean response to the target (9.1 ± 0.8; mean ± SEM), the CS+ APV_EXT (10.1 ± 1.0) and the CS+ saline_EXT (9.0 ± 1.1; Figure 5A) did not differ from each other (paired sample t-test: target vs.

Extinction
Two-way RMANOVA for both target and CS responding was conducted with two factors, the block and the injection (APV or saline).
FIGURE 4 | Training history of individual pigeons in pretraining I and II and the conditioning phase. Each bar represents one individual pigeon. Y -axis shows the number of training days required to reach the 80% training criterion in all phases. Blue, orange, and yellow represent the pretraining I, pretraining II, and the conditioning phase, respectively. The animals needed minimum 12 days and maximum 32 days to learn the task. η 2 p = 0.32) and interaction (Greenhouse-Geisser correction: F (4.5,39.8) = 6.27, p < 0.001, η 2 p = 0.28; Figure 5D). As just described for the target stimulus, APV seems to induce a disinhibition of pecking to reward-associated cues. To account for this effect and disambiguate it from further effects on learning, we normalized the CS response rates in the APV condition by multiplying an index Tar Sal Tar APV (see Eq. 1), which represents the ratio of target response rates under saline to that under APV.
normalized CS APV = Tar Sal Tar APV ×CS APV (1) This parameter corrects the CS pecking performance and indicates how the pecking response should manifest without the unspecific response effect to appetitive cues induced by APV. This enables us to detect the effect of APV on extinction learning dynamics. The RMANOVA analysis indicated a strong effect of injection (F (1,16) = 7.2, p = 0.017, η 2 p = 0.31), and of block (Greenhouse-Geisser correction: F (2.7,42.5) = 12.3, p < 0.001, η 2 p = 0.44; Figure 5E). For the block effect, we now observe that the normalized pecking responding to CS dropped significantly under both conditions (CS-saline_EXT : F (5,80) = 11.9, p < 0.001, η 2 p = 0.43; normalized CS-APV_EXT : F (2.7,43.6) = 3.6, p = 0.024, (F) Mean response rates (±SEM) for the stimuli in the test were presented. Gray and red indicate saline and APV, respectively. Data from both ABA and BAB conditions was summarized together and was labeled as ABA for simplification, and the same was done for ABB and BAA which was named as ABB.
Because of the training histories in different contexts, the animals should show stronger responses in the conditioning context (ABA) as compared to the extinction context (ABB). This is the hallmark of renewal. Accordingly, we observed a significant renewal under both conditions (paired sample t-test, CS-saline_EXT in ABB vs. ABA: t (16) = −4.5, p < 0.001, Cohen's d = 1.18; CS-APV_EXT in ABB vs. ABA: t (16) = −4.4, p < 0.001, Cohen's d = 1.18; Figure 5F). Specifically, in the extinction context (ABB), the animals responded equally to the CS-APV_EXT and the CS-saline_EXT (t (16) = 1.5, p = 0.163, Cohen's d = 0.39; Figure 5F). Similar findings were obtained also in the ABA condition that the pecking to the both CSs did not differ from each other (t (16) = 1.2, p = 0.236, Cohen's d = 0.33; Figure 5F).
Our results indicate that APV injections in the StM prior to the extinction training did not interfere with memory retrieval during testing in the extinction context (ABB) as well as in the conditioning context (ABA).
To examine whether the consolidation process of extinction memory was affected by the injection of APV prior to extinction, we compared the CS responses in the last block of extinction with that in the first four trials of the retrieval test in the extinction context (ABB). Results indicated no significant changes of pecking response to CS-APV_EXT (paired sample t-test between normalized CS-APV_EXT of the last block of extinction and the CS-APV_EXT in the beginning of test: t (16) = −0.4, p = 0.682, Cohen's d = 0.09) and to CS-saline_EXT (t (16) = 2.0, p = 0.059, Cohen's d = 0.53), implying that the pigeons responded equally from the end of extinction to the beginning of testing in the extinction context. Therefore, the consolidation of the extinction memory was not affected by the APV injection.

DISCUSSION
The aim of the study was to examine the role of NMDA receptors for extinction learning in the medial striatum of pigeons. For this purpose, local NMDA receptors were blocked with APV during extinction. Consequently, we observed severe deficits in the acquisition of extinction memory, as well as a disinhibition of conditioned responding towards a rewarded control stimulus. These effects will now be discussed.
The target region of the present study receives massive input from the prefrontal-like NCL as well as smaller projections from the visual associative NFL and the (pre)motor arcopallium (Veenman et al., 1995;Kröner and Güntürkün, 1999;Shanahan et al., 2013). The results obtained from the present study showed highly similar patterns to APV-injections in the NCL using the identical experimental procedure (Lengersdorf et al., 2015). This implies that the avian pallial (NCL) → striatal (StM) pathway plays a similar role as the corticostriatal system in mammals. This possibly explains that both studies evince a delayed extinction acquisition and a disinhibition of conditioned pecking to a reward stimulus after blocking NMDA receptors.

Blocking StM NMDA Receptors Impaired Extinction Acquisition
The observation of a delayed extinction in the present work is in good accordance to studies in other bird species. Excitotoxic lesions in the chick StM delayed extinction of operant pecking without affecting the learned inhibition of pecking the nonrewarded stimulus as well as the operant pecking at a rewarded control stimulus (Ichikawa et al., 2004). Yanagihara et al. (2001) demonstrated that a population of the StM neurons code the chick's evaluation of instantaneous rewards. Consequently, Ichikawa et al. (2004) propose that delayed extinction after StM lesions might be caused by a disturbed update of value coding of altered reward contingencies during extinction. This might also be one of the key explanations for the impairment in extinction acquisition observed in our study. To date, several studies could reveal a central role of the NCL in encoding reward amount and subjective reward value (Kalenscher et al., 2005;Koenen et al., 2013;Dykes et al., 2018). It is therefore possible that NCL efferents have a strong impact on reward value coding of StM neurons (Yanagihara et al., 2001), which also explains the similarity of the results from our study with that obtained from NCL.
Since the StM receives inputs from the visual, (pre)motor, and prefrontal regions (Veenman et al., 1995;Medina and Reiner, 1997;Kröner and Güntürkün, 1999), it possibly has a mixture of DMS and DLS characteristics. Accordingly, our findings are comparable to the extinction data obtained from mammals (Butters and Rosvold, 1968;Baranov, 1977;Denisova, 1981). Especially, the linkage of the DLS to the extinction of habit learning (Dunnett and Iversen, 1981;Goodman et al., 2017) via local NMDA receptors (Goodman et al., 2017) are well reported. Similarly, there is evidence that neurons in the monkeys' DMS display stimulus-related activity, which is believed to code for the anticipation of sensory cues that signal reward (Kimura et al., 1993). This finding echoes well with the electrophysiological findings in chick StM neurons (Yanagihara et al., 2001). Furthermore, the present study indicates a specific role of NMDA modulated synaptic plasticity of the avian striatum in extinction. There is evidence showing that activity-dependent long-term synaptic plasticity in the zebra finch striatum is involved in song learning and requires the activation of NMDA receptors (Ding and Perkel, 2004). Similarly, mammalian NMDA receptors in the dorsal striatum are also needed for long-term potentiation, synaptic depotentiation (Calabresi et al., 1992;Li et al., 2009). And the enhancement or impairment of habit memory extinction can be achieved through modulation of NMDA receptors in the DLS (Goodman et al., 2017). Given the highly conserved structure of the amniote basal ganglia across species (for a review see Medina and Reiner, 1995), it is likely that neural mechanisms in the dorsal striatum might operate in birds and mammals in a similar way. Accordingly, our study provides the first evidence that NMDA-dependent synaptic changes in the avian dorsal striatum is involved in Pavlovian extinction learning.
In addition, studies with rodents usually use extended training to ensure habit learning during task acquisition (e.g., Packard and McGaugh, 1996), while our pigeons also received more than thousand trials in total. It is therefore reasonable to believe that the extensive training in our experiment may have induced habit learning as well. Since the avian StM in pigeons has anatomical connections with somatic and (pre)motor areas (Medina and Reiner, 1995;Reiner et al., 1998), antagonizing the NMDA receptors during extinction may deteriorate the extinction dynamics from the habitual aspect of responding in our task. It is therefore reasonable to speculate that the impairment of extinction acquisition observed in the present study may be explained by (1) a disturbed update in value coding propagated via the prefrontal-like NCL, and (2) an impaired extinction of habit responding due to its connections with (pre)motor arcopallium.

Blocking StM NMDA Receptors Induced a Disinhibition of the Conditioned Responding to a Rewarded Control Stimulus
In the present experiment on the StM and the previous one investigating the role of the NCL (Lengersdorf et al., 2015), a disinhibition of conditioned responding to a rewarded control stimulus was found under the effect of local APV injections during extinction. It has been repeatedly shown that increased responding can be induced by pharmacological administration of NMDA receptor antagonist (Carlsson and Carlsson, 1989;Adams and Moghaddam, 1998;Moghaddam and Adams, 1998;Gainetdinov et al., 2001). At the neurophysiological level, the systemic injection of the non-competitive NMDA receptor antagonist MK801 produces an increased number of disorganized spike bouts of PFC neurons in freely moving rats (Jackson et al., 2004). These disorganized single spikes co-occurred with elevated locomotion and behavioral stereotypy (Jackson et al., 2004). In addition, MK801 injections preferentially reduced the activity of prefrontal GABAergic interneurons, thereby disinhibiting pyramidal neurons and resulting in an elevated excitation of cortical outputs (Homayoun and Moghaddam, 2007). At the synaptic level, administration of various kinds of NMDA receptor antagonists increases the extracellular levels of glutamate (Moghaddam et al., 1997;Adams and Moghaddam, 1998;Moghaddam and Adams, 1998;Lorrain et al., 2003) and dopamine (Carlsson and Carlsson, 1989;Carrozza et al., 1992;Keefe et al., 1992;Martínez-Fong et al., 1992;Lorrain et al., 2003) in PFC and striatum. Importantly, evidence with pigeons also suggests that systemically elevated dopamine receptor activity increases the pecking responses to the stimulus that is predictive of food delivery (Anselme et al., 2018). Jointly, the above-mentioned evidence makes it likely that increased pecking to the rewarded "target" as observed in the current study results from APV-induced hyperactivity. However, this effect was confined to the rewarded "target", but not to the "non-target." One possible reason may lie in the stronger appetitive value of the "target" in comparison to the "non-target." Upon non-rewarded stimulus presentations, the "hyperactive" striatal neurons induced by local APV injection might still be lower than threshold, and therefore, did not induce any changes in behavioral output. That said, it is presently unclear that the hyperactivity results from alterations of motivational or purely motoric processes Jackson et al., 2004;Gökhan et al., 2017).

CONCLUSION
Our results support the hypothesis that NMDA receptors in the avian StM are involved in distinct aspects of extinction learning. By locally injecting APV prior to extinction, the response decrement during extinction acquisition was severely impaired. In addition, we also observed a disinhibited responding to the rewarded control stimulus but not to the non-rewarded one. The resemblance of our data to that from NCL with the same behavioral paradigm hints to the presence of a functional pallialstriatal pathway in the avian brain similar to the prefrontalstriatal network in mammals. Furthermore, the comparative approach denotes the shared functionality of the avian and mammalian dorsal striata in extinction learning and suggests some invariant properties in evolutionary distant vertebrates that derived from the last common ancestor 300 million years ago.

DATA AVAILABILITY
The datasets generated for this study are available on request to the corresponding author.

ETHICS STATEMENT
This study was carried out in accordance with the German guidelines for the care and use of animals in science and all experimental procedures were approved by the National Ethics Committee of the State of North Rhine-Westphalia, Germany and were in agreement with the European Communities Council Directive 86/609/EEC concerning the care and use of animals for experimental purposes.

AUTHOR CONTRIBUTIONS
MG performed the experiments and data analysis. OG conceived the original idea. OG and RP supervised the study. MG wrote the manuscript with the support from OG and RP.

FUNDING
This work was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) -Projektnummer 316803389 -SFB 1280 (Project A01) and FOR 1581. The funding agency had no role in study design, collection, analysis, or interpretation of the data, in writing the manuscript or the decision to submit the manuscript for publication.