Edited by: Mauricio R. Delgado, Rutgers, The State University of New Jersey, USA
Reviewed by: Andrew Delamater, City University of New York, USA; Candace Marie Raio, New York University, USA
*Correspondence: Alex Pine, Department of Neurobiology, Weizmann Institute of Science, Herzl Street, Rehovot 76100, Israel e-mail:
This article was submitted to Decision Neuroscience, a section of the journal Frontiers in Psychology.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
Preferences profoundly influence decision-making and are often acquired through experience, yet it is unclear what role conscious awareness plays in the formation and persistence of long-term preferences and to what extent they can be altered by new experiences. We paired visually masked cues with monetary gains or losses during a decision-making task. Despite being unaware of the cues, subjects were influenced by their predictive values over successive trials of the task, and also revealed a strong preference for the appetitive over the aversive cues in supraliminal choices made days after learning. Moreover, the preferences were resistant to an intervening procedure designed to abolish them by a change in reinforcement contingencies, revealing a surprising resilience once formed. Despite their power however, the preferences were abolished when this procedure took place shortly after reactivating the memories, indicating that the underlying affective associations undergo reconsolidation. These findings highlight the importance of initial experiences in the formation of long-lasting preferences even in the absence of consciousness, while suggesting a way to overcome them in spite of their resiliency.
Humans and animals can learn to predict future reinforcement and make appropriate responses based upon knowledge of its contingency with environmental cues and actions. Experimental analysis has demonstrated that in associative learning paradigms contingent CS-US (conditioned stimulus—unconditioned stimulus) pairings (observational or via instrumental responses) have the potential to create multiple associative representations in the brain (Mackintosh,
A common manifestation of this phenomenon is that conditioning can engender a change in the hedonic evaluation of stimuli, leading to the formation of preferences (likes and dislikes), which profoundly guide behavior and choice (Rozin et al.,
A unique feature of preferences is that they remain relatively stable over one's lifetime. This resilience has also been observed experimentally, where supraliminally acquired preferences appear to be resistant to extinction training protocols (Baeyens et al.,
To address these questions, we examined subliminal instrumental learning using appetitive and aversive secondary reinforcement in humans. Our first aim was to determine if instrumental behaviors and preferences to discriminatory stimuli can be acquired without conscious awareness, and if so, whether they can influence long-term decision-making. We next assessed whether the associations learned in our task could be altered by an additional phase of subliminal learning where the reward/punishment contingencies were altered, such that the stimuli were no longer discriminatory. Finally, we probed the question of whether they undergo reconsolidation, by examining if application of this contingency shift during the hypothetical reconsolidation window following memory reactivation is more efficacious in altering instrumental task responses and preferences than the manipulation with no reactivation.
Forty four participants (28 female, 16 male; mean age of 25.1 ± 3.4 years) were recruited from the Weizmann Institute of Science and the Faculty of Agriculture of the Hebrew University, Rehovot. Four participants were excluded from the analyses; two because they performed significantly above chance in the perceptual discrimination and or recognition tasks, and another two for constantly making a “Go,” “No-Go” or fixed alternate response in all trials of one of the testing sessions (see below). There remained 19 participants in the reconsolidation group and 21 in the control group. The experimental protocol was approved by the Institutional Review Board of the Sourasky Medical Center, Tel-Aviv and written informed consent was obtained from all participants.
The experiment took place over three consecutive days. On day 1 we employed a subliminal instrumental conditioning procedure which utilized discriminative stimuli (SDs) for reward and punishment (day 1; Figure
On day 1, participants were randomly assigned to one of two conditions—control or reconsolidation. Six stimuli taken from a set of Japanese characters (matched for size and complexity) were then randomly ordered to form three pairs of stimuli that were assigned to the subject for all 3 days: S + app, S + av (1st pair); S + app, S + av (2nd pair); S−1, S−2. The same six characters (randomized) were used for all subjects.
Discriminated instrumental conditioning (phase 1 learning) was implemented subliminally using a technique similar to Pessiglione et al. (
Test trials were identical to learning trials except that no feedback was provided following responses—the subsequent trial began immediately following the 2 s response period. Subjects could still win or lose money during these trials and were aware of this.
In each round there were 80 trials of learning comprising 40 randomized presentations of each S+. Immediately following learning there were an additional 40 test trials (20 randomized presentations of each S+). There were two rounds of learning and testing, corresponding to the two pairs of S + s assigned to each subject. Learning of the second pair followed testing of the first and subjects were alerted between transitions from learning to testing and between rounds (Figure
In addition, subjects were given perceptual discrimination tasks, prior to and following the conditioning procedure (Figure
Each of the discrimination tasks comprised 60 trials. In the pre-conditioning task, the stimulus duration was set at 50 ms (i.e., between the masks). Following these 60 trials a binomial test was automatically performed to assess whether accuracy for the subject was significantly above chance—if so, another 60 discrimination trials were performed where the stimulus duration was set to 33 ms. The purpose here was to set the stimulus duration for all subsequent procedures throughout the experiment, for each subject. In practice, no subject was able to discriminate above chance with a 50 ms stimulus duration, in the pre-conditioning test.
A short practice session of discrimination, learning and test trials was provided before the first discrimination task (utilizing additional characters that did not appear in any subsequent tasks). Subjects were debriefed at the end of testing regarding how well they thought they had done, if they thought they had learned anything and what they could describe about the stimuli.
On day two subjects returned to the testing room for the phase 2 learning procedure. These trials were identical to the learning trials on day 1 except here a “Go” response led to a 50/50 win/loss outcome (i.e., irrespective of the S+)—that is the S + s were now rendered non-differential/discriminatory with respect to their reward and punishment contingencies. A round of phase 2 learning comprised 60 trials (30 randomized presentations of each S+) and each subject underwent two rounds. The S + app and S + av pairs used in each round were the same S+ pairs that were used for that subject in acquisition rounds 1 and 2 on day 1. The reminder session (for the reconsolidation group) comprised 20 test trials (i.e., no feedback, but playing for money) lasting less than 2 min. Each S+ from day 1 was tested five times successively.
The same instructions used for the learning task on day 1 were provided for the phase 2 trials. It was not specified if the stimuli were the same or different to those used in the prior day's trials. Subjects assigned to the reconsolidation condition (the reconsolidation group) underwent a reminder session when they entered the room, waited 10 min and then proceeded to the phase 2 trials. The purpose of these reminders was to reactivate the learned associations from day 1 and hence open the hypothetical reconsolidation window. Reconsolidation subjects were told that there was a short test of what they had learned on the previous day and that they could still win/lose on those test trials. Control group subjects did not undergo reactivation and started the phase 2 trials following a 10 min waiting period upon entering the testing room (Figure
On day 3 subjects returned to the testing room where they were given three separate tasks. The first task comprised test trials that were identical to those on day 1 (as well as to the reminders on day 2 for the reconsolidation subjects). As on day 1, no feedback was provided but the subjects could still win or lose money with “Go” responses. There were two rounds (testing each subject specific pair of S + s conditioned on day 1 and rendered non-discriminatory on day 2), each comprising 80 trials (40 randomized presentations of each S+).
The remaining tasks on day 3 were supraliminal in nature. The second task was a recognition test where each of the six stimuli—S + app, S + av and S− (two of each)—was presented individually for 3.5 s. The S − s were neutral (novel) stimuli determined randomly for each subject on day 1 from the initial set of six characters but had not been presented in any of the preceding tasks. In order to present an equal number of “seen” and “unseen” stimuli we also included two additional neutral stimuli at this stage. Subjects were instructed to press the space bar if they thought they had seen the symbol in any of the sessions on days 1–3. The order of presentation was randomized for each subject. As with the perceptual discrimination task, the recognition test served to control for the formation of conscious S+ representations during subliminal sessions.
The final task was a supraliminal preference task. Here, subjects were given the instruction “choose the symbol you prefer” and subsequently made 15 binary choices. These choices were all possible combinations, randomized, of each of the 6 stimuli: 4 × S + app vs. S + av; 4 × S + app vs. S−; 4 × S + av vs. S−; 1 × S + app vs. S + app; 1 × S + av vs. S + av; 1 × S− vs. S−. Both options were simultaneously presented on the screen, separated by a perpendicular line and remained until the choice was made. Subjects chose between the stimuli on the left or the right of the screen using the left and right shift keys and the choices were self-paced.
At the conclusion of testing subjects were told how much money they had won or lost over the 3 days. This was summed to the 100 shekel payment for taking part and awarded to the subject.
All tasks were performed on a PC using the cogent toolbox for Matlab (
All statistical tests were two-tailed and performed using Matlab and the Statistics toolbox. Chi-square tests were performed by hand. All means are reported ± s.e.m. in the Results Section.
Percentage of correct instrumental responses were calculated by summing the number of “Go” responses following appetitive conditioned stimuli with the number of “No-Go” responses following aversive conditioned stimuli and dividing by the total number of trials for each subject individually. This was performed separately for learning and test trials on day 1, phase 2 learning trials on day 2 and test trials on day 3. To test whether performance differed from chance, these values were compared to a value equal to 50% of the number of trials, by means of one sample
Trial-by-trial responses were analyzed by summing the number of “Go” responses for each individual trial over all subjects and dividing by the number of subjects—that is the proportion of “Go” responses made by the group as a whole on each trial. Each S+ type (S + app and S + av) was analyzed separately in this manner for each of the instrumental tasks on days 1–3. This measure was calculated for each of the two rounds and then averaged. Additional analyses were performed for rounds 1 and 2 separately (day 1 tasks), and for each group separately. To assess the relationship between percentage “Go” responses and trial number, linear regressions were performed (with trial modeled as the predictor variable and percentage “Go” responses as the dependent variable) for each S + type, to obtain a measure of the slope (β) and the significance of the regression. Analyses of covariance (ANCOVAs) were performed to test for significant differences in the slopes and intercepts of the two regression lines (S + app vs. S + av), i.e., for main effects of trial and S+, as well as trial × S+ interactions. Since this method does not take into account inter-subject variability and trades this off for inter-trial variability we performed an additional ANCOVA, this time entering each subject individually into the analysis but binning their responses into eight 5-trial blocks (for each S+ individually) and determining a % Go response for each block. This method overcomes the problem of calculating trial-by-trial percentages for binary data by sacrificing a little trial-by-trial variance. To test for group differences in trial-by-trial performance we performed ANCOVAs on linear regression lines modeling the differential percentage “Go” response to the S + app vs. S + av over trials (i.e., S + app-S + av), for each group. In a stricter analysis we also compared each S+ specific regression line across the groups using ANCOVAs—i.e., the S + app vs. S + app regression lines and S + av vs. S + av regression lines in control and reconsolidation groups, overall, as well as for each round separately.
Responses in the discrimination task were classed as correct same, correct different, incorrect same and incorrect different. The number of correct “same” and correct “different” responses was summed for each subject. A binomial test was performed on this score to assess performance relative to chance. We analyzed the post-testing round in a similar manner. Any subject whose performance differed from chance was removed from the analyses. We also tested group performance for each group by comparing the subjects' scores with chance using a one sample
For each subject the number of recognition responses to the S + s and the novel stimuli were summed separately (each giving rise to a number between 0 and 4). These scores were compared using a Wilcoxon matched-pairs signed-rank test, for each group separately. We compared the difference scores of recognized S + s minus recognized neutral stimuli across groups using the Wilcoxon rank-sum test.
Choices from the preference task were grouped into three categories for each group: S + app vs. S−, S + av vs. S−, and S + app vs. S + av (four of each). Three choices were discarded from analyses (S + app vs. S + app, S + av vs. S + av, and S− vs. S−) since they provided no information on preferences between stimulus types. In an initial analysis we summed the number of choices of S + app, S + av and S− for each subject, over all the choices, to obtain an overall measure of preference. A repeated measures ANOVA was used to test for any differences in overall preferences to the stimuli. This was performed for each group separately, followed by
Analysis of the percentage correct instrumental responses (“Go” responses to S + app, “No-Go” responses to S + av) vs. chance over all subjects, trials and rounds, revealed a significant effect of acquisition on day 1 (mean = 51.40 ± 0.51%,
Note that this measure of performance combined learning from the S + app and S + av, essentially measuring the differential “Go” and “No-Go” responses to the stimuli. Analysis of percentage correct responses to each S+
In a more sensitive analysis of the acquisition blocks we looked at the trial-by-trial change in the percentage of subjects' “Go” responses to the S + app and S + av individually, modeling these with linear regressions (Figure
Successful instrumental conditioning was also evident from subsequent test sessions where subjects were still playing for money but no feedback was provided, both from the percentage correct responses vs. chance analysis (mean = 52.81 ± 1.07%,
Importantly, conditioning took place without explicit awareness. This was evident from introspective reports during debriefs on day 1. Most subjects were not certain/did not believe there were any differences in the stimuli they saw and could not accurately describe what they looked like—in keeping with previous accounts of this masking technique (Marcel,
Furthermore, in the supraliminal recognition task on day 3, comparison of correctly recognized S + s with incorrectly recognized (i.e., recognized novel) stimuli (S − s) revealed no significant difference in the control group (mean S+ = 57.14 ± 6.49%; mean novel = 55.95 ± 5.95%) and in the reconsolidation group (mean S+ = 57.89 ± 6.36%; mean novel = 48.68 ± 5.57%). There was no significant difference in the group recognized S+ minus recognized novel scores. Thus, these three evaluations suggest it is unlikely that subjects formed conscious representations of cue-outcome associations. Subject debriefs indicated that performance improved in the second round because subjects learned to better rely on their “gut feeling” or intuition, and realized that other strategies (for example focusing intently on one point of the screen, or trying to infer a (nonexistent) pattern of reinforcement) did not help. The ability to forego the tendency to try and explicitly unveil the stimuli and their associations with the outcomes—and instead make what seems like arbitrary button presses—was initially unnatural for participants. Had the subjects habituated to the masking and become aware of the stimuli, performance would have been dramatically greater than chance.
Analysis of instrumental performance during phase 2 learning (day 2) showed that conditioned instrumental responses were abolished, both in terms of percentage correct responses (in relation to the original contingencies; mean = 48.08 ± 1.51%) and trial-by-trial changes in percentage “Go” responses for each S+ (Figure
Having established acquisition of instrumental conditioning on day 1, we next examined the persistence of the (affective) stimulus-outcome associations that would have presumably been formed during acquisition (Rescorla and Solomon,
In order to examine in more detail what was driving the preferences we also analyzed each choice type individually, first focusing on S + app vs. S + av decisions (Figure
Finally, to determine whether the preferences in the control group were driven by an attraction to the S + app or an aversion to the S + av (or both) we focused on each of the S + vs. S− decisions (Figure
No evidence of conditioned instrumental responses was apparent in either group on day 3 test trials preceding the preference task. When scored according to the contingencies on day 1, the percentage correct responses in test trials on day 3 did not differ significantly from chance in either group (mean control = 51.28 ± 1.59%; mean reconsolidation = 49.54 ± 0.83%) or differ between groups. In a trial-by-trial analysis of these trials ANCOVAS revealed no significant difference in “GO” responses to the S + s in either group—nor were there any between group differences. Lack of persistence (or recovery) of instrumental responses indicates that the conditioned preferences revealed in the preference task were driven by affective properties of the S + s (i.e., stimulus-outcome associations) and did not result from any instrumentally conditioned responding. In addition the nature of responses in the two tasks (“Go/No-Go” vs. Left/Right) differed, thereby precluding this interpretation.
Recent advances in the understanding of emotionally driven learning have focused on the physiological or neurological responses evoked by conditioned stimuli. Conversely, the characterization of the higher order affective and motivational properties acquired by stimuli during conditioning has received less attention. Yet determining the characteristics of such associations is paramount, because they allow environmental stimuli to profoundly influence volitional behavior and decision-making by initiating desires and aversions, guiding action selection and controlling behavioral vigor (Rozin et al.,
The necessity of conscious awareness in human classical conditioning has been strongly debated (Lovibond and Shanks,
We caution that there is substantial debate relating to the complexity of subliminal processing of information, and the methods used to achieve it (Maxwell and Davidson,
The duration of subliminal instrumental (stimulus-action-outcome) learning has not previously been addressed—to our knowledge—but appears to fall somewhere in the middle, persisting beyond acquisition (apparent in test trials) but not to the same extent as the (stimulus-outcome based) preferences. Thus, even at the start of the non-differential conditioning trials on Day 2, behavior did not appear to be under the control of the instrumental associations formed on Day 1. This was also the case with non-consciously acquired skin conductance responses in Raio et al. (
Our results indicate that despite the strength of appetitive and aversive affective associations, they can also undergo reconsolidation. Existing human reconsolidation studies focus on classical fear conditioning, employing primary reinforcement (e.g., Schiller and Phelps,
These findings show that reconsolidation is a wider phenomenon than previously described, common to a number of forms of associative learning as well as learning driven by secondary reinforcement such as money, and can occur without awareness. Our results seem to counter a recent theory that new learning (or the generation of a prediction error) is required during reactivation in order to trigger reconsolidation (Pedreira et al.,
Although higher order incentive learning aids in the procurement of rewards and avoidance of punishment, it can sometimes go awry—aberrant affective salience of environmental cues is critical in the maintenance of disorders such as addiction, post-traumatic stress disorder and phobias. For instance, in addiction, cues and contexts associated with drug taking are attractive and can induce immense cravings, hijacking behavior to seek drugs and leading to relapse (Berridge,
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
We thank Ben Seymour, Daniela Schiller, Rachel Ludmer and Aya Ben-Yakov for their valuable comments. This work was supported by the WIS-UK Joint Research Program (to Alex Pine and Yadin Dudai) and by the Centre of Research Excellence in the Cognitive Sciences (I-CORE) of the Planning and Grants Committee and Israeli Science Foundation (Grant 51/11) and the EP7 Human Brain Project (to Yadin Dudai).
The Supplementary Material for this article can be found online at: