Edited by: Ronald Weisman, Queen's University, Canada
Reviewed by: Rick Braaten, Colgate University, USA; Micheal Dent, University at Buffalo, The State University of New York, USA; Toshiya Matsushima, Hokkaido University, Japan
*Correspondence: Yoshimasa Seki, ERATO Okanoya Emotional Information Project, Japan Science and Technology Agency, 2-1 Hirosawa, Wako, Saitama, 351-0198, Japan e-mail:
This article was submitted to Frontiers in Comparative Psychology, a specialty of Frontiers in Psychology.
This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.
The abilities of animals and humans to extract rules from sound sequences have previously been compared using observation of spontaneous responses and conditioning techniques. However, the results were inconsistently interpreted across studies possibly due to methodological and/or species differences. Therefore, we examined the strategies for discrimination of sound sequences in Bengalese finches and humans using the same protocol. Birds were trained on a GO/NOGO task to discriminate between two categories of sound stimulus generated based on an “AAB” or “ABB” rule. The sound elements used were taken from a variety of male (M) and female (F) calls, such that the sequences could be represented as MMF and MFF. In test sessions, FFM and FMM sequences, which were never presented in the training sessions but conformed to the rule, were presented as probe stimuli. The results suggested two discriminative strategies were being applied: (1) memorizing sound patterns of either GO or NOGO stimuli and generating the appropriate responses for only those sounds; and (2) using the repeated element as a cue. There was no evidence that the birds successfully extracted the abstract rule (i.e., AAB and ABB); MMF-GO subjects did not produce a GO response for FFM and vice versa. Next we examined whether those strategies were also applicable for human participants on the same task. The results and questionnaires revealed that participants extracted the abstract rule, and most of them employed it to discriminate the sequences. This strategy was never observed in bird subjects, although some participants used strategies similar to the birds when responding to the probe stimuli. Our results showed that the human participants applied the abstract rule in the task even without instruction but Bengalese finches did not, thereby reconfirming that humans have to extract abstract rules from sound sequences that is distinct from non-human animals.
Abstract rule learning from sound sequences should be an essential factor in language acquisition of humans. Therefore, comparison of the ability to extract abstract rules from stimulus sequences between humans and non-human animals should be an interesting topic from a view point of human language evolution. To date, several researchers have reported such ability in animals and humans. Seven-month-old human infants can detect differences between sound sequences created from “AAB” (i.e., the first X is followed by the same X and then followed by different Y) and “ABB” type rules, indicating that they can extract “algebra-like rules” (Marcus et al.,
In studies of small animals like songbirds and rats, it should be difficult to detect visual attention of those subjects. Therefore, the discrimination tasks described above are appropriate methods for those studies. Meanwhile, to train human infants for discrimination tasks on an operant method using some reinforcements might be not appropriate, thus, the experimental methods used in previous infant studies should be adequate. However, researchers may have two questions regarding those studies for comparing the ability of rule extraction from sound sequences between human and animals: (1) how do the methods affect the results? (i.e., spontaneous response in a passive situation vs. conditioned behavior on active discrimination tasks); (2) how do species differences affect the results? (between primates and other species, or between zebra finches and other songbird species). To address some factors of those questions, we compared the discrimination strategies of Bengalese finches and humans presented with AAB and ABB sequences during an operant task. Hereafter, we use the term “rule-conforming” instead of grammatical and another term “non-rule-conforming” instead of “agrammatical” or “ungrammatical.”
Nine adult male birds (2–3 years old) were used; all birds were bred and maintained in our laboratory. Daily training and test sessions were done during day time (from forenoon to early afternoon). Food access of the birds was limited for 1–2 h around 4–6 pm but vitamin-enriched water and shell grit were available
A test cage (W15.5 × D30.3 × H22.0 cm) was placed in a sound attenuation chamber (W89 × D70 × H74 cm, Music Cabin, Japan) that was illuminated by LEDs. The front panel of the cage had two response keys consisting of acrylic panels that could be accessed through 10 mm diameter holes; the left one served as the observation key, and the right one served as the report key. The keys were illuminated with a red and a green color when activated for pecking responses. A feeder was placed on the cage and delivered grains into a dish located 5 cm below the response keys. A small light illuminated the dish for 2 s when food was delivered. A loudspeaker was placed above the cage to deliver sound stimuli. A personal computer controlled the execution of the experiment.
Distance calls of 15 adult male (M1–M15) and 15 adult female (F1–F15) Bengalese finches recorded in the sound attenuated box using Sound Analysis Pro (Tchernichovski et al.,
First, birds were trained to peck the observation key using an auto-shaping method. Then, a GO/NOGO task was introduced requiring discrimination between AAB and ABB sequences. The observation key was activated to signal at the beginning of each trial. Following a key peck, a GO or NOGO stimulus was presented in semi-random order. The report key was activated after presentation of the sound stimulus. Pecking the report key within 2 s after a GO stimulus resulted in a food delivery (Hit). Otherwise, the bird did not obtain a reward for the trial (Miss). Meanwhile, pecking the report key within 2 s after a NOGO stimulus resulted in a punishment of black-out for 16 s (False Alarm; FA). Otherwise, the bird proceeded to the next trial without consequence (Correct Rejection; CR). Correction trials were applied for unsuccessful trials (FA or Miss) until the bird gave a correct response. The inter-trial interval was 4 s. Each training session concluded when 60 trials had been completed (30 each for GO and NOGO trial) or 40 min after starting the session.
The stimulus combination was counterbalanced across the birds (GO: MMF,
Every bird had two types of test session; (1) transfer test; and (2) rule-generalization test. For the transfer test, novel sounds were selected from males and females that were not used for the training stimuli for each subject. Then five sound combinations were created in the same manner as the training stimuli and used as probe stimuli. For the rule-generalization test, five novel sounds from males and female calls were selected as for the transfer test; however, the positions of the M and F sound in the sequence were switched (i.e., if a bird was trained with GO-MMF and NOGO-MFF, then FFM (rule- conforming) and FMM (non-rule-conforming) were used as the probe stimuli). During a probe session, 12 probe trials were randomly interspersed among 48 normal training trials. Birds completed five test sessions, during which a unique probe stimulus was presented for both the first and the second tests. Thus, there were 60 responses for probe trials from each bird on each test. The responses for all probe trials were neither reinforced nor punished. During the task, if a subject did not respond to the observation key after 300 s, then a grain was delivered automatically to stimulate the subject.
Eleven males and 5 females (18–29 years old) participated in this experiment. Five males and 3 females were assigned to human voice (HV) experiment and the 6 males and 2 females were assigned to bird vocalization (BV) experiment. Experimental procedures were approved by the ethical committee for experimental research involving human subjects at Graduate School of Arts and Sciences of The University of Tokyo.
A monitor and speakers of a laptop computer was used for stimulus presentation. The response key was either the space-key or the enter-key of the laptop. The sound level was adjusted to be comfortable to hear for each participant.
The stimuli of HV were vocal sounds of “
Basic procedure was almost the same as Experiment 1, although all data collection was done in only one session. The flow of the experiment was following; a green square appeared on the monitor at the beginning of each trial. The square disappeared following a key-press and a sound stimulus was played back immediately. Then, a red square appeared after delivery of the sound stimulus. The participants were given 800 ms with a presentation of a red square to decide either pressing the key or not. Then, a feedback sign (either “a circle,” which means a correct sign in Japan, or “a cross” representing a false sign) was displayed on the monitor. A Key-press before presentation of the red square brought a warning sign and the response was a null and void, and then, the trial was repeated. Correction trials were introduced as the same as Experiment 1. ITI was 400 ms.
When the correct response rate reached 80% in the last 50 trials, the test period began immediately without notice and probe trials were randomly interspersed among the normal training trials at 10% probability. In this experiment, the “transfer test” was omitted. The total number of probe trial was 60 (30 probe ABB and 30 probe AAB) as the same as the birds' experiment. No feedback was given for each probe trial.
The participants were instructed that they need to make higher correct percent as much as possible to finish the experiment soon. Because they were notified that the reward was fixed (1000 JPY), depending on neither the time period nor the number of trial, we can assume that the participants tried to finish the task as soon as possible, so that a positive sign (i.e., a circle) should be a reward for them. The experimenter never told them about anything related to rules or sequences of the sound stimuli. Following the experiment, the participants answered a short questionnaire, which asked them (1) whether they noticed any rules for the sound sequences, and (2) how they discriminated the stimuli.
All birds met the criterion in 33–227 sessions (103.9 ±23.2, Mean ± SE). No significant effect of sound sequence combination on session number was found; between GO-AAB / NOGO-ABB and GO-ABB / NOGO-AAB groups (98.4 ± 35.6, 110.8 ± 32.9 sessions;
In the transfer test, the number of GO responses by all subjects was greater for the rule-conforming probe stimuli than the non-rule-conforming probe stimuli. Analysis of the pooled data revealed a significant difference in the rate of GO response between those probe types (
The response patterns for four of seven birds (Type-1 and Type-2) on the “rule-generalization test” clearly indicated that the birds did not generalize the abstract rule regarding sequence structure (i.e., from FFM (MMF) and FMM (MFF) to AAB and ABB) in the responses to probe stimuli. Instead, they memorized a particular acoustic pattern or global sound structure (i.e., combination of 3 call sounds; like MMF or MFF) as the strategy for a GO (bird #1, #4, #8) response. Consequently, those birds produced NOGO responses to any other (including probe) stimulus (and bird #12 memorized NOGO patterns, vice versa). In contrast, the three birds that exhibited a Type-3 pattern seemed to use a “rule” in the response to probe stimuli. A possible interpretation of these results is that the birds used repeated sounds (i.e., MM or FF) as a discriminative strategy, independent of their position in the sequence (i.e., in the first half or the latter half). This interpretation seems consistent with the findings of van Heijningen et al. (
Interestingly, the birds that adopted the simple rule (Type-3 birds) required more sessions (#2, 227 sessions; #5, 203; #13, 113) to learn the task than the birds that memorized a particular sound pattern (Type-1 and 2 birds, 33–123 sessions;
The next question is whether the birds' strategies are also reasonable for humans in the same operant task. The answer to this question could provide a stronger suggestion whether the birds really lacked the ability to discriminate the abstract rules or merely did not do so in the task.
All participants quickly reached the discriminative criterion (HV: 52.1 ± 0.78 trials; BV: 63.3 ± 2.57 trials; mean ± S.E.), as expected. Although the number of trials to meet the criterion in BV condition was greater than HV condition, the difference was statistically marginal (
One participant (#12, HV: MMF-GO) had a greater GO response rate for the rule-conforming than the non-rule-conforming probe stimuli (
Interestingly, the response patterns of two participants were similar to the birds' responses. The response pattern of participant #22, (HV: FMM-GO, Figure
The remaining participants #10, (HV: MFF-GO) and #17 (BV: FMM-GO) responded unsystematically to the probe stimuli (no bias toward either GO nor NOGO; #10 rule-conforming
As expected, the results for most participants revealed that they used an abstract rule (i.e., AAB and ABB) that was acquired during the training trials for responding to probe stimuli. Although the reports of participants #9 and #13 did not indicate they explicitly found the algebraic rule, the probe tests showed they implicitly learned the rule. This finding supports previous results that humans innately extract rules from sound sequences in passive situations (Saffran et al.,
The results clearly showed that the response pattern to the probe stimuli generally differed between Bengalese finches and humans. Although all patterns exhibited by the birds were also demonstrated by several human participants, in response to probe stimuli the birds never used the “global sequence rule” that was used by most human participants (Table
Birds | – | 3/7 | 3/7 | 1/7 | – | |
Humans | (HV) | 6/8 | 1/8 | – | – | 1/8 |
(BV) | 6/8 | – | – | 1/8 | 1/8 |
Why did birds fail to use the rule? The most likely interpretation is that the birds could not extract the sequence rule, which seems to agree with van Heijningen et al. (
Another interpretation for why the birds did not use the abstract rule is that use of the rule might be solely more difficult than use of the other strategies, such as the Type-1, Type-2, and Type-3 in Experiment 1 for the birds (Note; even the simple rule shown in Type-3 birds required more sessions to learn than Type-1 and 2). It might be easier to memorize the acoustic pattern of the sequences than to extract rules from the sequences, especially because we used conspecific calls as the stimuli. Thus, the acoustic patterns might be easier to memorize for the birds, which may reinforce such a trend. In other words, we might be able to consider that they merely did not apply the rule on this task, but could do it when required (e.g., in more difficult task using human vocalizations, although such training could be more difficult for birds). Interestingly, human participant #19 did not apply the rule when responding to the probe stimuli although he reported finding the abstract rule on the questionnaire. These findings suggest that a bird-like response does not always mean the subject could not learn the algebraic rule from the sound sequences, although it does not allow us to consider the same thing happening in finches.
In the present experiment, we presented the bird subjects neither reinforcements nor punishments for responses to the probe stimuli. A training session, which exhibited high percentage (>85%) of correct trials, was inserted between each pair of probe sessions to prevent the birds from learning the consequences of the response to the probe stimuli. However, it might be possible that the birds learned it in the generalization test. If the birds learned the outcome of the responses to the probe stimuli, they would show a particular trend for GO response to the probe stimuli as the sessions went. However, patterns of GO response for the probe stimuli did not show a systematic change as the sessions went (Figure
There is no doubt that the conditioning technique is one of the strongest tools for exploring sound sequence processing in animals, especially birds, because it is difficult to track birds' eye movements for monitoring their attention to loudspeakers as done in primate studies. For example, one study recently suggested the sound sequence generator (i.e., the song nervous system) might be related to learning and production of ruled sequences using an operant method (Yamazaki et al.,
Another approach to this issue might be electrophysiological recordings of neural activity as has been examined in humans (e.g., Sun et al.,
In conclusion, our findings suggest that the cognitive process of rule extraction from sound sequences differs between songbirds and humans, although testing with the present operant method did not provide critical evidence that the birds do not have the ability demonstrated by the humans. Although songbirds are unique species as experimental and operational animal models of the evolution of language, we believe the data show that humans have a stronger tendency to extract sequential rules; thus, this skill should be easier to perform for the cognitive system of humans than that of songbirds. The nature of this disposition is one of the peculiarities that enable humans to manipulate language.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
We thank to Mr. Masayuki Inada for supporting the bird experiment. This work was supported by JSPS KAKENHI to Kazuo Okanoya (Grant Number 23240033).