Overestimation of the second time interval replaces time-shrinking when the difference between two adjacent time intervals increases

Nakajima, Yoshitaka; Hasuo, Emi; Yamashita, Miki; Haraguchi, Yuki

doi:10.3389/fnhum.2014.00281

ORIGINAL RESEARCH article

Front. Hum. Neurosci., 14 May 2014

Sec. Cognitive Neuroscience

Volume 8 - 2014 | https://doi.org/10.3389/fnhum.2014.00281

This article is part of the Research TopicAdvances in Modern Mental ChronometryView all 15 articles

Overestimation of the second time interval replaces time-shrinking when the difference between two adjacent time intervals increases

Yoshitaka Nakajima^1*

Emi Hasuo²

Miki Yamashita³

Yuki Haraguchi⁴

¹Department of Human Science, Research Center for Applied Perceptual Science, Kyushu University, Fukuoka, Japan
²Japan Society for the Promotion of Science/Neurological Institute, Kyushu University, Fukuoka, Japan
³Kyushu Institute of Design, Fukuoka, Japan
⁴Department of Acoustic Design, Kyushu University, Fukuoka, Japan

When the onsets of three successive sound bursts mark two adjacent time intervals, the second time interval can be underestimated when it is physically longer than the first time interval by up to 100 ms. This illusion, time-shrinking, is very stable when the first time interval is 200 ms or shorter (Nakajima et al., 2004, Perception, 33). Time-shrinking had been considered a kind of perceptual assimilation to make the first and the second time interval more similar to each other. Here we investigated whether the underestimation of the second time interval was replaced by an overestimation if the physical difference between the neighboring time intervals was too large for the assimilation to take place; this was a typical situation in which a perceptual contrast could be expected. Three experiments to measure the overestimation/underestimation of the second time interval by the method of adjustment were conducted. The first time interval was varied from 40 to 280 ms, and such overestimations indeed took place when the first time interval was 80–280 ms. The overestimations were robust when the second time interval was longer than the first time interval by 240 ms or more, and the magnitude of the overestimation was larger than 100 ms in some conditions. Thus, a perceptual contrast to replace time-shrinking was established. An additional experiment indicated that this contrast did not affect the perception of the first time interval substantially: The contrast in the present conditions seemed unilateral.

Introduction

When the onsets of three successive sound bursts mark two neighboring time intervals, the second time interval can be underestimated when it is longer than the first time interval by up to 100 ms. This underestimation, i.e., time-shrinking, is very stable when the first time interval is 200 ms or shorter (Nakajima et al., 1991, 2004), and has been considered a kind of perceptual assimilation. Assimilation and contrast in perceptual paradigms often replace each other when the relationship and configuration of stimuli are changed systematically (e.g., Helson, 1963; Morinaga and Noguchi, 1966).

Assimilation and contrast may not necessarily be governed by a single perceptual mechanism, but they are likely to work under one perceptual principle for humans and animals to process information from the environment efficiently and quickly. For example, a figure in which luminance is sufficiently higher than in the background can be distinguished clearly from the background in the visual modality. This process is enhanced by contrast, which enlarges the perceptual difference in terms of lightness or color between the figure and the background, as well as by assimilation, which homogenizes the lightness or color within the figure and within the background (Koffka, 1935; Shapley and Reid, 1985). It is also argued that, when two potential objects are separated enough spatially from each other (but within a distance to keep a mutual interaction), they are likely to be organized as two separate wholes which are then contrasted (King, 1988). It is widely observed that perceptual assimilation between objects gives way to contrast when the difference between these objects is increased, and that assimilation can be blocked if the area or the group to be assimilated is broken by a boundary (or boundaries; e.g., Koffka, 1935; Hamburger, 2005), or by a temporal distance (Ikeda and Obonai, 1955). In Ikeda and Obonai’s (1955) experiment, concentric circles with different diameters I and T were presented simultaneously for 500 ms using a tachistoscope. The diameter of T, whose size was to be judged, was fixed at 30 mm. When the physical size of I was similar to that of T, assimilation took place, but contrast took over when the physical size difference was larger (Table 1). The fact that assimilation and contrast can both take place in the same experimental context is described systematically by Helson (1964). One should note that temporal configurations of stimuli can also lead to an assimilation or contrast of the stimuli (Shigeno, 1991; see also McKenna, 1984). In our study, assimilation and contrast were manipulated through modifying the temporal configuration of the sound bursts.

TABLE 1

TABLE 1. Underestimation and overestimation of the size of a circle, T = 30 mm, caused by another concentric circle, I, as observed by Ikeda and Obonai (1955).

When the difference between close but distinguishable objects or events is small, the objects will be seen as part of a homogeneous group. If the difference cannot be neglected, the objects or events will instead be perceived in different categories. This is the case particularly for the human auditory modality, which is responsible for quick and complicated communication sometimes in noisy environments without favorable acoustics.

Linguistic communication depends on the human capacity to process strings of categorized elements in time. This requires that any pair of sounds or sound patterns should be clearly either the same or different (de Saussure, 1966); assimilation and contrast must work for the listener to decode speech signals properly (e.g., Shigeno, 1991). Temporal aspects of auditory perception are also very likely to work in the same manner. Relative lengths of syllables are categorized in many languages; it is often important for the listener to judge, without hesitation, whether or not one of two neighboring syllables is longer or shorter than the other. When time intervals are presented in concatenation, listeners often simplify the patterns reducing small differences, and exaggerating larger differences (e.g., Fraisse, 1978, 1982; Povel, 1981). A ratio 1:2 or 2:1 seems stable perceptually, which means that the second time interval is likely to be overestimated if the neighboring time intervals are to be perceived as in a ratio 1:1.7 or 1:1.8 otherwise. We were interested in whether the extremely stable illusion of time-shrinking, a unilateral assimilation of a time interval to a preceding time interval or preceding time intervals, could be grasped in relation to such opposite perceptual processes. We thus examined whether a time interval was contrasted, instead of assimilated, to a preceding time interval at a certain point when the difference between these adjacent time intervals was increased step by step. When two adjacent empty time intervals t_P and t_S were presented in this order in our previous research, the same t_P may have caused both underestimation and overestimation of t_S depending on the physical difference between t_P and t_S. Nakajima et al.’s (2004) experiments suggested that this possibility is systematic. Table 2 indicates the cases in which both underestimation and overestimation reached 20 ms for a fixed t_P value.

TABLE 2

TABLE 2. Temporal patterns in which time shrinking was replaced by overestimation in Nakajima et al. (2004).

The present paradigm thus became clear. Time-shrinking typically takes place when two time intervals, t_P and t_S in this order, marked by the onsets of three successive sound bursts meet the following conditions: 0 < t_S - t_P ≤ 80 ms, and t_P ≤ 200 ms. It had been indicated already that overestimation of t_S to exaggerate the difference between t_P and t_S could take place when the physical difference between the neighboring time intervals, t_S - t_P, exceeded the above range (Nakajima et al., 2004). This problem had never been taken up systematically. In order to reveal the mechanism of rhythmic organization, however, it seemed of crucial importance to examine whether a systematic overestimation of t_S would replace the underestimation, which we call time-shrinking, if we increased the difference t_S - t_P.

General Methods

The general framework common to the present experiments is described in Figure 1. In the first three experiments, we basically followed the paradigm employed in previous studies on time-shrinking (e.g., Nakajima et al., 2004), except that we increased the range of the standard duration to be judged. In the control condition, a time interval, t_S, marked by the onsets of two successive tone bursts was the standard to be judged. An additional tone burst preceded t_S in the experimental condition; the effect of the preceding time interval, t_P, marked by the onsets of this additional tone burst and the first marker of t_S was studied. The difference in subjective duration of t_S between the control and the experimental condition was measured.

FIGURE 1

FIGURE 1. Time charts of stimulus patterns. The rectangles represent sounds. In the experiments, participants adjusted t_C to make its subjective duration equal to that of t_S. In the experimental conditions of Experiments 1–3, t_P was added before t_S. In the experimental condition of Experiment 4, t_SUC was added after t_S. Note that all time intervals (t_S, t_P, t_SUC, and t_C) refer to the duration between the onsets of successive sounds.

In the last experiment, Experiment 4, a tone burst did not precede but succeeded t_S, and the effect of the succeeding time interval, t_SUC, marked by the onsets of the second marker of t_S and this additional tone burst was examined in order to interpret the results of the first three experiments. This was the experimental condition, and no control condition was employed because the data of the control condition in Experiment 3 could be reused.

The method of adjustment was employed. The participant initiated each presentation by clicking a pane on the computer screen. A few seconds – the interval was chosen randomly within a range – after the clicking, the first tone burst of the standard pattern t_S, t_P|t_S, or t_S| t_SUC was presented. After that, there was a period of a few seconds – the interval was again chosen randomly, and then, another time interval, the comparison, t_C, was presented with the onsets of two successive tone bursts. The task of the participant was to adjust t_C to make it equal to t_S in subjective duration. The participant could change t_C by operating a screen interface, designed in a way not to give a visual hint about the present duration, and the minimum step of the adjustment was 1 ms. The participant was allowed to listen to the whole sequence as many times as he/she needed until t_S and t_C were perceived as equal, and finished the trial when satisfied. The last t_C value was recorded as the point of subjective equality, PSE.

Experiment 1

This experiment was conducted in 1996. Because we did not have an institutional ethical committee for psychological experiments at that time, an internal ethical review was impossible, but the experiment was a part of a research project reviewed by a governmental committee to select projects to be funded (as in the acknowledgments). This experiment is included in the present report because this was the first case in which the perceptual phenomenon we are going to describe appeared systematically. Our original purpose had been to determine the stimulus conditions to investigate the effect of sound marker duration on the occurrence of time-shrinking (underestimation), for there was a possibility that the amount of time-shrinking may be reduced, or the time condition for maximum time-shrinking could be shifted, by lengthening the markers (see Hasuo et al., 2011). From the present viewpoint, however, the experimental data gave us insight into the possibility of systematic overestimation of the second of two adjacent time intervals. The same t_S values were employed with a t_P in the experimental condition and in isolation in the control condition. The PSEs in these conditions were compared to see the amount of perceptual overestimation or underestimation of t_S caused by t_P.

Methods

Participants

The participants were five students, i.e., three males and two females, of the Kyushu Institute of Design (the predecessor of the Faculty of Design, Kyushu University). They had received education for acoustic design, including basic training in music performance. They were 20–24 years old, and had normal hearing.

Materials

Duration markers were pure tone bursts of 1000 Hz and 12, 63, or 123 ms with a rise and a fall time of ~2 ms each. These values were inexact due to our use of an analog filter to shape the waveform; the inexactness was sufficiently small relative to the effect we were measuring. The tone bursts of different durations were approximately equal in loudness when presented separately. This was realized by conducting preliminary measurements in which the participant could listen to any of the three sounds by clicking corresponding buttons on the computer screen. The stimulus sound was presented always 200 ms after the button was clicked. The level of the 12-ms burst, which was very short, was fixed at 97 dBA as defined as the level of a continuous tone of the same amplitude measured with an artificial ear (Brüel and Kjær 4153), a microphone (Brüel and Kjær 4134), and a sound level meter (Brüel and Kjær 2209). The levels of the other sounds were adjustable, and the participant was instructed to equalize the three sounds in terms of loudness. In each trial, the adjusted levels of the 63 and 123 ms bursts were recorded. The participant performed eight trials, and the median value for each sound was employed as the presentation level in the main part of the experiment. The presentation levels were 87–94 dBA for the 63-ms burst, and 85–93 dBA for the 123-ms burst.

The pure tones were first generated as rectangular pulse series before being band-pass filtered between 850 and 1250 Hz (NF DV-6BW). This resulted in tone bursts with rise and fall times of ~2 ms. The tone bursts were presented to the left ear of the participant through an amplifier (JVC AX-Z511) and headphones (AKG K141) in a soundproof room. The experimental procedure including stimulus generation was controlled by a quiet computer without a hard disk drive or a fan (Commodore Amiga 500).

In the main part of the experiment, the marker duration was fixed in each standard pattern, which was marked by two or three successive tone bursts, and the comparison time interval was always marked by two 12-ms tone bursts. In the standard patterns of the experimental condition, t_P| t_S, the preceding time interval, t_P, was fixed at 160 ms. Both in the control and in the experimental condition, the standard time interval, t_S, was varied from 120 to 440 ms in steps of 40 ms. The t_S duration of 120 ms was not possible when the marker duration was longer, i.e., 123 ms; this condition was omitted. Thus, there were 58 stimulus patterns: [2 (control/experimental) × 2 (marker durations ≤ 63 ms) × 10 (t_S durations) + 1 (marker duration = 123 ms) × 9 (t_S durations)]. The standard pattern was presented 2300–2500 ms after the participant clicked a button on the screen. There was a silence of 2700–3300 ms between the offset of the last sound marker of t_S, and the onset of the first sound marker of t_C.

Procedure

The participant performed four adjustment trials, two in ascending series and two in descending series, for each stimulus pattern: two replications for both series were performed. One replication comprised the first half, and the other the second half of the whole measurement. Each replication (= half) consisted of 116 trials, 58 (stimulus patterns) × 2 (series) in random order, and was divided into 9 blocks of 12 or 13 measurement trials, which were preceded by two warm-up trials. Preceding the measurement, the participant performed 58 training trials, divided into four blocks; each stimulus pattern appeared once. Thus, the whole experiment consisted of 22 blocks: 4 (training blocks) + 2 (replications) x 9 (measurement blocks). Each block took around 15–20 min, and the whole experiment was carried out over a period of 8 days for each participant.

Results and Discussion

We performed a three-way [marker duration × condition (experimental/control) × t_S duration] ANOVA utilizing the PSEs for t_S = 160–480 ms. Since it is commonplace that PSEs change as t_S changes, we will not detail the main effect of this factor neither here nor in the following experiments; its main effect was always significant (p < 0.001). The main effect of marker duration was significant, F(2,8) = 21.902, p < 0.01, $η_{p}^{2}$ = 0.846. Ryan’s post hoc test showed that the difference between all combinations of marker duration, i.e., 12 and 123; 63 and 123; and 12 and 63 ms; was significant (p < 0.05). The interaction between condition (experimental/control) and t_S duration was also significant, F(8,32) = 4.614, p < 0.01, $η_{p}^{2}$ = 0.536. This interaction should be related to the assimilation and contrast of t_S to t_P. The main effect of condition (experimental/control) and the other interactions were not significant (p > 0.05).

The PSEs in the control condition were very close to the physical values of t_S (Figure 2). Slight deviations appeared systematically, however: PSEs of shorter duration tended to be longer than the physical values of t_S. This kind of time errors sometimes appear in the literature of time perception (Woodrow, 1951; Eisler et al., 2008). The PSEs tended to be slightly longer when the marker duration was longer, but the present data do not offer much information on this issue. This issue should be investigated intensively in the future in order to understand rhythm perception in speech or music. Hasuo et al. (2011, 2012) reported that inter-onset time intervals up to 360 ms tended to be perceived as longer when the duration of the sound markers to terminate the time intervals were longer. This was the case whether the time interval to be judged was isolated or neighboring another time interval. The duration of the sound markers to initiate the time intervals showed similar effects, but in a more unstable manner.

FIGURE 2

FIGURE 2. Mean PSEs obtained from five participants in Experiment 1. PSE corresponds to the duration of t_C that was perceived to be equal to the duration of t_S. The results for marker durations 63 and 123 ms were raised by 300 and 600 ms, respectively, in this graph for clarity. The physical values of t_S (the points of objective equality) are indicated by dotted lines. Error bars represent standard deviations between participants.

The PSEs in the control and in the experimental condition differed systematically. The experimental PSEs were smaller than the corresponding control PSEs when t_S = 200 or 240 ms, i.e., when t_S - t_P = 40 or 80 ms: t_S was underestimated showing time-shrinking in a typical manner. However, the difference between the control and the experimental condition was reversed when t_S was longer: the experimental PSEs were systematically greater than the control PSEs when t_S ≥ 320 ms. Thus, time-shrinking as assimilation of t_S to t_P appeared when the difference between these neighboring time intervals was small, and gave way to contrast of t_S to t_P when the difference was large.

The above tendency appeared in similar ways in all the marker conditions between the control and the experimental PSEs despite the fact that the control PSEs increased slightly, but clearly, if the sound marker duration was increased. The contrast appeared as overestimation of t_S in the experimental condition against the control condition. The PSEs were already lengthened in the control condition if the sound markers were longer, and they became even longer – were overestimated further – in the experimental condition. Furthermore, the amount of overestimation was larger when the duration markers were longer. This is in contrast with the fact that the magnitude of time-shrinking – underestimation – is often smaller when longer markers are used (Yamashita and Nakajima, 1999; Hasuo et al., 2011), as was the case also in the present experiment.

The overestimation, as represented by the difference in the PSEs between the control and the experimental condition, seemed to have a local peak when t_S = 320 ms for all the marker durations. This tendency was peculiar and robust, but we leave this issue for future research.

To test whether the common tendency in overestimation pattern (i.e., the difference between the control and the experimental PSEs over the t_S duration range) across different marker durations was statistically significant, we conducted a Friedman test (e.g., Siegel and Castellan, 1988) utilizing the mean overestimation values for each marker duration. There was a statistically significant tendency in overestimation, χ²(8) = 23.644, p = 0.003. To examine whether the overestimation patterns had a common tendency even when the influence of time-shrinking (the negative overestimation at t_S - t_P = 40 or 80 ms ) was cancelled, we also performed the same Friedman test without the conditions in which t_S - t_P = 40 or 80 ms. The tendency in overestimation pattern was significant again, χ²(6) = 17.714, p = 0.007. The statistical significance in this additional Friedman test confirmed that the overestimation patterns had a common tendency even without the influence of time-shrinking.

Experiment 2

Experiments 2–4 were part of a research project approved by the research ethics committee of the Faculty of Design, Kyushu University, in 2010. Experiment 1 and our previous data on time-shrinking (e.g., Nakajima et al., 2004) revealed that the underestimation of a time interval that appeared as assimilation of t_S to t_P often gave way to contrast when t_S - t_P > 120 ms. Because we did not have systematic data indicating this effect except in Experiment 1, we decided to conduct an experiment in which t_S was varied in a larger range (up to 640 ms). For t_P, we chose three values: 80, 120, and 160 ms. Time-shrinking appears most stably in this range of t_P (Nakajima et al., 2004; Miyauchi and Nakajima, 2005), and we first needed experimental data under such conditions. One of the things we were interested in was whether any overestimation would appear for t_P = 120 ms; there had been occasional cases in previous data in which t_S had been overestimated for t_P = 80 or 160 ms, but no such cases ever for t_P = 120 ms. Most importantly, we wanted to see whether the typical time-shrinking, which was expected reliably if t_S - t_P = 40 or 80 ms, would give way to contrast, i.e., overestimation of t_S.