Skip to main content


Front. Aging Neurosci., 11 April 2017
Sec. Neurocognitive Aging and Behavior

The Treatment Based on Temporal Information Processing Reduces Speech Comprehension Deficits in Aphasic Subjects

  • 1Laboratory of Neuropsychology, Department of Neurophysiology, Nencki Institute of Experimental Biology of Polish Academy of Sciences, Warsaw, Poland
  • 2Department of Psychology, SWPS University of Social Sciences and Humanities, Warsaw, Poland
  • 3Bioimaging Research Center, World Hearing Center of Institute of Physiology and Pathology of Hearing, Kajetany, Poland

Experimental studies have reported a close association between temporal information processing (TIP) and language comprehension. Brain-injured subjects with aphasia show disturbed TIP which was evidenced in elevated temporal order threshold (TOT) as compared to control subjects. The present study is aimed at improving auditory speech comprehension in aphasic subjects using a specific temporal treatment. Fourteen patients having deficits in both speech comprehension and TIP were tested. The Token Test, phoneme discrimination tests (PDT) and Voice-Onset-Time (VOT) Test were employed to assess speech comprehension. The TOT was measured using two 10 ms tones (400 Hz, 3000 Hz) presented binaurally. The patients participated in eight 45-min sessions of either the specific temporal treatment (n = 7) aimed at improving the perception of sequencing abilities, or in a non-temporal control treatment (n = 7) on volume discrimination. The temporal treatment yielded an improvement in TIP. Moreover, a transfer of improvement from the time domain to the language domain was observed. The control treatment did not improve either TIP or speech comprehension in any of the applied tests.


Temporal Dynamics in Language Processing

The relationship between temporal information processing (TIP) and language has been discussed for many years. Both experimental data and everyday observations have emphasized the temporal dynamics of human speech. Numerous publications indicate the fact that our language communication is rooted in TIP (Pöppel, 1997). In the temporal structure of the speech signal, two major levels may be distinguished, i.e., the millisecond level reflected in temporal segmentation in short time intervals and multisecond level reflected in segmentation in longer intervals. These two temporal levels are controlled by a high- or low-frequency processing system, respectively (Szelag et al., 2004, 2011).

High-frequency processing within millisecond time range might be crucial to phonemic hearing defined as primary ability to analyze and synthesize speech sounds. Auditory comprehension relies heavily on such an ability. For example, information about place and manner of articulation in stop consonants is contained within time range of ca. 20–40 ms. Spectrographic analyses in many languages clearly indicate that rapid formant transitions in stop consonants (like “p”, “b”, “t”, “d” etc.) are limited to time ranges of up to ca. 40 ms. Structure of human articulatory apparatus prevents prolonged articulation of these consonants in fluent speech due to immediate switching to vowel sounds, which follow. These frames of temporal processing are of paramount importance to our study. However, low-frequency processing system concerns rather lexical selection and sentence production or reception.

The relationship between TIP in the millisecond range and language reception may be supported by neuroanatomical data. Some authors postulated an overlapping of brain structures involved in both TIP and language which are located mainly in the left temporal lobe comprising gyrus temporalis superior, gyrus temporalis medius and surrounding white matter (von Steinbüchel et al., 1999; Wittmann et al., 2004; Woo et al., 2009; Lewandowska et al., 2010; Oron et al., 2016).

TIP Deficits in Language Disordered Population

Several studies showed that the patients with left-hemispheric brain lesions and aphasia, children with language learning impairment, as well as children or adults with dyslexia display deficient TIP, specifically, a disordered ability of sequential processing and temporal ordering. These deficits were reflected in elevated thresholds for identification of events presented in rapid succession and observed in significantly longer, as compared to the healthy controls, inter-stimulus interval (ISI) between two sounds, necessary to report correctly their temporal order (e.g., Swisher and Hirsh, 1972; Tallal and Piercy, 1973; Farmer and Klein, 1995; Tallal et al., 1998; von Steinbüchel et al., 1999; Wittmann et al., 2001, 2004; Fink et al., 2006; Szelag et al., 2014, 2015). In this study, we concentrated on such temporal deficits in post-stroke aphasic patients which were assessed with temporal order threshold (TOT), i.e., the shortest ISI at which subjects were able to report the order of two sounds correctly.

In these patients, TIP has been widely investigated in the previous studies. A brief review of reports in existing literature shows that aphasic patients displayed deficient TIP on different time levels (millisecond/multisecond), depending on aphasic symptoms they present. In patients with Wernicke’s aphasia disturbed time perception on millisecond level was reflected in declined temporal resolution in millisecond time range (von Steinbüchel et al., 1999; Wittmann et al., 2004; Sidiropoulos et al., 2010). Such ability is crucial for auditory comprehension. Temporal resolution in the millisecond time range in aphasic patients can be tested using the monaural stimulus presentation mode, where one click is presented to the left or right ear, followed by another click to the other ear with different ISIs separating these clicks. The other possibility is the binaural stimulus presentation mode where two tones of different frequencies are presented to both ears with various ISI in between. The former mode was more extensively explored in existing studies on aphasic population than the binaural one.

For example, Wittmann et al. (2004) using monaural mode confirmed temporal ordering deficits in patients with a damage to the Wernicke’s area, who required significantly longer ISI between two clicks presented successively to report their order correctly in comparison to the healthy subjects. In line with this evidence, our previous studies confirmed in Wernicke’s patients disturbed temporal order perception measured with such monaural presentation mode (Szelag et al., 2014). It is worth mentioning that historically, Efron (1963), as well as Swisher and Hirsh (1972) emphasized that deficits in TIP in aphasia were not related selectively to the auditory processing, but concerned also the other modalities. Furthermore, deficits on the multisecond time range were evidenced also in Broca’s aphasic patients who displayed relatively preserved auditory comprehension but nonfluent output of speech and agrammatism. For such disfluency pattern multisecond timing seems to be the crucial factor (Szelag and Pöppel, 2000; Wittmann et al., 2001; Kagerer et al., 2002).

Moreover, in our previous study it was proven that in aphasic subjects individual differences in severity of phoneme-identification and phoneme-discrimination impairments were correlated with elevated TOTs (Oron et al., 2015). In this report, we evidenced in 30 aphasic patients that the elevated TOTs for monaurally presented clicks correlated positively with correctness achieved in standard language tests involving: (1) higher linguistic functions (measured by Token Test); (2) phoneme discrimination (tested by phoneme discrimination test (PDT)); and (3) voicing contrast detection (measured by Voice-Onset-Time (VOT) test). The existence of such correlations indicated that poorer sequencing abilities were accompanied by poorer language performance.

The Application of TIP in Neurorehabilitation

Based on the coexistence of TIP and language deficits, the Fast for Word (FFW) computer training program was developed to improve both the sequencing abilities and phoneme discrimination, resulting in ameliorated auditory comprehension. This program was first applied in children with language-learning impairment (Merzenich et al., 1996; Tallal et al., 1996). FFW was also applied in dyslexic children and adults for improving their language competency (Temple et al., 2003; Strehlow et al., 2006; Gaab et al., 2007; Lajiness-O’Neill et al., 2007). Nevertheless, other authors did not confirm the positive effects of FFW on some aspects of language competency, i.e., writing and reading (Agnew et al., 2004).

A pilot treatment based on the relationship between TIP and language was used for the first time in our laboratory in aphasic patients (Szelag et al., 2014). In this study, the patients were taught to recognize the order of paired clicks presented monaurally in rapid succession with decreasing ISI within consecutive pairs. The training combined eight 45-min long sessions (temporal training). In parallel, the control group was trained using the volume discrimination procedure without any temporal aspects. We reported that only temporal training yielded improved TIP. Thus, after such training patients recognized the order of clicks within a pair at significantly shorter ISIs than before the training. Moreover, a transfer of improvement was observed from the trained time domain to the language domain which remained untrained during the intervention period. Importantly, following such therapy patients committed significantly fewer errors in Token Test, PDT and VOT Test. In contrast, the non-temporal control training did not improve either the temporal order perception or auditory comprehension in any of the applied language skills.

Due to the limited literature evidence with regard to amelioration of auditory comprehension following temporal training in aphasic patients (Szelag et al., 2014, 2015), as well as importance of this issue for modern neurorehabilitation, we extended our early pilot prototyping therapy procedure and developed the complementary training protocol, using the modified stimulus presentation mode. There is an important difference between our pilot prototyping procedure (Szelag et al., 2014) and the procedure presented here. It concerns the specificity of the stimulus presentation mode during intervention. In the former case, monaurally presented clicks were delivered contrary to binaurally presented tones in the latter case. Based on our previous reports, there are some fundamental differences in performance of sequencing tasks using monaural clicks or binaural tones as the presented material (Szymaszek et al., 2009; Szelag et al., 2011). Whereas the monaural task (using click presentation) reflects rather pure temporal processing, the binaural one (tones presentation) involved beside such processing the frequency discrimination, labeling the perceived pitch in the inner speech, and abstract thinking. Therefore, it seems to be more sensitive to subjects’ age in comparison to the monaural one. Moreover, in the monaural mode considerable deterioration was observed beyond 60 years of age, whereas in the binaural one declined performance started much earlier, approximately at the age of 40 years (Szymaszek et al., 2009). These frequency discrimination problems did not reflect difficulties in peripheral hearing as the normal hearing was evidenced by screening audiometry applied during participant recruitment. In aphasic patients, in binaural task, the overlapping of temporal problems related to brain lesions and difficulties related to the specificity of the task was observed. In consequence, only patients relatively cognitively well-functioning (with more subtle post-stroke deficits) were able to perform the binaural task.

It seems that different neuronal processes are involved in binaural as compared to the monaural procedure. Considering the interpretation provided by some authors (Brechmann and Scheich, 2005; Rahne et al., 2008; Deike et al., 2010), we are of the opinion that two tones presented in rapid sequence could be integrated into one percept at short ISIs. Their temporal order therefore, can be reported on the basis of one frequency-modulated pattern (rising/falling) rather than on identification and sequencing of separate stimuli.

Some authors emphasized that the binaural presentation involved the integration of information from both hemispheres, as the left posterior part of the superior temporal gyrus is probably specialized for sequential tasks. Nevertheless, the participation of the right-hemispheric global strategy for categorization of pitch direction (global pattern recognition) was also confirmed (Johnsrude et al., 2000; Zatorre et al., 2002; Brechmann and Scheich, 2005). In our post-stroke subjects population only left hemispheric areas were disturbed, which in consequence may be the reason of such TIP deficits.

Study Aims

Considering the above evidence, in the present study we investigated whether the deficits in temporal sequencing abilities in aphasic patients can be reduced by the specific temporal training using binaural tone presentation. Furthermore, we verified if such a training could be profitable for both TIP and language domains, as previously reported training using monaural presentation (Szelag et al., 2014).

Materials and Methods


Fourteen patients (nine males and five females) suffering from post-stroke, moderate to mild fluent aphasia after hemeorrhage or infarction, (lesion age ± SD = 18 ± 13 weeks) participated in the study. They aged from 40 to 77 years (± SD = 57.3 ± 11 years), were right-handed (Oldfield, 1971), Polish native speakers and had normal hearing level (ANSI, 2004) verified by a screening audiometry (audiometer AS 208), using frequencies of 250, 500, 750, 1000, 1500, 2000 and 3000 Hz, which covered the frequency spectrum of stimuli presented in the study. Apart from stroke they had neither neurological nor psychiatric disorders and reported no history of head injuries in the past. The other exclusion criteria were: multiple stroke, concomitant systemic diseases, poor general health, participation in other rehabilitation program during our study and older age to minimize age-related cognitive deficits which might have negative effects on the training results.

All recruited patients displayed auditory comprehension deficits. Three standard language tests were administered to assess comprehension abilities (see below). The language deficits were accompanied by disordered TIP, evidenced in auditory perception of temporal order.

The patients were classified randomly into two groups: experimental group (EXP) and control group (CON). These groups were matched for age, gender, type of stroke, lesion age, as well as for baseline characteristic measures tested in this study, i.e., the level of auditory comprehension, attentional resources (see below for description of particular procedures). All between-group differences for these variables were nonsignificant (Mann-Whitney U test). It was a randomized, single-blind study, thus, patients were not informed to which group they were assigned.

The EXP group (n = 7) was assigned to the experimental treatment program and the CON group (n = 7)1 to the control treatment program. The baseline characteristic of the study population is summarized in Tables 1, 2.


Table 1. Baseline characteristics of the study population.


Table 2. Detailed description of the patient sample assigned to group EXP and CON.

The place of the lesion was verified by CT or MRI. Neuroanatomical analyses are summarized in Figure 1. which indicates overlapping of lesioned areas in both cortical and subcortical regions in EXP and CON groups. The lesioned areas in EXP group comprised mainly of: superior temporal gyrus, medial orbito-frontal cortex, basal ganglia: insula, putamen, whereas, in CON group: superior and middle temporal gyrus, Heschl’s gyrus, Rolandic operculum, medial orbito-frontal cortex and basal ganglia: insula, putamen.


Figure 1. The summarized lesioned areas in patients included into EXP group (A) and CON group (B).

As shown in Figure 1, the damaged structures in both groups comprised mainly temporal area of the left hemisphere which covered the classical regions involved in both receptive language function (auditory comprehension) and temporal processing (von Steinbüchel et al., 1999; Wittmann et al., 2004; Szelag et al., 2009). Considering the description of the patient sample presented above, the EXP and CON groups were fully matched.

The research was approved by the Bioethics Commission at the Institute of Psychiatry and Neurology in Warsaw (permission no. 5/2005 from February 2nd, 2005) as well as by the Bioethics Commission at the Warsaw Medical University (permission no. 5/2010 from January 26th, 2010). The study was conducted according to the principles expressed in the Helsinki Declaration; the written informed consent from each participant was obtained prior to testing.

Experimental Procedures

The study combined both treatment programs and cognitive evaluation procedures. The treatment programs comprised of either the experimental treatment (EXP) or control treatment (CON). The cognitive evaluation procedures included language, attention and TIP.

Treatment Programs

Experimental treatment (EXP) program

The EXP program was focused on TIP. It used paired tones of different frequencies (low tone: 400 Hz and high tone: 3000 Hz) which were presented with different ISIs binaurally via headphones. After each paired-tone presentation, a patient reported the order of tones by pointing to a response card corresponding to high-low or low-high tone order. A starting ISI was the mean TOT value achieved in two sessions during the baseline evaluation (see below for detailed description) increased by 20 ms to ensure a comfortable treatment situation. In EXP an adaptive algorithm was applied to adjust the level of task difficulty for each patient individually (see below). EXP program consisted of 10-trial blocks. Within each block the paired tones were presented with the same ISIs, whereas between blocks the ISIs varied regarding an adaptive algorithm which based on correctness achieved in previous responses. There were the passed and the missed blocks. The “passed block” was successfully completed when at least 8 out of 10 trials within this block were answered correctly. Then, in the next block the ISI was shortened to make the task more demanding for the patient. The steps applied in ISI decreasing depended on the actual value of patients TOT. For TOT longer than 100 ms the step was 5 ms, whereas, for TOT from 50 ms to 100 ms, and for TOT below 50 ms it was 2 ms and 1 ms, respectively. On the other hand, in the “missed block” only seven or fewer correct responses within a block were given. In such a situation ISI in the next block was increased by a constant 1 ms step.

Control treatment (CON) program

In CON program two 1000 ms sounds separated by a constant, relatively long ISI of 3000 ms were presented (Szelag et al., 2014). The subject assessed the loudness of sounds and reported which sound was louder (the first or second) by pointing to a response card corresponding to the first and second sound. During CON treatment sounds of different frequency (400, 600, 800, or 1000 Hz) were presented in consecutive blocks, however, the frequency of sounds within a block (10 following trials) was always the same. The adaptivity in the control training was based on the loudness difference of paired tones. The loudness of the tones within a pair may differ within 0.025–0.00025 of the amplitude range with a constant step of 0.00025.

Both EXP and CON treatment programs were comparable as much as possible according to the task difficulty and the mental force. The only difference was in a lack of a millisecond TIP processing component which constituted a basic component of the EXP program.

The treatment protocol in each program consisted of eight training sessions performed three times a week. Each session lasted 45 min and comprised of 10-trial blocks. The same motivation system was used in EXP and CON treatment. Within the block each single trial was preceded by a visual warning signal. After each subject’s response the feedback on the correctness was provided. Each single correct response was rewarded by one point. Next, the points achieved were summarized and displayed to the patient at the end of each block. For each “passed block” the patient collected the puzzles to keep him highly motivated. The patients were allowed to make short breaks within the session, nevertheless the duration of pauses was excluded from the entire session time.

Cognitive Evaluation Procedures


Auditory comprehension was assessed by the three following tests: (1) Token Test (Huber et al., 1983) to evaluate speech comprehension on a sentence level; (2) PDT (Nowak-Czerwińska, 1994); and (3) VOT Test (Szelag and Szymaszek, 2014) to assess phonemic hearing and comprehension deficits on single word level.

Token Test is a part of the Aachener Aphasie Battery and involves semantic, syntactic and/or post-interpretative processes. The test contains 20 plastic tokens varying in shape (squares and circles), size (small and large) and color (red, yellow, green, blue and white). The test consists of 50 commands classified into five sections of increasing difficulty. Examples of particular commands in consecutive sections are as follows: Section 1: Touch the red square; Section 2: Touch the big red circle; Section 3: Touch the green square and the red circle; Section 4: Touch the little red circle and the big yellow square; Section 5: Put the green square next to the red circle.

The outcome measure was the percent of errors committed in the entire test.

The percent of errors committed in the Token Test was used both as patients’ inclusion criterion and for evaluation (both for the baseline and the post-treatment evaluation).

PDT contains 64 paired-words, within these 75% included different and 25% the same words. The words within each pair differed in consonants, contrasted for the place of articulation, plosive, fricative, voicing or nasality.

The patient was asked to report whether words within the presented pair were the same or different. The responses were given by pointing to corresponding response cards. Subjects performed four series of eight paired words in each series. Alternative versions of the test were applied in the baseline and post-treatment evaluation.

The outcome measure was the percent of errors in the entire test.

VOT Test assesses the ability to differentiate between initial voiced and unvoiced stop-consonant in two Polish words /TOMEK/ and /DOMEK/ (in English: Tom/house) which differed in the initial voiced and unvoiced stop-constant (T or D). They were created by manipulation of the duration of an interval between the burst and the onset of laryngeal pulsing in the initial consonant in naturally uttered word /TOMEK/unvoiced. There were in total 21 pseudowords (synthesized using Adobe Audition software), presented randomly in six series. The VOT of the initial stop consonant varied from −100 ms to 90 ms in 10 ms steps, i.e., −100, −90, −80, −70, −60, −50, −40, −30, −20, −20, −10, 0, 5, 10, 20, 30, 40, 50, 60, 70, 80, 90 ms2. The patient differentiated categorically each presented word either as /TOMEK/or /DOMEK/ by pointing to the response card displaying either a picture of a boy or a house. Each subject performed 126 trials exposed in 6 series of 21 presentations of various VOT values.

As confirmed by our previous study on 67 healthy Polish volunteers (Szelag and Szymaszek, 2014), pseudo-words of the VOT from −100 ms to −70 ms are typically classified as a word /DOMEK/voiced and from 10 ms to 90 ms as an unvoiced word /TOMEK/. In between, i.e., for a VOT values from −60 ms to 5 ms there is a transition zone in which the probability of the discrimination of T or D is at a chance level.

The outcome measure was the percent of reported word /DOMEK/ for VOT of −100 ms to −70 ms and /TOMEK/ for 10–90 ms.


The evaluation procedure comprised two aspects of attention, i.e., vigilance defined as the ability to sustain or maintain attention over a longer period of time and alertness, defined the ability to raise and maintain a high level of attention in anticipation of a test stimulus (Test for Attentional Performance, Zimmermann and Fimm, 2007).

Vigilance: a low (440 Hz) and a high (1000 Hz) tone were presented sequentially in random order. The subject pressed a response pad when two identical tones (low or high) occurred in a row. The test lasted 10 min.

The outcome measure was the mean reaction time (RT) achieved in all test trials.

Alertness: simple RTs were measured in response to a visual stimulus (a white cross displayed in the screen center) which was preceded by an auditory warning signal. The subject pressed a response pad after each presentation of the cross. The test comprised 80 trials which were performed in four blocks: two blocks without any warning signal (A) and two blocks with an auditory warning signal (B). The blocks were administered according to the ABBA protocol.

The outcome measure was the mean RT achieved in all test trials.


The stimuli were pairs of 10-ms sinusoidal tones of 400 Hz and 3000 Hz (1-ms rise and fall time of each tone) presented in rapid succession with varied ISIs. Within each sequence the tones were adjusted to equal loudness on the basis of isophones3. The paired-tones were presented binaurally, i.e., each pair was presented to both ears. The stimuli were generated by a 16-bit Sound Blaster Extigy Card and delivered at a comfortable listening level through Sony headphones MDR-CD 480. The subject reported temporal order of tones presented within each pair by pointing to one of two response cards corresponding to high-low or low-high tone order. A warning signal of 400 ms-long preceded each stimulus pair and it was delivered 1500 ms before the first tone in each pair.

In each trial, ISIs were adjusted according to an adaptive maximum-likelihood-based algorithm and the ISIs varied from 1 ms to 600 ms. This procedure estimated individual TOTs as the minimum ISI between two tones at which their order was reported by a subject at 75% correctness (Levitt, 1971). TOT values were adjusted using “Yet Another Adaptive Procedure” (Mates et al., 2001) on the basis of the maximum likelihood parameter estimation. The evaluation procedure was continued until the individual TOT value was located with a probability of 95% inside a ±5-ms interval around the currently estimated threshold (Treutwein, 1995).

The proper experiment was preceded by a practice session in which the subject was presented with a high and low tone separately to familiarize with sounds. Next, the patient was presented with paired tones at relatively long constant ISI of 600 ms and asked to report their order. Such a practice session was completed when the pre-defined criterion of 11 correct responses in last 12 presentations was reached. Then the proper experiment started.

The outcome measure was the mean TOT from two sessions.

The TOT measurement was applied both in the evaluation phase and in subjects recruitment as one of the inclusion/exclusion criterion. The subject performed the task twice during two sessions with a break of a few days. In case of recruitment, the mean value of two sessions was compared to that of healthy volunteers matched for age. The cut off value for patient inclusion to the training was the TOT value of 1.5 SD above the mean value evidenced in healthy volunteers (Szymaszek et al., 2009; Szelag et al., 2011).

The cognitive assessment procedures were performed at the baseline and for the post-treatment evaluation (see Figure 2 below).


Figure 2. The scheme of the experimental protocol.

Statistical Analyses

Using Wilcoxon Signed-Rank Test for two dependent samples, we compared the post-treatment vs. baseline performance in particular patients in two treatment groups. We tested the effect of EXP and CON treatment on language and attentional resources, as well as on TIP in sequencing abilities measured by the TOT.



Token Test

After EXP the difference in percentage of errors post-treatment (x¯ = 33) vs. a baseline (x¯ = 50) was significant (n = 7, Z = 2.37, p < 0.017). After CON the difference in percentage of errors post-treatment (x¯ = 45) vs. a baseline (x¯ = 50) was nonsignificant (n = 7, Z = 1.48, p < 0.14).

Phoneme Discrimination Test (PDT)

After EXP and CON the difference in percentage of errors post-treatment (x¯ = 3 and x¯ = 11, for EXP and CON, respectively) vs. a baseline (x¯ = 7 and x¯ = 14, for EXP and CON, respectively) was nonsignificant (n = 7, Z = 1.28, p < 0.21 and n = 7, z = 1.10, p < 0.28, for EXP and CON, respectively). Nevertheless, after EXP the percentage of errors post-treatment was lower than that at a baseline.

Voice-Onset-Time Test

After EXP in the typical unvoiced zone (VOT values of 5, 10 and 70 ms) the mean percent of correctly reported unvoiced pseudowords /TOMEK/ post-treatment was higher than at a baseline. It corresponded to the improved performance (VOT5 ms = 91 vs. 81, Z = 1.83, p < 0.068; VOT10 ms = 95 vs. 74, Z = 2.02, p < 0.044; VOT70 ms = 96 vs. 86, Z = 1.83, p < 0.068). After CON the post-treatment vs. baseline difference for all VOT values was nonsignificant.


For both treatment programs there was no significant difference for the post-treatment and a baseline performance for both measured aspects of attention, i.e., alertness and vigilance.

For alertness the mean RT post-treatment vs. a baseline was x¯ = 287 vs. x¯ = 287 for EXP (Z = 0.11; p < 0.92) and x¯ = 372 vs. x¯ = 397 for CON (Z = 0.94; p < 0.35). For vigilance the mean RT post-treatment vs. a baseline was x¯ = 613 vs. x¯ = 645 for EXP (Z = 0.11, p < 0.92) and x¯ = 584 vs. x¯ = 536 after CON (Z = 0.54, p < 0.60).


After EXP the difference in TOT post-treatment (x¯ = 120 ms) vs. a baseline (x¯ = 257 ms) was significant (n = 7, z = 2.20, p < 0.028). After CON the difference the post-treatment TOT (x¯ = 151 ms) vs. a baseline (x¯ = 219 ms) was nonsignificant (n = 4, z = 1.46, p < 0.15).

The results of EXP and CON treatment for all tests described above are presented in Figure 3.


Figure 3. Percentage of differences in the level of a baseline vs. post-treatment performance (baseline—post-treatment/baseline * 100%) for particular tasks in EXP (red bar) and CON (blue bar). The 0 point reflects a stable performance (no difference between a baseline vs. post-treatment performance). Positive values (right side from the 0 point) correspond to improved performance. No worsened performance (left side from the 0 point) was observed. Significant differences (p < 0.05) are indicated by asterisks.


Summary of Results

The application of EXP training in aphasic patients significantly ameliorated TIP, which was reflected in lower TOT values (better performance) as well as language comprehension evidenced in Token Test, VOT test and PDT. In contrast, the CON training caused no significant improvement in any of the tests used. Moreover, attentional functions (alertness and vigilance) remained stable following both EXP and CON programs. Thus, the obtained language benefits resulted rather from the improvement of linguistic processes but not from changes in attentional resources after the treatment.

Influence of EXP and CON Treatment Programs on TIP and Language Abilities

The study reported the effects of two different treatment programs (EXP and CON) on TIP and language competency in post-stroke patients suffering from aphasia following damage to the posterior parts of the left hemisphere. Although CON treatment did not cause any significant improvement, EXP treatment provided significant benefits for both language and TIP skills.

The application of EXP treatment resulted in significantly lowered thresholds for the order detection in post-training vs. baseline evaluation, corresponding to improved sequencing ability. In contrast, post-training vs. baseline difference turned out nonsignificant following CON treatment. Despite the similar mental load of these two treatment types, the huge advance of the EXP program was evidenced. It was rooted in improved, rapid sequencing processing which proved to be crucial for our cognition (Szelag et al., 2015).

Our previous preliminary study (Szelag et al., 2014) also revealed that temporal training was beneficial for aphasic subjects despite the use of different stimulus presentation mode from that applied in the present study. Both studies (Szelag et al., 2014 and presented here) indicated that the beneficial effects seem related to temporal aspects of auditory processing independently of the stimulus types applied for the improvement of TIP, unless each of them contains rapid auditory processing stimulation. Nevertheless, it is worth mentioning that in case of treatment procedure based on tone differentiation severely disturbed patients were not able to differentiate between high and low tones and in a consequence could not perform the training.

With respect to receptive language functions, EXP program seems to be very profitable, as compared to the CON treatment. There is a strong literature support for a close relationship between the millisecond timing and language (Oron et al., 2015, see also “Introduction” Section). The disturbed sequencing abilities on millisecond time range may be associated with problems in speech perception, which is temporally segmented in this range, i.e., phoneme discrimination. The coexistence of timing deficits and speech comprehension problems was proven previously in different language pathologies e.g., dyslexia, specific language-learning impairments and aphasia. On the basis of this evidence, some authors postulated the existence of central mechanism which controls TIP in the range of some tens of milliseconds (responsible for correct identification of temporal order of two events) as well as phoneme discrimination. It uses spectral cues which comprise rapid formant transitions in millisecond time window, similar to that critical for TIP on the millisecond range. As already proven (Szelag et al., 2014), the intervention based on temporal order detection in the millisecond time domain, lowered patients’ TOT, resulting in speech abilities improvement.

In the present study, temporal treatment program caused improvement in different language functions, i.e., in global speech comprehension (Token Test), as well as in the differentiation of voiced/unvoiced contrast (VOT test). The remarkable lowering of errors was also observed in PDT, however, this improvement did not reach the statistical significance. Since VOT test is focused on phoneme discrimination, the important result of the present study revealed that temporal treatment lowered the number of errors in Token Test, which measured the comprehension of spoken commands of increasing length and complexity. The functions measured in Token Test required not only well preserved linguistic processing (phonemic, semantic and syntactic) but also efficient verbal working memory which was not assessed in the present study. Nevertheless, different bibliographical data emphasize that working memory may be improved after temporal treatments (Szelag et al., 2015). It seems important because post-stroke cognitive disabilities may not be related selectively to language disturbances but also to the other cognitive functions, i.e., working memory, executive functions, etc.

The follow-up diagnosis performed 6 months after the treatment completion indicated relative stability of the observed improvements.

Temporal Trainings as Promising Perspective for Neurorehabilitation

Our important finding is the indication that temporal treatment is beneficial for amelioration of receptive language functions. In the modern neurorehabilitation this is a promising idea of a method supporting the classic speech therapy. Neuropsychologists emphasized that one of the advances of temporal treatment is that during the therapy aphasic patients do not face directly their language problems (the treatment used nonverbal stimulation) which may motivate them to work and simultaneously foster improvement of untrained language competency.

Numerous studies indicate that TIP constitutes a basic process, which is at the roots of our mental activity in its broad aspect and covers not only language, but also other cognitive functions. von Steinbüchel and Pöppel (1993) introduced modern functional taxonomy, which distinguishes content-related functions (“what”) from logistic functions (“how”). Content-related functions represent mental content of human subjective experience, including but not limited to perception, memory, language, while logistic functions fund the base of mental activity, thus being superior to content-related functions. These scholars postulate that TIP might be a logistic function, which determines other functions of human cognitive process. Such a theory was further developed in our previous work (Szelag et al., 2015).

Some authors indicated the internal timing mechanism controlling perception of temporal order in millisecond time range. This mechanism may be related to neural gamma band oscillations with a periodicity of ca. 40 Hz which means that one oscillation period has ca. 25 ms duration (Oron et al., 2015). Pöppel (1997) asserted that the correct perception of tow sound order is possible when they occur at least in two successive oscillatory periods. We can assume that such mentioned hypothetical mechanism might be disturbed in brain lesioned patients, e.g., after a stroke resulting in longer periodicity of gamma oscillations and less efficient TIP.

Final Remarks

To sum up, our recent results allow some generalizations to be drawn. Only temporal treatment improved both TIP and receptive language functions which was evidenced by the transfer of improvement from the trained time domain to untrained language domain. Such transfer was not observed after the control treatment. These results show the possible influence of TIP treatment on language deficits reduction in aphasic population.

Author Contributions

AS: subject recruitment, acquisition, analysis and interpretation of data, manuscript writing. TW: analysis and graphical presentation of neuroanatomical data. ES: acquisition, analysis and interpretation of data, manuscript writing, responsibility for the final version of manuscript.


The study was supported by Ministry of Science and Higher Education grant No: PBZ-MIN/001/PO5/06 realized in cooperation with researchers from the Human Science Centre, Ludwig-Maximillian University, Bad Tölz (Germany).

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer TSG declared a shared affiliation, though no other collaboration, with several of the authors, AS and ES, to the handling Editor, who ensured that the process nevertheless met the standards of a fair and objective review.


The authors would like to thank clinicians who recruited aphasic patients for the present study.


  1. ^ A part of results in the CON group reported here was used as controls in our previously published study (Szelag et al., 2014).
  2. ^ The negative VOT values reflect a situation when a burst is preceded by laryngeal pulsing (in Slavonic languages perceived usually as a voiced contrast), whereas, the positive VOT values reflect the opposite situation, i.e., pulsing is preceded by a burst (unvoiced contrast).
  3. ^


Agnew, J. A., Dorn, C., and Eden, G. F. (2004). Effect of intensive training on auditory processing and reading skills. Brain Lang. 88, 21–25. doi: 10.1016/s0093-934x(03)00157-3

PubMed Abstract | CrossRef Full Text | Google Scholar

ANSI. (2004). ANSI S3.6–2004. American National Standard Specification for Audiometers New York, NY: American National Standards Institute.

Brechmann, A., and Scheich, H. (2005). Hemispheric shifts of sound representation in auditory cortex with conceptual listening. Cereb. Cortex 15, 578–587. doi: 10.1093/cercor/bhh159

PubMed Abstract | CrossRef Full Text | Google Scholar

Deike, S., Scheich, H., and Brechmann, A. (2010). Active stream segregation specifically involves the left human auditory cortex. Hear Res. 265, 30–37. doi: 10.1016/j.heares.2010.03.005

PubMed Abstract | CrossRef Full Text | Google Scholar

Efron, R. (1963). Temporal perception, aphasia and déjà vu. Brain 86, 403–424. doi: 10.1093/brain/86.3.403

PubMed Abstract | CrossRef Full Text | Google Scholar

Farmer, M. E., and Klein, R. M. (1995). The evidence for a temporal processing deficit linked to dyslexia: a review. Psychon. Bull. Rev. 2, 460–493. doi: 10.3758/BF03210983

PubMed Abstract | CrossRef Full Text | Google Scholar

Fink, M., Churan, J., and Wittmann, M. (2006). Temporal processing and context dependency of phoneme discrimination in patients with aphasia. Brain Lang. 98, 1–11. doi: 10.1016/j.bandl.2005.12.005

PubMed Abstract | CrossRef Full Text | Google Scholar

Gaab, N., Gabrieli, J. D. E., Deutsch, G. K., Tallal, P., and Temple, E. (2007). Neural correlates of rapid auditory processing are disrupted in children with developmental dyslexia and ameliorated with training: an fMRI study. Restor. Neurol. Neurosci. 25, 295–310.

PubMed Abstract | Google Scholar

Huber, W., Poeck, K., Weniger, D., and Willmes, K. (1983). Aachener Aphasie Test. Toronto, Zürich: Velag für Psychologie Dr. C.J. Hogrefe Göttingen.

Johnsrude, I. S., Penhune, V. B., and Zatorre, R. J. (2000). Functional specificity in the right human auditory cortex for perceiving pitch direction. Brain 123, 155–163. doi: 10.1093/brain/123.1.155

PubMed Abstract | CrossRef Full Text | Google Scholar

Kagerer, F. A., Wittmann, M., Szelag, E., and von Steinbüchel, N. (2002). Cortical involvement in temporal reproduction: evidence for differential roles of the hemispheres. Neuropsychologia 40, 357–366. doi: 10.1016/s0028-3932(01)00111-7

PubMed Abstract | CrossRef Full Text | Google Scholar

Lajiness-O’Neill, R., Akamine, Y., and Bowyer, S. M. (2007). Treatment effect of Fast ForWord® demonstrated by magnetoencephalography (MEG) in a child with developmental dyslexia. Neurocase 13, 390–401. doi: 10.1080/13554790701851544

PubMed Abstract | CrossRef Full Text | Google Scholar

Levitt, H. (1971). Transformed up-down methods in psychoacoustics. J. Acoust. Soc. Am. 49, 467–477. doi: 10.1121/1.1912375

PubMed Abstract | CrossRef Full Text | Google Scholar

Lewandowska, M., Piatkowska-Janko, E., Bogorodzki, P., Wolak, T., and Szelag, E. (2010). Changes in fMRI BOLD response to increasing and decreasing task difficulty during auditory perception of temporal order. Neurobiol. Learn. Mem. 94, 382–391. doi: 10.1016/j.nlm.2010.08.005

PubMed Abstract | CrossRef Full Text | Google Scholar

Mates, J., van Steinbüchel, N., Wittmann, M., and Treutwein, B. (2001). A system for the assessment and training of temporal-order discrimination. Comput. Methods Programs Biomed. 64, 125–131. doi: 10.1016/s0169-2607(00)00096-1

PubMed Abstract | CrossRef Full Text | Google Scholar

Merzenich, M. M., Jenkins, W. M., Johnston, P., Schreiner, C., Miller, S. L., and Tallal, P. (1996). Temporal processing deficits of language-learning impaired children ameliorated by training. Science 271, 77–81. doi: 10.1126/science.271.5245.77

PubMed Abstract | CrossRef Full Text | Google Scholar

Nowak-Czerwińska, M. (1994). “Test sluchowego różnicowania glosek (TSRG),” in Mowa glośna i pismo. Materialy z Konferencji Naukowej Sekcji Logopedycznej Towarzystwa Kultury Jezyka (Warsaw, Poland: Towarzystwo Kultury Jezyka), 10–13.

Oldfield, R. C. (1971). The assessment and analysis of handedness: the Edinburgh inventory. Neuropsychologia 9, 97–113. doi: 10.1016/0028-3932(71)90067-4

PubMed Abstract | CrossRef Full Text | Google Scholar

Oron, A., Szymaszek, A., and Szelag, E. (2015). Temporal information processing as a basis for auditory comprehension: clinical evidence from aphasic patients. Int. J. Lang. Commun. Disord. 50, 604–615. doi: 10.1111/1460-6984.12160

PubMed Abstract | CrossRef Full Text | Google Scholar

Oron, A., Wolak, T., Zeffiro, T., and Szelag, E. (2016). Cross-modal comparisons of stimulus specificity and commonality in phonological processing. Brain Lang. 155, 12–23. doi: 10.1016/j.bandl.2016.02.001

PubMed Abstract | CrossRef Full Text | Google Scholar

Pöppel, E. (1997). A hierarchical model of temporal perception. Trends Cogn. Sci. 1, 56–61. doi: 10.1016/S1364-6613(97)01008-5

PubMed Abstract | CrossRef Full Text | Google Scholar

Rahne, T., Deike, S., Selezneva, E., Brosch, M., König, R., Scheich, H., et al. (2008). A multilevel and cross-modal approach towards neuronal mechanisms of auditory streaming. Brain Res. 1220, 118–131. doi: 10.1016/j.brainres.2007.08.011

PubMed Abstract | CrossRef Full Text | Google Scholar

Sidiropoulos, K., Ackermann, H., Wannke, M., and Hertrich, I. (2010). Temporal processing capabilities in repetition conduction aphasia. Brain Cogn. 73, 194–202. doi: 10.1016/j.bandc.2010.05.003

PubMed Abstract | CrossRef Full Text | Google Scholar

Strehlow, U., Haffner, J., Bischof, J., Gratzka, V., Parzer, P., and Resch, F. (2006). Does successful training of temporal processing of sound and phoneme stimuli improve reading and spelling? Eur. Child Adolesc. Psychiatry 15, 19–29. doi: 10.1007/s00787-006-0500-4

PubMed Abstract | CrossRef Full Text | Google Scholar

Swisher, L., and Hirsh, I. J. (1972). Brain damage and the ordering of two temporally successive stimuli. Neuropsychologia 10, 137–152. doi: 10.1016/0028-3932(72)90053-x

PubMed Abstract | CrossRef Full Text | Google Scholar

Szelag, E., Dacewicz, A., Szymaszek , A., Wolak, T., Senderski, A., Domitrz, I. et al. (2015). The application of timing in therapy of children and adults with language disorders. Front Psychol. 6:1714. doi: 10.10.3389/fpsyg.2015.01714

PubMed Abstract | CrossRef Full Text | Google Scholar

Szelag, E., Dreszer, J., Lewandowska, M., and Szymaszek, A. (2009). “Cortical representation of time and timing processes,” in Neuronal Correlates of Thinking, eds E. Kraft, B. Guylas and E. Pöppel (Berlin: Springer-Verlag), 185–199.

Szelag, E., Kanabus, M., Kolodziejczyk, I., Kowalska, J., and Szuchnik, J. (2004). Individual differences in temporal information processing in humans. Acta Neurobiol. Exp. (Wars) 64, 349–366.

PubMed Abstract | Google Scholar

Szelag, E., Lewandowska, M., Wolak, T., Seniow, J., Poniatowska, R., Pöppel, E., et al. (2014). Training in rapid auditory processing ameliorates auditory comprehension in aphasic patients: a randomized controlled pilot study. J. Neurol. Sci. 338, 77–86. doi: 10.1016/j.jns.2013.12.020

PubMed Abstract | CrossRef Full Text | Google Scholar

Szelag, E., and Pöppel, E. (2000). Temporal perception: a key to understanding language. Behav. Brain Sci. 23, 52. doi: 10.1017/S0140525X0055239X

PubMed Abstract | CrossRef Full Text | Google Scholar

Szelag, E., and Szymaszek, A. (2014). Test do Badania Rozumienia Mowy u Dzieci i Dorosłych: Nowe Spojrzenie na Zegar Mózgowy. Sopot: GWP.

Szelag, E., Szymaszek, A., Aksamit-Ramotowska, A., Fink, M., Ulbrich, P., Wittmann, M., et al. (2011). Temporal processing as a base for language universals: cross-linguistic comparisons on sequencing abilities with some implications for language therapy. Restor. Neurol. Neurosci. 29, 35–45. doi: 10.3233/RNN-2011-0574

PubMed Abstract | CrossRef Full Text | Google Scholar

Szymaszek, A., Sereda, M., Pöppel, E., and Szelag, E. (2009). Individual differences in the perception of temporal order: the effect of age and cognition. Cogn. Neuropsychol. 26, 135–147. doi: 10.1080/02643290802504742

PubMed Abstract | CrossRef Full Text | Google Scholar

Tallal, P., Merzenich, M., Miller, S., and Jenkins, W. (1998). Language learning impairments: integrating basic science, technology and remediation. Exp. Brain Res. 123, 210–219. doi: 10.1007/s002210050563

PubMed Abstract | CrossRef Full Text | Google Scholar

Tallal, P., Miller, S. L., Bedi, G., Byma, G., Wang, X., Nagarajan, S. S., et al. (1996). Language comprehension in language-learning impaired children improved with acoustically modified speech. Science 271, 81–84. doi: 10.1126/science.271.5245.81

PubMed Abstract | CrossRef Full Text | Google Scholar

Tallal, P., and Piercy, M. (1973). Deficits of non-verbal auditory perception in children with developmental aphasia. Nature 241, 468–469. doi: 10.1038/241468a0

PubMed Abstract | CrossRef Full Text | Google Scholar

Temple, E., Deutsch, G. K., Poldrack, R. A., Miller, S. L., Tallal, P., Merzenich, M. M., et al. (2003). Neural deficits in children with dyslexia ameliorated by behavioural remediation: evidence from functional MRI. Proc. Natl. Acad. Sci. U S A 100, 2860–2865. doi: 10.1073/pnas.0030098100

PubMed Abstract | CrossRef Full Text | Google Scholar

Treutwein, B. (1995). Adaptive psychophysical procedures. Vision Res. 35, 2503–2522. doi: 10.1016/0042-6989(95)00016-x

PubMed Abstract | CrossRef Full Text | Google Scholar

von Steinbüchel, N., and Pöppel, E. (1993). Domains of rehabilitation: a theoretical perspective. Behav. Brain Res. 56, 1–10. doi: 10.1016/0166-4328(93)90017-k

PubMed Abstract | CrossRef Full Text | Google Scholar

von Steinbüchel, N., Wittmann, M., and Szelag, E. (1999). Temporal constraints of perceiving, generating and integrating information: clinical indications. Restor. Neurol. Neurosci. 14, 167–182.

PubMed Abstract | Google Scholar

Wittmann, M., Burtscher, A., Fries, W., and von Steinbüchel, N. (2004). Effects of brain-lesion size and location on temporal-order judgment. Neuroreport 15, 2401–2405. doi: 10.1097/00001756-200410250-00020

PubMed Abstract | CrossRef Full Text | Google Scholar

Wittmann, M., von Steinbüchel, N., and Szelag, E. (2001). Hemispheric specialisation for self-paced motor sequences. Cog. Brain Res. 10, 341–344. doi: 10.1016/s0926-6410(00)00052-5

PubMed Abstract | CrossRef Full Text | Google Scholar

Woo, S. H., Kim, K. H., and Lee, K. M. (2009). The role of the right posterior parietal cortex in temporal order judgment. Brain Cogn. 69, 337–343. doi: 10.1016/j.bandc.2008.08.006

PubMed Abstract | CrossRef Full Text | Google Scholar

Zatorre, R. J., Belin, P., and Penhune, V. B. (2002). Structure and function of auditory cortex: music and speech. Trends Cogn. Sci. 6, 37–46. doi: 10.1016/s1364-6613(00)01816-7

PubMed Abstract | CrossRef Full Text | Google Scholar

Zimmermann, P., and Fimm, B. (2007). Test for Attentional Performance. Herzogenrath: Psytest.

Google Scholar

Keywords: temporal information processing, aphasia, stroke, phoneme discrimination, speech comprehension, temporal treatment

Citation: Szymaszek A, Wolak T and Szelag E (2017) The Treatment Based on Temporal Information Processing Reduces Speech Comprehension Deficits in Aphasic Subjects. Front. Aging Neurosci. 9:98. doi: 10.3389/fnagi.2017.00098

Received: 27 September 2016; Accepted: 28 March 2017;
Published: 11 April 2017.

Edited by:

P. Hemachandra Reddy, Texas Tech University Health Sciences Center, USA

Reviewed by:

Tadeusz Stanislaw Galkowski, SWPS University of Social Sciences and Humanities, Poland
Justyna Maria Skolimowska, The Maria Grzegorzewska Univeristy, Poland

Copyright © 2017 Szymaszek, Wolak and Szelag. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Elzbieta Szelag,

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.