The segment-to-frame association in word reading: early effects of the interaction between segmental and suprasegmental information

Sulpizio, Simone; Job, Remo

doi:10.3389/fpsyg.2015.01612

ORIGINAL RESEARCH article

Front. Psychol., 20 October 2015

Sec. Psychology of Language

Volume 6 - 2015 | https://doi.org/10.3389/fpsyg.2015.01612

This article is part of the Research TopicBridging Reading Aloud and Speech ProductionView all 13 articles

The segment-to-frame association in word reading: early effects of the interaction between segmental and suprasegmental information

Simone Sulpizio^1,2*

Remo Job¹

¹Department of Psychology and Cognitive Science, University of Trento, Trento, Italy
²Fondazione Marica De Vincenzi ONLUS, Trento, Italy

In four reading aloud experiments we investigated the operations occurring at the level of the phonological buffer by manipulating stress and phoneme information. In all experiments we adopted a masked priming paradigm with three-syllable Italian word targets. Experiments 1 and 2 tested the effect of pure segmental (e.g., fe%%%% – FEcola) and pure suprasegmental (CInema – FEcola) overlap, respectively. Experiments 3 and 4 tested the joint manipulation of segmental and suprasegmental information, by using prime-target pairs that shared the first syllable and did or did not share their stress pattern (e.g., FEgato – FEcola vs. feNIce – FEcola). The results showed that both segmental and suprasegmental primes affect reading at an abstract phonological level. Moreover, the joint manipulation of stress and phonemes showed an asymmetric pattern for different stress patterns, suggesting that the phonemic and the stress systems address the articulation planning through a process that starts as soon as the relevant information about the to-be-planned unit is active.

Introduction

Reading aloud involves computing the sound of a word from its visually presented form. In order to carry out such process the execution of multiple operations is required, e.g., perceiving the written stimulus, computing the phonological code, and converting it into a speech signal. Giving its specific nature, reading aloud thus has similarities and differences with both the process of (silent) reading and the process of speech production, the former being about getting from print to meaning and the latter being about getting from concepts to sounds. Since reading aloud may be construed as a print-to-sound mapping process, a key issue for such a process is the understanding of how a phonological code is translated into a sequence of articulatory gestures that correspond to the word’s sounds. Despite their importance, the operations involved in the planning and execution of articulation in reading aloud have not been investigated with the same fervor that word recognition or lexical access received. As a consequence, little empirical evidence is available on how readers perform the two steps assumed to follow, i.e., the lexical retrieval and/or the orthography-to-phonology mapping, the phonological encoding – that is the building of a sequence of well-formed phonological syllables – and the phonetic encoding – that is the computation of the phonetic-articulatory gestures of the to be uttered stimulus (Levelt et al., 1999). In most computational models of reading aloud phonological and phonetic encoding are implemented as an oversimplified set of operations (see, e.g., Rastle and Coltheart, 2000; Coltheart et al., 2001; Arciuli et al., 2010; Perry et al., 2010).

Recent empirical work has shown evidence for a double process at the level of phonological encoding in reading. Similarly to what happens in word production, reading polysyllabic words implies retrieving both segmental (i.e., word sounds) and suprasegmental information (i.e., stress) and these two types of information may be computed separately (Colombo and Zevin, 2009; Sulpizio et al., 2012a,b; Sulpizio and Job, 2013). Ascribing the computation of stress and the computation of phonemes to two separate mechanisms has important consequences on the structure of phonological and phonetic encoding since the assembling of the phonological unit will require the reader to carry out at least three operations: (a) activating the word’s segments, (b) activating the stress pattern, and (c) assembling segmental and suprasegmental information. Data on (c) are lacking, but some evidence is available for both (a) and (b).

An insight into the phonological encoding in reading has been provided by the masked onset priming effect (henceforth MOPE; Forster and Davis, 1991; see Grainger and Ferrand, 1996): target words (e.g., sink) are named faster when preceded by a masked prime with the same initial phoneme (e.g., save), than by a prime with a different initial phoneme (e.g., ball). The main account of the MOPE – the speech planning account (Kinoshita, 2000) – assumes that the effect has a serial nature and affects the segment-to-frame association (Kinoshita, 2000; Kinoshita and Woollams, 2002; Malouf and Kinoshita, 2007; Dimitropoulou et al., 2010; but see Mousikou et al., 2010). Such process allows for the active phonological segments to be assigned to an abstract frame – i.e., the word metrical frame – specifying the number of syllables and the stress pattern of the word (e.g., for the word FEcola ‘starch,’ the metrical word is ‘σ σ σ). The MOPE was also found by Schiller (2004) with a slightly different masked priming paradigm, in which participants had to read aloud Dutch words (e.g., banaan, ‘banana’) under two conditions: when preceded by a prime consisting of an onset-related word embedded in a sequence of symbols (e.g., %%balans%%, ‘balance’) and when preceded by an onset-related sequence prime that consisted of one or two letters embedded in a sequence of symbols (e.g., %%ba%%%%%%). Responses to targets were faster in both onset-related conditions than in the control, all symbols condition (%%%%%%%%) and Schiller suggested that the pre-activation of congruent phonological segments by the prime facilitates the phonological encoding of the target (see, e.g., Schiller and Kinoshita, 2007). Taken together, these findings offer support for a stage of phonological encoding in the reading system; during this stage, after having retrieved/computed word’s phonemes and stress, the reader assembles the phonological word through a rightward serial process that associates the phonological segments to a metrical frame. The resulting unit is then used to address the articulatory system (see Levelt et al., 1999 for a detailed description of the phonological encoding in speech production).

With regard to stress, some studies have investigated stress assignment to polysyllabic words addressing the question whether the computation of stress may be independent of the computation of segmental information. The results have been mixed. In a series of implicit form-priming experiments – participants first learn pairs of words (e.g., meer-water ‘lake-water’), and then had to produce the second word (e.g., water) of the pair in response to the presentation of the first (e.g., meer) – Roelofs and Meyer (1998) manipulated the stress pattern of the to be produced words (all having either the same or different stress) and did not find any stress priming effect. However, adopting different priming methodologies (all involving visible primes), some reading aloud studies have shown that the metrical structure of a word may be primed independently from its segmental content, and this is possible both when stress is assigned to pseudowords and when it is lexically retrieved (Colombo and Zevin, 2009; Sulpizio et al., 2012a,b). Possible explanations for the divergent results are offered in the General Discussion, but for the time being we assume that computation of stress and segmental information are to some extent independent. To illustrate this issue, we may refer to the Sulpizio et al.’s (2012b) study: readers were presented with prime-target word pairs that did or did not share the stress pattern (e.g., TESsera – BUfala, ‘card’ – ‘hoax’ vs. cuGIno – BUfala¹, ‘cousin’ – ‘hoax’) and were found to be faster in reading the targets when preceded by a congruent stress prime, than when preceded by an incongruent-stress prime. The finding invites the conclusion that readers have an abstract representation of stress, quite independent from the segmental material and that the representation of stress is involved in the segment-to-frame association and in the articulatory planning of the stimulus, thus affecting target processing.

While phonemic computation and stress assignment are to some extent handled by autonomous systems, they need to interact during processing. Specifically, articulation requires a segment-to-frame association, in which the system associates the computed phonological segments to a metrical frame, and such a well-formed phonological unit will allow articulation (Dell, 1986, 1988; Levelt et al., 1999).

The speech production literature may help to shed light on the functioning of the segment-to-frame association in word reading. Since both reading aloud and speech production require the construction of a phonological unit and its conversion into articulatory programs, they share (at least in part) the stages of processing finalized to encode the phonological word and to use such a phonological word to produce the phonetic realization of the stimulus (Roelofs, 2004).

To investigate the processing of segment-to-frame association and phonological-to-phonetic mapping in word reading we run four experiments in Italian capitalizing on the fact that in such language stress is nor graphically marked neither solely determined by orthographic structure² and that, therefore, any particular word’s stress pattern can only be reliably established through lexically stored information. Our results will be then generalizable to the other polysyllabic languages such as English, with a similar stress system.

Although distributional cues allow Italian readers to assign stress to pseudowords to some extent (Colombo and Zevin, 2009; Sulpizio et al., 2013), such cues play no role in word reading (Paizi et al., 2011; Sulpizio and Colombo, 2013). The fact that in Italian word stress is lexically based may be helpful to investigate phonological encoding: since there is no algorithmic procedure to assign a stress to a stimulus, the metrical structure has to be lexically retrieved and then combined with the segmental material to shape the phonological word, which will be then used by the system to address articulatory programs.

Experiments 1 and 2 investigated the MOPE and the stress priming effects by means of a masked priming paradigm, with a set of tightly controlled stimuli, trying to establish whether the two effects are facilitatory or inhibitory. Moreover, with regard to the MOPE, the use of a pure segmental prime (e.g., fe%%%% – FEcola, ‘starch’) allowed us to test whether the activation of the first phonological segments of the word automatically activates suprasegmental information as the masked segment (e.g., <fe>) might activate either a syllabic unit – which may be phonetically specified for stress (i.e., as stressed or unstressed) – or only its segmental constituents (i.e., /f/ and /e/).

We adopted the masked priming paradigm also in Experiments 3 and 4 but the aim here was to test the effect of the joint manipulation of segmental and suprasegmental information. Thus, for each prime-target pair, the prime either shared both the initial phonemes and the stress pattern with the target (e.g., FEgato – FEcola, ‘liver’ – ‘starch’), or shared the initial phonemes with the target but had a different stress pattern (e.g., feNIce – FEcola, ‘phoenix’ – ‘starch’). In the control condition, the prime-target pair shared neither segmental nor suprasegmental information, the prime being composed of a string of symbols (%%%%%%). The manipulation is particularly interesting for the fact that Italian three-syllable words have two main stress patterns (Thornton et al., 1997): antepenultimate stress (i.e., the first syllable bears stress, e.g., TAvolo ‘table’), and penultimate stress (i.e., the second syllable bears stress, e.g., coLOre ‘color’). Although their distribution differs – 80% of three-syllable words bear penultimate stress and 18% bear antepenultimate stress³ – reading of words bearing the dominant penultimate stress pattern is not faster, and the two patterns are assumed to be stored in the phonological lexicon (Burani and Arduino, 2004; Paizi et al., 2011). Thus, a further question we may ask is whether the prime-target manipulation affects similarly penultimate- and antepenultimate-stress targets. For the manipulation we proposed – prime-target pairs sharing both initial phonemes and stress vs. prime-target pairs sharing initial phonemes but not stress – we may sketch the following predictions: congruent primes should facilitate, and incongruent primes should inhibit, target articulation. The facilitation would be brought about by the prime pre-activating either segments and/or stress (cf. Roelofs and Meyer, 1998) congruent with the target, while in the incongruent condition the stress mismatch would be enough to delay the articulation. In fact, if we assume – according to current computational models of polysyllable word reading (Perry et al., 2010) – that readers do not start articulation until stress has been fully activated – since only determining which syllable is stressed guarantees correct performance –, we may expect that the incongruency at the suprasegmental level may be sufficient to delay the articulation, irrespective of any overlap at the segmental level. Moreover, since previous stress priming studies have shown that stress priming effects seem not to be modulated by the word stress position (Sulpizio et al., 2012b), no difference is expected between penultimate- and antepenultimate-stress targets.

Experiment 1

In Experiment 1 we tested the MOPE in a reading aloud experiment with Italian penultimate- and antepenultimate-stress words as targets. We adopted the paradigm proposed by Schiller (2004; see also Schiller and Kinoshita, 2007), in which the target word (e.g., FEcola, ‘starch’) is preceded by an onset-related or -unrelated sequence (e.g., fe%%%%; mi%%%%). In this way, we are able to exclude any effect of suprasegmental material that, in case of a whole word prime (as, e.g., FEgato ‘liver’), might be elicited by the activation of stress information. In addition, in order to establish the direction of the effect we also included a control condition that did not involve orthographic information.

The aim of Experiment 1, however, was not only to replicate previous studies showing that onset-related primes facilitate the computation of target phonology during reading aloud, but also to test whether a pure segmental prime may also activate suprasegmental information. In the onset-related condition, prime and target shared the first syllable as they were segmentally identical; however, the target syllable was either stressed (e.g., FEcola ‘starch’) or unstressed (e.g., feNIce ‘phoenix’), and thus the prime syllable could or could not be congruent with the target first syllable for stress pattern. This allows us to propose two alternative predictions: first, if the prime affects an abstract phonological level of computation, such as the segment-to-frame association, then readers should be faster reading a target word in the onset-related condition than in either the onset-unrelated condition or the control condition (Schiller, 2004), and this should be true for both antepenultimate and penultimate stress words. Alternatively, if the prime affects the phonetic level of target computation – by activating a phonetic syllabic unit containing also information about stress – then we should expect different results for penultimate- and antepenultimate-stress targets. The reason for this is that penultimate-stress targets start with an unstressed syllable whereas antepenultimate-stress targets start with a stressed syllable. Thus, if the prime activates a stressed syllable, it might facilitate antepenultimate-, but not penultimate-stress targets; differently, if the prime activates an unstressed syllable, it might facilitate penultimate-, but not antepenultimate-stress targets.

Method

Participants

Twenty-four students (six males, mean age: 23.33; SD: 4.73) from the University of Trento took part in the experiment. They received course credit for their participation. All participants were Italian native speakers with normal or corrected-to-normal vision. This and all the following experiments were carried out in accordance with the recommendations of the University of Trento ethics committee.

Materials

Targets were two sets of 24 three-syllable words each. One set comprised penultimate-stress words and the other antepenultimate-stress words. Words were selected from the CoLFIS database (Bertinetto et al., 2005) and were matched on: frequency, orthographic neighborhood size, orthographic neighbors’ summed frequency, and bigram frequency (Table 1). Words in the two sets were also matched on their first syllable, i.e., for each word in a set there was a word in the other set starting with the same syllable as, e.g., FEcola ‘starch’ and feRIta ‘wound.’ All words were six letters long and had the same CVCVCV syllabic structure. All stimuli are listed in the Appendix.

TABLE 1

TABLE 1. Summary statistics: mean (and standard deviation) for target words used in Experiments 1–3.

Each target (e.g., FEcola ‘starch’) was preceded by three different primes: (i) a control condition, in which the prime consisted of a string of symbols (%%%%%%); (ii) an onset-related condition, in which prime and target share the first syllable (e.g., fe%%%%); (iii) an onset-unrelated condition, in which prime and target differ in the first syllable (e.g., mi%%%%). Three different lists were created, and each target appeared once in each list in a different prime condition. Within each list the three prime conditions appeared the same number of times.

Procedure

Participants were tested individually. They were instructed to read the targets aloud as quickly and accurately as possible. No information was given about the presence of the primes, which was revealed only after the experiment.

The experiment was run using E-Prime software (Psychology Software Tools, Pittsburgh, PA, USA). Each target started with a fixation cross, in the center of the screen, for 400 ms. The fixation cross was followed by a forward mask of hash marks (#), which was displayed for 500 ms in the center of the screen. The prime was then presented for 50 ms in lower-case letter, in the same location, followed by the target word, displayed in upper-case letters in the same position as the prime. The target remained on the screen until the participant began to read or for a maximum of 1,500 ms. A voice key connected to the computer measured reaction times (RTs) in ms from the onset of pronunciation.

The inter-stimulus interval was 1,500 ms. A short practice session preceded the experiment.

Each participant received all three lists, each list in a separate block separated by a short interval. Each block contained only one token of target and an equal number of the three prime-target pairs; the order of blocks was counterbalanced across participants and the order of prime-target pairs was randomized within each block. The experimenter noted the naming errors or apparatus failures on the fly.

Results

Responses shorter than 200 ms and invalid trials due to technical failures accounted for 1.3% of all data points and were discarded from the analyses; outliers (0.9% of all data points) were identified and removed following the Van Selst and Jolicoeur’s (1994) procedure. Three items (PAtina ‘patina,’ coLEra ‘cholera,’ Mitilo ‘mussel,’ all above 30% of errors) were also excluded from analyses due to the very high percentage of errors participants made. Naming errors were few (2.4% of all data points) and were not analyzed. Naming times were analyzed using mixed-effects models (Baayen et al., 2008). The models were fitted using the lmer function in R software. The models included prime type (related, unrelated, and control) and stress of the target (penultimate and antepenultimate) as fixed factors⁴. For the random factors, a maximal random structure approach was used (by participants and by items random intercepts and slopes; see Barr et al., 2013). The analysis started with a full factorial model including the main effects and the two-way interaction. The model was progressively simplified by removing the variables that did not significantly contribute to the goodness of fit of the model. Variables were evaluated one by one on the basis of likelihood ratio tests: those whose exclusion did not decrease significantly the model goodness of fit were removed from the analysis. Statistics of the best model are reported. Statistical significance of the fixed parameters was evaluated using the MCMC procedure, sampling 10,000 times (Baayen et al., 2008). Results are reported in Figure 1.

FIGURE 1

FIGURE 1. Mean reading times for correct responses by condition in Experiment 1.

The full factorial model revealed that the prime type by stress of the target interaction was not significant, and it was dropped from the analysis as it did not significantly increase the model goodness of fit (χ² = 2.47, p > 0.2). The reduced model revealed that prime type significantly affected reading of target words, with slower reading times for targets preceded either by unrelated primes (β = 14.46, SE = 3.37, t = 3.61, p < 0.001) or by control primes (β = 10.01, SE = 3.61, t = 2.76, p = 0.005) than for targets preceded by related primes. The unrelated and the control conditions did not differ (t = 1.22, p > 0.2). The effect of target stress was not significant (t = -1.25, p > 0.2).

Discussion

The results of Experiment 1 show a clear effect of the segmental overlap on reading times: readers were facilitated in reading a target word in the onset-related condition in comparison to both the onset-unrelated and the control condition. The pattern goes in the same direction for penultimate- and antepenultimate-stress targets, suggesting similar processing in the computation of segmental information for both types of words.

The pattern we obtained is entirely compatible with Schiller’s (2004; Schiller and Kinoshita, 2007) explanation: in the onset-related condition the prime pre-activates the initial phonological segments of the target at the level of phonological encoding. According to such a view, the active units are phonological segments and not phonetically specified syllabic units.

The analogous pattern obtained for penultimate- and antepenultimate-stress target supports this claim. In our experiment, the congruent prime always coincided with the first syllable of the target and, thus, the prime might have activated a syllabic unit rather than two phonological segments. However, in Italian a syllabic unit is realized in one of two different phonetic versions, i.e., as stressed or unstressed. Thus, the prime could have affected the target at a phonetic level, by activating a phonetically specified syllabic unit, which would also activate information about stress. This being the case, a different pattern for penultimate- and antepenultimate-stress targets would be expected since pre-activation of stressed syllables would facilitate reading antepenultimate-stress targets (which start with a stressed syllable) but not penultimate-stress targets (which start with an unstressed unit) and pre-activation of unstressed syllables would lead to the opposite pattern. The results of our experiment showing a parallel pattern for both penultimate- and antepenultimate-stress words suggest that the prime exerts its effect at an abstract phonological level, with a benefit for onset-overlapping targets during the word phonological encoding (Schiller, 2004)⁵.

In Experiment 2 we investigated the effect of suprasegmental priming on the phonological encoding of the word using the same set of target words of Experiment 1.

Experiment 2

The aim of the present experiment was to establish whether the masked stress priming is effective in generating a stress priming effect, and whether such an effect is facilitatory or inhibitory in nature. The stress priming effect reported by previous studies has never been tested against a control condition (Colombo and Zevin, 2009; Sulpizio et al., 2012a,b), with the consequence that it is still unclear whether priming the metrical structure of a word facilitates or inhibits reading it aloud. Moreover since all aforementioned studies adopted a visible priming technique – in which readers explicitly processed the prime – it cannot be excluded that the effect of stress priming they reported may have a strategic component. To rule out this hypothesis, we used the masked priming paradigm with prime-target pairs that differed at the segmental level but did or did not share the metrical structure. In this way, we would be able to assess whether primes sharing or not sharing stress with the targets (i.e., the congruent vs. incongruent condition) affect target reading, with respect to a non-linguistic control condition, over and above any effect due to the prime and target mismatch at the segmental level.