Edited by: Adrian Garcia-Sierra, University of Connecticut, United States
Reviewed by: Naomi Yamaguchi, Université de la Sorbonne Nouvelle Paris III, France; Carl Dunst, Orelena Hawks Puckett Institute, United States; Kyle Danielson, University of Toronto, Canada
This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
When speaking to infants, parents typically use infant-directed speech, a speech register that in several aspects differs from that directed to adults. Vowel hyperarticulation, that is, extreme articulation of vowels, is one characteristic sometimes found in infant-directed speech, and it has been suggested that there exists a relationship between how much vowel hyperarticulation parents use when speaking to their infant and infant language development. In this study, the relationship between parent vowel hyperarticulation and phonetic complexity of infant vocalizations is investigated. Previous research has shown that on the level of subject means, a positive correlational relationship exists. However, the previous findings do not provide information about the directionality of that relationship. In this study the relationship is investigated on a conversational turn level, which makes it possible to draw conclusions on whether the behavior of the infant is impacting the parent, the behavior of the parent is impacting the infant, or both. Parent vowel hyperarticulation was quantified using the vhh-index, a measure that allows vowel hyperarticulation to be estimated for individual vowel tokens. Phonetic complexity of infant vocalizations was calculated using the Word Complexity Measure for Swedish. Findings were unexpected in that a negative relationship was found between parent vowel hyperarticulation and phonetic complexity of the immediately following infant vocalization. Directionality was suggested by the fact that no such relationship was found between infant phonetic complexity and vowel hyperarticulation of the immediately following parent utterance. A potential explanation for these results is that high degrees of vowel hyperarticulation either provide, or co-occur with, large amounts of phonetic and/or linguistic information, which may occupy processing resources to an extent that affects production of the next vocalization.
This study investigates the relationship between parents’ infant-directed speech (IDS) and the developing speech production of the infant. In terms of IDS, the focus lies on the specific characteristic of vowel hyperarticulation (VH) often—but not always—found in IDS (e.g.,
It has been demonstrated that at least some aspects of IDS are part of a feedback loop between parent and infant, in which parents respond to infants’ in-the-moment reactions to their speech by amplifying or attenuating certain IDS characteristics. The pitch of mothers’ IDS to their four-month-old infants can be manipulated by interrupting this feedback loop (
It has also been reported that mothers respond differently to different types of infant vocalizations; for example, more mature infant vocalizations elicit a vocal response from the mother more frequently than less mature infant vocalizations (
When it comes to VH, mothers’ articulation of vowels was impacted when the feedback loop was interrupted as they interacted with their six- to seven-month-old infants (
To summarize, infant behavior—including vocalizations—influences the specific realization of parent IDS in the moment. This has been shown for a number of IDS characteristics, including VH. VH is a result of spontaneous communicative adaptation to the perceptual and linguistic demands of the interlocutor (
The linguistic, prosodic, and articulatory modifications that parents use when speaking IDS to their infants are thought to impact both infant language development in the long term and infant language production and perception in the short term. For example, overall amount of IDS in everyday speech input at seven to eleven months is positively correlated with language outcomes at five years of age (
Parent social and vocal behavior has also been shown to influence infant vocal behavior of the child. For example, amount of IDS in a one-on-one setting correlates with amount of infant speech output (
When it comes to the specific characteristic of VH in parents’ IDS, it has been shown to predict later vocabulary size (
To summarize, parent behavior—both in terms of IDS realization and temporally contingent social feedback—influences infant language, either long term and/or in the moment. When it comes to VH in parent IDS and PC of infant vocalizations, a positive correlational relationship between them has been shown (Marklund et al., accepted), but any potential momentary impact is yet to be established.
This study focuses on the relationship between parent VH and PC of infant vocalizations. A positive relationship between the two has previously been established on a subject level (Marklund et al., accepted), leaving unanswered, and highlighting, the question of directionality. Does the phonetic maturity of infant vocalization influence the articulatory behavior of the parent, and/or does the clarity of parents’ articulation influence the vocal behavior of the infant? Based on previous findings reviewed above, both explanations are plausible. Attempting to shed light on this issue, the present study focuses on the relationship between parent VH and infant PC on a turn level. The VH of parent utterances immediately preceding and following infant vocalizations is calculated and related to the PC of the vocalization.
This study uses vhh-index, a measure of VH that normalizes across vowel type and speaker, and thus makes it possible to estimate and compare VH of individual vowel tokens. This measure has been used in a previous study on VH in Swedish IDS to 12-month-olds, where it was motivated from phonetic theory and compared to traditional measures of VH for validation purposes (
The measure of infant vocalization maturity used in the present study is the Word Complexity Measure for Swedish (WCM-SE;
Nineteen infants and their parents participated in this study (9 girls, 10 boys; 12 mothers, 7 fathers). At the time of recording the material, the infants were approximately 12 months old (mean = 12.0, range = 11.5–12.3, SD = 0.2). All infants were born full term (within three weeks of due date) and monolingual (defined as both parents speaking only Swedish with the infant). The majority of the parents (
Participants were selected for inclusion in the present study if (a) there was a recording from the 12-month visit, (b) the infant was monolingual, and (c) there was sufficient ADS material (recorded at the 27-month visit, from the same parent as in the 12-month visit) to include in the VH analysis. The study has been approved by the Regional Ethics Review Board (2015/63-31). For the original longitudinal study, recruitment was conducted
Audio and video recordings of parent–infant interaction were made at Stockholm Babylab, the Phonetics Laboratory, Stockholm University. Parent–infant dyads (one parent and the infant) were recorded in a comfortable carpeted studio equipped with age-appropriate furniture and toys. Video and audio recordings were made with three cameras (Canon XA10) mounted on the walls of the studio to capture all angles of the parent interacting with the infant. A fourth camera (GoPro Hero3), attached to the parent’s chest, enabled video uptake of the infant facing the parent. To capture high-quality audio, an additional three microphones were used. Omnidirectional wireless lavalier microphones (Sennheiser EW 100 G2) were mounted on parent and infant, and one room microphone (AKG SE 300 B) was mounted on a high shelf. In the present study, audio from the two lavalier microphones was used, since this enables high-quality close-up recordings of the parent’s speech and the infant’s vocalizations with minimal interference from the other speaker.
Each infant was recorded together with the parent for approximately 10 min, providing the infant vocalizations and the parent IDS material for the current study. The experimenter instructed parents to interact, play, and talk with their infant as they typically would at home. After instructions and equipment arrangements, the experimenters left the studio, closed the door, and monitored the session from the adjacent control room.
Estimation of VH in parent’s IDS was performed as a part of a previous study (
VH was quantified using a novel measure, the vhh-index, which entails speaker and vowel normalization, so that VH can be estimated for each individual vowel token (
Infant vocalizations were transcribed in ELAN 5.8-5.9 (
The protocol entailed transcribing all sounds present in the Swedish phoneme inventory as described in
The Swedish consonants used in the transcription. Consonants not recognizable as any of those phonemes were marked as “C.” Adapted from IPA Chart from International Phonetic Association.
Bilabial | Labiodental | Dental | Retroflex | Alveolar | Palatal |
Velar | Uvular | Glottal | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Plosive | p | b | t | d | ʈ | ɖ | k | ɡ | ||||||||||
Nasal | m | n | ɳ | ŋ | ||||||||||||||
Trill | r | ʀ | ||||||||||||||||
Tap/flap | ɾ | |||||||||||||||||
Fricative | f | v | s | ʂ | ʐ | ʝ | ʁ | h | ||||||||||
Approximant | ɹ | |||||||||||||||||
Lat. approximant | l | ɭ |
The Swedish vowels used in the transcription. Vowels not recognizable as any of those phonemes were marked as “V.” Adapted from IPA Chart from International Phonetic Association.
All infant vocalizations were transcribed. They could consist of words, syllables, babbling, or isolated speech sounds. Laughter, crying, fuzzing, coughing, effort sounds, and vegetative sounds such as breathing, sneezes, and hiccups were not transcribed. Overlapping speech and distorted sounds were excluded. Boundaries between vocalizations were based on silence (pause or breath) and thus not dependent on interpretation of lexical content or on other linguistic information such as intonation.
Two recordings were annotated by both annotators independently to check inter-transcriber agreement. Percentage of matching characters for each transcribed vocalization was compared. Characters were IPA consonants, IPA vowels (treated as a single category in the inter-rater comparison, unless they were long, front, and rounded, i.e., relevant to the WCM-SE measure, in which case their vowel quality was taken into account), syllable markers, stress markers, and vocalization boundary markers. Inter-transcriber agreement of which vocalizations were transcribed was 70%, and out of those the average transcription inter-transcriber agreement was 78%.
To operationalize complexity in infant vocalizations, the WCM-SE was used (
The WCM-SE measure as implemented in the present study, based on
Domains | Complexity parameter | N points |
---|---|---|
Word patterns | >2 syllables | 1 per vocalization |
Non-initial stress | 1 per vocalization | |
Syllable structures | Word-final consonant | 1 per vocalization |
Consonant cluster |
1 per occurrence | |
Sound classes | Velar consonant [k], [ɡ], [ŋ], [ɧ] | 1 per occurrence |
Liquid [l], [ɭ], [ɹ] | 1 per occurrence | |
Fricative |
1 per occurrence | |
Voiced fricative [v], [ʐ], [ʁ], [ʝ] | 1 per occurrence | |
Trill [r], [ʀ] | 3 per occurrence | |
Long, front, rounded vowel [y], [ø], [ʉ̟] | 1 per occurrence |
Examples of transcriptions of infant vocalizations, and WCM-SE calculations for the vocalizations. Syllable onsets are denoted by “.” and stress by “ˈ”.
Example transcription | Complexity parameters | Points |
---|---|---|
.ə.ˈbm | Non-initial stress, word-final consonant, consonant cluster | 3 |
.ˈC.C | Word-final consonant | 1 |
.ˈtɛ.kɛ.tæ | >2 syllables, velar consonant [k] | 2 |
.ˈhɪ.Cɪ.V | >2 syllables, fricative [h] | 2 |
.hɔŋ.ɡɛ.ˈjɛ | >2 syllables, non-initial stress, velar consonants [ŋ] and [ɡ], fricative [h] | 5 |
ˈV.V | – | 0 |
Data consist of infant vocalization WCM-SE score and vhh-index measures of the preceding and following parent utterances. Cases where an infant vocalization was preceded or followed by another infant vocalization were excluded. Since the vhh-index is novel, and token-based measures of VH have not been used previously, a number of vhh-index measures were included for exploratory purposes. All VH measures were calculated on the level of utterances, that is, they are based on all vowels for which vhh-index could be calculated within a single parent utterance. The measures were mean vhh-index, max vhh-index, vhh-index range, hyperarticulation ratio (number of vowels with vhh-index > 50 over total number of vowels), weighted mean vhh-index, and weighted max vhh-index. The weighted mean and max vhh-index entails multiplying the vhh-index with the duration of the vowel, to give more weight to longer vowels and less weight to shorter vowels. The purpose of weighting the vowel tokens like this is to reflect their relative salience in the speech signal; a vowel with long duration entails longer exposure to its particular spectral properties than a vowel with shorter duration.
The analyses were performed using linear mixed models. Linear mixed models are conceptually similar to regular linear regression models, except that they also account for within-subject variation, essentially allowing the model to disregard between-subject variation in favor of variation related to the independent variable.
Two linear mixed effects regressions were calculated for each of the measures of vhh-index on utterance level (mean, max, range, ratio, weighted mean, and weighted max), one on data points in which the parent utterance preceded the infant vocalization (parent–infant turns), and one on data points in which the parent utterance followed the infant vocalization (infant–parent turns). In the case of parent–infant turns, the predicted variable was infant vocalization WCM-SE score, and the fixed effects variable was the parent utterance vhh-index measure. In the case of infant–parent turns, the predicted variable was the parent utterance vhh-index measure, and the fixed effects variable was infant vocalization WCM-SE score. In both cases, random variable was participant, that is, parent–infant dyad (intercept only).
Data points with infant vocalizations that were outliers (thresholds: Q ± 3*IQR) in terms of WCM-SE score were removed (
For mean vhh-index, max vhh-index, vhh-index range, and vhh-index ratio, no significant results were found (
Summary of the fixed effects of the analysis of the measure mean vhh-index in parent–infant turns
Est. | SE | ||
---|---|---|---|
Intercept | 1.36 | 0.25 | 5.55 |
Parent utterance mean vhh-index | <−0.01 | < 0.01 | −0.27 |
Intercept | 72.8 | 3.37 | 21.6 |
Infant vocalization WCM-SE score | 0.41 | 1.18 | 0.34 |
Summary of the fixed effects of the analysis of the measure max vhh-index in parent–infant turns
Est. | SE | ||
---|---|---|---|
Intercept | 1.29 | 0.24 | 5.47 |
Parent utterance max vhh-index | <0.01 | <0.01 | 0.38 |
Intercept | 106.8 | 6.75 | 15.8 |
Infant vocalization WCM-SE score | 1.66 | 2.07 | 0.80 |
Summary of the fixed effects of the analysis of the measure vhh-index range in parent–infant turns
Est. | SE | ||
---|---|---|---|
Intercept | 1.28 | 0.22 | 5.82 |
Parent utterance vhh-index range | <0.01 | <0.01 | 0.93 |
Intercept | 61.4 | 6.15 | 9.98 |
Infant vocalization WCM-SE score | 1.35 | 2.13 | 0.64 |
Summary of the fixed effects of the analysis of the measure vhh-index ratio in parent–infant turns
Est. | SE | ||
---|---|---|---|
Intercept | 1.35 | 0.26 | 5.28 |
Parent utterance vhh-index ratio | −0.03 | 0.21 | −0.16 |
Intercept | 0.68 | 0.02 | 31.0 |
Infant vocalization WCM-SE score | <−0.01 | 0.01 | −0.86 |
Summary of the fixed effects of the analysis of the measure weighted mean vhh-index in parent–infant turns
Est. | SE | ||
---|---|---|---|
Intercept | 1.49 | 0.22 | 6.64 |
Parent utterance weighted mean vhh-index | −0.02 | 0.01 | −2.24* |
Intercept | 9.13 | 0.65 | 14.0 |
Infant vocalization WCM-SE score | −0.31 | 0.30 | −1.04 |
Summary of the fixed effects of the analysis of the measure weighted max vhh-index in parent–infant turns
Est. | SE | ||
---|---|---|---|
Intercept | 1.49 | 0.23 | 6.49 |
Parent utterance weighed max vhh-index | −0.01 | 0.01 | −2.06* |
Intercept | 15.3 | 1.22 | 12.6 |
Infant vocalization WCM-SE score | −0.13 | 0.52 | −0.26 |
The results show a negative relationship between parent VH in IDS to their 12-month-old infants and the PC of infant vocalizations on a turn level; specifically, the more hyperarticulated a parent utterance is, in terms of mean and max vhh-index weighted for vowel duration, the less phonetically complex the following infant vocalization is, in terms of WCM-SE score.
This is a somewhat surprising finding since previous findings on the same data show a positive relationship between parent VH and PC of infant vocalizations on the level of individual dyads (Marklund et al., accepted). Based on previous findings, it was expected that if any relationship was found, it would be a positive one, that is, a high degree of VH in the parent utterance would be followed by high PC in the infant vocalization, or high PC in the infant vocalization would be followed by a high degree of VH in the parent utterance.
In the previous study, the positive correlation between infants’ WCM-SE scores and parents’ VH (measured in vowel space area) could indicate that parents’ articulation impact infants’ production and/or that infants’ production impact parents’ articulation, or that a third, underlying variable mediates the relationship. For example, it is possible that articulatory adaptiveness is a specific realization of a general communicative adaptiveness, and that other components of this general adaptiveness may be the driving factors for any potential benefit for language development, rather than VH in itself.
In the present study, both the direct impact of VH on a turn level and the directionality of any potential effect were investigated. The negative relationship that was found between infant WCM-SE score and parent vhh-index suggests that there is a direct, in-the-moment causality between the two, and directionality of the effect was indicated by the fact that the effect was only significant in parent–infant turns.
Had the effect been significant in both directions, one potential interpretation could have been that parents are responsive and use a high degree of VH to support the linguistic needs of infants with less mature vocalizations overall. However, previous studies have shown that parent VH is typically attenuated rather than increased in interaction with atypically developing infants or infants at risk for developmental delays (
There is no reason to believe that an infant would try less hard in their production as a direct response to high degrees of VH in the preceding parent utterance. However, high levels of hyperarticulation in the input might mean more or novel phonetic information to process for the infant. This could potentially leave less energy or focus for the infant in regard to the next task, that is, production of the next vocalization. This is in line with the
There are limitations to this study that should be acknowledged. The study has a relatively small sample size, although in line with previous similar studies (e.g.,
There are a few things to take into consideration with regards to the complexity measure of the infant vocalizations, the WCM-SE score (
In addition, the study uses a method to quantify VH in the parent speech, the vhh-index, which has only recently been developed (
Furthermore, high
There is also the possibility that the fact that recordings were made in a laboratory impacted the way that parents and infants interacted. However, previous research has shown both that young children speak similarly in different contexts such as laboratory setting or at home (
The unexpected findings are difficult to interpret and explain in the light of existing knowledge, and one reason is that this study is the first of its kind. Given this, as well as the limitations listed above, it is premature to talk about new insights into the relationship between VH in parent IDS and infant speech production based on the findings this study. They have, however, contributed to new thoughts about how perceptual processing demands potentially impacts infant production, which need to be addressed in future studies, together with further evaluation of the VH and PC measures used in this study.
In conclusion, the present study reports a negative relationship between VH in parent utterances and PC in immediately following infant vocalizations. No relationship was found between PC in infant vocalization and VH of the immediately following parent utterance. That is, a negative relationship between parent VH and infant PC was found on the level of conversational turns, and the directionality suggested was that parent utterances influence infant vocalizations rather than the opposite.
Tabular data generated for this study are available at the Open Science Framework at:
The studies involving human participants were reviewed and approved by The Regional Ethics Committee in Stockholm, Sweden (2015/63-31). Written informed consent to participate in this study was provided by the adult participants and the infant participants’ legal guardian/next of kin.
All authors contributed to the article and approved the submitted version. UM, EM, and LG: study design, drafting the manuscript, and critical revisions of the manuscript. UM and LG: data collection (part) and transcriptions. EM: data processing and analyses. All authors approved the submission.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
The authors would like to thank all families participating in the MINT project as well as Tove Gerholm for permission to use the data and also thank Tove Gerholm and David Pagmar for data collection and Freya Eriksson, Alice Gustavsson, Mika Matthis, Linnea Rask, Johanna Schelhaas, and Sofia Tahbaz for transcriptions of parent speech used for VH estimation.
1The longitudinal study was part of the MINT-project (MAW 2011.0070, PI Gerholm).