Statistical Learning of Two Artificial Languages Presented Successively: How Conscious?

Franco, Ana; Cleeremans, Axel; Destrebecqz, Arnaud

doi:10.3389/fpsyg.2011.00229

ORIGINAL RESEARCH article

Front. Psychol., 21 September 2011

Sec. Psychology of Language

volume 2 - 2011 | https://doi.org/10.3389/fpsyg.2011.00229

Statistical learning of two artificial languages presented successively: how conscious?

Ana Franco*

Axel Cleeremans and Arnaud Destrebecqz

Consciousness, Cognition, and Computation Group, Université Libre de Bruxelles, Bruxelles, Belgium

Statistical learning is assumed to occur automatically and implicitly, but little is known about the extent to which the representations acquired over training are available to conscious awareness. In this study, we focus on whether the knowledge acquired in a statistical learning situation is available to conscious control. Participants were first exposed to an artificial language presented auditorily. Immediately thereafter, they were exposed to a second artificial language. Both languages were composed of the same corpus of syllables and differed only in the transitional probabilities. We first determined that both languages were equally learnable (Experiment 1) and that participants could learn the two languages and differentiate between them (Experiment 2). Then, in Experiment 3, we used an adaptation of the Process-Dissociation Procedure (Jacoby, 1991) to explore whether participants could consciously manipulate the acquired knowledge. Results suggest that statistical information can be used to parse and differentiate between two different artificial languages, and that the resulting representations are available to conscious control.

Introduction

Statistical learning broadly refers to people’s ability to become sensitive to the regularities that occur in their environment by means of associative learning mechanisms. Such sensitivity often extends to the temporal domain, inasmuch as temporal structure is a central feature of many skills, ranging from language processing to action planning. The first studies dedicated to statistical learning per se essentially focused on language acquisition, and in particular on speech segmentation. Thus, Saffran et al. (1996b) investigated whether distributional cues could be used to identify words in continuous speech. In their study, participants were exposed to a stream of continuous speech composed of six trisyllabic words that were repeated in seemingly random order. The continuous speech was produced by a speech synthesizer and contained no other segmentation cues than statistical information, that is, the transitional probabilities between the syllables. These probabilities were higher for within-word syllable transitions than for between-word transitions. After a brief exposure phase, participants had to discriminate between words of the artificial language and non-words. Results indicated that they were able to do so, thus proving that transitional probabilities convey sufficient information to effectively parse continuous speech into units.

Subsequent studies have extended this seminal finding in many different directions, documenting the central role that statistical learning mechanisms play in different aspects of language acquisition, such as speech segmentation (Saffran et al., 1996b; Jusczyk, 1999; Johnson and Jusczyk, 2001), lexicon development (Yu and Ballard, 2007), or learning about the orthographic regularities that characterize written words (Pacton et al., 2001). Further, the importance of statistical learning has also been explored in other domains, such as non-linguistic auditory processing (Saffran et al., 1999), visual processing (Fiser and Aslin, 2002; Kim et al., 2009), human action processing (Baldwin et al., 2008) or visuomotor learning (Cleeremans, 1993). While most of these studies involved adult participants, many have also demonstrated that children (Saffran et al., 1997) and infants (Aslin et al., 1998; Saffran et al., 2001) are capable of statistical learning. Taken together, these studies suggest that such learning can occur without awareness (Saffran et al., 1997), automatically (Saffran et al., 1996a; Fiser and Aslin, 2001, 2002; Turk-Browne et al., 2005) and through simple observation (Fiser and Aslin, 2005).

Though statistical learning can be viewed as a form of implicit learning (Reber, 1967), this is not necessarily the case, and the relevant literatures have so far remained rather disconnected from each other. According to some authors, statistical and implicit learning represent different ways of characterizing essentially the same phenomenon (Conway and Christiansen, 2005; Perruchet and Pacton, 2006). Indeed, just like statistical learning, implicit learning is assumed to occur without awareness (Cleeremans et al., 1998) and automatically (Jimenez and Mendez, 1999; Shanks and Johnstone, 1999), or at least, incidentally. Although statistical learning research has been essentially dedicated to exploring language acquisition – with particular emphasis on development – most implicit learning studies have instead been focused on adult performance, with particular emphasis on understanding the role of awareness in learning and the nature of the acquired knowledge. Recently however, the two fields have begun to converge as it became increasingly recognized that the processes involved in artificial grammar or sequence learning are of a similar nature as those involved in statistical learning studies (e.g., Cleeremans et al., 1998; Hunt and Aslin, 2001; Saffran and Wilson, 2003; Perruchet and Pacton, 2006).

Despite this emerging convergence, statistical learning studies have seldom addressed what has long been the central focus of implicit learning research, namely, the extent to which the representations acquired by participants over training or exposure are available to conscious awareness. As discussed above, most statistical learning studies claim that such learning occurs without conscious awareness. However, most of the relevant studies have consisted in an incidental exposure phase followed by a two-alternative forced-choice test (2AFC; Saffran et al., 2001; Saffran, 2002; Perruchet and Desaulty, 2008) in which participants are instructed to choose the stimuli that feel most “familiar.” Familiarity, however, can involve either implicit or explicit knowledge: One can judge whether an item has been seen before based on intuition or on recollection (for review Richardson-Klavehn et al., 1996). The assumption that knowledge is implicit because people learn incidentally and perform well on a familiarity task is therefore unwarranted. In this respect, the implicit learning literature is suggestive that considerable care should be taken when drawing conclusions about the extent to which acquired knowledge is available to conscious awareness or not. This literature has also suggested that which type of measure is used to assess awareness is instrumental to our conclusions about whether learning was truly implicit. A distinction is generally made between two types of measures that can be used to assess awareness: Objective and subjective measures.

Objective measures are quantitative (e.g., accuracy, reaction times) and typically require participants to perform a discrimination task, such as deciding whether a stimulus was present or not (identification) or deciding whether a stimulus has been seen before or not (recognition). Subjective measures, by contrast, require participants to report on their mental states and typically take the form of free verbal reports or confidence ratings (Dienes and Berry, 1997). Both types of measures have been criticized and have generated substantial debate. Subjective measures, for instance, can be questioned based on the fact that they are biased and depend on the manner in which participants interpret the instructions (Eriksen, 1960; Reingold and Merikle, 1993; Dulany, 1997). Thus, conservative participants may claim to be guessing while actually knowing more about the stimulus than they report. As a consequence, subjective measures may overestimate unconscious knowledge. On the other hand, objective measures may be contaminated by unconscious influences. Thus, a participant who correctly recognizes a stimulus as “old” may do so not on the basis of conscious recollection, but rather on the basis of a feeling of familiarity. Objective measures may thus underestimate the influence of unconscious knowledge.

Another issue comes from the fact that tasks in general involve both conscious and unconscious knowledge. In this context, Jacoby (1991 proposed his process-dissociation procedure (PDP) as a way of overcoming the limitations of both objective and subjective measures. The method rests on the assumption that conscious knowledge is amenable to voluntary control whereas information held without awareness is not. The PDP involves contrasting performance in two versions of the same task. As an illustration, imagine an experiment in which participants are exposed to two different lists of words that they have to remember. Their memory of the words is tested through two tasks: an inclusion and exclusion task. In Inclusion, participants are asked to perform a simple old/new recognition judgment. Here, it is assumed that both familiarity and recollection contribute to task performance as participants may correctly classify a training item either based on a feeling of familiarity or conscious recollection. In Exclusion, by contrast, participants are instructed to recognize only the words from the first (or second) list. Under such instructions, successful performance has to be based on conscious recollection, as a mere feeling of familiarity may impair participants’ ability to effectively differentiate between the test items belonging either to the first or the second language. Familiarity may influence participants so that they incorrectly recognize words of the second (or first) list. Familiarity and recollection thus act in opposition during exclusion. By comparing performance in the two tasks, it is therefore possible to estimate the extent to which processing is conscious or not.

The PDP has been widely used to explore awareness in situations that involve tracking probabilities, such as in sequence learning (Buchner et al., 1997, 1998; Destrebecqz and Cleeremans, 2001, 2003; Fu et al., 2008) or in artificial grammar learning (Dienes et al., 1995). However, to the best of our knowledge, the PDP has never been used in a paradigm more typical of statistical learning situations, such as speech segmentation. In fact, little is known about participants’ ability to consciously access the knowledge they acquire during typical statistical learning situations. Only one recent study (Kim et al., 2009) has directly attempted to address this issue. Using a rapid serial visual presentation (RSVP) paradigm and a matching task (i.e., an 11-alternative forced-choice task), the authors assessed sensitivity to the regularities contained in the stream by means of reaction times to certain predictable or unpredictable events and through correct responses on the matching task. Participants exhibited faster RTs in response to predictable events while remaining unable to perform above chance on the matching task. Based on these results, Kim et al. concluded that visual statistical learning results in representations that remain implicit. However, we believe that an 11-alternative forced-choice task may fail to be sufficiently sensitive to all the relevant conscious knowledge acquired by participants. Thus, this task may fail to fulfill the sensitivity criterion (Shanks and St John, 1994). Therefore, participants’ failure could be due to task difficulty rather than to the absence of explicit knowledge of the statistical regularities.

Here, we specifically focus on using a more sensitive test of explicit knowledge so as to assess whether or not statistical learning results in conscious representations. More specifically, we assess the extent to which learners are aware of the relevant contingencies present in two artificial languages and on how conscious they are of the representations of the languages acquired in a statistical learning situation.

To do so, we conducted three experiments. In Experiment 1, we first explore whether learners can correctly find the word boundaries in each of two artificial speech streams. In Experiment 2, participants were successively exposed to the two speech streams and immediately presented with two tasks: a 2AFC task and a language decision task, in which they were asked to differentiate words from the two artificial languages. If participants successfully performed the two tasks, this would suggest that they can process two different sets of statistical information and that they formed two separate sets of representations, one for each language. Finally, Experiment 3 aimed at assessing the relative contributions of implicit and explicit memory by using the PDP. The exposure phase was identical to Experiment 2 and was immediately followed by an inclusion and an exclusion task. In Inclusion, participants were asked to perform a simple old/new recognition judgment. In Exclusion, participants are instructed to recognize not only the test items but also the context (e.g., the language it comes from) in which it has been presented. By comparing performance on the two tasks, it is possible to estimate the extent to which processing is conscious or not. Thus in Experiment 3, if participants learned the two artificial languages consciously and independently (i.e., they consider the material to consist of two distinct languages and they are able to differentiate them), we expect high performance in the inclusion and the exclusion task. Participants should be able to differentiate items belonging to the two languages from novel items. They should also be able to differentiate items from the first and the second language. However, if learning was implicit or if participants formed a single lexicon gathering words from both languages, inclusion performance should be above chance (as it may be based on familiarity) but exclusion should be at chance as participants should not explicitly differentiate test items belonging to the first or the second language.

Experiment 1

Method

The goal of Experiment 1 was to establish that both languages can be learned when presented in isolation.

Participants

Twenty monolingual French-speaking undergraduate psychology students (mean age: 20.8) were included in this study and received class credits for participation. None of the participants had previous experience with the artificial languages presented in this experiment. All reported no hearing problems.

Stimuli

We tested two different speech streams created from the same sound inventory (the same syllable set), with a similar underlying statistical structure. For clarity of discussion, we label each stream as an “artificial language”: L1 and L2. Each language consisted of 4 artificial trisyllabic (CV.CV.CV) words composed of the same 12 syllables (see Figure 1). The two artificial languages were generated using the MBROLA speech synthesizer (Dutoit et al., 1996) using two different male voices (fr1 and fr4).

FIGURE 1

Figure 1. This figure shows the design of languages L1, L2.

We chose 12 French syllables that were considered as easily distinguishable. We assembled the syllables to obtain a set of trisyllabic words. Four were assigned to L1 (bulago, kimolu, liteva, and muviko) and four were assigned to L2 (govimu, luteki, vamoli, and kolabu). The others were used as non-words and were never presented during the exposure phase. All the words were pretested in order to ensure that none of them sounded similar to a French word. Finally, using Matlab, we created a script in order to generate the artificial speech streams. There were two conditions, which differed by the time of exposure to the artificial languages: 10 or 20 min. Each word was presented 300 times (in the 10-min condition) or 600 times (in the 20-min condition) in pseudo-random order: the same word could not occur twice in succession. There were no pause between syllables and no other acoustic cues to signal word boundaries. As the succession of words was pseudo-randomized, the transitional probabilities between each syllable were 100% within words and 33% between words in both artificial languages. Non-words had syllable transitional probabilities of 0%. L1 and L2 words did not share any syllable transition between languages or with the non-words, that is all the transitional probabilities of 100 or 33% in one language were null in the other or in the non-words.

Procedure

Participants were instructed to watch a series of cartoons during which they would hear continuous speech spoken in an unknown language. They were not informed about the length or structure of the words or about how many words each language contained. They were randomly assigned to the short or the long exposure condition. Stimuli were presented during either 10 or 20 min. After the exposure, participants performed a recognition task. In each trial a trisyllabic string (spoken by a female voice, V3) was presented and participants had to decide if the string was a word from the language they had heard or not. The material consisted of the four words of the language and randomly chosen non-words, which consisted in trisyllabic words composed by the same syllables as the language but with null transitional probabilities between syllables (see Figure 1).

Results and Discussion

The results are presented in Figure 2. As typically used in statistical learning studies (Saffran et al., 1996a; Gebhart et al., 2009), we analyzed recognition data by assessing mean percentages of correct responses. Performance exceeded chance for both languages. L1 Participants averaged 68.93% of correct responses, t(10) = 4.464, p < 0.005, Cohen’s d = 1.470. L2 participants averaged 69.38% of correct responses, t(9) = 4.378, p < 0.005, Cohen’s d = 1.850. An independent-sample t-test revealed no difference between L1 and L2, t(19) = −0.503, p > 0.5. We thus pooled L1 and L2 participants together for further analysis. We asked whether exposure duration had an effect on performance. Another independent-sample t-test revealed no difference between the two exposure durations, t(19) = 0.882, p > 0.5. Participants exposed to the artificial language for 10 min averaged 68.76% of correct responses, whereas those who had been exposed for 20 min averaged 69.63% of correct responses. Results demonstrate that the two different artificial languages were learnable when presented in isolation. Moreover, learning across both languages did not differ significantly and 10 min appear to be sufficient to learn the language. The next experiment investigates whether the two languages remain learnable when they are presented successively for 10 min each.

FIGURE 2

Figure 2. Endorsement rates for words and for non-words in the recognition task.