Functional Load and the Teaching-Learning Relationship in L2 Pronunciation

Though frequent recourse has been made to the functional load (or FL) principle in establishing priorities for L2 pronunciation teaching, it remains an under-theorized and relatively under-utilized concept. This is despite the existence of empirical evidence pointing to correlations between the FL ranking of phonemic contrasts and a) the effect that the absence of particular contrasts has on the comprehensibility of speech, and b) their occurrence at different levels of proficiency. Previous studies have found that errors involving high FL sound contrasts are linked with educed comprehensibility, and have also found that high FL errors are less common in learners at higher proficiency levels. Taken together, these findings suggest that language learners tend to pay more attention to high FL contrasts and incorporate them into their repertoires more readily than low FL contrasts, possibly because the high FL contrasts are more salient in terms of contrastive potential and frequency of occurrence. The concept of FL therefore appears to be relevant in considering the relative ease (or difficulty) of learning and teaching particular features, and in understanding the relationship between learning and teaching. Frequent calls have been made for FL considerations to inform the setting of priorities in L2 pronunciation teaching, for example. In this mini-review I will explore and re-evaluate the concept of FL in terms of both theoretical formulation and empirical application, aiming to identify both its contributions and its limitations.


INTRODUCTION
The concept of functional load (hereafter, FL) is approaching its centenary. From its first mention in the discussions of the Prague School linguists (e.g., Jakobson, 1931), a line of influence can be traced through postwar structural linguistics (e.g., Martinet, 1952;Hockett, 1967) to the application of FL to language teaching (e.g., Catford, 1987;Brown, 1991). In recent years there has been a resurgence of interest in FL in the field of L2 pronunciation (e.g., Munro and Derwing, 2006;Suzukida and Saito, 2019) and assessment (e.g., Kang and Moran, 2014).
How can the enduring appeal of FL be explained? It appears to hold out the promise of a parsimonious explanation for various linguistic phenomena, ranging from diachronic sound change to the effects of different sound contrasts on the perceived comprehensibility of spoken language. But there is no agreed-upon definition of FL, and studies in the field of L2 pronunciation still rely on lists of minimal pairs drawn up in the pre-computer age. In this mini-review article I have two main objectives: firstly to critically examine the concept of FL itself, and secondly to review its deployment in research, aiming to identify both its contributions and its limitations. In doing so I will address the central concern of this special issue, namely the relationship between ease of acquisition and ease of teaching in L2 speech. I will argue that FL can inform our understanding of this relationship and help to answer the question of why it is that certain phenomena are more difficult to learn and teach. However, I also argue against the mechanistic application of FL and call instead for an increased awareness of what lies behind the concept and its measurements. Suitably reconceptualized, the FL concept can maintain its usefulness in an era of international communication and dynamic language practices.

THE ORIGINS AND NATURE OF FL
The durability of the FL concept reflects an enduring interest in the relationship between the structure of linguistic systems and the functional roles of their components. The appeal of FL as a way of measuring the functional or informational value of these components was, and still is, an "intuitively attractive idea" (Wedel et al., 2013a: 397). The first applications of FL to language teaching appeared in the work of Catford (1987) and Brown (1991), both of whom were concerned with identifying priorities for L2 pronunciation teaching. The ranked lists of English phonemic contrasts prepared by these scholars are still used in present-day studies (e.g., Derwing and Munro, 2006, which looked at the relationship between FL and the perceived accentedness and comprehensibility of L2 English speech).
The simplest definition of FL, and the one with which most linguists are familiar, is that of the amount of "work" performed by the phonemic contrasts found in a given language. The simplest measurement of FL-what Martinet (1955: 54) called the most "naïve" measurement -is that of the number of minimal pairs a particular contrast serves to differentiate. This was the basis of the lists prepared by Catford (1987). The lists of Brown (1991) adopt a more sophisticated approach, taking account of factors such as the relative frequency with which the constituents of minimal pairs occur and the number of pairs that have the same part of speech (such as live/leave). Nevertheless, despite the slightly different approach to measurement there is broad agreement between the two lists. Consonantal contrasts such as /l, r/ and vowel contrasts such as /ɔː, əʊ/ have a high FL in both. Indeed, in comparing the effects of using the Catford and Brown lists for their study, Munro and Derwing (2006) observed that were no conflicts between them.

THE APPLICATION OF FL IN L2 PRONUNCIATION RESEARCH
Despite its intuitive appeal, the complexity of FL turns out to be daunting. Beyond the core principle of "amount of contrastive work" and its measurement by minimal pair counts, there is no agreed-upon way to define or measure FL. This probably explains why the lists of Catford (1987) and Brown (1991) still serve as the go-to resource for researchers (e.g., Munro and Derwing, 2006;Kang and Moran, 2014;Suzukida and Saito, 2019). In this Introduction will briefly review these studies to illustrate the application of FL and begin to identify its contributions to L2 pronunciation research and its overall significance.
In their exploratory investigation, Munro and Derwing (2006) found preliminary confirmation of the "functional load hypothesis", namely that high FL errors (such as substituting /n/ for /l/) have a greater impact on listeners" perceptions of the accentedness and comprehensibility of L2 speech than low FL errors (p. 529). The study of Suzukida and Saito (2019) compared the effects of vowel and consonant substitutions with different FL values. It lent further support to the FL hypothesis by discovering that consonant substitutions were more detrimental to comprehensibility, and concluded that it was "only high FL consonant substitutions (e.g., mispronunciation of /l/ as /r/ or /v/ as /b/) that negatively impacted on native listeners" comprehensibility judgments" (p. 1). Taking a slightly different approach, Kang and Moran (2014) studied the patterning of high and low FL errors within speech samples taken from different levels of the Cambridge ESOL exam suite. The study found that the percentage of high FL errors was inversely related to proficiency level (p. 185), providing indirect evidence of a developmental trend in which learners progressively learn to avoid, or correct, high FL errors.
The studies of Munro and Derwing (2006) and Suzukida and Saito (2019) were both concerned with the dimensions of comprehensibility (i.e., perceived ease of understanding) and accentedness. I have elsewhere proposed that FL can also help to explain the findings of certain studies focusing on intelligibility (i.e., actual understanding in terms of word recognition; see Sewell, 2010;Sewell, 2017). The studies of Jenkins (2000) and Deterding (2013) were both concerned with international intelligibility among non-native speakers of English, and involved collecting corpora of misunderstandings. FL considerations can largely explain the hierarchies of error significance found in these studies, even though the researchers did not make explicit reference to FL. For example, Jenkins's proposed Lingua Franca Core (LFC) of intelligibility-preserving features is far more prescriptive for consonants than it is for vowels, echoing the findings of Suzukida and Saito (2019). The only vowel contrast given priority treatment in Jenkins's LFC is /ɪ, i/, which is a high FL contrast. Consonantal substitutions were also found to be more problematic in Deterding's study, and within this category the most problematic were the substitution of /n/ with /l/ and of /l/ with /r/-again, these are high FL contrasts in the Catford and Brown rankings.
FL therefore appears to be a promising explanatory factor in studies of the three dimensions of accentedness (Munro and Derwing, 2006), comprehensibility (Munro and Derwing, 2006;Suzukida and Saito, 2019) and intelligibility (Jenkins, 2000;Deterding, 2013). It may also be relevant in explaining the order in which sound contrasts are typically learned (Kang and Moran, 2014). However, both the concept and its application need to be placed on a firmer footing. The continuing use of the Catford and Brown lists is both reassuring and troubling. It is reassuring because they are mostly consistent with each other and are able to generate fairly consistent results when used in statistical analyses. It is troubling because the lists are over 30 years old, and because neither made their assumptions and procedures particularly clear (see Levis and Cortes, 2008: 200). The underlying theoretical basis of FL is therefore also unsatisfactory. Although minimal pair counts and rankings are necessary for statistical analysis, we must also be concerned what lies behind the concept: what do the statistics and rankings represent, and what do we mean by "communicative value"?

FL 2.0: A BROADER VIEW
Putting FL on a firmer theoretical footing should help to connect the various areas of interest in this special issue and address the central question of why it is that certain phenomena are more difficult to learn and teach. In an earlier work on the subject I suggested that a useful distinction could be made between narrow and broad senses of FL (Sewell, 2017). This was intended to widen the scope of FL beyond minimal pair counts. The broad sense of FL-also called the information-theoretic, entropy or global approach (Oh et al., 2015)-allows for FL to be measured and compared not only at the level of phoneme contrasts, but also with regard to individual phonemes and higher-level categories. Applying such an approach, Oh et al. (2015) demonstrated that consonants had a higher FL than vowels, not only in English but in all nine languages they studied. Taking a broader view not only extends the scope of FL beyond minimal pairs, but also increases its theoretical coherence by providing a psycholinguistic perspective and clarifying what it is that FL is actually measuring. For the purposes of this special issue I will illustrate this by explaining the relevance of FL across three interrelated temporal frames, those of interaction, learning and language change. Within and across these frames, FL measurements retain their essential character of indicating how, and where, the scope for variation in language use is constrained by the need to retain information value.
The temporal frame of interaction represents the here-andnow of communicative activity, and L2 pronunciation research approaches it via the concepts of intelligibility and comprehensibility, among others. In researching this temporal frame, FL provides an indication of which features are relied upon by participants to secure mutual understanding. Taking a longerterm view, the second temporal frame is that of learning. Participants can be seen as bringing their habitual ways of speaking and their implicit or explicit knowledge of language (e.g., phoneme and word frequencies, collocational patterns, and sound/spelling correspondences) to the frame of interaction. Of course, their habits and knowledge are the cumulative result of interaction, and interaction provides further opportunities for learning; the two frames are in fact interdependent.
Learning is partly open-ended, but is also shaped by the demands of interaction. In the context of L1 learning, Bybee (2001) characterizes the process of phonological acquisition as one of increasing fluency and automation, but one which is "constrained by the need to retain information value" (p. 15). Also writing from a functionalist perspective, Croft (2000) notes that communities of language users deal with the "problem" of communication by converging on "a regular solution to a recurring co-ordination problem" (p. 97). The co-ordination problem of how to distinguish between similar words is solved by relying on phonemic contrasts, and FL can be seen as indicating the relative usefulness of particular contrasts in maintaining information value. The finding of Kang and Moran (2014), namely that high FL errors are less common at higher proficiency levels, provides indirect evidence of an L2 learning process that is also shaped by emerging knowledge of information value, among other factors.
It is important to consider how this L2 learning process may take place, with reference to FL. It appears that high FL errors are gradually eliminated, but that low FL errors often remain, perhaps as more or less permanent accent features. The gradual elimination of high FL errors may be a result of actual communication breakdown or other kinds of negative feedback. A related possibility is that high FL errors tend to be more salient, because they occur more frequently or because they involve contrasts that are relatively easy for users to distinguish, and are thus more likely to be the target of monitoring by self or others. Similarly, the persistence of low FL errors, noted by Kang and Moran (2014) even in high-proficiency samples, may be due to their relative insignificance in terms of triggering communication breakdown. If the absence of an L2 contrast does not lead to problems in the temporal frame of interaction, the contrast is less likely to be incorporated into learner's repertoires.
To a certain extent, then, FL lends support to naturalistic methods of language learning: if learners are exposed to sufficient input and have opportunities for meaningful interaction, they will automatically learn which features are most important (Krashen, 1981). However, and to move back into the temporal frame of interaction, an awareness of FL may help instructors to provide high-quality feedback in the form of awareness-raising activities. With same-L1 classes I have found it useful to play recordings of local speakers as a dictation exercise. The words that are difficult (or impossible) to transcribe are often found to contain high FL errors, which can then be brought to the learners' attention. Taking a broader view of FL, such intelligibility problems are often associated with phonological contexts (such as word-initial position) that enhance the information value of contrasts. This was visible in the intelligibility study of Deterding (2013), which required listeners from different L1 backgrounds to transcribe extracts of L2 English conversations. Substitutions in word-initial position were a prevalent cause of intelligibility problems; the substitution of [n] for /l/ (e.g., "noisy" pronounced as "loisy") was particularly problematic, as would be expected for a high FL contrast.
The "noisy/loisy" example raises the question of the relevance of minimal pairs in FL, and also starts to illuminate the psycholinguistic basis of the concept. It would not appear on traditional minimal pair lists, as "loisy" is not a currently-existing word. However, the issue at stake is not merely the confusability of minimal pairs-the activation of non-words such as "loisy" can also be distracting for the L2 listener, who will often be unsure as to whether it is a real word or not. Weber and Cutler (2004) contend that much of the difficulty of listening to and understanding L2 speech arises from the activation of "spurious competitor words," which can take the form of L2 minimal pair members (when "still" is heard for "steel"), near-matches (as when "belly" is heard for "balance") or non-words. The significance of the /r, l/ contrast for Japanese learners, for example, resides not only in the potential confusability of pairs such as "belly" and "berry" but also in the unwanted activation of words like "barrow" and "barren" when the target word is "balance" (Weber and Cutler, 2004: 3). The measurement, and perhaps more importantly the theorization, of FL needs to take account of this. It may be that measurements based on minimal pair counts are no more than indirect reflections of overall "information value." In addition to interaction and learning, the third temporal frame relevant to FL is that of language change. Indeed, the original meaning of the "functional load hypothesis" was the question of whether "sound change is biased toward selective maintenance of those phonemes that contribute more to distinguishing existing lexical items in usage" (Wedel et al., 2013a: 396). Recent corpus-based studies have lent support to this version of the FL hypothesis (Wedel et al., 2013a;Wedel et al., 2013b). While it is possible to speculate on the likely future direction of phoneme merger (or equally, contrast preservation) based on FL considerations, the focus in this special issue is on the temporal frames of interaction and learning. The reason for considering all three frames is that this gives the concept of FL greater theoretical coherence-at least, as long as the assumptions of generative phonology are passed over in favor of models that have usage, rather than abstract systems, as their guiding principle. FL originated as a functionalist concept, and it continues to influence what Wedel et al. (2013a: 396) categorize as "VUE" (variationist, usage-based, evolutionary) models of language. According to Wedel et al., a central feature of such models is "the assumption of a causal chain linking properties of individual usage events to long-term change in the abstract, linguistic category system of a speech community" (2013a: 410; see also Bybee 2001;Bybee and Hopper, 2001).
The key characteristic of FL across all three frames is that it attempts to measure the prevailing information value of linguistic features. Language is inherently variable, and change is always in progress, but the need to retain information value provides a centripetal brake on these centrifugal forces. High FL, in other words, indicates that a feature or contrast is heavily relied upon to make distinctions of meaning. These features are, by and large, automatically prioritized by language learners and language users; from a longer-term perspective their information value is a consequence of usage, in turn influenced by technological affordances (such as writing) and societal trends (such as literacy).
By reconceptualizing FL as "FL 2.0" I am not arguing for the abandonment of minimal pair counts. It may turn out that these offer a shorthand approach to the complexities of FL. However, regardless of how FL is measured, there needs to be a greater awareness of what the concept and the counts represent. As Suzukida and Saito (2019) point out, there are more users of English as an L2 than there are native speakers. This introduces greater scope for variation and for alternative solutions to the co-ordination problems of communication. If FL and its measurement continue to be based on the narrow notion of minimal pairs (the lists of which may also be outdated), and if its indications are treated as language universals, its ability to inform research and language education will be compromised. Although research studies and teaching guides prefer to operate with ranked lists of features, the lists represent the effects of complex forces operating across different timescales.

CONCLUSION
What, then, does an expanded view of FL have to offer with regard to the central question of this issue, namely the relationship between ease of acquisition and ease of teaching in an L2? One aspect of the relationship between the two is illuminated by Levis"s observation that "certain features are not acquirable in the long run, no matter what we do to teach effectively or no matter how much effort learners put into learning" (2018: 213). Many of the accent features that are retained by advanced learners have a low FL (Kang and Moran, 2014) and may not need to be prioritized in either teaching or testing. An FL perspective also suggests, by implication, that the process of acquisition involves mastering features and contrasts whose FL is high, or to put it another way, learning to avoid high FL errors. The traditional contribution of FL has been to help predict what these features might be and to indicate targets for teaching and assessment, even though many of these features are likely to be difficult to acquire (e.g., /i/ and /ɪ/ for Cantonese speakers).
FL has several limitations, however. Paradoxically, the more the theoretical foundations of FL are buttressed by taking it out of the narrow realm of minimal pairs, the more it becomes apparent that it is one of the many factors that shape the contours of interaction, learning and language change. Behind the statistics, the underlying principle of FL is simply that information value plays an important role in determining the nature and scope of variation in human language, both at relatively shorter timescales (such as those of interaction and learning) and across longer timescales of language change. That it does not have a determining role is shown by the many phenomena that appear to run counter to the FL principle. For example, it has been observed that the absence of contrast between /i/ and /ɪ/ is a feature of many L2 English accents around the world (see, e.g., Lim, 2004, on Singapore English;Deterding et al. (2008) and Hung, 2000 on Hong Kong English). It is noticeable in the speech of advanced learners (i.e., it is associated with the temporal frames of interaction and learning), and may represent language change in progress, at a local level. 1 Yet this contrast is given a high FL ranking in both the Catford and Brown lists. Unless this is due to the relatively "narrow" measures of FL represented by these lists, we must consider the possible reasons for this and other exceptions before making pedagogical recommendations.
The learners' L1 is a major reason for these exceptions. For example, the /i/ and /ɪ/ contrast is allophonic in Cantonese, and many other well-known and recurring "errors"-such as Japanese learners' difficulties with the /l, r/ contrast-are related to the L1, illustrating the psycholinguistic principle that "the hardest second-language contrasts to learn are those which are ignored in the native language because each of the contrasting sounds is a permissible token of a single native category" (Best, 1995, in Weber andCutler, 2004: 2). It may simply be, therefore, that the FL_derived "information value" of these contrasts is outweighed by the difficulty of acquiring them. The local adaptation of abstract language "systems" is precisely what VUE models would suggest. To the extent that language involves a "system," it is not one so delicate as to be functionally compromised by the loss of certain contrasts. Rather, the system is adaptive and resilient, not 'rigid, homogeneous, selfcontained or finely "balanced"' (Croft 2000: 231). For both of these reasons-the particularities of learners' L1s and the resilience of language systems-we may therefore be mistaken if we automatically prioritize high FL "errors" in teaching and assessment. As English is decolonized and appropriated by different cultures, the frequency with which words occur and the ways they are realized in phonological terms will show local patterns of variation. Although the FL principle would predict substantial continuity, assuming that the centripetal forces of written language and literacy remain in place, it is important to avoid treating FL measures as language universals. The FLinformed approach taken by some pronunciation teaching handbooks (e.g., Celce-Murcia et al., 2010) needs to be tempered by an awareness of local patterns of variation, and of the limitations of existing measures of FL. It may even be that the proper application of FL lies more in post hoc explanation than in the prediction of difficulties or the formulation of teaching priorities.
There is an urgent need to conduct studies of FL in different contexts around the world. These should take advantage of corpus data and statistical modeling, to avoid the continuing reliance on lists drawn up over four decades ago. Such studies also need to grapple with the theoretical complexities of FL, and decide how to model such factors as frequency, as Brown (1991) attempted to do. Suitably reconceptualized and remodeled, the centuryold concept of FL can continue to serve as a useful heuristic in assessing questions of language acquisition and language teaching, and relating them in turn to language usage and language change.

AUTHOR CONTRIBUTIONS
The author confirms being the sole contributor of this work and has approved it for publication.