The Phonological Mapping (Mismatch) Negativity: History, Inconsistency, and Future Direction

The last three decades have seen a considerable growth in the use of event-related potential (ERP) methods in language research. As our appreciation of the cognitive mechanisms underlying language processing increases, so too does our understanding of its electrophysiological correlates. The phonological mapping negativity1 (PMN) is an ERP component long established to index pre-lexical phonological processing (Connolly and Phillips, 1994; Connolly et al., 2001; Desroches et al., 2009), involving the mapping of speech signals onto phonological representations (Newman and Connolly, 2009). Generally maximal at around 300ms post stimulus onset (PSO), the PMN is classically elicited in paradigms in which phonological expectancies are generated using words (Newman and Connolly, 2009), sentences with high cloze probability (Connolly and Phillips, 1994), and pictures (Desroches et al., 2009). For example, asking participants to delete the initial sound from a word (e.g., snap without the /s/) will generate expectation of “nap.” When this expectation of particular phonological input is violated (e.g., presentation of “tap” in place of “nap”) the component increases in amplitude. ERP components are typically identified through a combination of their scalp distribution, timing, polarity, and sensitivity to experimental manipulations (Woodman, 2010).Marked overlaps in timing and sensitivity with another component, the N400, initially prompted research intended to distinguish the PMN as a discrete response (Connolly et al., 1992; Connolly and Phillips, 1994). Over the last 30 years however, such research has unveiled a host of inconsistencies in the reported topography, timing, and sensitivity of the response. Despite its established place in the core repertoire of ERP components as a reliable, well-defined effect, consistent ambiguity in the characterization of the response necessitates reconsideration of the reliability and authenticity of the PMN.


INTRODUCTION
The last three decades have seen a considerable growth in the use of event-related potential (ERP) methods in language research. As our appreciation of the cognitive mechanisms underlying language processing increases, so too does our understanding of its electrophysiological correlates. The phonological mapping negativity 1 (PMN) is an ERP component long established to index pre-lexical phonological processing (Connolly and Phillips, 1994;Connolly et al., 2001;Desroches et al., 2009), involving the mapping of speech signals onto phonological representations (Newman and Connolly, 2009). Generally maximal at around 300 ms post stimulus onset (PSO), the PMN is classically elicited in paradigms in which phonological expectancies are generated using words (Newman and Connolly, 2009), sentences with high cloze probability (Connolly and Phillips, 1994), and pictures (Desroches et al., 2009). For example, asking participants to delete the initial sound from a word (e.g., snap without the /s/) will generate expectation of "nap." When this expectation of particular phonological input is violated (e.g., presentation of "tap" in place of "nap") the component increases in amplitude.
ERP components are typically identified through a combination of their scalp distribution, timing, polarity, and sensitivity to experimental manipulations (Woodman, 2010). Marked overlaps in timing and sensitivity with another component, the N400, initially prompted research intended to distinguish the PMN as a discrete response (Connolly et al., 1992;Connolly and Phillips, 1994). Over the last 30 years however, such research has unveiled a host of inconsistencies in the reported topography, timing, and sensitivity of the response. Despite its established place in the core repertoire of ERP components as a reliable, well-defined effect, consistent ambiguity in the characterization of the response necessitates reconsideration of the reliability and authenticity of the PMN.

DISTINGUISHING THE PMN & N400
The N400 is the most commonly studied electrophysiological response in the context of language processing, sensitive to the manipulation of semantic, phonological, and associative relationships (Kutas and Federmeier, 2011). Notably, the N400 can be modulated by phonological factors such as word-initial and word-final overlap (Bölte and Coenen, 2002;O'Rourke and Holcomb, 2002).
The component's breadth of sensitivity means that it is generally characterized as a function of its timing, behavior and morphology, with the term N400 used as a heuristic label for stimulus related brain activity that occurs in a pattern of sensitivity to manipulated variables 200-600 ms post-stimulusonset (Kutas and Federmeier, 2011).
The degree to which the PMN and N400 overlap, or coexist as functionally distinct components, has been the subject of substantial debate. Early PMN literature focused on the differentiation of these components by highlighting the sensitivity of the PMN to phonological violations, and that of the N400 to semantic violations. In one of the first studies to report a dissociation between the N400 and PMN, Connolly and Phillips (1994) sought to distinguish the effects of phonological and semantic violations by manipulating both factors in high cloze-probability sentences. Target words were fully congruent (e.g., The piano was out of tune); semantically congruent but phonologically unexpected [e.g., The pig wallowed in the pen [mud]]; semantically unexpected but phonologically congruent [e.g., The gambler had a streak of bad luggage [luck]]; or fully incongruent [e.g., Joan fed her baby some warm nose [food]]. Whilst a combined PMN-N400 response was elicited in the fully incongruent condition, pure phonological violations elicited a PMN response, and pure semantic violations elicited a slightly later N400 modulation. The authors interpreted the results as indicating that the two components were functionally distinct, with the PMN sensitive to early lexical processing, and the N400 representing later semantic integration processes. Although oft referenced as a defining study on the PMN, further consideration suggests that such an interpretation may be oversimplified. Firstly, the largest PMN modulation was found in response to fully incongruent targets, as opposed to phonological mismatch alone. Furthermore, when only semantic expectation was violated, N400 peak latency was delayed by over 50 ms relative to the fully incongruent condition. Both the heightened PMN response to fully incongruent stimuli, suggesting sensitivity to both phonological and semantic manipulation, as well as the increased N400 latency for purely semantic violations suggest the two components may not be as functionally independent as proposed.
Another significant study in the characterization of the PMN was Connolly et al. (2001), which investigated the sensitivity of the component to semantic manipulations. Here, the PMN was unaffected by lexicality-a result that was interpreted to suggest that unlike the N400, the PMN does not reflect semantic processing. Despite the prevailing influence of the study, conflicting MEG evidence (Kujala et al., 2004) subsequently led to the assertion that no definitive conclusions about the PMN could be drawn from the paradigm used (Newman and Connolly, 2009). In a later investigation into whether the PMN reflects pre-lexical or lexical processing, Newman et al. (2003) presented participants with auditory prime words, and asked them to mentally delete the initial consonant (e.g., clap, /k/). Subsequently, participants either heard a correct target (e.g., lap) or one that violated expectation in one of three ways, (i) wrong consonant deletion (WC, e.g., cap); (ii) consonant cluster deletion (CC, e.g., ap); or (iii) an irrelevant word (IW, e.g., nose). The authors predicted gradations in response by phonological violation (largest in IW condition, followed by WC and CC), and an indistinguishable response to word and non-word targets demonstrating the PMN's insensitivity to lexicality. Results showed that PMN mean amplitude was significantly reduced in the correct condition, whilst it was significantly greater and undistinguishable across all incorrect conditions. These findings were interpreted as evidence for the PMN's phonological sensitivity and functional distinction from the N400 based on its comparable amplitude for both word and non-words targets. Despite their interpretation, the authors acknowledged the likelihood that P300 contamination uniquely reduced PMN amplitude in the correct condition. Such contamination would likely confound any comparison between correct and incorrect conditions. As such, in the three remaining conditions with reliable results (IW, WC, & CC), PMN amplitude failed to differ significantly despite the predictions of gradations in response to phonological violation. Furthermore, the absence of N400 reporting weakens the interpretation significantly, as it is not possible to determine whether the study's task demands would have elicited N400 lexicality distinctions. Without evidence of a lexicality-dependant response from a component known to be sensitive to semantics, the absence of a PMN lexicality response cannot be deemed meaningful. A similar later study, wherein participants were asked to delete the initial consonant from four-syllable words, reported a PMN response to correct/incorrect targets and a lexicality effect on an N400like response (Newman and Connolly, 2009). Although the two components were concluded to be functionally dissociated, visual inspection revealed a larger PMN response to words compared to non-words over centroparietal sites, which the authors attributed to the onset of the N400.
Despite their methodological issues, current literature predominantly defines PMN based on the aforementioned studies. Furthermore, Table 1 shows wider research often cited when defining the component, noting further methodological issues including low sample size, known to affect the noise to signal ratio of ERP data (Boudewyn et al., 2018). Whilst literature supporting the specific PMN component as reported by Connolly and Phillips (1994) appears conflicting, there remains consistent evidence that ERP activity between 200 and 400 ms PSO reflects the activation and processing of phonological information (Hagoort and Brown, 2000;van den Brink et al., 2001;Desroches et al., 2009). Alternative explanations for phonological sensitivity within this timeframe range from early N400 accounts (Praamstra et al., 1994;Van Petten et al., 1999;Dumay et al., 2001); separate components including the P250 and P325 (Hagoort and Brown, 2000;van den Brink et al., 2001;Holcomb and Grainger, 2006;Grainger and Holcomb, 2009); to variations upon a visual PMN component (Desroches et al., 2009;Sučević et al., 2015).

A QUESTION OF TOPOGRAPHY?
Methodological discrepancies in the aforementioned research maintain the possibility that central/parietal PMN responses may not be as distinct from the N400 as generally assumed. This does, prompt questions as to the origin of the frontal PMN responses. Despite considerable variability in reported PMN topography (see Table 1), a number of studies reporting frontocentral PMN topography have influenced the notion that the component is classically anterior (Connolly et al., 1990;D'Arcy et al., 2004). The frontocentral topography reported by these studies is not too dissimilar from that of another ERP response, the auditory MMN. Originally reported by Näätänen et al. (1978), the MMN is a frontocentral component that peaks ∼150-250 ms PSO (Kujala et al., 2007), and is typically elicited in oddball paradigms where it responds to infrequent deviant sounds presented amongst frequent "standard" sounds. Research suggests, however, that the MMN's sensitivity extends beyond pure auditory mismatch. For example, Shtyrov et al. (2003) presented inattentive participants with short phrases that differed in a single phoneme, rendering them either grammatical or ungrammatical. The authors found that infrequent ungrammatical phrases elicited an MMN amplitude increase. Further research, suggesting the MMN may be sensitive to semantic (Pulvermüller et al., 2001), acoustic, and phonological properties of speech (Aaltonen et al., 1997;Weber et al., 2004) imply that the component may be more diverse in sensitivity than originally thought.
Despite considerable overlap in both timing and topography, in addition to recent reports of a potential phonological Mismatch Negativity response (Proverbio et al., 2018), no study has thus far compared the PMN source loci with those of the MMN (Näätänen et al., 2007). Whilst the PMN requires active participant attention, the MMN is pre-attentive. It has been assumed thus far that the necessity of attentiveness means the two components are generated by distinct cortical mechanisms (Näätänen et al., 2007). Perhaps however, the mere generation of phonological violation requires attention to linguistic content in order to establish expectation, whilst simple auditory mismatches, such as those typical in MMN paradigms, do not. If so, the MMN and frontal PMN could instead represent two ends of a spectrum of electrophysiological responses to auditory mismatch, the former as a rudimentary reaction to auditory mismatch, and the latter a more sophisticated response to violation of complex expectations. Whilst establishing the extent of the distinction between the two components is beyond the scope of this article, the necessity for future work to ascertain the discreteness of these two components beyond mere assumption is crucial (Kujala et al., 2007).

CONCLUDING REMARKS AND FUTURE PERSPECTIVES
The PMN has consistently been reported as a distinct electrophysiological index of phonological processing. In the present article, we highlight the necessity to reconsider the clear distinctions drawn by prior research between the PMN, N400, and MMN. We propose that the frontal PMN response cannot be differentiated from the MMN based on attentional processing requirements alone. Furthermore, despite the continued prominence of research discussed, we suggest that the PMN response cannot be sufficiently differentiated from the N400 based on current research. Instead, it remains possible that the centroparietal PMN may represent earlier onset of congruity effects in response to phonemic mismatch with an expected word (Van Petten et al., 1999).
Future research in this field must focus on elucidating inconsistencies in topography and sensitivity of the PMN in order to gain a thorough understanding of this component. This may be achieved through several means: (1) Utilizing clusterbased permutation approaches as a data-driven way to choose time-windows and electrodes for analysis, this could clarify the results of experimental paradigms where both a PMN and N400 appear to be present. (2) Conducting a meta-analysis of the PMN literature to gain clarity regarding current evidence on the PMN; and (3) Conducting further research utilizing paradigms intended to dissociate the PMN from the N400 and MMN. Until further evidence becomes available, it is clear that a full appreciation of the insight ERPs can offer us into phonological processing remains beyond our reach.