OPINION article
Front. Psychiatry
Sec. Autism
Volume 16 - 2025 | doi: 10.3389/fpsyt.2025.1665247
This article is part of the Research TopicResearch by Autistic Researchers: An “Insider’s View” into AutismView all articles
How do we reconcile the seemingly contradictory theories of Gestalt language processing and weak central coherence?
Provisionally accepted- 1Northeastern University, Boston, United States
- 2Boston University, Boston, United States
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Introduction: Local Processing Bias Focusing on the trees and not the forest. Working on the details of a problem but ignoring the big picture. Over the last decades, these observations of autistic cognitive processing have been experimentally verified using diverse tasks. In the embedded figure task (Frith, 1989), a line drawing of a pie shape would be difficult to find if it was drawn by the spokes on the wheel of a baby carriage. Autistics were faster and more accurate than a neurotypical comparison group at identifying hidden shapes, presumably because of greater bottom-up processing, meaning focusing on details of the image, without conflicting top-down signals from the context surrounding the shape. Another informative task was the local-global task, where observers viewed a large letter (such as G) composed of small images of a different letter (such as E). Observers are asked to respond as quickly as possible regarding which letter is being displayed. Autistics were more likely than neurotypical observers to respond with the small (or "local") letter (Happé & Frith, 2006). The frequency of local vs. global choices can then be used to quantify the relative predominance of local vs. global processing bias. 1.1 Weak central coherence Happé and Frith (2006) explained detail-oriented cognition using the phrase "weak central coherence," where central coherence refers to integrating information coming from diverse aspects of domains. For example, in vision, integrating the distinct shape of leaves, branches, and a trunk helps identify the overall object, tree; integrating a scene with trees, hills and a sky would identify this as a landscape. Neuroanatomical evidence has also supported the idea that autistic brains have more connections in local areas, and fewer long-range connections (Belmonte et al., 2004; Chung & Son, 2020; Pellicano & Burr, 2012; Ursino et al., 2022). Wickelgren, 2005). Happé and Frith (2006) argued that difficulty integrating information is a general characteristic of autistic functioning. Examples include integrating information when listening to sentence narratives (Jarrold & Russell, 1997; Happé, 1996; Jolliffe & Baron-Cohen, 2001), and difficulties making generalizations about categories, especially new categories (Gastgeb & Strauss, 2012; Van Overwalle et al., 2025). Autistic individuals are less helped by context when taking memory tests and problem-solving (Skoyles, 2011). Happé and Frith (2006) argued that difficulty with integrating information could also explain two key aspects of autism, restricted interests and social impairments. Specific, idiosyncratic interests are those that don't depend on flexibly interrelated diverse types of conceptual information. Social skills deficits may emerge because the social domain requires a dynamic blend of facial emotion processing, gestures, and knowledge of social norms (see also Herringshaw et al., 2018). 1.2 Enhanced perceptual functioning (EPF) Mottron et al.(2006) proposed that autistics have heightened sensitivity to sensory stimuli, particularly in the visual and auditory domains. This sensitivity leads to a detail-oriented cognitive style, where fine-grained information is prioritized for processing, at the expense of attention to global information(see review in Robertson & Baron-Cohen, 2017). As part of reframing detail-orientation as a strength, Mottron et al. (2006) showed that autistic outperformed neurotypicals on diverse tasks of block design, pattern recognition, and visual search. Children with ASD exhibited faster reaction times and no disruption in search efficiency on dynamic visual processing tasks (Joseph et al., 2009). Those authors argued that the autistic advantage derives from enhanced perceptual ability to discriminate between targets and distractors, although the cause of this was not specified. EPF reframed local processing bias as a strength, which also helped explain savants (Mottron & Belleville, 1993). EPF could be the mechanism by which some autistic individuals develop extreme talents such as calendrical calculation (Wallace et al., 2009). Mottron et al. (2006) proposed that EPF is broadly about sensory processing, and not just visual processing. They reviewed that autistics have superior pitch discrimination, heightened ability to distinguish between different musical pitches, enhanced pure-tone detection. Relative to NTs, autistics have improved auditory pattern recognition. They excel at identifying and remembering complex auditory patterns and sequences, and have superior auditory stream segregation, meaning better ability to separate and process multiple simultaneous auditory streams. Many individuals with ASD report heightened sensitivity to certain sounds, which may be related to enhanced auditory perception. Autistic individuals reported during interviews being drawn to details in a visual scene and having enhanced perceptual processing, such as navigating through an environment with a detailed map-like mental representation (Rattray et al., 2025). 1.3 Evolving Views of the Causes of Enhanced Perceptual Functioning It was older literature from the 1980s, 1990s, and early 2000s that used the terms detail-oriented, local processing bias, and weak central coherence. Another reframing came from researchers who argued that enhanced sensory functioning can be attributed, at least in part, to enhanced sensory capacity. Remington et al. (2009) designed tasks to stress visual processing capacity. Autistic adults and children had intact performance on a central attention task, but compared to an NT group, autistic adults had increased ability to process irrelevant peripheral information, even under high levels of perceptual load. Those authors concluded that autistics had superior visual processing ability compared to NTs (Remington et al., 2009, 2012). A proposed mechanism is increased auditory capacity, paralleling the increased visual attention advocated by Remington et al. (2009, 2012). To compare perceptual capacity in both autistics and NT groups, Remington and Fairnie (2017) designed auditory detection and identification tasks designed to tax processing capacity. Tasks were constructed to highlight both the benefits and disadvantages of increased capacity. Autistics were better able to notice and respond to unexpected sounds, meaning, sounds not central to the current task or focus. The authors concluded that autistics were able to process a greater amount of auditory information at any given time compared to neurotypical individuals. The Paradox of Detailed-Focused and Gestalt-Focused Processing We just reviewed the long-standing understanding of autistic cognition as detail-focused. This appears to contradict a theory that is increasingly popular among speech-language pathologists (SLP) called Gestalt language processing (GLP). GLP proposes that some individuals, including many autistic individuals, learn language by memorizing and using larger "chunks" of language (like phrases or sentences) before they fully understand the individual words within those chunks (Peters, 1977; Prizant, 1982). Because our goal is to chart out future research that could reconcile these two broad perspectives, we first review the historical background and current arguments for GLP, just as we did regarding evidence about autistics having weak central coherence. 2.1 The insight igniting Gestalt language fervor: Echolalic utterances are communicative Echolalia was the term to describe the repetition of phrases used in a meaningless way, such as repeating an advertising jingle. The innovation at the heart of GLP was that echolalic utterances have communicative intent. Prizant (1982; Prizant & Duchan, 1981) observed that the echolalic utterances produced by his autistic clients could be understood as meaningful for caregivers who understood the purpose and context of use, even if these utterances were idiosyncratic and not directly interpretable. As an example, one autistic youth repeated "Are you a good witch or a bad witch?" as a stand-in for greeting a new person. Prizant (1982) proposed that the children were using the echoes to communicate, albeit idiosyncratically. During the same era, language acquisition theorists established that using unanalyzed whole chunks to communicate is a strategy many children employ during early language development (Nelson, 1973, 1981; Peters, 1977). As part of extensive research on typical children's language acquisition, Nelson (1973, 1981) discussed holistic phrases such as Iwandat, which would be used during social interaction; she referred to this as an expressive style of language learning. Counterposed to this was the referential style, meaning using isolated words as labels to refer to objects. Thirty years later, another SLP, Marge Blanc, took up Prizant's insight that repeated phrases have communicative intent. Blanc (2012) worked to develop and popularize the idea of Gestalt processing, describing it in a manner that SLPs could use to instruct parents to help autistic children communicate. This was done in a neuroaffirming way. Blanc did not try to "normalize" autistic children's language acquisition, but to advise caregivers on treating Gestalt language as communicative, so that language could be a useful tool rather than an obstacle. Blanc (2012) argued the autistics' path to language learning is rooted in a "a form of thinking that includes a whole experience or situation—and suggest(s) that, for many GLPs, Gestalt language processing is an intrinsic part of the emotional experience in which the language was first used" (p. 1282, Blanc et al., 2023). GLP proposes that autistic people latch onto a "whole experience or situation" because it is difficult for them to parse the individual components (in this case, words). As part of evaluating this claim, we note it stands in opposition to the long-standing claim, reviewed earlier, that autistics struggle to see the coherent whole because they are overly focused on the details. Resolving the Paradox There are a number of possible reconciliations. We first note that the paradox is less severe than one might assume, because how autism is characterized has evolved since the decades when weak central coherence was proposed by Happé (1996, 1999, 2005). Autistic individuals are now understood to employ both detail-oriented and big-picture processing, albeit in a slightly different manner from neurotypical individuals (Wang et al., 2007). 3.1 New ideas about bottom-up vs. top-down processing In recent decades, local-global tasks have been heavily scrutinized by researchers (e.g., Smith, 2018). A view has solidified that local processing bias is a cognitive style and a preference rather than a disability (Booth & Happé, 2010; Gambra et al., 2024; Tassini et al., 2024; although note that this was also proposed in Happé, 1999). Evidence for this is that even when autistic individuals (including children) default to responding at the local level, they are able to report the global level when instructed to do so (Koldewyn et al., 2013; Mottron et al., 2006; Wang et al., 2007). An idea frequently discussed in the autism community is that autistic people process details first and then use the details to construct a big-picture perspective (Silvertant, 2018). Numerous online articles, blogs and popular science books now summarize and explain the view that autistics can construct big-picture views, but these are the end-goal of processing, not the starting point (e.g., Ampe & Rammant, 2023; Viter et al., 2022). One autistic researcher noted that autistic people appear to be "details-before-the-concept" thinkers, while non-autistic people are "concept-before-the-details" thinkers (Silvertant, 2018). Some autistic advocates note that learning approaches may differ from neurotypical norms. Aiello (2025) advises: "If learning a scientific theory, an Autistic student may need to see multiple specific experiments and results before understanding the overarching concept, rather than absorbing a general explanation first." Numerous social media posts elaborate on the view that autistics reach the big picture via processing of details, rather than being limited to details (u/AutismTranslated, 2024). However, this view is currently not discussed by academic researchers (as determined in July 2025 via multiple literature searches, including queries using AI tools). This sea-change away from the original 'weak central coherence' view reduces the processing difference between weak central coherence and Gestalt processing in language. This raises the possibility that differences in the nature of tasks and one's expertise could shift relative the extent to which someone makes use of detail-oriented vs. holistic processing. 3.2 No paradox, because echolalic utterances are not holistic episodic memories GLP has been criticized from within the field of speech and language pathology (e.g., Bryant et al., 2024; Hutchins et al., 2024) and from academic researchers (e.g., Beals, 2024), for being ill-defined and lacking robust empirical support. Hutchins et al. (2024) noted that both Prizant (1982) and Peters (1977) intended their description of Gestalt processing as provisional, awaiting future theoretical analysis. Yet Blanc (2012) adopted and developed the idea of a holistic style as if it were a theoretically and empirically vetted construct. Some researchers have argued that clinical practices based on GLP could even be detrimental for children with neurodevelopmental and communication disabilities (Venker & Lorang, 2025). We focus here on the criticism of Hutchins et al. (2024) that GLP is ill-defined because echolalic utterances are not unanalyzed forms, and are not holistic representations. Regarding forms not being analyzed, Hutchins et al. (2024, p. 5) note that Prizant's examples of "Gestalt mode of processing" appeared to be better characterized as "need for sameness but not a lack of internal analysis of the situation." Prizant (1982) defined a Gestalt mode of processing as one "in which events are remembered or retained with relatively little analysis ... [which] must be viewed in contrast to an analytic mode in which experiences or events are analyzed and segmented into meaningful components based upon prior experience" (p. 300). How can a phrase (which is a multi-word utterance) be remembered with relatively little analysis? Blanc and others proposed that Gestalt forms are supported by superior episodic memory, which allows information to exist in an unanalyzed, holistic form. Hutchins et al. (2024) claim this is a misunderstanding of episodic memory. Cognitive psychologists view episodic memory to be flexible, hierarchically structured, and composed of multiple analyzed components (Boucher & Anns, 2018; Boucher et al., 2012). Episodic memory is a constructive process, not just a simple recording of past events. This constructive nature allows individuals to flexibly extract and recombine elements of past experiences, and to imagine or simulate future scenarios. Memories need to be reshaped slightly by new context, emotions, or suggestions during each retrieval. The reason for this is the human memory system is designed for predicting future events, which means it needs to be adaptable, not a mechanism for freezing in place replicas of past experiences (Schacter et al., 2008). Hutchins et al. (2024) argued that delayed echolalia would be better understood as reflecting strength in perceptual memory. This is a non-declarative form of memory that registers and retains relatively unprocessed 'snapshot' records of single items to compensate for episodic memory (Boucher, 2007, 2012; Boucher & Anns, 2018). Let us accept the point of Hutchins et al. (2024) that echolalic utterances are recordings in perceptual memory. These could also be understood as holistic. It is thus difficult to understand why relatively unanalyzed forms stored as unprocessed snapshots in perceptual memory are not Gestalt forms. The proposal by Hutchins et al. (2024) is thus either wrong or incomplete. Below, we propose several novel resolutions. 3.3 Echolalic utterances are details seized upon in a rich, complex conversational setting If Gestalt forms are perceptual representations, they can plausibly exhibit enhanced sensory capacity. Consistent with existing research that autistics have enhanced auditory capacity, overheard phrases can be retained as an auditory trace. These can be holistic, in the sense that the phrase is mapped to some salient meaning in the environment, but not fully analyzed into constituent pieces. Some phonological analysis must have been performed, since with echolalia, phonemes are typically pronounced correctly, and appropriate intonation is employed. But if autistics have local processing bias and attention to detail, why don't they analyze the details of phrases, such as the component words? We propose that autistic learners may already be showing local processing bias regarding language input, but in a way that deviates from our conventional understanding of what it means to be detail-oriented in the domain of language. Consider the richness and complexity of a typical conversational setting. Within this rich setting, a detail-oriented processor with high perceptual capacity may have selected the detail consisting of a phrase spoken to them or an overheard TV jingle. This is the detail that is selected for focus, but not integrated with diverse aspects of the larger setting in which the phrase is spoken, including objects handled by speakers and speakers' physical actions in space. Consistent with local processing bias, the detail is not integrated within the larger conversational setting, which includes speakers' communicative intentions. The result is one of the hallmarks of echolalia: unconventional, idiosyncratic meaning. Schuler and Prizant (1985) cited the example from Kanner (1943) of an autistic youth repeating "don't throw the dog off the balcony" to remind himself to exert self-control. On this account, there is no conflict between GLP and detail-oriented processing, because an echolalic utterance is a detail, albeit also a holistic, unanalyzed form. Given that autistics appear to excel at identifying the small units of any input stream, why would this ability be absent in the case of speech? The standard explanation is that identifying word-level semantics requires intense attention to joint attention and speakers' communicative intent (Tomasello, 2005). Autistics at the beginning of language learning may be focusing their attention on perceptual details of the speech stream, rather than tracking conversation partners' communicative goals. A speculative proposal about why autistics sometimes do not break into language learning with words continues with the theme that autistics do have "enhanced auditory capacity." Infants' restricted working memory capacity was an assumption in Newport's (1990) 'less is more' hypothesis. Newport (1990) speculated that early childhood is a sensitive period for language learning, due in part to reduced working memory capacity. According to the 'less is more' hypothesis, reduced capacity forces young children to focus on (and thus identify) smaller linguistic units that form the fundamental components of language (although see Brooks & Kempe, 2019). In contrast, the more capacious working memory capacity of older children and adults allows them to encode large units of language, which then interferes with identifying what are the small units that form the building blocks of language. 3.4 Inherent differences between auditory and visual processing mean holistic processing is more easily observed in language than in vision Auditory inputs can be easily repeated (re-produced) in a holistic fashion. This is as simple as listening to auditory input and repeating verbatim an overhead phrase or environmental sound. Any individual with a larger-than-expected sensory capacity can impress observers by repeating a long phrase. This can strike observers as 'holistic' if the speaker is not known to produce any of the phrases' component words on their own. Recall that visual stimuli where local processing bias is reported for autistic children and adults. What would it mean to demonstrate holistic processing in vision? An example is drawing. Autistics do produce high-capacity visual stimuli in the form of accurate, detailed drawings of real-life scenes (Furniss, 2008). As example, is the autistic artist Stephen Wiltshire, who regularly draws pictures of cityscapes after a one-time viewing from a plane (Roth, 2020). This ability to produce high-capacity representations of visual scenes has been observed for autistic individuals beyond those labeled as savants. High-capacity drawings may be less apparent to observers because drawing requires fine motor skills, an area of documented challenges for some autistics (Mohd Nordin et al., 2021). Nonetheless, when autistic children produce such drawings, they are lauded and admired, but what gets remarked by both caregiver and autism researchers is not the holistic nature of the drawings, but their details. We suggest that autistic visual processing is not just detail-oriented, but also has a holistic quality, given that some drawings are an accurate rendering of a coherent, big-picture stimulus, just as echolalic utterances are typically faithful copies of a heard phrase. On this analysis, autistics' enhanced sensory function allows them to repeat large sequences of auditory stimuli and re-present 2-D visual forms via drawing (or demonstrate knowledge via keen spatial navigation or rapid learning of visual material). We maintain that what differs between vision and audition is not that the former is detail-oriented and the latter holistic. What differs is that demonstrating to observers examples of holistic auditory forms is easily done through verbal repetition, but less easily done in the visual realm. 3.5 The paradox of weak-central coherence vs. Gestalt processing is an illusion caused by observers' normative expectations, influenced by culture When learning language, experts and parents alike have an expectation that words will be acquired first, then phrases. The building blocks should be acquired first, then combined into large forms. When autistics produce phrases, onlookers regard this as aberrant. The assumption that words must come first reflects a society that is educated about developmental psychology. This assumption is part of the educated middle-class parenting tool kit, not a universal of human language acquisition. Child language acquisition researchers have reported that typically developing children use multiple strategies in learning language, including phrasal productions (Bates et al., 1996; Nelson, 1973; Peters, 1977). In a different culture, the normative expectations may differ. For example, cross-cultural psychologists report that North Americans process a visual scene by focusing on a central detail. For example, when shown an image of a fish foregrounded in a fishtank, Americans report seeing "a fish", while East Asians more often report the larger context, such as a fishtank (Nisbett & Miyamoto, 2005). Stages of learning may also be different in cultures where children begin learning via overhearing rather than direct verbal interaction with caregivers (Girouard-Hallam & Norris, 2024) The language domain has other examples where autistic children's patterns are ambiguous between supporting weak central coherence vs. a lack of orientation to normative expectations. Autistic children do not manifest a shape bias in laboratory experiments on new word learning (Abdelaziz et al., 2018; Potrzeba et al., 2015). In contrast, the non-autistic group robustly used shape to infer the meaning of new words starting at 24 months of age, a phenomenon that exists (Hou et al.. One interpretation is that learning to use shape to infer words' meaning is an example of an abstraction of a high-level principle, and that its absence indicates weak central coherence (specifically, lack of generalization). An alternative is that autistic children have not learned (or not attended to) the normative expectation that shape is routinely used to identify words' referents in typical language interactions. Agent-based modeling has been used to explain the shape bias as originating from pressures of communicative efficiency cultures (Portelance et al., 2021). The strength of the shape-bias holds in some cross-cultural comparisons (e.g., Hou et al., 2024), but not across others, suggesting that cultures vary in their normative expectations that object shape is relevant to inferring a word's referent (Jara-Ettinger et al., 2022). Discussion Our purpose is to galvanize the research community to investigate whether autistics' local processing bias and high-capacity productions in vision and languages share more similarities than is currently acknowledged. Visual outputs may have holistic properties (such as Wiltshire's expert drawings), and echolalic utterances may have more details (such as precise phonemic and intonational renderings) than researchers have recognized. Here, we discuss the relevance of these issues to open questions in autism research. 4.1 Some autistics may be bottom-up processors, others, top-down processors Across a range of types of cognitive processing, abilities are normally distributed for the population at large, but abilities for autistic individuals may manifest a more flattened distribution of abilities. This derives from more individuals at the low end of ability (corresponding to intellectual and social disabilities), but also a higher proportion of individuals with special skills, such as visual-spatial processing, music, and math ability (Soulières et al., 2009, 2011). A case where autistic abilities load more strongly on the tails of a distribution concerns sensory sensitivities. Some autistics are hypersensitive to sound and others are hyposensitive (Christopher, 2019; Ide et al., 2017). Hyposensitivity and hypersensitivity are not mutually exclusive but often co-occur within an individual (Baranek et al., 1997; Foss-Feig et al., 2012). The same individual may be hypersensitive to light but hyposensitive to sound. Individuals may even be hyper-and hyposensitive to the same sensory domain, with touch being the most commonly reported. We suggest the same variation can manifest in bottom-up vs. top-down processing. But could some autistics be primarily top-down processors? The idea of autistic gestalt-type of cognition was suggested as far back as Kanner (1973): "[Autistic children's] world must seem to them to be made up of elements that, once they have been experienced in a certain setting or sequence, cannot be tolerated in any other setting or sequence; nor can the setting or sequence be tolerated without all the original ingredients in the identical… order. (P.41)" We add here a contemporary updating. Autistic adults who have read about Gestalt language processing have reflected that they feel their cognitive style has Gestalt elements. Examples include the following. I am a Gestalt processor, and this shows up in many different ways in my life. It is very hard for me to follow steps and thought processes unless I know the endpoint/the whole picture. https://www.instagram.com/reel/C0ULMW3sdzT/# …when it comes to learning and thinking about things I have to first understand the greater context before focusing on specific pieces….. [subsequent response] I have always knew my brain or "way of thinking" was top down, iterative, and associative but never had the words for it past that https://www.reddit.com/r/AuDHDWomen/comments/1add35h/gestalt_processing/# 4.2 Future Research Whether and how much some autistic individuals have a Gestalt approach vs a detailed orientation is unknown, with heterogeneous findings on global-local tasks (Gambra et al., 2024). Future work can explore the relative strength of top-down vs. bottom-up processing across modalities, following what has been done by child language acquisition researchers (e.g., Bates et al., 1996). Some autistic individuals may: • display a detail-focused approach to visual stimuli and a gestalt-focused approach to language and auditory processing. • experience challenges when moving between details and gestalts, rather than being biased towards one or the other. • have the ability to be detail-focused and holistic, with their major challenge located at an intermediary level of processing. GLP is confined to language learning, whereas weak-central coherence is assumed to be part of general processing, although most examples concern vision. Future work could disentangle this difference when trying to reconcile whether or not there are different processing strategies across modalities, including and beyond vision vs. language. 4.3 Conclusions The scientific research literature has characterized the autistic cognitive style as detail-oriented. As noted, this contrasts with the Gestalt language perspective discussed by speech-language pathologists. We described several ways to resolve this, such as understanding how large perceptual capacity can facilitate detailed, oriented processing in vision but holistic-seeming productions in speech. Autistic and neurotypical individuals differ in their connections between sensory processing and amodal integration systems (Kausel et al., 2024). Future research can reveal how variations in connectivity may differently prioritize top-down vs. bottom-up information processing. Pursuing these questions between and within autistic and neurotypical groups will illuminate a core question presented in brain sciences, which is the nature of the relationship between low-level sensory information and higher-order cognition.
Keywords: autism, Language acquisition, Top-Down, Bottom - Up approach, Enhanced perceptual functioning
Received: 13 Jul 2025; Accepted: 20 Aug 2025.
Copyright: © 2025 Schwartz and Caldwell-Harris. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Catherine L. Caldwell-Harris, Boston University, Boston, United States
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.