The Margins of the Language Network in the Brain

This review paper summarizes the various brain modules that are involved in speech and language communication in addition to a left-dominant “core” language network that, for the present purpose, has been restricted to elementary formal-linguistic and more or less disembodied functions such as abstract phonology, syntax, and very basic lexical functions. This left-dominant perisylvian language network comprises parts of inferior frontal gyrus, premotor cortex, and upper temporal lobe, and a temporoparietal interface. After introducing this network, first, the various roles of neighboring and functionally connected brain regions are discussed. As a second approach, entire additional networks were considered rather than single regions, mainly motivated by resting-state studies indicating more or less stable connectivity patterns within these networks. Thirdly, some examples are provided for language tasks with functional demands exceeding the operating domain of the core language network. The rationale behind this approach is to present some outline of how the brain produces and perceives language, accounting, first, for a bulk of clinical studies showing typical forms of aphasia in case of left-hemispheric lesions in the core language network and second, for wide-spread activation patterns beyond this network in various experimental studies with language tasks. Roughly, the brain resources that complement the core language system in a task-specific way can be described as a number of brain structures and networks that are related to (1) motor representations, (2) sensory-related representations, (3) non-verbal memory structures, (4) affective/emotional processing, (5) social cognition and theory of mind, (6) meaning in context, and (7) cognitive control. After taking into account all these aspects, first, it seems clear that natural language communication cannot really work without additional systems. Second, it also becomes evident that during language acquisition the core language network has to be built up from outside, that is, from various neuronal activations that are related to sensory input, motor imitation, nursing, pre-linguistic sound communication, and pre-linguistic pragmatics. Furthermore, it might be worth considering that also in cases of aphasia the language network might be restored by being trained from outside.


INTRODUCTION
Following the idea that language areas in the brain may be subdivided into a core language network and additional areas that are required for real communication (Hagoort, 2017), the aim of this review paper is to provide an overview of an expanded language network in the brain that becomes relevant when language is used in natural contexts. To these ends, section The Core Language Network introduces a "core" language network as a kind of update of historical models. Section Anatomical Neighbors and Connectivity then addresses the "neighborhood" of the core language network in terms of brain areas and connectivity patterns that are additionally activated during various language tasks, in order to demonstrate some functionalanatomical modularity and flexibility of language processing. A different approach is shown in section Functional Networks where entire networks were considered that are participating in language tasks. Section Examples of Language Use Beyond the "Code" of the Core Language Network, finally, rather than primarily focusing on brain regions or networks, considers extensions or modifications of the language network from a functional point of view, addressing special situations of language use such as non-literal or affective language. The entire review is focused on the normal language communication process in a native spoken language without addressing particular issues such as sign language or second language acquisition.
The way of presenting the various aspects of language processing here may appear somewhat arbitrary, particularly with regard to the borders of the core language network. On the one hand, it might be desirable to integrate various language-related functions into this network such as, for example, the pragmatic assignment of word meanings in context, the sensorimotor embodiment of language, or the listener's assumption of a speaker's intention behind an utterance. On the other hand, by making the core language module large, accounting for all potential language-related functions, more or less the entire brain must be considered as a network for language processing, due to the fact that language is a powerful tool that not only can name all kinds of objects and express and store all kinds of thoughts, but also can evoke emotions and physiological responses, can support complex social functions such as teaching, flirting, division of labor, and conflict management, and last not least, can be used for complex logical reasoning. Thus, the core language network as presented here should be taken as a kind of working definition circumscribing a brain module for some elementary language functions while "elementary" remains a more or less vague attribute.
Also the selection of studies that are reviewed in sections Anatomical Neighbors and Connectivity, Functional Networks, and Examples of Language Use Beyond the "Code" of the Core Language Network might appear somewhat arbitrary since the research questions of these sections are partially overlapping and, thus, may lead to some redundancy. In spite of these shortcomings, the present approach was adopted in order to demonstrate the different perspectives that are present in current language research. Section Anatomical Neighbors and Connectivity, focusing on particular regions in the brain, is mainly supported by studies demonstrating well-defined small activation spots associated with particular tasks. Section Functional Networks is supported by studies on stable, more or less task-unspecific connectivity patterns within entire brain networks that may be engaged in language tasks such as the multiple-demand or the default mode network, and section Examples of Language Use Beyond the "Code" of the Core Language Network is supported by studies on particular language tasks that might be considered as somewhat atypical with regard to language processing under laboratory conditions. The method of reviewing was driven by intuition, it was not restricted by pre-selected key words or data resources. Most of the search was performed with "Web of Science, " "Pubmed, " and "Google Scholar, " using key words, known authors, cross-links, and "cited by" features. The entire review should be considered as work in progress. It does not claim to be exhaustive, but it may point toward some demand for communicating the results of recent research on brain and language to a broader community, including branches such as linguistics, psychology, speech and language pathology, and philosophy. It should be taken as stimulus for considering various aspects and perspectives of language processing, as a kind of brain storming to the next generation of textbooks.

THE CORE LANGUAGE NETWORK
Since the beginning of the twenty-first century the view of the functional neuroanatomy of language received various important updates, motivated, among others, by Poeppel and Hickok (2004) addressing various issues such as linguistic specificity, more fine-grained localization in the brain, and connectivity patterns. Subsequently, various models have been established, comprising auditory, phonological, and lexical regions in the temporal lobe, that are linked to Broca's area and premotor cortex in the frontal lobe via lexical ventral pathways and articulatoryphonological-syntactic dorsal pathways (Hickok and Poeppel, 2007;Hickok, 2009;Poeppel et al., 2012;Bornkessel-Schlesewsky et al., 2015;Skeide and Friederici, 2016). The importance of the dorsal connection has already been emphasized in historical language models (see, e.g., Geschwind, 1970), in a simplified manner as an "arcuate fasciculus" connecting "Broca's" and "Wernicke's" areas. More detailed neuroanatomical connectivity studies showed that the arcuate fasciculus, concomitant with the superior longitudinal fasciculus, has a complex structure that has heavily developed in human evolution, connecting multiple target areas in frontal cortex with multiple target areas in temporal and parietal cortex (Thiebaut de Schotten et al., 2012;Friederici, 2017;Pulvermüller, 2018). Regarding the languagerelevant aspects of this structure, so far no definite model has been worked out, and various alternative models are still under discussion (Glasser and Rilling, 2008;Dick and Tremblay, 2012;Friederici, 2018).
While the dorsal pathway of the language system connects perception and production of language at the stage of perceptionaction correspondence predominantly related to phonology and syntax, the ventral pathway connects language to various aspects of meaning and, subsequently, meaning-dependent responses in frontal cortex. The major pathways of the ventral stream comprise the capsula extrema and the uncinate fasciculus for frontotemporal connectivity and the middle longitudinal fasciculus for anterior-posterior information exchange within the temporal lobe (Brauer et al., 2013), again with some variability across different authors and studies (Dick and Tremblay, 2012). Functionally, the ventral pathways connect various lexicalsemantic representations in the temporal lobe to target regions in the frontal lobe including, among others, BA45, BA47, and insular cortex. In approximate analogy to the functional subdivision of language into phonology, syntax, and semantics, the left inferior frontal gyrus seems to exhibit a modular structure of for language processing, as suggested by more or less distinct connectivity patterns toward the temporal lobe for semantic, syntactic, and phonological processing (Anwander et al., 2007).
The neocortex is characterized by a modular structure of primary sensory and motor areas mapping input signals or motor commands onto neuronal structures in a systematically organized and spatially coherent manner such as tonotopy, retinotopy, or somatotopy. Also secondary sensory areas are organized in a map-like way, leading to a spatial organization of mental objects (Cavina-Pratesi et al., 2010;Grill-Spector and Weiner, 2014;Rauschecker, 2018). Examples for the visual system are the representations of shape and color (Bartels and Zeki, 2000;Reddy and Kanwisher, 2006;Bushnell and Pasupathy, 2012), faces in the fusiform face area (Kanwisher and Yovel, 2006), or written words in the visual word form area (Dien, 2009;Dehaene-Lambertz et al., 2018). As concerns language and speech, auditory word forms have been considered as basic mental objects of language, represented in association areas of the ventral (anterior temporal) stream of the central auditory system (DeWitt and Rauschecker, 2012).
Some authors have separately worked out a speech articulation network with the focus on speech motor control rather than linguistic processing (Guenther and Vladusich, 2012). Considering clinical studies, deficits in the articulation network comprise dysarthria (deficient speech motor performance) and apraxia of speech (deficient speech motor planning) rather than aphasia .
While articulatory motor control represents the "lower" end of language processing, there is also an extension of the language network at the upper end, toward pragmatics and discourse processing exceeding the simple requirements of lexical access, syntax, and phonology (Xu et al., 2005). Considering the various language models in the brain, there is a tradeoff between language-specificity and compactness, on the one hand, and a model's explanatory power in complex language environments, on the other. Historical (Broca-, Wernicke-like) models, mostly based on clinical observations of prototypical aphasic symptoms, are small, simple, and largely bound to the left hemisphere. By contrast, functional imaging studies investigating healthy subjects largely show bilateral activations and a huge network that is partially active, depending on different language tasks. An example of such an expanded language model is represented in Price (2012), assuming that the classic Broca's and Wernicke's areas serve as "convergence zones" that receive and send signals to all other areas that are involved in perceiving and producing speech.
For the purpose of the present review paper a rather small network in the perisylvian region was considered as the core language network while all other regions that are involved in language processing will be addressed separately. However, just saying that the core language network comprises Broca's and Wernicke's areas seems problematic since these areas and their functions are characterized by a considerable inconsistency across authors and historical epochs (Tremblay and Dick, 2016). Thus, the core language network has to be outlined more explicitly corresponding to some core language functions and their cortical implementation. In this respect, language will be considered as a more or less disembodied (Mahon, 2015) abstract modality with auditory word forms as its basic units that are linked to lexical-semantic objects in the mental lexicon. At a sublexical level, these units are bound to a phonological structure organized in features, phonemes, phonological gestures, and syllables. At the supra-lexical level, lexical word forms are concatenated to phrases and sentences organized by a system of morphological and syntactic rules. For the present purpose, thus, the core language network will be restricted to largely left-lateralized cortical regions for word forms, simple lexicalsemantic mapping, phonological processing, and syntax as outlined in Figure 1 (regions with numbers 1-5) and Table 1 in contrast to the "margins" regions (Figure 1, regions 6a−6g and Table 2). Regarding semantics, only basic lexical semantic areas in the temporal lobe will be considered since the semantic system, as a whole, is very large, extending into various nonlinguistic memory systems and cognitive functions (Binder and Desai, 2011).

Extended "Wernicke's" Area
Various studies have shown that the brain regions that historically had been labeled Wernicke's area comprise functionally different subregions. For example, already Démonet et al. (1992), in a positron emission tomography (PET) study, found the superior temporal gyrus to be bound to phonological processing whereas lexical-semantic processing was assigned to more inferior regions. Similarly, Wise et al. (2001), considering word production, encoding, and memorizing, found distinct functional-anatomic subsystems in the upper left posterior temporal cortex. Considering the ventral and the dorsal stream of auditory processing, DeWitt and Rauschecker (2013) proposed two different modules within Wernicke's area: an auditory word form area anterior the primary auditory cortex and a more posterior region for representing inner speech. Also Dronkers et al. (2004Dronkers et al. ( , 2017, mainly based on clinical findings, pointed out the existence of various functionally distinct subregions within and around "Wernicke's" area. As one of the most expanded models of the posterior language region within the human cortex, Binder (2017) outlined a huge area comprising almost FIGURE 1 | Left-hemisphere schematic display of the auditory cortex (1), the core language network (2-5), and its margins (6). (1) auditory cortex as the primary input structure for verbal communication (Rauschecker and Scott, 2009), (2) auditory word form area as a perceptual core region of the language modality (DeWitt and Rauschecker, 2013;Binder, 2015Binder, , 2017, (3) phonological areas linking an auditory-phonetic to an articulatory language code (Hickok and Poeppel, 2007;Price, 2012;Herman et al., 2013;Binder, 2015;Rogalsky et al., 2015;Battistella et al., 2019), (4) syntax processing, manipulating and detecting structures above word level (Uddén and Bahlmann, 2012;Goucha and Friederici, 2015;Binder, 2017;Regel et al., 2017;Matchin and Hickok, 2019), (5) lexical-semantic core areas linking phonological codes to lexical meanings (Price, 2012;Ardila et al., 2016;Binder, 2017;Wilson et al., 2018) the entire temporal lobe as well as part of inferior parietal cortex. Similarly, Ardila et al. (2016) argued in favor of an extended Wernicke's area toward inferior temporal, anterior temporal, and temporoparietal regions (BA 20,37,38,39,40) while the core "Wernicke's" area comprises upper and middle temporal regions (BA 21,22,41,42). Regarding structural connectivity, the various areas of the extended semantic network are mainly interconnected via four by white matter pathways: the uncinate fasciculus, the middle longitudinal fasciculus, the inferior fronto-occipital fasciculus, and the inferior longitudinal fasciculus (Bajada et al., 2015).
In the following subsections, the core areas within the temporal lobe will be constrained to the regions 2 (auditory word forms), 3b and 3c (phonological processing), 4b (elementary lexical semantics), and 4b (syntax) as outlined in Figure 1. As the "margins" of these core functions, three functionally different regions will explicitly be addressed, referring to (1) parietal and posterior temporal regions contributing to sensorimotor processing, processing of language in context, and theory of mind, (2) inferior temporal-occipital regions that are predominantly linked to visual object representations and their association with language processing, and (3) the temporal pole as a language interface toward the processing of emotions, valence, and social cognition. Such extensions may not be required for very simple laboratory language tasks, but they are largely relevant for natural language communication and for understanding sentences in context. They may even be active at the stage of single-word processing, based on the nature of word and concept representations as wide-spread cell assemblies (Pulvermüller, 1999).

Parietal and Inferior and Posterior Temporal Regions
Independent Component Analysis on fMRI connectivity data determined specific functional subregions in the region of the left temporoparietal junction (TPJ) that could be relevant for language processing (Igelstrom et al., 2015). These subregions comprise (1) an anterior region located in the supramarginal gyrus, connected to sensorimotor areas and the insula, (2) a ventral region in posterior STG linked to auditory processing, (3) a dorsal region in angular gyrus with fronto-parietal connections and some attentional functions, and (4) a posterior region centered at posterior STS with connectivity to various temporal and frontal areas and precuneus that, among others, may serve Theory of Mind (ToM) functions. The latter aspect seems particularly relevant for discourse comprehension, being related to semantic integration (Lin et al., 2018). Closely related to the ToM aspect, the left TPJ is part of an intention-processing network, integrating linguistic information into an "agential situation" (Tettamanti et al., 2017), which might be related to the temporoparietal region involved in the primate mirror network and the self-other distinction (Holden, 2004;Molnar-Szakacs and Uddin, 2013;Hogeveen et al., 2015). Furthermore, clinical data indicate that the left TPJ is involved in belief inference (Biervoye et al., 2016) which is a further aspect of ToM. Regarding linguistic tasks at the level of lexical words, inferior parietal regions were involved in a homonym-finding task (Balthasar et al., 2011) indicating, as the authors suggest, the presence of particular word representations for items with multiple meanings in this region.

Visual Association Areas and Audiovisual Interactions
In a similar way as auditory word forms are represented as auditory objects in the ventral stream of auditory processing (DeWitt and Rauschecker, 2013), written words have representations in the left-hemispheric ventral visual stream, a tool-related area in inferior occipitotemporal cortex in the neighborhood of face representations (Dehaene-Lambertz et al., 2018). This "visual word form area" is strongly connected 1 | Tentative function, network involvement, and most pronounced cortical connectivity patterns of the "core" language areas depicted in Figure 1 (including the auditory system).

Brain structure
Language-related function Functional network Connectivity  Most of these language-related functions are more or less lateralized to the left hemisphere. The first column indicates the respective regions depicted in Figure 1. SMA, supplementary motor area; IFG, Inferior frontal gyrus.
to core language areas in upper temporal cortex, and the strength of this connectivity correlates with the performance in visual linguistic tasks .
In natural interaction, particularly during language acquisition, the processing of speech can be considered as audiovisual due to visual representations of articulatory gestures, enabling lip reading (Giraud and Truy, 2002;Calvert and Campbell, 2003;Chu et al., 2013;Hauswald et al., 2018) and giving rise to audiovisual interactions such as the McGurk effect (MacDonald and McGurk, 1978;Hickok et al., 2018). These audiovisual interactions seem to rely on a network including, among others, bilateral fusiform gyrus (for visual face processing) and phonological areas within posterior superior temporal gyrus and sulcus (pSTG and pSTS) (Hertrich et al., 2011;Chu et al., 2013).
Apart from graphematic and visual articulatory-phonological representations, the fusiform gyrus, presumably by representing facial expressions during speech communication, is also active during mentalizing (Castelli et al., 2000) and emotion processing (Schindler et al., 2015), i.e., operations that are necessary for or supporting language comprehension in a natural social environment.
A further interaction of the visual system with language processing refers to non-speech gestures that can be paired with intelligible speech in terms of visual prosody. These gestures can be subdivided in categories such as beat gestures (related to timing and rhythm), on the one hand, and metaphoric gestures (related to the shape of semantic content), on the other, activating different cortical and cortical-cerebellar networks (Bernard et al., 2015). Both types of gestures may lead to multisensory integration at the semantic level, activating a left frontotemporal network. Furthermore, differential gammaband activity was observed in right temporal cortex for iconic and metaphoric gestures, indicating specific dimensions of higherlevel semantic processing in an audiovisual setting (He et al., 2018). In addition to direct visual involvement in language reception, secondary visual cortex may give rise to visual imagery even in the absence of a visual signal (Bergen et al., 2007), which may also cause cross-modal priming effects (Dils and Boroditsky, 2010). Regarding the general relationship between visual recognition (perception) and semantic language functions, Brodmann area 37 can be considered as a common node of the visual network and the language system (Ardila et al., 2015).

The Temporal Pole and Its Connectivity
While the extreme capsule, linking the temporal lobe with IFG, seems to be a core component of the semantic and syntactic system as part of the ventral stream, the linguistic role the uncinate fasciculus, linking more anterior parts of the temporal lobe (temporal pole and some medial regions of the temporal lobe) with frontal cortex, is discussed differently in the literature (Catani et al., 2013;Friederici and Gierhan, 2013;Dick et al., 2014;Hau et al., 2016). Apart from language operations, anterior temporal lobes are generally involved in social cognition, theory of mind processing, social conceptual knowledge (Ross and Olson, 2010), and memory systems that are linked to reward and valence (Von der Heide et al., 2013). The vertical dimension of the temporal pole has a modality-related structure with visual input in inferior, auditory input in superior regions (Olson et al., 2007). Evidence for an additional language-related region in the left temporal pole was indicated by an fMRI study on lexical-emotional processing (Ethofer et al., 2006). A further language-related function of the temporal pole seems to be the processing of proper names, as indicated by intracortical electrical stimulation experiments (Papagno, 2017).

Extensions of the Anterior Language Areas
Traditionally, the anterior language areas have been considered as being linked to language and speech production whereas posterior language areas were assigned to speech perception and language comprehension. However, since language processing comprises mirror mechanisms in terms of action-perception loops, perception and production aspects are closely interwoven. On the one hand, speech perception often includes considerable executive top-down activity of predictive processing (Callan et al., 2010;Garrod, 2013, 2014), on the other hand, production is closely associated with self-monitoring and forward imagery (Tian and Poeppel, 2015). Nevertheless, considering simple task conditions, the anterior-posterior distinction of action and perception seems still valid and, thus, also the margins of the language network are supposed to show a respective tendency.
However, in case of emotion processing, the binary distinction between perception and production might be too simplistic since emotions can be processed at different levels. In cognitive tasks such as a multiple choice test for emotion perception, for example, they might be handled as mental objects whereas in natural situations emotional speech might be perceived more directly at the level of feeling rather than thinking.

Motor System, Sensorimotor Pathway
In a similar way as actions can be performed as well as observed with an "inner" performance, language can be produced and perceived with inner motor representations such as inner speech and motor preparedness, which suggests that language has been implemented during human evolution as an extension of the mirror system of other primates (Holden, 2004;Pulvermüller, 2018). Speech motor aspects (e.g., tongue sounds vs. lip sounds), semantic content with motor aspects (e.g., foot-related vs. handrelated words), and even abstract words seem to have roots in the motor system, resulting in multiple interactions of the motor system with language processing (Carota et al., 2012;Pulvermüller, 2018;Zhang et al., 2018). Such interactions can be interpreted as an aspect of embodiment of language, they seem to have a facilitating role of primary motor cortex for language processing (Courson et al., 2018). Superordinate control mechanisms of this activity in medial frontal cortex seem to involve SMA proper rather than pre-SMA, indicating that these effects are due the actual engagement of low-level motor control rather than imagery or abstract inferences (Courson et al., 2017). Additionally, the primary laryngeal motor representation seems to play a particular role in the formation of the human speech generation network (Simonyan and Fuertinger, 2015). As an evolutionary aspect, the direct pathway from motor cortex to laryngeal motoneurons in the brain stem is unique to humans, providing voluntary control over phonation as a prerequisite to the ability of speech production (Simonyan and Horwitz, 2011;Ackermann et al., 2014).

Mechanisms of Cognitive Control in the Frontal Lobe
Unlike human language, most instances of natural sound communication in mammals rely on innate instinct-driven mechanisms. The neural network underlying this phonatory behavior comprises anterior cingulate cortex as a preparatory sound-eliciting structure of the limbic system, the periaqueductal gray as a lower-order trigger mechanism, and a sound pattern generator in the reticular formation of the brain stem feeding the motoneurons that are required to produce the respective sounds (Jürgens, 2009). Also in humans, this phylogenetically old system is still intact, enabling subjects who cannot speak to produce emotional sounds such as spontaneous pain cries, laughing, and yawning. This system is also active during speaking, providing speech sounds with a natural-sounding "tone" in terms of an affective-prosodic modulation (Ziegler and Ackermann, 2017). Humans and non-human primates as far as they are able of some limited vocal learning, engage additional mechanisms for sound communication including motor cortex and subcortical cerebellar-thalamic and basal ganglia-thalamic circuits (Jürgens, 2009) as well as superordinate control mechanisms for volitional sound initiation, comprising anterior cingulate cortex (motivational coding), prefrontal cortex (related to decision making prior to sound production and partially homologous to the Broca area), and pre-supplementary motor area (preparatory motor signal) (Hage and Nieder, 2016;Gavrilov et al., 2017). This cognitive control network, particularly the connectivity between the pre-SMA and the frontal language areas (BA44, BA45, and premotor cortex) via the frontal Aslant tract , is involved in mechanisms providing a continuous and temporally coherent flow of speech and to manage repair mechanisms in case of detected errors, both during speech production and speech perception (Hertrich et al., 2016). There seems to be a functional anterior-posterior gradient in this system regarding semantic, syntactic, and phonological aspects of speech (Anwander et al., 2007;Ford et al., 2010). Dysfunction of this network causes stuttering behavior (Catani et al., 2013;Kemerdere et al., 2016) or even more severe language deficits (Dhakar et al., 2016;Chernoff et al., 2018).
Regarding functional gradients, most studies on the core language regions are more or less restricted to phonology, syntax, and semantics up to the sentence level. However, cognitive control of language and speech has also to consider the superordinate level of discourse generation and management which seems to rely on dorsal and medial prefrontal regions beyond the "classical" Broca area, (Kim et al., 2012;Bourguignon, 2014;Moss and Schunn, 2015;Rouault and Koechlin, 2018;Panikratova et al., 2020). Further examples for the recruitment of prefrontal cortex are the task of verb generation including the management of competition among words, with additional activation of anterior cingulate cortex (Bourguignon et al., 2018), and the management of discourse coherence at the perceptual level, with additional activity in angular gyrus and posterior cingulate cortex (Moss and Schunn, 2015).

Insula-Interface to the "Inner Being"
Based on clinical studies on apraxia of speech, the anterior insula has been considered as an area for articulatory programming (Dronkers, 1996), which has been further specified in a follow-up study (Baldo et al., 2011). However, other studies could not really confirm these findings, showing that insular activations were stronger in case of non-speech movements than speech articulation and that the particular role of insular cortex for apraxia of speech was overestimated due to the fact that damage of the insula often co-occurs with damage in neighboring regions such as premotor cortex and posterior IFG (Fedorenko et al., 2015). However, there is still clinical (Mandelli et al., 2016) and electrophysiological (Basilakos et al., 2017) evidence that the insula is involved in speech articulation, and connectivity patterns of the insula toward language-relevant regions suggest a number of different functions giving rise to some further discussion.
In general, the insula seems to be related to vegetative functions and, thus, might represent an interface for the coordination of vegetative functions with voluntary motion during speech articulation (Ackermann and Riecker, 2010). Ackermann and Riecker (2004) suggest that the insular contribution to speech motor control may reflect phylogenetic roots that have developed for motor coordination during swallowing. Accordingly, and in line with the authors' finding that insular activation was related to overt rather than silent speech, a more recent study found that swallowing-related insular activity was present during execution rather than preparation of swallowing (Toogood et al., 2017).
Apart from motor control, the insula contains various target regions for inner perception and monitoring of inner states including reception of harmful as well as pleasant stimuli. As an example for pleasant stimuli, the perceptual target region of c-tactile fibers mediating soft touch and stroke signals from the skin is located in a posterior insular region, with functional connectivity to the emotion and reward system (Olausson et al., 2002;Sailer et al., 2016). Regarding adverse stimuli, various regions of the insula are sensitive to pain-eliciting stimulation (Duerden and Albanese, 2013). Furthermore, insular cortex seems to be an interface between pain sensation and cognitive components of pain perception such as attention, awareness, salience, and memory (Albanese et al., 2007). Thus, insular cortex does not only respond to painful stimuli but also to neutral pictures that were previously paired with such stimuli (Forkmann et al., 2015). This latter aspect seems also important for language processing since language can also "hurt" in some sense. For example, bilateral insula was found activated by negative prosody paired with positive statements (Matsui et al., 2016). Comprehensive review of clinical data (Ardila, 1999) and meta-analytic connectivity analyses (Ardila et al., 2014) suggest that the insula is involved in multiple language functions related to different types of aphasia in case of dysfunction. Regarding connectivity toward BA44 ("Broca"), BA22 ("Wernicke"), BA37 (object representations), BA40 (sensorimotor interface), BA 7 (perspective taking and theory of mind), BA9 (among others: working memory), cingulate cortex, and SMA (cognitive control), insular cortex seems to have a central function that could refer to the integration of meaning with regard to affective and social cognitive functions, in line with other studies emphasizing the role of the insula for emotion processing, cognitive and behavioral control, and social networking (Cai et al., 2016;Langner et al., 2018;Spagna et al., 2018).
Considering the anterior-posterior dimension of the insula, most of the cognitive and motor control functions seem to be located in the anterior insula whereas posterior regions are more related to perception. For example, posterior insula was activated by words with high-arousing compared to low-arousing sounds at the sub-lexical level (Aryani et al., 2018), and clinical studies associated posterior insula lesions with word deafness and semantic conduction aphasia (Marshall et al., 1996;Ardila, 1999). However, so far the number of studies providing evidence for language-related functions of the posterior insula is still limited.

Orbitofrontal Cortex and Emotion Processing
As already mentioned when the temporal pole and the uncinate fasciculus were addressed, left orbitofrontal cortex plays a role in emotional-semantic processing, particularly when emotions have to be perceived in a cognitive task (Ethofer et al., 2006). By contrast, the production of emotional speech seems to be served by the anterior cingulate cortex and subcortical mechanisms through the basal ganglia . Apart from emotion processing, orbitofrontal cortex comprises an area for secondary olfactory representations which is activated by lexical items that semantically refer to olfactory sensation (Pomp et al., 2018). Considering human evolution, an increase in the size of orbitofrontal cortex seems to be a "modern" extension of the limbic system, linking emotional processing to cognitive reasoning and, thus, providing an interface between phylogenetically old, instinct-driven communication mechanisms, on the one side, and human social communication via language, on the other (Semendeferi, 2018). In line with this bridging function, orbitofrontal cortex seems to play a role in intuitive (instinct-driven) responses of adults to child cues, including a modification of their speaking behavior such as "motherese" speech when communicating with small children (Parsons et al., 2017). Additionally, orbitofrontal cortex shows differential activation patterns during parent-infant communication depending on the child's gender (Mascaro et al., 2017), which might also be considered as a kind of "intuitive" behavior. Regarding lexical-syntactic interactions (abstract vs. concrete verbs, used in transitive vs. intransitive syntax) orbitofrontal cortex (IFG pars orbitalis) showed stronger activity in case of intransitive as compared to transitive sentences, and this effect was stronger for abstract as compared to concrete verbs, indicating an integration of word-level and constructionlevel meaning (Van Dam and Desai, 2016). Regarding the test materials of this study in detail, abstract intransitive verbs referred to feelings and emotional processing, in line with the above references on emotion-related processing in orbitofrontal cortex.

Right-Hemisphere Homologs of Left-Dominant Language Areas
Prosody In a similar way as propositional language (providing explicit semantic information) is organized in the left hemisphere, giving rise to aphasic symptoms in case of dysfunction, the right hemisphere seems to manage the affective-prosodic content of language, as indicated by clinical studies on right-hemispheric brain lesions (Ross, 1981;Ross and Monnot, 2008) and TMS experiments (Hartwigsen and Siebner, 2012). However, the particular role of the right hemisphere in speech processing is not restricted to affective prosody since also linguistic prosody elicits right-lateralized activity in dorsolateral frontal cortex (Wildgruber et al., 2004). As shown by Kreitewolf et al. (2014), the laterality patterns in fMRI studies are largely dependent on the respective control condition since, in fact, linguistic prosody processing relies on a bilateral mechanism in that right-dominant representations of prosodic cues are combined with left-dominant linguistic structures.
When contrasted to a phonetic task, the processing of affective prosody showed right-lateralized activity in superior temporal sulcus, and IFG (BA45 and BA 47)  and when contrasted to a linguistic task (emotional word meaning), the right hemisphere showed stronger activity in posterior temporal cortex (BA 21) and bilateral middle/inferior frontal gyrus (BA 45/46) (Ethofer et al., 2006). A facilitating role of affective prosody for linguistic encoding was addressed in a further fMRI study, also showing right-lateralized activity in the temporal lobe, at least for prosodic expression of fear and happiness (Leitman et al., 2010). It was also shown that right-lateralized processing of affective prosody is linked to understanding a verbal message in terms of intentions and mentalizing (Hellbernd and Sammler, 2018). In a similar way as the left hemisphere shows a ventral (more lexical) and a dorsal (more phonetic/phonological) stream of processing, for the right hemisphere such a dual stream hypothesis was formulated with regard to the processing of prosody (Sammler et al., 2015).
Regarding the early auditory stages of speech encoding, an "asymmetric sampling in time" hypothesis was set up, suggesting that the left hemisphere has a preferred temporal integration window of 20-40 ms, adapted to detect phonetic features that are related to formant transitions in single speech sounds whereas the right hemisphere preferably processes longer time intervals (150-250 ms) that are related to suprasegmental aspects of speech such as the syllabic modulation (Poeppel, 2003;Poeppel et al., 2008). Furthermore, the right hemisphere seems to dominant with respect to the processing of pitch, as has been shown in dichotic listening experiments (Jia et al., 2013;Wu et al., 2017) as well as in studies on phase locking to pitch periodicity in auditory cortex . Further evidence for rightlateralized early representation of pitch was provided in studies on music processing (Jantzen et al., 2014;Wengenroth et al., 2014).

Right Frontal Non-verbal Working Memory Functions, Semantic Monitoring, and Inhibitory Control
The classical view of working memory distinguishes a leftlateralized, phonologically organized verbal working memory overlapping with left-hemispheric regions for language and speech generation and a right-lateralized "visual sketchpad" comprising, among others, right-hemisphere homolog areas within the frontal cortex, in addition to some secondary visual areas (Baddeley, 2003). Apart from visual memory structures, right prefrontal cortex seems also involved in a pragmatic working memory (Ptak and Schnider, 2004) and a working memory for emotional prosody (Mitchell, 2007). Further evidence for right-frontal memory functions is provided by a study on patients with episodic memory deficits associated with hypometabolism in right inferolateral prefrontal cortex (Brand et al., 2009). In line with these suggestions, righthemispheric functions seem to be involved in discourse mapping by building up context representations (Robertson et al., 2000).
Various studies have shown that right-hemispheric IFG has an inhibitory control function on left-hemispheric IFG with regard to language, speech, and action processing (Aron et al., 2014;Heiss, 2016;Neef et al., 2016;D'Alberto et al., 2017). Due to connectivity of right IFG with right-prefrontal memory functions (Cai et al., 2014), the combination of episodic/pragmatic memory with the inhibitory functions of right IFG could be used for semantic/pragmatic monitoring and error management in terms of a semantic/pragmatic plausibility check. There is also some clinical evidence for semantic monitoring in right IFG (Sims et al., 2016). However, experimental studies explicitly addressing the semantic/pragmatic memory and monitoring function of right frontal cortex are still rare and might be an interesting field for further research.
A particular inhibitory control function of the right dorsolateral prefrontal cortex (rDLPFC) seems related to language switching in case of bilingual subjects. After anodal or cathodal direct current stimulation of rDLPFC, speech reaction time was prolonged in switch trials as compared to non-switch trials in Chinese (L1)/English (L2) bilinguals, concomitant with an altered electrophysiological late positive component (Liu et al., 2020). By contrast, stimulation of the left DLPFC did not yield a behavioral impact on language switching, as reported in a transcranial stimulation study (Pestalozzi et al., 2020). Furthermore, an MEG study has shown that language switching is associated with two different processes: earlier left IFG activity during presentation of the language cue (from L1 to L2) and a later right IFG magnetic field component associated with the presentation of the naming cue, presumably indicating an inhibitory effect to suppress the word in the dominant language (Zhu et al., 2020).

Subcortical Circuits
The important role of subcortical circuits for speech motor control is known from clinical studies on dysarthrias due to basal ganglia or cerebellar disorders. While the basal ganglia have a long "tradition" in the evolution of sound communication in terrestric vertebrates, the involvement of the cerebellum seems to be phylogenetically linked to the development of human language in a more specific way (Ziegler and Ackermann, 2017). Apart from sensorimotor and timing functions, subcortical structures are also involved in cognitive tasks and higher-order language functions (Booth et al., 2007;Bouvier et al., 2017;Kang et al., 2017) Basal Ganglia Concerning clinical aspects, Parkinson's disease (PD) has been considered as paradigm of basal ganglia dysfunction, causing "hypokinetic dysarthria, " a motor speech disorder with reduced and imprecise articulation and a lack of prosodic modulation (Ackermann and Ziegler, 1991;Duffy, 2013;Whitfield and Goberman, 2014). In addition to motor execution, various aspects of the entire communication process seem to be affected in PD such as motor planning (Spencer and Rogers, 2005), speech fluency (Goberman et al., 2010), and various cognitive functions including initiation, sensory integration, self-monitoring, and turn taking (McNamara et al., 1992;McNamara and Durso, 2003;Sapir, 2014), up to linguistic functions such as lexical retrieval (Saldert et al., 2014; and pragmatic processing at the sentence level (Monetta et al., 2009;Holtgraves and McNamara, 2010). Apart from PD, also other disorders or lesions of the basal ganglia, may lead to symptoms of subcortical aphasia such as anomia, reduced word fluency, and poor speech comprehension. In general, these deficits show up in more complex and demanding language tasks at the discourse or syntax level, and they seem to be more severe in language production as compared to comprehension (Bouvier et al., 2017). However, the detailed pathomechanisms of subcortical language symptoms are difficult to assess, and in many cases additional cortical or white matter lesions cannot strictly be ruled out (Radanovic and Mansur, 2017). Nevertheless, fMRI studies on language processing have shown that the basal ganglia are involved in a variety of specific language-related tasks such as perceptual category learning (Lim et al., 2014), the management of word categories (Bonhage et al., 2015), emotional prosody , ambiguity resolution (Ketteler et al., 2008), control of speaking rate , and control of infant-directed speech (Matsuda et al., 2014).

Cerebellum
Clinical studies on ataxia and dysmetria in cerebellar patients (slowed and imprecise motor performance and deficits in motion perception) as well as experimental studies in healthy subjects emphasize the role of the cerebellum for speech (and non-speech) motor control (Manto et al., 2012). This cerebellar function seems to be involved in both production of overt speech as well as the organization of internal speech representations (Ackermann, 2008). Additionally, the cerebellum serves a variety of cognitive language-related functions part of which are summarized in the book "The Linguistic Cerebellum" (Marien and Manto, 2015) and addressed in a "consensus paper" (Marien et al., 2014). Some examples for such cerebellar functions are: Activation of the dentate nucleus in a verb generation task (Thurling et al., 2011), clinical studies on cerebellar induced aphasia (Mariën et al., 1996), cerebellar agrammatism (Adamaszek et al., 2012), cerebellar involvement in visual (gestural) prosody (Bernard et al., 2015) and a cerebellar contribution to an auditory speech representation during silent reading (Moberget et al., 2016).
The cerebellum is mostly connected to the cortex in a crossed way. Thus, linguistic functions of the cerebellum tend to be lateralized to the right cerebellar hemisphere (Jansen et al., 2005), with exception of subjects with righthemispheric cortical language dominance (Hubrich-Ungureanu et al., 2002). Considering the locations within the cerebellar hemispheres, motor representations seem to be anterior to cognitive and affective functions (Stoodley and Schmahmann, 2010). A more detailed topography of cerebellar functions was established by a combination of resting state cerebellarcortical connectivity analysis and the evaluation of task-specific fMRI activation maps considering motor, working memory, language, and social processing tasks (Guell et al., 2018c). Furthermore, it has been shown that most cognitive tasks are associated with three regionally distinct cerebellar activations the differential function of which is still an open question (Guell et al., 2018b).
In general, the cerebellum is involved in action-perception coupling (Christensen et al., 2014). Thus, cerebellar activity in motor regions of the cerebellum during language tasks can demonstrate an aspect of sensorimotor embodiment of language (Garcia et al., 2017). Consideration of cerebellar involvement in embodied cognition gave rise to a more generalizing theory of cerebellar function, in terms of a Universal Cerebellar Transform Theory, assuming a similar structure of cerebro-cerebellar information exchange for motor, affective, and cognitive representations (Guell et al., 2018a). In case of cerebellar dysfunction, a "Dysmetria of Thought" hypothesis was formulated for the cerebellar-cognitive-affective syndrome (Gomez-Beldarrain and Garcia-Monco, 1998), in analogy to motor dysmetria in cerebellar ataxia.
Cerebellar functions seem to be particularly important during the phase of language acquisition (Riva and Giorgi, 2000;Vias and Dick, 2017), which might be related to the demand of establishing procedural memory structures prior to integration of stored information into a declarative mental lexicon Fawcett, 2007, 2011;Clark and Lum, 2017;West et al., 2018). A further aspect of cerebellar language functions is the role of the cerebellum in linguistic prediction, i.e., language processing in an anticipatory manner by making forward simulations of upcoming content (Runnquist et al., 2016;Pleger and Timmann, 2018).

Thalamus
As a third subcortical structure, the thalamus has to be mentioned. First, this central organ in the brain serves as an afferent gate toward the cortex by directing sensory input signals to their modality-specific cortical target regions, including early speech-relevant auditory processing (Bartlett, 2013). Second, it serves as the output gate from the basal ganglia and the cerebellum to the cortex (Kotz and Schwartze, 2010;Habas et al., 2019) and, third, it serves as a coordinator for distant cortical areas by managing cortico-cortical connectivity and bandwidth for information exchange (Klostermann, 2013). Regarding language processing, some evidence for thalamic functions arises from clinical studies, showing aphasic deficits in case of thalamic lesions (Jonas, 1982;Crosson, 1985;Pergola et al., 2013). Summarizing the language-related thalamic functions, Crosson (2013) emphasizes (1) the control for selective engagement of task-relevant cortical areas, (2) information transfer from one cortical area to another (3) sharpening the focus on task-relevant information, and (4) selection of one language unit over another in the expression of a concept.

FUNCTIONAL NETWORKS
As an alternative to the consideration of particular brain areas as additional modules for language processing, distinct more or less elementary networks can be considered as a whole each. Such networks may comprise conceptual aspects such as modality-specific object representations and memory structures, implementational executive aspects such as action planning, and mediational aspects of cognitive control facilitating the communication between conceptual and executive networks (Wig, 2017). Segregated networks may serve various specific or domain-general functions such as, cognitive control, affectiveemotional processing, social cognition, or theory of mind. In the following, some of these structures will be introduced, each containing one or more brain regions that have been addressed in section Anatomical Neighbors and Connectivity.

The Semantic System
While the core language network as defined here is largely restricted to phonological and elementary lexical-semantic functions, semantic processing, as a whole, comprises a huge network that is deeply embodied in various ways. It includes all kinds of world knowledge and comprises multiple areas in the brain such as modality-specific representations, sensorimotor regions, and emotion systems (Binder and Desai, 2011). Furthermore, "convergence zones" toward more generalizing and abstract categories in temporal and inferior parietal regions play an important role for semantic processing as well as dorsomedial and inferior prefrontal cortices, controlling the goal-directed activation and selection of semantic information (Binder and Desai, 2011).
Based on structural connectivity analyses, three major subcomponents of the semantic system have been outlined comprising (1) a large-distance orbitofrontal-temporal-occipital network assembling object properties, (2) a middle and inferior frontal-subcortical module serving executive control of semantic processing, and (3) a medial temporal module as an interface to episodic memory (Fang et al., 2015). Regarding object representations, the semantic system is organized in a system of gradients in cortical features from sensory and sensorimotor to transmodal areas (Huntenburg et al., 2018). The medial temporal module of the semantic system overlaps with the "hippocampal-cortical memory system" (Eichenbaum et al., 1996;Pan and Tsukada, 2006) as a general interface for memory storage, management, and retrieval (Cooper and Ritchey, 2019). Regarding memory content, there seems to be a lateral-medial gradient in semantic representations where lateral regions relate to external knowledge and processes while medial regions relate to self-processing and autobiographic episodic memory (Maguire et al., 2000;Jouen et al., 2018).
Another semantic model considered three functional networks as the basis of semantic processing, comprising (1) a perisylvian "language-supported system" (partially overlapping with the core language network as defined here), (2) a "multimodal experiential system" also addressed as the "default mode network" integrating experience-based knowledge across multiple modalities (see below), and (3) a left-dominant frontoparietal network as a semantic control system (Xu et al., 2017). These three networks are linked together in hub regions, comprising the anterior temporal lobe, posterior middle temporal gyrus, posterior intraparietal sulcus, angular gyrus, and parts of superior and middle frontal gyrus. In general, depending on task demands, for semantic processing various memory systems may be recruited and temporarily linked together, or single subsystems can locally get expanded or diminished, as outlined by a "multiple memory systems theory" (Ferbinteanu, 2019).
Within the temporal lobe (extending into parietal cortex), the semantic system shows an organization linking various subcomponents, as suggested by structural and functional connectivity analyses (Jouen et al., 2018). First, a lateral system comprising angular gyrus and the superior temporal pole, associates semantic representations of the external world in temporoparietal cortex (part of the dorsal language pathway) to the representation of abstract concepts in the anterior temporal lobe (part of the ventral pathway). Second, a medial system, linking retrosplenial with parahippocampal cortex, seems to be related to the default mode network (see Right-Hemisphere Homologues of Left-Dominant Language Areas), to memory processes, to the perception of the "inner world, " and to egocentric perspective-taking.
Regarding structural connectivity within the temporal lobe, the middle longitudinal fascicle seems to play a role for language processing by linking the angular gyrus with the temporal pole (Makris et al., 2013). This anterior-posterior connectivity in the temporal lobe seems to be part of a larger network also including superior medial prefrontal cortex. The function of this larger network is related to social-semantic and ToM aspects of discourse comprehension (Lin et al., 2018).
While semantics, as a whole, is a wide field exceeding the domain of the present review, the distinction between concrete and abstract concepts seems to be of particular interest (Montefinese, 2019), being closely related to the nature of language and its sensory embodiment or disembodiment. So the embodiment of abstract in comparison to concrete concepts is more complex (Buccino et al., 2019), abstract items are more related to emotional processing (Lindquist et al., 2015), they have a stronger representation in left inferior frontal gyrus (Shallice and Cooper, 2013) and left temporoparietal cortex (Skipper-Kallal et al., 2015), and they are characterized by longer processing time and different electrophysiological responses in the N400 domain and later potentials (West and Holcomb, 2000).

The Multiple Demand System
The "multiple demand system" (MDS) represents a domaingeneral fronto-parietal network for controlling all kinds of actions, including a superordinate cognitive control of language communication. Its activity correlates with general intelligence (Duncan, 2010). In spite of some variability across studies and task demands (Camilleri et al., 2018;Marek and Dosenbach, 2018), some bilateral core regions of the MDS have been listed, comprising parts of inferior, middle, and orbitofrontal cortex, precentral gyrus, insula, supplementary motor area, anterior cingulate cortex, and the inferior and superior parietal cortex (Müller et al., 2015;Mineroff et al., 2018). Within this network, two subsystems have been distinguished, a "frontoparietal" part related to task rules, comprising dorsolateral prefrontal cortex, inferior frontal junction, and intraparietal sulcus, and a cinguloopercular part related to salience processing, comprising anterior cingulate cortex, anterior insula, and anterior prefrontal cortex (Crittenden et al., 2016).
On the one hand, the MDS can be considered as distinct from the core language network (Mineroff et al., 2018;Woolgar et al., 2018). On the other hand, the MDS seems necessary for language, at least in case of difficult task and memory requirements (Campbell and Tyler, 2018), and it seems to be engaged in language learning (Sliwinska et al., 2017). A particular network overlapping with the MDS has been outlined for language control in bilingual speakers, comprising dorsal and ventral parts of the frontal lobe, parietal cortex, subcortical areas and cerebellar regions (Wu et al., 2019).
Based on connectivity analyses, a somewhat extended multidemand network has been described, comprising three major parts: (1) a subcortical sensation/action-related part, (2) a frontal lobe part related to attention, language, working memory, and sensation, and (3) a large-distance network comprising the inferior frontal junction, inferior parietal sulcus, dorsal premotor cortex, and left inferior temporal gyrus, serving, among others, abstract thinking and some language functions (Camilleri et al., 2018). In general, the MDS, and in particular the dorsolateral prefrontal cortex as one or its centers, also serves as a link for connecting various other networks and memory systems, depending on task demands (Ferbinteanu, 2019).

The Default Mode Network and Self-Processing
The default mode network (DMN) was originally described as a set of brain areas that are deactivated rather than activated when subjects have to perform various tasks (Raichle et al., 2001). It seems to be active when individuals are dealing with themselves, considering autobiographical memory, simulating future, or taking the perspectives of others. This system seems to be distinct from both the core language system and the multiple demand system (Mineroff et al., 2018). Neuroanatomically, it comprises medial temporal lobe (memory processing), medial prefrontal cortex (self-relevant mental simulations), and posterior cingulate cortex integrating these two processes (Buckner et al., 2008). Furthermore, the angular gyrus (or at least part of it) has been assigned to the DMN, as a supramodal hub area with various functions that are relevant for language processing in context, but that are also active in case of internal mentation when people are not engaged in external interactions (Seghier, 2012). Apart from activity during resting state, the DMN seems to be particularly active during embodied simulation of another's physical and mental states for the purpose of social cognition (Molnar-Szakacs and Uddin, 2013). Regarding language functions, language tasks that require access to episodic and semantic memories seem to engage the DMN (Binder et al., 2009;Geranmayeh et al., 2014). Thereby, the DMN seems to be engaged in memorybased top-down simulations, preferably processing concrete and self-related rather than abstract and external items (Xu et al., 2017).

The Theory of Mind System
Theory of Mind (ToM) refers to the ability to understand the minds of others (Siegal and Varley, 2002;Molenberghs et al., 2016), comprising a cognitive and an affective component (Westby, 2014). Neuroanatomically, it's a largely bilateral network comprising the temporo-parietal junction, medial parts of the temporal lobe, the temporal pole, parts of medial frontal cortex, and the precuneus (Molenberghs et al., 2016;Wellman, 2018). Furthermore, various cerebro-cerebellar circuits seem to play a major role for ToM processing, representing a Cerebro-Cerebellar Mentalizing Network (D'Mello and Stoodley, 2015;Ryan et al., 2017). Regarding language functions, the ToM system is primarily engaged in pragmatic processing when individual-or situation-specific meanings must be derived, or when inferences have to be made such as required for understanding indirect requests (Van Ackeren et al., 2012). Although the ToM and the language network can be considered as distinct networks, they can get synchronized during language comprehension (Paunov et al., 2019). In some respect, theory of mind processing is also related to the Default Mode Network, as has been shown in a study about ToM abilities as a function of aging (Hughes et al., 2019).

The Salience Network
A further network that is relevant for some language tasks is the Salience Network, engaged in some aspects of cognitive control, integrating sensory input with regard to salience, in terms of conscious or unconscious relevance, in order to guide attention, and to recruit brain resources for potential responses (Peters et al., 2016). In some way, the salience network represents the "intensity of experience" (Toyomaki and Murohashi, 2013). Neuroanatomically, it comprises various parts of the insula, anterior cingulate cortex, and subcortical loops for integrating various kinds of input signals (Uddin, 2015). Regarding connectivity to other systems, the salience network can dynamically be linked to medial frontal cortex for internally directed responses and to dorsolateral prefrontal cortex for externally-directed actions (Uddin, 2015). Regarding language processing, the salience network had been found relevant, for example, for working memory and narrative comprehension (Twait et al., 2018), atypical prosody understanding in case of a foreign accent (Hernaandez et al., 2019), and residual language functions in aphasic patients .

EXAMPLES OF LANGUAGE USE BEYOND THE "CODE" OF THE CORE LANGUAGE NETWORK
Similarly as the language network in the brain is interwoven with various other brain structures and functions, linguistic functions cannot be considered in isolation as far as we are interested in natural language since human language is an open system with various interfaces to functions that are related to perception, cognition, acting, experiencing, and social interaction.

Emotional Language
In addition to the communication of a propositional meaning, natural language often conveys emotional messages. These can be addressed by using emotional words that are linked to a lexicalemotional pathway and/or by the way of speaking, in terms of affective prosody.

Affective Prosody
Affective prosody comprises a system of acoustic features such as pitch height, modulation depth of pitch, speaking rate, and voice quality (Banse and Scherer, 1996;Harnmerschmidt and Juergens, 2007;Patel et al., 2011) that is relatively consistent even across languages and cultures (Scherer et al., 2001). Regarding functional brain anatomy, affective prosody, particularly for the modulation of pitch, engages the phylogenetically "old" phonatory control system of primates including, among others, anterior cingulate cortex (Belyk and Brown, 2016). Furthermore, it seems to rely on right-hemisphere analoga to the left perisylvian language regions as indicted by clinical studies on brain lesions, e.g., in right IFG and right supramarginal gyrus (Patel et al., 2018;Wright et al., 2018). Considering white matter connectivity of the affective prosody network, a bilateral dual (dorsal and ventral) pathway structure has been outlined, linking upper temporal cortex to IFG (Frühholz et al., 2015). In particular, a right-hemispheric ventral pathway should be mentioned here linking an "emotional voice area" in primary and secondary auditory cortex of the right hemisphere to ipsilateral IFG (Ethofer et al., 2012). The input signal into the emotional voice area again comprises a dual pathway, related to (1) the recognition of a human voice as the sound source via lateral parts of the auditory system and (2) emotional valence via the amygdala (Grisendi et al., 2019).
There seems to be a functional neuroanatomic differentiation of activation patterns of emotional voice processing, first, regarding the various sequential or parallel processes such as auditory representation, categorization, and evaluation (Schirmer and Kotz, 2006;Leitman et al., 2010;Patel et al., 2018) and, second, with respect to distinct emotions (Ethofer et al., 2009;Kotz et al., 2013). An fMRI study on affective and linguistic prosody found largely overlapping right-lateralized activation patterns for both tasks in superior temporal, dorsolateral and medial frontal, insular/fronto-opercular cortex, and cerebellum while contrast analysis between the two conditions showed bilateral orbitofrontal activity for affective > linguistic and left inferior frontal activity for linguistic > affective processing (Wildgruber et al., 2004).

Emotional Words
Apart from prosody, emotions can also be signaled linguistically by using lexical items that are linked to affective content. Some studies have shown that emotion words may cause particular priming and memory effects. For example, emotion or taboo words can be remembered better than neutral words (Jay et al., 2008). Furthermore, affect-arousing distractor words can lead to an "emotion-induced blindness" (less accurate encoding of words following an emotional item in a word list) which may interact with the "attentional blink effect" measuring the encoding of a stimulus depending on the time delay from a preceding stimulus (Anderson, 2005;Mathewson et al., 2008). Electrophysiological recordings have shown that affect words can elicit increased early posterior negativity (400-450 ms) and an increased late positive potential (520-600 ms), reflecting early semantic activation and consolidation in working memory, respectively (MacLeod et al., 2017).
A lexical decision task with a visual hemifield design showed a significant valence effect, i.e., shorter response times to positive-emotional words in comparison to neutral or negative items, concomitant with an enhanced right hemifield advantage (indicating left-hemisphere processing) (Martin and Altarriba, 2017). Considering functional neuroanatomy, the left temporal pole and its connectivity to orbitofrontal cortex via the uncinate fasciculus seems to be important for lexical-emotional processing (Olson et al., 2007;Ethofer et al., 2009), which is also confirmed by a speech production experiment considering words with emotional connotations (Crosson et al., 1999). Regarding the enhancing memory effects of emotion words (Jay et al., 2008), valence and arousal seem to be processed with differential connectivity pattern toward the hippocampus: a fast pathway via amygdala in case of arousal, and slower and more controlled pathway via prefrontal cortex for valence (Kensinger and Corkin, 2004). In contrast to other emotion-inducing stimuli, memoryenhancing effects for emotional words, seem to work in a language-specific way rather than just by activation of the autonomic nervous system (Bayer et al., 2011).
To some degree, the implementation of emotion words, bound to particular contexts during language acquisition, may differ across languages and cultures (Altarriba, 2003;Basnight-Brown and Altarriba, 2018). Furthermore, lexical emotional effects may differ between a subject's native language (L1) and a second language that was acquired later in life (L2), due to the fact that L1 is acquired in emotionally richer contexts as compared to more neutral scholarly environments for L2 (Ivaz et al., 2016). At the level of brain activity, L1/L2 language effects have been shown in electrophysiological parameters as well as hemodynamic activations. For example, positive emotion words elicited larger early posterior negativity and a smaller late positive component in L1 concomitant with reduced occipital and left cerebellar activity, indicating rapid and automatic attention effects, while in L2 emotion-related activity seemed to be associated with semantic retrieval .

Non-literal Language
A comprehensive meta-analysis of fMRI studies on non-literal language found a widespread network comprising 409 activation foci of which 129 were in the right hemisphere (Rapp et al., 2012). These activations were largely overlapping with the core language network, presumably because non-literal language imposes an increased cognitive load on this network. The following sections will report some examples demonstrating the spectrum of taskspecific recruitment of brain resources related to non-literal language processing.

Irony
For understanding irony, literal meaning must be inverted or negated, which may require more complex brain activity than literal understanding. Furthermore, detecting the demand for an inversion may require an additional evaluation of pragmatic information. While the bulk of activity that is additionally required for ironic speech processing takes place within the left perisylvian language regions (Rapp et al., 2012), some righthemispheric functions seem to be essential for understanding irony, as indicated by a clinical study in which right-frontal braindamaged patients had problems with the understanding of irony (Champagne-Lavau et al., 2018). Obviously, the irony-triggering context was not accessible in these patients, confirming other clinical studies that indicate pragmatic memory deficits in case of right frontal cortex dysfunction (Ptak and Schnider, 2004). An electrophysiological study addressed the cognitive load due to irony in comparison to syntactic difficulty (Regel et al., 2014). Both conditions elicited a P600 potential with similar latency, but with different field distributions (irony slightly more rightlateralized) and differences in alpha-and thetaband amplitudes, indicating that different networks are engaged in processing irony vs. syntax.
Apart from the integration of right-hemispheric pragmatic cues, left middle temporal gyrus (lMTG) seems to be specifically involved in irony processing, as indicated by a study comparing fMRI effects of deceitful vs. ironic language, while activation in left frontal cortex and right cerebellum was comparable for the two conditions (Bosco et al., 2017). Further evidence for the importance of MTG was provided in a study on healthy schizotypal subjects, showing an inverse relationship between the subjects' schizotypal personality score (SPQ) and ironyrelated activity in MTG whereas a positive correlation was found between SPQ and IFG activity, which was interpreted as a compensatory strategy (Rapp et al., 2010).
As a special case of irony, sarcasm in spoken language can be associated with a prosodic modulation of an utterance. In an fMRI study, such modulations activated an affective-prosodic network comprising bilateral insula, left inferior frontal cortex, and cingulate cortex, while the anterior part of left IFG (BA47) seems to be particularly important for the integration of affectiveprosodic cues into a pragmatic meaning (Matsui et al., 2016).

Metaphors
Metaphors, characterized by a non-literal associative meaning, may require additional lexical operations which are reflected by additional activity in middle temporal gurus and anterior IFG (Rapp et al., 2004). Often metaphors are related to sensation and, thus, associated with brain activity in sensory regions. For example, "body" areas in secondary visual cortex are active during the comprehension of metaphors that are related to body parts (Lacey et al., 2017). Similarly, lexical items referring to olfaction can activate olfactory orbitofrontal cortex in case of both metaphorical and literal context (Pomp et al., 2018). In addition to sensory associations, metaphors may evoke affective brain responses, indicated by activation of left amygdala (Citron et al., 2016). Furthermore, the processing of metaphors can interact with emotionality of the context in terms of an emotioninduced mental simulation of the literal meaning of a metaphor (Samur et al., 2015).
In contrast to conventional, more or less lexicalized metaphors, novel metaphors require additional cognitive operations associated with switching from the literal representation to the search for a new non-literal meaning. In this case, the precuneus, left angular gyrus, and right intraparietal sulcus seem to be involved, with connectivity to right anterior insula, preceding later coupling of left angular gyrus and dorsolateral prefrontal cortex (Beaty et al., 2017). More detailed information about the time course of processing was obtained in an electrophysiological study showing right frontal P200 attenuation preceding the N400 effect of novel metaphors, which was interpreted in terms of context-sensitive early semantic scanning of the incoming words, facilitating later stages of decision making about the meaningfulness of the respective sentence (Schneider et al., 2014). While most of the processing of metaphors seems to rely on left-hemispheric functions, particularly the right posterior temporal lobe seems to play a role in case of novel metaphors, as indicated by transcranial magnetic stimulation studies (Hartwigsen and Siebner, 2012).
Various clinical groups have difficulty understanding metaphors, for example, subjects with schizophrenia or autism spectrum disorder (ASD). In case of schizophrenia, mainly the left frontotemporal language network seems affected, apart from some other regions such as parts of the theory-of-mind network (Rossetti et al., 2018). By contrast, ASD individuals tend toward increased activation in regions related to verbal memory, semantic associations, and basic visual processing, presumably as a compensatory strategy. Functional cortico-cortical and corticosubcortical connectivity was largely reduced in these patients, which was interpreted in terms of a more global impairment in cognitive control pathways (Chouinard et al., 2017).

Idioms
A further class of non-literal language features refers to idioms, that is, conventionalized phrases that are often used by particular social groups or in a particular environment or communicative setting. In contrast to metaphors and irony, idioms, depending on context and speaker group, may be used quite frequently and, thus, in a highly automatized manner. Thus, the non-literal meaning can be largely lexicalized and may even occur more frequently as compared to the literal meaning, which gave rise to the formulation of a "Graded Salience Hypothesis" (Giora, 1997). Rather than distinguishing a literal from a non-literal meaning, it seems important to distinguish a more salient, frequent, or kind of default meaning from a less frequent, non-standard meaning. In the case of ambiguity, both meanings have to be kept in memory until a disambiguation process is performed, which seems to be implemented in the brain in a bilateral network including inferior frontal and middle temporal gyri for semantic representations as well as an anterior prefrontal area for cognitive control (Papagno and Romero Lauro, 2010). While the bulk of processing idioms seems to be performed by the left hemisphere (Häuser et al., 2016), the right hemisphere seems to play a particular role for the inhibition and/or maintenance of non-salient meanings (Mashal et al., 2008) or for accessory functions such as non-language (e.g., visuospatial) semantic representations of literal and non-literal meanings (Papagno et al., 2006).

Non-lexical Aspects of Speech
While non-literal speech still relies on particular lexical meanings that can be inferred by pragmatic inference, speech may comprise additionally or exclusively features beyond any lexical-semantic meaning. On the one hand, such features comprise a kind of "natural, " non-arbitrary code of speech sounds such as onomatopoeia, sound symbolism, or iconicity. On the other hand, in particular situations formulaic language-like utterances may be produced without a compositional semantic structure, serving direct emotional expression.

Non-arbitrary Coding
In principle, phonological-lexical coding of language is arbitrary, i.e., any word form may be combined with any kind of meaning. However, observable across different languages, there is a significant above-chance probability of a direct phoneticto-semantic mapping which may either directly be related to acoustic sound generation such as imitation of animal sounds or to cross-modal analogies of coding (Lockwood and Dingemanse, 2015;Svantesson, 2017). The functional relevance of nonarbitrary coding seems to be that is has a facilitating function for language acquisition, as has been shown, for example, for 2and 3-year-old children (Imai et al., 2008). This effect seems to be particularly strong at early stages, indicated by a significant correlation between the iconicity of words and the average age of their acquisition (Perry et al., 2015). However, also adults are still sensitive to sound symbolism, as has been shown in artificial language experiments (Nielsen and Rendall, 2013;Sidhu and Pexman, 2017). Apart from sensory aspects related to sound, size and shape of objects, also emotional aspects may be directly conveyed by speech sounds (Ullrich et al., 2016), and also the motor system seems to play a direct role for iconicity, due to a parallel structure in tongue and hand movement control (Vainio et al., 2017).
At the level of brain activity, iconicity may affect wide-spread networks including sensory association areas as well as the motor system. Electrophysiological studies have shown various components related to iconicity such as an early posterior effect over the visual system with a latency of about 160 ms (Kovic et al., 2010), effects of affective processing at about 200 ms (Ullrich et al., 2016), and later semantic effects exceeding 400 ms (Kovic et al., 2010;Ullrich et al., 2016). Even in 1 year-old infants early auditory-phonetic (latency ca. 200 ms) as well as later (ca. 400 ms) sound-to-meaning effects of sound symbolism have been demonstrated (Arata et al., 2010;Asano et al., 2015).
In spite of various significant iconicity effects, the advantage of non-arbitrary coding seems to be limited-otherwise languages would be much more iconic. As suggested in a recent opinion paper, some pressure in the direction of arbitrary coding seem to be at work, related to a language's demand for generalization and abstractness (Lupyan and Winter, 2018). A further study showed that preferably those words show iconicity that are either closely related to sensory experience or have a low semantic neighborhood density (Sidhu and Pexman, 2018). Thus, with increasing semantic neighborhood density and distance from concrete perception (toward abstractness and generalization), iconic coding seems to become inefficient.

Formulaic Language
Language, in general, is related to propositional meaning and more or less hierarchically structured into phonological features, lexical words, and syntactic phrases. However, in particular situations language seems to be used in a different way and to be structured differently. An example for quasi non-compositional speech is formulaic language when it is used as for direct affective expression. In this mode of language use, rather than the normal left-dominant language network, a right-hemispheric/subcortical network seems to be active, as indicated by clinical studies as well as hemodynamic activation patterns (Sidtis and Sidtis, 2018). Sidtis and Sidtis propose a dual process model for linguistic behavior comprising distinct lines of cognitive activity that are differentially lateralized to the two hemispheres, one for grammatical processing of language and one for obtaining direct affective, attitudinal, or emphatic information. While formulaic language often comprises single words such as expletives, another form of formulaic language may comprise stretches of speech such as in poems and prayers. Also in this case, utterances are produced in a highly automatized way without explicit lexical access, largely relying on subcortical structures, as indicated by respective deficits in patients with Parkinson's disease (Bridges et al., 2013). As a further clinical example and some evidence for right-hemispheric control of formulaic speech, an aphasic Japanese patient with a lesion in the left temporal lobe seemed to use a kind of "sutra" mode of communication, comprising not only Buddhist prayers, but also stereotypic expressions such as greetings (Shinoura et al., 2010).

Speaking in Tongues
Glossolalia or speaking in tongues is a particular mode of nonlexical speaking representing "god-inspired" speech. It has been defined as some kind of babbling with phonological similarity to a language, but without a consistent syntagmatic structure (Samarin, 1973). In some respect, it resembles speech in cases of mental disorders, characterized by a deactivation of some mechanisms of cognitive control (Chouiter and Annoni, 2018). Regarding electrophysiology, glossolalia has been observed in association with particular activity at temporal EEG electrodes (Persinger, 1984;Reeves et al., 2014), and a preliminary SPECT study has found decreased cerebral blood flow during glossolalia (as compared to singing) in prefrontal cortices, left caudate, and left temporal pole whereas an increase was observed in the left superior parietal lobe and right amygdala (Newberg et al., 2006). The latter activation, indicating an involvement of the affective system, might be related to a finding that glossolalia in Apostolic Pentecostals is associated with systematic changes in biomarkers of stress and arousal (Lynn et al., 2011).

The Door for Mental Representations Into Language
The core language network, comprising phonological, lexical, and syntactic processing, relies on the existence of lexical representations. However, such representations may not be available for expressing something, first, during language acquisition and, second, if something new has to be verbalized. Thus, the formation of language in the brain must be understood as a dynamic process in terms of an open system that is able to implement a structure of language-coded information. This process has been described in terms of "neuroconstructivism, " considering the individual implementation of cognitive brain functions as a developmental trajectory or "neuroemergentism, " accounting for the fact that new functions may arise by reuse and reconfiguration of existing brain modules (D'Souza and Filippi, 2017;Campos et al., 2019;Dick and Krishnan, 2019;Hernandez et al., 2019).

Deictic Language
An interesting phenomenon in language communication is the use of underspecified lexical items such as spatial demonstratives, requiring contextual or additional non-language information such as pointing gestures. In such cases, a common spatial reference system has to be built up comprising, at the level of brain activity, right-hemispheric spatial processing concomitant with activation of the mentalizing system including medial prefrontal cortex and the temporoparietal junction (de Langavant et al., 2011;Peeters et al., 2017;Vanlangendonck et al., 2018). An fMRI study with word-level time resolution during the perception of narrative speech found spatial demonstratives to recruit dorsal parieto-frontal "where" pathways that are also active in extra-linguistic visuospatial cognition and attentional orienting (Rocca et al., 2020).
The combination of pointing gestures with underspecified speech is also important in child communication when the vocabulary for some items has not yet been established. So during language acquisition infants have to rely on the pragmatics of cooperative communication in which shared spatial experience plays a crucial role (Liebal et al., 2009;Grassmann and Tomasello, 2010).

New Vocabulary-Long-Term Shaping of the Mental Lexicon
Since natural language is an open system, even adults may come across new words and concepts that have not been entered into their mental lexicon before. Furthermore, existing words are continuously subject to change in meaning and/or phonological structure, which in the end may result in diachronic change of a language. An attempt to outline a model for integrating new items and meanings into the language system at the level of brain activity has been made by Rodríguez-Fornells et al. (2009). In principle, it relies on the dynamic mechanisms of pragmatics, resulting in the fact that each pragmatic operation, such as the context-specific modification of a word meaning, leaves some traces in the semantic system of the mental lexicon. The neural substrate of this process seems to involve temporoparietal regions attached to the dorsal path of language processing as indicated by studies on word learning using transcranial electrical DC stimulation (Perceval et al., 2017). In order to further understand these dynamic processes, the nature of concepts and mental objects has to be considered (Mahon and Hickok, 2016), as well as attempts to explain ongoing changes in the mental lexicon in terms of "exemplar" theories assuming that concepts are represented as remembered category instances during category learning (Murphy, 2016). However, beyond emphasizing that the integration of new lexical items requires neuronal resources that largely exceed the core language network, this aspect of long-term formation of language exceeds the scope of the present review.

Interference of Speaker Characteristics and Speaker Identity With Language Processing
One of the central functions of speech communication is the transfer of knowledge from on person to another. In this respect, not only the propositional content of speech is important, but also the authenticity and competence behind the words. For example, there is electrophysiological evidence that brain responses to verbal messages depend on the listeners' belief about the origin of this speech, e.g., being produced by a human speaker or a machine, suggesting that "who is perceived as saying something can be as relevant as what is said" (Schindler and Kissler, 2016). In a similar way as in animal communication voice identification interacts with the processing of affective communication (Kato et al., 2018), human utterances are perceived and evaluated differently depending on the respective speaker or the individual speaker-listener relation, and the uptake of information, for example in terms of learning from speech, is largely selective with regard to speaker attributes such as competence, age, and confidence (Poulin-Dubois and Brosseau-Liard, 2016).
During speech communication, auditory speaker identification relies on various cues such as voice pitch, spectral characteristics, and prototypical features. These cues are processed in regions near right Heschl's Gyrus, bilateral posterior STG and adjacent temporoparietal regions, and right anterior STG (Lattner et al., 2005). Apart from auditory cues, nonauditory information about the origin of a linguistic message may be used such as visual information that may interact with voice processing, e.g., via connectivity of the fusiform face area with voice areas in the upper temporal lobe (Benetti et al., 2018).
Apart from permanent speaker characteristics, also situational speaker variability can interact with linguistic processing which might be indicated by prosodic cues. For example, an electrophysiological study has shown that "Motivational language, " is processed differently compared to messages in a neutral tone: Cues signaling some pressure cannot be ignored and may lead to preferential and more in-depth processing as early as the P200 potential reflecting phonological encoding (Zougkou et al., 2017). Similarly, it can be assumed that the recipient's perception of a speaker's emotion, arousal, attitude, or temporary mental state interacts with linguistic processing, as well as any declarative knowledge about the speaker.
A particular speaker/listener effect is evoked when the listener's own name is uttered by a familiar voice. A study on traumatic patients in vegetative state or unresponsive wakefulness syndrome has shown that the subjects' brain responses to such auditory stimulation yielded a high prognostic value for later recovery (Wang et al., 2015).

DISCUSSION
The above sections have shown that the core language network is linked to other brain regions and entire networks in various ways and that this connectivity is related to various functions of speech and language communication. In natural situations, the core language areas may be used like a tool or module, but the entire communication process, depending on its particular aim and function, exceeds the domain of this tool in many aspects. Such aspects may comprise social interaction, knowledge exchange, or the report of an experience. Depending on the particular purpose of the communicative action, the language network is connected to other functional networks related to various domains such theory of mind and social interaction, non-verbal memory structures (e.g., in case of coherent text reception), an expanded mental lexicon representing world knowledge (e.g., in case of explanations), motor representations (e.g., in case of verbal descriptions of motor actions), or the affective system (e.g., in case of emotion communication or language-induced emotional responses). Presumably, many additional networks are also linked to the language system in a task-or situation-specific way. Furthermore, after considering the margins we may come back to the question about the border of a language network in general. Depending on task demands, it may only partially be recruited. During reading, for example, speech articulation might be partially switched off, and during speech communication via telephone, the interface to the visual system might be partially switched off.
There seems to be a shift in the consideration of language in the brain: Rather than making the language network larger and larger, it seems useful to keep the core language network small and consider larger structures (additional brain regions or entire networks) as additional modules that are recruited in dependency of the respective task. The definition of subnetworks in the brain largely depends on the analysis methods, and there is still a large variability across different studies and meta-studies. Furthermore, it should also be considered that even the core language areas are not reserved for language only, since language and domain-general functions may be closely neighbored, for example, in "Broca's" area (Fedorenko et al., 2012).
Developmental aspects are not the primary focus of the present review, but the question may arise how the language system gets implemented and "finds" its place within the developing brain. Obviously, it must be programmed from outside through its margins, by starting with pragmatic deictic/iconic communication until incrementally a meaningful vocabulary can be formed (Liebal et al., 2009;Grassmann and Tomasello, 2010;Paulus and Fikkert, 2014;Macoun and Sweller, 2016). Thus, although pragmatics in linguistic science often has been considered as more complex and less easy to comprehend as compared to phonology and lexical semantics, in fact, it is the more elementary and basic aspect of communication as compared to the "core" language functions.
In the above description, the core language network was attributed as "disembodied" because language functions can be more or less de-coupled from their original embodied way through which they have been implemented. However, in spite of this aspect of disembodiment, language is very efficiently bound to the overall behavioral/experiential world of its user, in a largely effortless and highly automatized manner. Thus, in some respect language processing could be considered as "superembodied." This term has been used for representations of Japanese spiritual beings as "superembodied" forms of agency (Jensen et al., 2016). Furthermore, superembodiment refers to pathological states such as the Cotard syndrome or missing proprioception in which, however, patients are still able to effortlessly perform automatized body movements such gesticulating or even car driving (Gallagher and Cole, 1995;Gallagher, 2005Gallagher, , 2006. While they perceive their physical body as largely "dead, " they still have precise behaviorally relevant body representations, even beyond the physical body in case of car driving. Similarly, language seems to establish a secondary modality-like structure in the brain that is partially detached from its embodied roots through which it was originally acquired (e.g., imitation of audiovisual perception of articulatory movements). So the "superembodied" language system comprises a phonologically organized structure of representations, similar to somatotopy in the motor system or tonotopy in the auditory system, that is closely linked to a structure of highly elaborated behavioral and perceptual patterns.
As a perspective for treating language disorders, the present review may provide some ideas for language training by approaching the language network from its margins. In this respect, domain-general networks may be particularly relevant for the language recovery in stroke patients Geranmayeh et al., 2017). Similarly, brain structures outside the core language network are relevant for children during language acquisition since at an early stage they do not have a mental lexicon of word forms and meanings onto which the incoming speech signal can be mapped. Studies on early blind subjects have shown that mental objects such as shapes, textures, and categories that in sighted subjects are bound to the ventral visual stream approximately "find" their "correct" place in the brain even in the absence of any visual input (Handjaras et al., 2016). This process seems to be driven by top-down mechanisms from modality-independent higher-order semantic regions (e.g., generalized object representations) and by cross-modal interactions with other modalities such as audition and somatosensation. For example, the "visual word form area" is established in blind subjects as a language region (Botthali et al., 2014) that is active during both somatosensory Braille reading as well as auditory language input (Kim et al., 2017). Also other category-selective visual regions can be activated during tasks that are related to these categories without any visual experience (Peelen and Downing, 2017). Furthermore, this topographic organization is subject to continuous plasticity throughout life (Striem-Amit et al., 2015). Thus, in a similar way as the organization of higher-order visual regions can be organized from "outside, " also language areas might be restored or retrained in the neighborhood of lesions by idiosyncratic stimulation from outside the core language system, using a multimodal setting that is ecologically relevant to the patients' communication demands.
A limitation of the present review is that it does not really present a comprehensive model of language processing in the brain. The subdivision into a core language area and its margins is more or less arbitrary, and there is some redundancy across sections Anatomical Neighbors and Connectivity, Functional Networks, and Examples of Language Use Beyond the "Code" of the Core Language Network referring to brain areas, networks, and particular linguistic functions, respectively. However, language in the brain appears as a structure that, at least for the moment, is too complex to be described in a single model. This is due to the nature of language that is able to express almost everything and, thus, to activate more or less any brain region. Furthermore, it's an open system, that any time can create new lexical items, new stylistic features, and new topics of communication, associated with the generation of new sub-networks in the brain.
Further limitations of the present review refer to various aspects that could not comprehensively be addressed, such as effects of age, gender, and neuroplasticity induced by pathological conditions or second language learning. Furthermore, most of the considered brain imaging studies relied on simplified experimental conditions while natural language use, particularly at the production side, might be underrepresented.
As a tentative conclusion, the "margins" of the language network comprise single brain regions as well as entire networks that are bound to various aspects of communication. Roughly, these margins comprise (1) the motor system for speech articulation as well as motor representations of semantic content, (2) sensory representations that are connected to word forms in the mental lexicon, (3) nonverbal memory systems that are built up during narrative language structures and can be used for pragmatic inference and a plausibility check, (4) theory of mind functions for managing subject identity, attitudes, beliefs, intentions, and perspective taking, (5) the affective network for the processing of emotions, valence, arousal, and emphasis, and (6) domain-general networks for cognitive and attentional control to initiate the communication process, to keep it fluent, and to interrupt the ongoing activity it in case of error detection.

AUTHOR CONTRIBUTIONS
All authors: contributed to the present article. IH: primary writing. HA and SD: substantial contributions to the conception and editing.

FUNDING
This work was supported by the German Research Council (DFG Project HE 1573/6-2), the Hertie Institute for Clinical Brain Research (Tübingen, Germany) and regarding publication fees, by the University of Tübingen and the German Research Council.