Off to a Good Start: The Early Development of the Neural Substrates Underlying Visual Working Memory

Current neuroscientific models describe the functional neural architecture of visual working memory (VWM) as an interaction of the frontal-parietal control network and more posterior areas in the ventral visual stream (Jonides et al., 2008; D'Esposito and Postle, 2015; Eriksson et al., 2015). These models are primarily based on adult neuroimaging studies. However, VWM undergoes significant development in infancy and early childhood, and the goal of this mini-review is to examine how recent findings from neuroscientific studies of early VWM development can be reconciled with this model. We surveyed 29 recent empirical reports that present neuroimaging findings in infants, toddlers, and preschoolers (using EEG, fNIRS, rs-fMRI) and neonatal lesion studies in non-human primates. We conclude that (1) both the frontal-parietal control network and the posterior cortical storage areas are active from early infancy; (2) this system undergoes focalization and some reorganization during early development; (3) and the MTL plays a significant role in this process as well. Motivated by both theoretical and methodological considerations, we offer some recommendations for future directions for the field.


INTRODUCTION
Working memory is a limited-capacity system for the maintenance and manipulation of information in service of ongoing tasks. The classic model of working memory (WM, Baddeley and Hitch, 1974) distinguishes the central executive system and two different sensory buffers for the temporary storage of visual and auditory information (an additional system, the episodic buffer, was later added: Baddeley, 1986). This multicomponent model has framed essentially all research on WM for more than 20 years. More recent "state-based" WM models (Cowan, 1988;Oberauer, 2002;McElree, 2006), however, question basic assumptions of the multicomponent model, claiming Abbreviations: DR, delayed response; DNMS, delayed non-match to sample; DTI, diffusion tensor imaging; EEG, electroencephalography; fNIRS, functional near infrared spectroscopy; fMRI, functional magnetic resonance imaging; HR, heart rate; dlPFC, dorsolateral prefrontal cortex; LTM, long-term memory; MTL, medial temporal lobe; Neo-HC, neonatally lesioned in the hippocampus; Neo-PRh, neonatally lesioned in the perirhinal cortex; Obj-SO, object self-ordered pointing task; PRh, perirhinal cortex; rs-fMRI, resting state fMRI; SOMT, Serial Order Memory Task; VoE, Violation of Expectation; vlPFC, ventrolateral prefrontal cortex; VWM, visual working memory; WM, working memory. that there are no separate WM-specific storage systems in the brain; instead, representations held in WM are temporarily activated long-term memory (LTM) representations. According to this view, storage of sensory information involves posterior cortices; visual WM (VWM) representations, for example, have been localized in various stages of the ventral stream, starting in the occipital cortex (Harrison and Tong, 2009;Serences et al., 2009) and continuing to inferior temporal cortex (Miller et al., 1991). Maintenance and manipulation of WM representations (the functions of the central executive) depend upon a frontal-parietal network (Awh and Jonides, 2001;Curtis and D'Esposito, 2003), in particular, anterior insula, lateral prefrontal cortex (PFC), dorsal anterior cingulate cortex, and areas within and surrounding the intraparietal sulcus (Seeley et al., 2007).
This conceptualization of WM is grounded in an extensive body of neuroscientific research, the majority of which has been conducted with human adults (for reviews, see Jonides et al., 2008;D'Esposito and Postle, 2015;Eriksson et al., 2015). WM undergoes significant postnatal development, with farreaching consequences on cognitive development in general (Bull et al., 2008). Behavioral studies have shown that the ability to hold information in VWM emerges in infancy Leslie, 2003, 2005;Ross-Sheehy et al., 2003;Zosh and Feigenson, 2012), and gradually improves throughout childhood (Riggs et al., 2006;Cowan et al., 2010;Simmering, 2012) and adolescence (Isbell et al., 2015). It is outside of the scope of this mini-review to provide a comprehensive overview of the entire behavioral literature (see Kibbe, 2015;Cowan, 2016;Reynolds and Romano, 2016, in this Research Topic); instead, we will examine whether recent findings from neuroscientific studies of early VWM development can be fit into the adult model above.
We limit our focus to studies that examine VWM in the first 5 years of life. While there is an abundant fMRI literature on children older than 6-7 years of age (e.g., Geier et al., 2009;von Allmen et al., 2014), this method currently cannot be used with very young children, and here we focus on what is known about these mechanisms before this age. The studies reviewed here employ a variety of neurophysiological methods (primarily electroencephalography, EEG, and functional Near-Infrared Spectroscopy, fNIRS) in human infants and young children (Table 1) and lesions in young primates ( Table 2).
Structural and functional brain development progresses in parallel. Both classic brain anatomical studies in synaptic density (Huttenlocher and Dabholkar, 1997) and more recent structural connectivity studies using DTI (Qiu et al., 2015) found a posterior-to-anterior progression during the first few years of life, with white matter developing in the occipital and temporal cortices before frontal areas. While our focus in this minireview is on the functional development of the system underlying VWM, we will also discuss a few groundbreaking studies where researchers were able to link behavioral performance in a VWM task with myelination of a specific network (Short et al., 2013;Meng et al., 2014).

NEURODEVELOPMENT OF THE HUMAN VWM SYSTEM: INFANCY (0-2 YEARS)
Many of the neuroimaging studies examining infant VWM development employed the classic A-not-B task in conjunction with optical imaging (fNIRS) or EEG. In this task, an object is hidden at one of two locations and the infant is allowed to manually search for it. Once the infant repeatedly succeeds at one location, the object is then hidden at the other location. In the looking-based version of this task, looking times to the two locations are contrasted.
In one of the first studies to measure regional blood-flow changes in infants using fNIRS, Baird et al. (2002) found that prefrontal cortex (PFC) activity increased with success on an object maintenance task. More recently, EEG power and coherence measures from the entire scalp have been used to examine VWM task-related and age-related changes in the frontal-parietal network of infants (Bell and Wolfe, 2007;Cuevas and Bell, 2011;Bell, 2012;Cuevas et al., 2012a,b,c). Cuevas et al. (2012a), for example, found that frontal EEG power and heart rate predicted VWM performance in infants at 10 months, but not at 5 months. In another study, successful performance on the A-not-B task was found to be related to increased frontalparietal coherence at 8 months (Bell, 2012;Cuevas et al., 2012b). These findings suggest that the frontal-parietal network supports successful VWM performance between 8 and 10 months.
During the infancy period, functional connectivity of the VWM network appears to become less diffuse with age. Cuevas et al. (2012a) found an increase in EEG coherence relative to baseline across the entire scalp in 5-month-olds but only between the medial frontal and occipital electrode sites in 10-montholds. This finding is additionally supported by the observation of increased focalization of frontal-parietal network activity between 8 months and 4.5 years of age, which may reflect more efficient communication (Bell and Wolfe, 2007).
Resting-state fMRI (rs-fMRI) has been used to identify functional connections between brain regions in the absence of any task. This latter aspect makes this method particularly attractive for studies of early development, as infants can be scanned during sleep. In a pioneering study, Alcauter et al. (2014) tracked the development of resting-state networks in infants from birth to 2 years of age and their VWM performance. In addition to significant gains in synchrony among prefrontal and parietal regions at age one, it was found that connectivity between the thalamus and the salience network (which includes the insula, the cingulate, and frontal cortices, and is considered a sub-network of the frontal-parietal network in adults, see Elton and Gao, 2014) at age one predicted VWM performance at age two. In a DTI tractography study, the same group found that myelination of the tracts connecting frontal and parietal cortices predicted VWM performance in 1-year-old infants (Short et al., 2013). These studies thus corroborate the EEG findings that frontal-parietal connectivity is present before the end of the first year, and is related to VWM development. However, because salience network activity is functionally dissociated from WM performance in adults (Seeley et al., 2007;Elton and Gao, 2014), it is likely this network undergoes functional reorganization between toddlerhood and adulthood. The involvement of posterior cortical areas in infant VWM has primarily been examined using more modern behavioral paradigms, such as Violation-of-Expectation (VoE), in conjunction with fNIRS, or EEG. Using fNIRS, Wilcox and colleagues found that the anterior temporal cortex showed consistent activation when infants noticed a change in the features of an object that they held in mind when it reappeared from behind an occluder (thus, this feature change "violated" their expectations; Wilcox et al., 2005Wilcox et al., , 2008Wilcox et al., , 2009Wilcox et al., , 2010Wilcox et al., , 2014. Task-related activation in the posterior temporal cortex gradually decreased from 5 to 12 months, and the occipital cortex was active during all object maintenance tasks. This decrease in activation in posterior temporal cortex may reflect functional reorganization of object processing areas over the course of development (Wilcox et al., 2012(Wilcox et al., , 2014Wilcox and Biondi, 2016). Converging evidence for maintenance related activity in posterior storage areas has been reported by Kaufman et al. (2003Kaufman et al. ( , 2005 using EEG. They found that increased gamma-band (20-60 Hz) activity in the right temporal cortex of 6-month-olds was associated with the maintenance of object representations behind an occluder (Kaufman et al., 2003(Kaufman et al., , 2005. More recently, Kaufman and colleagues showed that the same response was higher in the right occipital cortex when infants kept two vs. one object in VWM (Leung et al., 2016). This result raises the possibility of finding a load-dependent neural signature of information storage in infant VWM.
In sum, the literature concerning the neural substrates of VWM systems in infants points toward an early emerging frontal-parietal network; one that is present and active even before age one (Bell, Cuevas; connectivity studies). Studies by Wilcox, Kaufman and their colleagues found storage-related VWM activity in the temporal and occipital cortices as well, which may mirror similar findings in adults in the ventral visual stream (for a recent review, see Lee and Baker, 2016).

NEURODEVELOPMENT OF THE HUMAN VWM SYSTEM: EARLY CHILDHOOD (3-5 YEARS)
To date, only a handful of neuroimaging studies have examined VWM during the early childhood period, and all used fNIRS. The lack of neuroimaging (both structural and functional) conducted with this notoriously challenging age range is primarily due to practical limitations: Preschool-age children require special experimental designs as they are rarely willing to participate for an extended time, and they often do not follow verbal instructions reliably. One notable limitation of three of the four fNIRS studies reviewed below is that hemodynamic responses were measured only in the frontal areas (or in Buss et al., 2014, in the frontal and the parietal cortices). Thus, conclusions were necessarily constrained to these regions. Tsujimoto et al. (2004) found that lateral PFC activity in 5.5year-old children was very similar to adults' during a change detection task: One of the most widely used paradigms in adult Object maintenance with updating is impaired in Neo-PRh adults, but serial order monitoring is preserved DTI, Diffusion Tensor Imaging; SU-DNMS, session-unique Delayed Nonmatch-to-Sample; SOMT, Serial Order Memory Test; Obj-SO, Object Self-Order task; Neo-HC adults, adult macaques who received neurotoxic hippocampal lesions during the first 2 weeks of life; Neo-PRh adults, adult macaques who received neurotoxic lesions in the perirhinal region during the first 2 weeks of life; L, longitudinal; C, cross-sectional; "one," only one age group tested.
Frontiers in Systems Neuroscience | www.frontiersin.org VWM research, participants are briefly presented with a set of to-be-remembered items, and following a short delay are tested on whether or not the items have changed (Pashler, 1988;Luck and Vogel, 1997). Using the same task with a small longitudinal sample, Tsujii et al. (2009) found that between 5 and 7 years of age, increased VWM performance correlated with right lateralization of frontal activity. More recently, Buss et al. (2014) found that the frontalparietal network was active in 3-and 4-year-olds during a change detection task, where load was systematically manipulated. Overall, they demonstrated greater involvement of parietal cortical areas relative to frontal areas, as well as increased parietal activity in 4-year-olds relative to 3-year-olds. Prior studies found that, in adults, activity in the parietal cortex was load-dependent for small set sizes, and leveled off at the behaviorally-defined capacity limit (Todd and Marois, 2004;Palva et al., 2011). In 3and 4-year-olds this activity was load-dependent, but continued to increase beyond the observed capacity limit-a finding that warrants further investigation. In a similar investigation of delaydependent activity, Perlman et al. (2016) manipulated the length of delays (2 vs. 6 s) and found age-dependent activation in lateral PFC in children between 3 and 7 years of age, and that children recruited this area more during longer delays. As the ventrolateral PFC is involved in maintenance, this finding suggests increased active rehearsal of information with age.
In sum, it appears that the frontal-parietal network becomes increasingly adult-like throughout early childhood. Increased recruitment of prefrontal and parietal areas point to increased focalization of the frontal-parietal system, while increased lateralization to the right hemisphere suggests adult-like specialization of this network for visuospatial tasks (Thomason et al., 2009). Because recordings were not made from the temporal and occipital areas, at the current time we cannot draw any conclusions about the involvement of the posterior cortices. The paucity of research in this age range creates a gap in our understanding of the development of VWM.

NEURODEVELOPMENT OF THE NON-HUMAN PRIMATE VWM SYSTEM: EFFECTS OF NEONATAL LESIONS
Both the frontal-parietal network and the posterior storage areas (e.g., IT) have multiple connections to the medial temporal lobe (MTL; Lavenex et al., 2002). While most current neuroscientific methods used in young children (fNIRS, EEG) do not allow access to these deep structures, primate lesion studies have provided a wealth of findings about the role of these structures in early development. Unlike adult lesion studies, which can only provide information about the relative contribution of a brain structure in a fully-formed system, neonatal lesion studies have the advantage of examining the downstream effects of a lesion on the developing system 1 . In the following section, we will focus on 1 The earliest neuroscientific studies of the development of the frontal cortex used these techniques as well (Goldman, 1971;Miller et al., 1973), and demonstrated the role of both the dorsolateral and the ventrolateral PFC (dlPFC and vlPFC) in VWM. By connecting findings in PFC-lesioned macaques and human infants, Diamond and Goldman-Rakic (1989) laid the one of the first building blocks of developmental cognitive neuroscience. the role of the MTL in the development of the frontal-parietal network.
Heuer and Bachevalier (2011) examined the contribution of the hippocampus to the development of VWM abilities. Here they utilized a delayed response task (also widely used in classic behavioral studies with infants; e.g., Diamond and Doar, 1989), where participants are presented with one object (the sample), followed by a delay, and then a choice between a matching object and a non-matching object. In the delayed-nonmatch-to-sample (DNMS) version of this task, participants are rewarded for selecting the non-matching object. Results showed that adult macaques that received neonatal hippocampal lesions (henceforth: Neo-HC) performed as well as sham-operates on a DNMS task (requires maintenance and putatively relies on the vlPFC, see Petrides, 1995). However, these macaques failed to even meet training criterion on an object self-ordered pointing task (Obj-SO) in which participants selected baited food wells in a different order on successive trials (requires manipulation, specifically, monitoring serial order, and putatively relies on the dlPFC; Petrides, 1995).
Follow-up studies using other dlPFC-associated VWM tasks have provided supporting evidence: Neo-HC macaques made significantly more errors than controls on a serial-order memory (SOMT) task (Heuer and Bachevalier, 2013), and in a foraging task were more likely than controls to return to boxes they had already visited, especially if that box previously contained the animal's preferred food (Glavis-Bloom et al., 2013). Thus early hippocampal lesions lead to deficits in VWM manipulation, but not in maintenance. The finding that early hippocampal damage leads to deficits on a task that taps dlPFC has been replicated in human patients who suffered hypoxic-ischaemic events early in life (Geva et al., 2016).
In addition to hippocampal lesions, Weiss et al. (2016) found that neonatal lesions to another area of MTL, the perirhinal cortex, impacted VWM performance on tasks believed to rely on the vlPFC. In their study, macaques with neonatal lesions of perirhinal cortex (Neo-PRh) were impaired on a DNMS task at short delays, as well as an Obj-SO task; both repeated stimuli across trials, and thus required trial-to-trial updating of information in VWM. In contrast, Neo-PRh animals performed well on a task that used novel stimuli across trials (SOMT), thus did not require updating, suggesting that the perirhinal cortex is not involved in manipulation of WM contents per se, but rather interference resolution or associated executive functions (e.g., inhibition).
These findings suggest that the MTL gives rise to the development of PFC-associated VWM skills, such as manipulation and interference resolution, likely through reciprocal neuroanatomical connections (Goldman-Rakic et al., 1984;Aggleton et al., 2015). Two recent connectivity studies provide converging evidence for this. Early hippocampal damage led to both reduced white matter (Meng et al., 2014) and decreased resting-state connectivity (Meng et al., 2016) between the dlPFC and the medial PFC and several posterior areas, such as IT and V4, in adult macaques. These anatomical and functional impairments correlated with poorer performance on the SOMT (Meng et al., 2014(Meng et al., , 2016. This correlation underscores the importance of the hippocampus, as well as the frontal-parietal network in the development of VWM abilities: By adulthood, Neo-HC macaques had not developed compensatory mechanisms for VWM. This stands in stark contrast to a similar neonatal lesion study demonstrating compensatory mechanisms for rule learning and recognition memory following lesions to the vlPFC (Malkova et al., 2016).

SUMMARY AND FUTURE DIRECTIONS
The goal of this mini-review was to examine the neurophysiological evidence regarding the early emergence of the VWM network that involves both the frontal-parietal control network and the posterior storage areas that have been identified in adults. Our first conclusion is that both of these systems seem to be active from as early as the second half of the first year in humans.
A handful of longitudinal and cross-sectional studies reviewed here point to a gradual focalization of the frontal-parietal system throughout development (see the works of Bell and her colleagues). We also see some evidence for the functional reorganization of the network during the early life period: for example, a shift away from the salience network from infancy to adulthood (Alcauter et al., 2014), and increasing reliance on the parietal cortex during the preschool years (Buss et al., 2014). These changes may reflect a specialization within the network. Furthermore, findings from non-human primates have demonstrated the significance of the medial temporal lobe in the development of the lateral PFC (see the works of Bachevalier and her colleagues).
Related to the emergence of posterior information storage areas, a number of studies found object-maintenance related activity in both occipital and temporal lobes in infancy (see the works of Kaufman and his colleagues). Studies on VWM mechanisms in early childhood have not recorded from these posterior areas, so our understanding of how these areas support VWM in this age range is, at the moment, limited.
A defining characteristic of VWM is its limited capacity. This functional characteristic can serve as a signature to identify VWM storage-related mechanisms: In these structures, activity is expected to gradually increase with the number of to-beremembered items, and then remain constant when capacity limit has been reached (e.g., Palva et al., 2011). In studies that aim to find this signature, the adult cognitive neuroscience literature has adopted a useful psychophysical measure to quantify VWM capacity (Cowan's k 2 ). Research on school-age children has recently begun to examine how memory load affects the recruitment of different parts of the VWM system using this measure (e.g., Shimi et al., 2014;Kharitonova et al., 2015). Importantly, this approach has already been applied successfully in preschoolers (Buss et al., 2014, reviewed above).
Based on both theoretical and methodological considerations, the ideal design to study neurodevelopmental change in the VWM system has the following attributes: (a) Uses the same behavioral task across ages from infants to adults (or at least between infants and young children or young children and older ones). (b) Quantifies capacity behaviorally (e.g., with Cowan's k), and measures neural activity as a function of capacity, in order to identify storage-related mechanisms. (c) Uses a longitudinal sample, to minimize the effects of interindividual variability.
Some of the studies to date have two of these features, but none have all three. Because of its versatility and low task demands, the change detection paradigm is the best positioned to meet criterion (a) in the near future. Thus, a crucial open question for future studies is how neural activity in the VWM network changes in children under 3 years of age using this task. As well, future studies with preschool-age children that record from posterior cortices (using whole-brain nets, see e.g., Sato et al., 2012) should elucidate the role of these structures in VWM beyond infancy.
Despite all the methodological challenges that are involved in studying brain functions in infants, young children, and young primates, research on early VWM neurodevelopment has gotten off to an exciting start. Several different physiological methods have already yielded converging results, and recent advances in neuroimaging methods (e.g., Cutini and Brigadoi, 2014;Graham et al., 2015), will likely lead to an expansion of research in the near future. We look forward to an exciting period in the study of the early developmental unfolding of the VWM system.

AUTHOR CONTRIBUTIONS
AF and HS contributed equally to this manuscript by selecting and summarizing relevant studies and writing multiple sections of this review. SG contributed significantly to the section on EEG and rs-fMRI studies. ZK has developed the theoretical perspective with the help of AF and HS. All four authors contributed to the writing and editing of the paper.