<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="brief-report">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Comput. Neurosci.</journal-id>
<journal-title>Frontiers in Computational Neuroscience</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Comput. Neurosci.</abbrev-journal-title>
<issn pub-type="epub">1662-5188</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fncom.2020.586671</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Neuroscience</subject>
<subj-group>
<subject>Brief Research Report</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Leveraging Prior Concept Learning Improves Generalization From Few Examples in Computational Models of Human Object Recognition</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Rule</surname> <given-names>Joshua S.</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1026852/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Riesenhuber</surname> <given-names>Maximilian</given-names></name>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<xref ref-type="corresp" rid="c001"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1158964/overview"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology</institution>, <addr-line>Cambridge, MA</addr-line>, <country>United States</country></aff>
<aff id="aff2"><sup>2</sup><institution>Department of Neuroscience, Georgetown University Medical Center</institution>, <addr-line>Washington, DC</addr-line>, <country>United States</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Germ&#x000E1;n Mato, Bariloche Atomic Centre (CNEA), Argentina</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Dami&#x000E1;n G. Hern&#x000E1;ndez, Bariloche Atomic Centre (CNEA), Argentina; Jian K. Liu, University of Leicester, United Kingdom</p></fn>
<corresp id="c001">&#x0002A;Correspondence: Maximilian Riesenhuber <email>max.riesenhuber&#x00040;georgetown.edu</email></corresp>
</author-notes>
<pub-date pub-type="epub">
<day>12</day>
<month>01</month>
<year>2021</year>
</pub-date>
<pub-date pub-type="collection">
<year>2020</year>
</pub-date>
<volume>14</volume>
<elocation-id>586671</elocation-id>
<history>
<date date-type="received">
<day>23</day>
<month>07</month>
<year>2020</year>
</date>
<date date-type="accepted">
<day>30</day>
<month>11</month>
<year>2020</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2021 Rule and Riesenhuber.</copyright-statement>
<copyright-year>2021</copyright-year>
<copyright-holder>Rule and Riesenhuber</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract><p>Humans quickly and accurately learn new visual concepts from sparse data, sometimes just a single example. The impressive performance of artificial neural networks which hierarchically pool afferents across scales and positions suggests that the hierarchical organization of the human visual system is critical to its accuracy. These approaches, however, require magnitudes of order more examples than human learners. We used a benchmark deep learning model to show that the hierarchy can also be leveraged to vastly improve the speed of learning. We specifically show how previously learned but broadly tuned conceptual representations can be used to learn visual concepts from as few as two positive examples; reusing visual representations from earlier in the visual hierarchy, as in prior approaches, requires significantly more examples to perform comparably. These results suggest techniques for learning even more efficiently and provide a biologically plausible way to learn new visual concepts from few examples.</p></abstract>
<kwd-group>
<kwd>transfer learning</kwd>
<kwd>few-shot learning</kwd>
<kwd>semantic cognition</kwd>
<kwd>artificial neural networks</kwd>
<kwd>object recognition</kwd>
</kwd-group>
<contract-num rid="cn001">COMP-19- ERD-007</contract-num>
<contract-num rid="cn001">DE-AC52-07NA27344</contract-num>
<contract-num rid="cn002">1745302</contract-num>
<contract-num rid="cn002">1026934</contract-num>
<contract-num rid="cn002">1122374</contract-num>
<contract-num rid="cn002">1232530</contract-num>
<contract-sponsor id="cn001">Lawrence Livermore National Laboratory<named-content content-type="fundref-id">10.13039/100006227</named-content></contract-sponsor>
<contract-sponsor id="cn002">National Science Foundation<named-content content-type="fundref-id">10.13039/100000001</named-content></contract-sponsor>
<counts>
<fig-count count="2"/>
<table-count count="0"/>
<equation-count count="0"/>
<ref-count count="60"/>
<page-count count="8"/>
<word-count count="6277"/>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="s1">
<title>Introduction</title>
<p>Humans have the remarkable ability to quickly learn new concepts from sparse data. Preschoolers, for example, can acquire and use new words on the basis of sometimes just a single example (Carey and Bartlett, <xref ref-type="bibr" rid="B10">1978</xref>), and adults can reliably discriminate and name new categories after just one or two training trials (Coutanche and Thompson-Schill, <xref ref-type="bibr" rid="B13">2014</xref>, <xref ref-type="bibr" rid="B15">2015b</xref>; Lake et al., <xref ref-type="bibr" rid="B28">2015</xref>). Given that principled generalization is impossible without leveraging prior knowledge (Watanabe, <xref ref-type="bibr" rid="B54">1969</xref>), this impressive performance raises the question of how the brain might use prior knowledge to establish new concepts from such sparse data.</p>
<p>Several decades of anatomical, computational, and experimental work suggest that the brain builds a representation of the visual world by way of the so-called ventral visual stream, along which information is processed by a simple-to-complex hierarchy up to neurons in ventral temporal cortex that are selective for complex objects such as faces, objects and words (Kravitz et al., <xref ref-type="bibr" rid="B27">2013</xref>). According to computational models (Nosofsky, <xref ref-type="bibr" rid="B34">1986</xref>; Riesenhuber and Poggio, <xref ref-type="bibr" rid="B41">2000</xref>; Thomas et al., <xref ref-type="bibr" rid="B51">2001</xref>; Freedman et al., <xref ref-type="bibr" rid="B20">2003</xref>; Ashby and Spiering, <xref ref-type="bibr" rid="B2">2004</xref>) as well as human functional magnetic resonance imaging (fMRI) and electroencephalography (EEG) studies (Jiang et al., <xref ref-type="bibr" rid="B25">2007</xref>; Scholl et al., <xref ref-type="bibr" rid="B44">2014</xref>), these object-selective neurons in high-level visual cortex can then provide input to downstream cortical areas, such as prefrontal cortex (PFC) and the anterior temporal lobe (ATL), to mediate the identification, discrimination, or categorization of stimuli, as well as more broadly throughout cortex for task-specific needs (Hebart et al., <xref ref-type="bibr" rid="B22">2018</xref>). It is at this level where these theories of object categorization in the brain connect with influential theories of semantic cognition that have proposed that the ATL may act as a <italic>semantic hub</italic> (Ralph et al., <xref ref-type="bibr" rid="B38">2017</xref>), based on neuropsychological findings (Hodges et al., <xref ref-type="bibr" rid="B23">2000</xref>; Mion et al., <xref ref-type="bibr" rid="B33">2010</xref>; Jefferies, <xref ref-type="bibr" rid="B24">2013</xref>) and studies that have used fMRI (Vandenberghe et al., <xref ref-type="bibr" rid="B52">1996</xref>; Coutanche and Thompson-Schill, <xref ref-type="bibr" rid="B14">2015a</xref>; Malone et al., <xref ref-type="bibr" rid="B30">2016</xref>; Chen et al., <xref ref-type="bibr" rid="B12">2017</xref>) or intracranial EEG (iEEG; Chan et al., <xref ref-type="bibr" rid="B11">2011</xref>) to decode category representations in the anteroventral temporal lobe.</p>
<p>Computational work suggests that hierarchical structure is a key architectural feature of the ventral stream for flexibly learning novel recognition tasks (Poggio, <xref ref-type="bibr" rid="B37">2012</xref>). For instance, the increasing tolerance to scaling and translation in progressively higher layers of the processing hierarchy due to pooling of afferents preferring the same feature across scales and positions supports robust learning of novel object recognition tasks by reducing the problem&#x00027;s sample complexity (Poggio, <xref ref-type="bibr" rid="B37">2012</xref>). Indeed, computational models based on this hierarchical structure, such as the HMAX model (Riesenhuber and Poggio, <xref ref-type="bibr" rid="B40">1999</xref>) and, more recently, convolutional neural network (CNN)-based approaches have been shown to achieve human-like performance in object recognition tasks given sufficient numbers of training examples (Jiang et al., <xref ref-type="bibr" rid="B26">2006</xref>; Serre et al., <xref ref-type="bibr" rid="B46">2007a</xref>; Crouzet and Serre, <xref ref-type="bibr" rid="B16">2011</xref>; Yamins et al., <xref ref-type="bibr" rid="B56">2013</xref>, <xref ref-type="bibr" rid="B57">2014</xref>) and even to accurately predict human neural activity (Schrimpf et al., <xref ref-type="bibr" rid="B45">2018</xref>).</p>
<p>In addition to their invariance properties, the complex shape selectivity of intermediate features in the brain, e.g., in V4 or posterior inferotemporal cortex (IT), is thought to span a feature space well-matched to the appearance of objects in the natural world (Serre et al., <xref ref-type="bibr" rid="B46">2007a</xref>; Yamins et al., <xref ref-type="bibr" rid="B57">2014</xref>). Indeed, it has been shown that reusing the same intermediate features permits the efficient learning of novel recognition tasks (Serre et al., <xref ref-type="bibr" rid="B46">2007a</xref>; Donahue et al., <xref ref-type="bibr" rid="B18">2013</xref>; Oquab et al., <xref ref-type="bibr" rid="B35">2014</xref>; Razavian et al., <xref ref-type="bibr" rid="B39">2014</xref>; Yosinski et al., <xref ref-type="bibr" rid="B60">2014</xref>), and the reuse of existing representations at different levels of the object processing hierarchy is at the core of models of hierarchical learning in the brain (Ahissar and Hochstein, <xref ref-type="bibr" rid="B1">2004</xref>). These theories and prior computational work are limited, however, to re use of existing representations at the level of objects and below. Yet, as mentioned before, processing hierarchies in the brain do not end at the object-level but extend to the level of concepts and beyond, e.g., in the ATL, downstream from object-level representations in IT. These representations are importantly different from the earlier visual representations, generalizing over exemplars to support category-sensitive behavior at the expense of exemplar-specific details (Bankson et al., <xref ref-type="bibr" rid="B3">2018</xref>). Intuitively, leveraging these previously learned visual <italic>concept</italic> representations could substantially facilitate the learning of novel concepts, along the lines of &#x0201C;a platypus looks a bit like a duck, a beaver, and a sea otter.&#x0201D; In fact, there is intriguing evidence that the brain might leverage existing concept representations to facilitate the learning of novel concepts: in <italic>fast mapping</italic> (Carey and Bartlett, <xref ref-type="bibr" rid="B10">1978</xref>; Coutanche and Thompson-Schill, <xref ref-type="bibr" rid="B13">2014</xref>, <xref ref-type="bibr" rid="B15">2015b</xref>), a novel concept is inferred from a single example by contrasting it with a related but already known concept, both of which are relevant to answering some query. Fast mapping is more generally consistent with the intuition that the relationships between concepts and categories are crucial to understanding the concepts themselves (Miller and Johnson-Laird, <xref ref-type="bibr" rid="B32">1976</xref>; Woods, <xref ref-type="bibr" rid="B55">1981</xref>; Carey, <xref ref-type="bibr" rid="B8">1985</xref>, <xref ref-type="bibr" rid="B9">2009</xref>). The brain&#x00027;s ability to quickly master new visual categories may then depend on the size and scope of the bank of visual categories it has already mastered. Indeed, it has been posited that the brain&#x00027;s ability to perform fast mapping might depend on its ability to relate the new knowledge to existing schemas in the ATL (Sharon et al., <xref ref-type="bibr" rid="B49">2011</xref>). Yet, there is no computational demonstration that such leveraging of prior learning can indeed facilitate the learning of novel concepts. Showing that leveraging existing concept representations can dramatically reduce the number of examples needed to learn novel concepts would not only provide an explanation for the brain&#x00027;s superior ability to learn novel concepts from few examples, but would also be of significant interest for artificial intelligence, given that current deep learning systems still require substantially more training examples to reach human-like performance (Lake et al., <xref ref-type="bibr" rid="B29">2017</xref>; Schrimpf et al., <xref ref-type="bibr" rid="B45">2018</xref>).</p>
<p>We show that leveraging prior learning at the concept level in a benchmark deep learning model leads to vastly improved abilities to learn from few examples. While visual learning and reasoning involves a wide variety of skills&#x02014;including memory (Brady et al., <xref ref-type="bibr" rid="B7">2008</xref>, <xref ref-type="bibr" rid="B6">2011</xref>), compositional reasoning (Lake et al., <xref ref-type="bibr" rid="B28">2015</xref>; Overlan et al., <xref ref-type="bibr" rid="B36">2017</xref>), and multimodal integration (Yildirim and Jacobs, <xref ref-type="bibr" rid="B58">2013</xref>, <xref ref-type="bibr" rid="B59">2015</xref>)&#x02014;we focus here on the task of object recognition. This ability to classify visual stimuli into categories is a key skill underlying many of our other visual abilities. We specifically find that broadly tuned conceptual representations can be used to learn visual concepts from as few as two positive examples, accurately discriminating positive examples of the concept from a wide variety of negative examples; visual representations from earlier in the visual hierarchy require significantly more examples to reach comparable levels of performance.</p>
</sec>
<sec sec-type="methods" id="s2">
<title>Methods</title>
<sec>
<title>ImageNet</title>
<p>ImageNet (<ext-link ext-link-type="uri" xlink:href="http://www.image-net.org">www.image-net.org</ext-link>) organizes more than 14 million images into 21,841 categories following the WordNet hierarchy (Deng et al., <xref ref-type="bibr" rid="B17">2009</xref>). Crucially, these images come from multiple sources and vary widely on dimensions such as pose, position, occlusion, clutter, lighting, image size, and aspect ratio. This image set has been designed and used to test large-scale computer vision systems (Russakovsky et al., <xref ref-type="bibr" rid="B43">2015</xref>), including models of primate and human visual object recognition (Yamins et al., <xref ref-type="bibr" rid="B57">2014</xref>; Schrimpf et al., <xref ref-type="bibr" rid="B45">2018</xref>). We similarly use disjoint subsets of ImageNet to both train and validate a modified GoogLeNet and to train and test a series of binary classifiers.</p>
<p>To train and validate GoogLeNet, we randomly selected 2,000 categories from 3,177 ImageNet categories providing both bounding boxes and more than 732 total images (the minimum number of images per category in the Image Net Large Scale Visual Recognition Challenge (ILSVRC) 2015), thus ensuring each category represented a concrete noun with significant variation, as can be seen in <xref ref-type="supplementary-material" rid="SM1">Supplementary Table 1</xref>. One of the authors further reviewed each category to ensure it represented a concrete visual category. We set aside 25 images from each category to serve as validation images and used the remainder as training images. We thus used a total of 2,401,763 images across 2,000 categories for training and 50,000 images across those same 2,000 categories for validation. To reduce computational complexity, all images were resized to 256 pixels on the shortest edge while preserving orientation and aspect ratio and then automatically cropped to 256 &#x000D7; 256 pixels during training and validation. While it is possible for this strategy to crop the object of interest out of the image, previous work with the GoogLeNet architecture (Szegedy et al., <xref ref-type="bibr" rid="B50">2014</xref>) suggests that the impact on performance is marginal.</p>
<p>To train and test our binary classifiers, we used the training and validation images from 100 of the 1,000 categories from the ILSVRC2015 challenge (Russakovsky et al., <xref ref-type="bibr" rid="B43">2015</xref>). As with the GoogLeNet images, all images were resized to 256 pixels on the shortest edge while preserving orientation and aspect ratio and then automatically cropped to 256 &#x000D7; 256 pixels during feature extraction. These 100 test categories are all novel relative to the 2,000 training categories in that there are no exact duplicates across the training and test categories. There are test categories providing significant visual overlap with training categories, such as <italic>car wheel</italic> sharing similar structure with <italic>bicycle wheel, wheelchair, steering wheel, bicycle, Ferris wheel</italic>, and so on. It is central to the hypothesis of this paper that these kinds of visual similarities can be leveraged to more quickly learn new categories. In this case, <italic>car wheel</italic> is an unknown category: no category in the visual lexicon mastered by GoogLeNet corresponds exactly to <italic>car wheel</italic>. It might be learned more quickly, however, by noting that it is relatively visually similar to <italic>bicycle wheel</italic> and <italic>wheelchair</italic> but relatively dissimilar to, for example, <italic>fence, bugle</italic>, or <italic>footbridge</italic>. The particular pattern of similarity and dissimilarity at the level of visual categories can be used as a signature for identifying car wheels.</p>
</sec>
<sec>
<title>GoogLeNet</title>
<p>GoogLeNet is a high-performing (Szegedy et al., <xref ref-type="bibr" rid="B50">2014</xref>) deep neural network (DNN) designed for large-scale visual object recognition (Russakovsky et al., <xref ref-type="bibr" rid="B43">2015</xref>). Because prior work has shown that the performance of DNNs is correlated with their ability to predict neural activations (Yamins et al., <xref ref-type="bibr" rid="B56">2013</xref>, <xref ref-type="bibr" rid="B57">2014</xref>) and that GoogLeNet in particular is a comparatively good predictor of neural activity (Schrimpf et al., <xref ref-type="bibr" rid="B45">2018</xref>), we use GoogLeNet as a model of human visual object recognition. Because the exact motivation for GoogLeNet and the details of its construction have been reported elsewhere, we focus here on the details relevant to our investigation. We used the Caffe BVLC GoogLeNet implementation with one notable alteration: we increased the size of the final layer from 1,000 to 2,000 units, commensurate with the 2,000 categories we used to train the network. We trained the network for &#x0007E;133 epochs (1E7 iterations of 32 images) using a training schedule similar to that in Szegedy et al. (<xref ref-type="bibr" rid="B50">2014</xref>) (fixed learning rate starting at 0.01 and decreasing by 4% every 3.2E5 images with 0.9 momentum), achieving 44.9% top-1 performance and 73.0% top-5 performance across all 2,000 categories.</p>
</sec>
<sec>
<title>Main Simulation</title>
<p>To study how previously learned visual concepts could facilitate the learning of novel visual concepts, we trained a series of one-vs-all binary classifiers (elastic net logistic regression) to recognize 100 new categories from the ILSVRC2015 challenge. The 100 categories, listed in <xref ref-type="supplementary-material" rid="SM2">Supplementary Table 2</xref>, were chosen uniformly at random and remained constant across all feature sets.</p>
<p>The primary hypothesis of this paper is that prior learning about visual concepts can significantly improve learning about new visual concepts from few examples. Learning new categories in terms of existing category-selective features is thus of primary interest, so we compared several feature sets to test the effectiveness of learning from category-selective features relative to other feature types. We specifically compared the following feature sets:</p>
<list list-type="bullet">
<list-item><p>Conceptual: 2,000 features extracted from the loss3/classifier, a fully connected layer of GoogLeNet just prior to the softmax operation producing the final output.</p></list-item>
<list-item><p>Generic<sub>1</sub>: 4,096 features extracted from pool5/7x7_s1, an average pooling layer of GoogLeNet (kernel: 7, stride: 1) used in computing the final output.</p></list-item>
<list-item><p>Generic<sub>2</sub>: 13,200 features extracted from the loss2/ave_pool, an average pooling layer of GoogLeNet (kernel: 5, stride: 3) mid-way through the architecture used in computing a second training loss.</p></list-item>
<list-item><p>Generic<sub>3</sub>: 12,800 features extracted from the loss1/ave_pool, an average pooling layer of GoogLeNet (kernel: 5, stride: 3) early the architecture used in computing a third training loss.</p></list-item>
<list-item><p>Generic<sub>1</sub> &#x0002B; Conceptual: 4,096 Generic<sub>1</sub> features combined with 2,000 Conceptual features for a total of 6,096 features.</p></list-item>
</list>
<p>All features were selected for broad tuning to encourage generalization. The Conceptual features&#x02014;being as close to the final output as possible but without the task-specific response sharpening of the softmax operation&#x02014;represent what should be the most category-sensitive features of GoogLeNet (i.e., individual features serve as more reliable signals of category membership than features from other feature sets; see <xref ref-type="supplementary-material" rid="SM3">Supplementary Data</xref>). The various Generic feature sets were chosen as controls against which to compare the conceptual features. Based on prior work using GoogLeNet, these layers likely correspond to high-level visual cortex (e.g., V4, IT, fusiform cortex) (Yamins et al., <xref ref-type="bibr" rid="B57">2014</xref>; Schrimpf et al., <xref ref-type="bibr" rid="B45">2018</xref>). The Generic<sub>1</sub> features act as close controls against which to compare the conceptual features. These features provide a representative basis in which many visual categories can be accurately described while themselves being relatively category-agnostic, as shown in <xref ref-type="supplementary-material" rid="SM3">Supplementary Data</xref>. We chose a layer near the end of the network but before the fully connected layers that recombine the intermediate features into category-specific features. The GoogLeNet architecture defines two auxiliary classifiers&#x02014;smaller convolutional networks connected to intermediate layers to provide additional gradient signal and regularization during training&#x02014;at multiple depths in the network. We define the Generic<sub>2</sub> and Generic<sub>3</sub> features using layers from these auxiliary networks that correspond to the layer from the primary classifier used to define Generic<sub>1</sub>.</p>
<p>We measured feature set performance by training a series of one-vs-all binary classifiers (elastic net logistic regression) for each feature set, meaning that each feature set served in a sub-simulation as the sole input to the classifiers. For each feature set, we trained 14,000 classifiers&#x02014;one for each combination of test category, training set size, and random training split&#x02014;and measured performance using d&#x02032;. Our ImageNet ILSVRC-based image set had 100 categories (see section &#x0201C;ImageNet&#x0201D; above). Positive examples were randomly drawn from the target category, while negative examples were randomly drawn from the other 99 categories. Because we were interested in how prior knowledge helps with learning from few examples, we tested classifiers trained with <italic>n</italic> &#x003F5; {2, 4, 8, 16, 32, 64, 128} total training examples, evenly split between positive and negative examples. To better estimate performance and average out the effects of the classifiers&#x00027; random choices, we repeated each simulation by generating 20 random training/testing splits unique to each combination of test category and training set size.</p>
</sec>
</sec>
<sec sec-type="results" id="s3">
<title>Results</title>
<p>To explore whether concept-level leveraging of prior learning leads to superior ability to learn novel concepts compared to leveraging learning at lower levels, we conducted large-scale analyses using state-of-the-art CNNs (we also conducted similar analyses using the HMAX model (Riesenhuber and Poggio, <xref ref-type="bibr" rid="B40">1999</xref>; Serre et al., <xref ref-type="bibr" rid="B47">2007b</xref>), obtaining qualitatively similar results, albeit with overall lower performance levels). Specifically, we examined concept learning performance as a function of training examples for four feature sets (Conceptual, Generic<sub>1</sub>, Generic<sub>2</sub>, Generic<sub>3</sub>) extracted from a deep neural network (GoogLeNet; Szegedy et al., <xref ref-type="bibr" rid="B50">2014</xref>) as shown in <xref ref-type="fig" rid="F1">Figure 1</xref>. Based on prior work using GoogLeNet, we hypothesize that the Conceptual features best model semantic cortex (e.g., ATL), while the Generic layers best model high-level visual cortex (e.g., V4, IT, fusiform cortex) (Yamins et al., <xref ref-type="bibr" rid="B57">2014</xref>; Schrimpf et al., <xref ref-type="bibr" rid="B45">2018</xref>). We predicted that higher levels would support improved generalization from few examples, and in particular that leveraging representations for previously learned concepts would strongly improve learning performance for few examples. To test this latter hypothesis, we modified the GoogLeNet architecture to perform 2,000-way classification. We then trained the modified network to recognize 2,000 concepts from ImageNet (Deng et al., <xref ref-type="bibr" rid="B17">2009</xref>), listed in <xref ref-type="supplementary-material" rid="SM1">Supplementary Table 1</xref>. We examined the activations of each feature set for images drawn from 100 additional concepts from ImageNet, distinct from the previously learned 2,000 concepts and listed in <xref ref-type="supplementary-material" rid="SM2">Supplementary Table 2</xref>.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p>A schematic of the GoogLeNet neural network (Szegedy et al., <xref ref-type="bibr" rid="B50">2014</xref>) as used in these simulations (main figure) and a schematic of the network&#x00027;s Inception Module (gray inset on lower right). We modified the network to produce 2,000-way outputs, simulating representations for 2,000 previously learned categories. We then investigated how well representations at different levels of the hierarchy supported the learning of novel concepts. To encourage generalization, we wanted each layer to be broadly tuned, so we drew our conceptual layer not from the task-specific and sharply tuned final decision layer (Softmax), but the immediately preceding layer. Multiples (i.e., x2 or x3) indicate several identical layers being connected in series.</p></caption>
<graphic xlink:href="fncom-14-586671-g0001.tif"/>
</fig>
<p>For our scheme to work, conceptual features must support generalization by being broadly tuned. All the feature sets we analyzed are thus part of the standard GoogLeNet architecture and come before the network&#x00027;s final decision layer. The binary classifiers we trained for this analysis, however, were separate from GoogLeNet. We do not claim that they are part of the visual hierarchy so much as we use them to straightforwardly assess the usefulness of different parts of that hierarchy for sample-efficient learning.</p>
<p>The concepts GoogLeNet learns are based on visual information only and therefore do not capture the fullness of the rich and nuanced concepts used in everyday cognition. Yet, they provide a further level of abstraction beyond the object level and could be used in a straightforward fashion to participate in the downstream representations of supramodal concepts (see section Discussion).</p>
<p>To test our hypothesis, we compared the performance of each feature set for several small numbers of training examples. The results in <xref ref-type="fig" rid="F2">Figure 2</xref> confirm the predictions: for small numbers of training examples, feature sets extracted later in the visual hierarchy generally outperformed features sets extracted earlier in the visual hierarchy. Critically, as predicted, we see that the Conceptual features dramatically outperform Generic<sub>1</sub> features for small numbers of training examples (particularly for 2, 4, and 8 positive examples, but including 16 and 32 as well). In addition, Conceptual and Generic<sub>1</sub> features outperform Generic<sub>2</sub>, which outperforms Generic<sub>3</sub>. These results suggest that combinations of Generic<sub>1</sub> features are frequently consistent across small sets of examples without generalizing well to the entire category; patterns among categorical features, by contrast, tend to generalize much better for small numbers of examples.</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p>Mean performance (y-axis) of classifiers in our analysis by category (dots) by feature set (color) and number of positive training examples (x-axis). Performance in both plots is measured as d&#x02032;. Cross bars show mean across categories with bootstrapped 95% CIs.</p></caption>
<graphic xlink:href="fncom-14-586671-g0002.tif"/>
</fig>
<p>To verify this pattern quantitatively, we constructed a linear mixed effects model predicting d&#x02032; from main effects of training set size, and feature set, as well as an interaction between feature set and training set size, with a random effect of category. A Type III ANOVA analysis using Satterthwaite&#x00027;s method finds main effects of feature set [<italic>F</italic>(3, 55,873) = 9105.5, <italic>p</italic> &#x0003C; 0.001] and training set size [<italic>F</italic>(6, 55,873) = 15,833.5, <italic>p</italic> &#x0003C; 0.001], as well as an interaction between feature set and training set size [<italic>F</italic>(18, 55,873) = 465.1, <italic>p</italic> &#x0003C; 0.001]. We further find via single term deletion that the random effect of category explains significant variance [&#x003C7;<sup>2</sup>(1) = 20,646.5, <italic>p</italic> &#x0003C; 0.001].</p>
<p>Having established a main effect of feature set, we further analyzed differences in performance between feature sets by computing pairwise differences in estimated marginal mean performance. Critically, we found that the Conceptual features outperformed Generic<sub>1</sub>, Generic<sub>2</sub>, and Generic<sub>3</sub> features, Generic<sub>1</sub> outperformed Generic<sub>2</sub> and Generic<sub>3</sub> features, and Generic<sub>2</sub> outperformed Generic<sub>3</sub> (<italic>p</italic>s &#x0003C; 0.001).</p>
<p>The interaction between feature set and training set size is also supported by pairwise differences in estimated marginal mean d&#x02032;. Critically, we find that Conceptual features outperform the Generic<sub>1</sub> features for 2&#x02013;32 positive training examples (<italic>p</italic>s &#x0003C; 0.001) and marginally outperform them for 64 positive training examples (performance difference = 0.041, <italic>p</italic> = 0.074). Thus, as predicted, leveraging prior concept learning leads to dramatic improvements in the ability of deep learning systems to learn novel concepts from few examples.</p>
</sec>
<sec sec-type="discussion" id="s4">
<title>Discussion</title>
<p>A striking feature of the human visual system is its ability to learn novel concepts from few examples, in sharp contrast to current computational models of visual processing in cortex that all require larger numbers of training examples (Serre et al., <xref ref-type="bibr" rid="B47">2007b</xref>; Yamins et al., <xref ref-type="bibr" rid="B57">2014</xref>; Schrimpf et al., <xref ref-type="bibr" rid="B45">2018</xref>). Conversely, previous models of visual category learning from computer science that perform well for small numbers of examples (Fei-Fei et al., <xref ref-type="bibr" rid="B19">2006</xref>; Vinyals et al., <xref ref-type="bibr" rid="B53">2016</xref>; albeit not at the level of current state-of-the-art approaches) were not explicitly motivated by how the brain might solve this problem and do not provide biologically plausible mechanisms. It has been unclear, therefore, how the brain could learn novel visual concepts from few examples. In this report, we have shown how leveraging prior concept learning can dramatically improve performance for few training examples. Crucially, this performance was obtained in a model architecture that directly builds on and extends our current understanding of how the visual cortex, in particular inferotemporal cortex, represents objects (Yamins et al., <xref ref-type="bibr" rid="B57">2014</xref>): by using a &#x0201C;conceptual&#x0201D; layer, akin to concept representations identified downstream from IT in anterior temporal cortex (Binder et al., <xref ref-type="bibr" rid="B5">2009</xref>; Binder and Desai, <xref ref-type="bibr" rid="B4">2011</xref>; Malone et al., <xref ref-type="bibr" rid="B30">2016</xref>; Ralph et al., <xref ref-type="bibr" rid="B38">2017</xref>) new concepts can be learned based on just two examples. This suggests that the human brain could likewise achieve its superior ability to learn by leveraging prior learning, specifically concept representations in ATL. How could this hypothesis be tested? In case disjoint neuronal populations coding for related concepts learned at different times can be identified, causality measures such as Granger causality (Granger, <xref ref-type="bibr" rid="B21">1969</xref>; Seth et al., <xref ref-type="bibr" rid="B48">2015</xref>; Martin et al., <xref ref-type="bibr" rid="B31">2019</xref>) could provide evidence for their directed connectivity. At a coarser level, longer latencies of neuronal signals coding for more recently learned concepts relative to previously learned concepts would likewise be compatible with novel concept learning leveraging previously learned concepts.</p>
<p>Intuitively, the requirement for two examples to successfully learn novel concepts makes sense as this allows the identification of commonalities among items belonging to the target class relative to non-members. However, the phenomenon of fast mapping suggests that under certain conditions, humans can learn concepts even from a single positive and negative example. In contrast, in our system, performance for this scenario was generally poor. Yet, theoretically, one positive and one negative example should already be sufficient if the negative example is chosen from a related category that would serve to establish a crucial, category-defining difference, which is precisely what is done in conventional fast mapping paradigms in the literature. In the simulations presented in this paper, our negative example was chosen randomly, so we would not necessarily expect good ability to generalize from a single positive example. Yet, studying how variations in the choice of negative examples can further improve the ability to learn novel concepts from few examples is an interesting question for future work that can easily be studied within the existing framework.</p>
<p>Another interesting question is whether there are conditions under which leveraging prior learning leads to suboptimal results compared to learning with features at lower levels of the hierarchy. In particular, Generic<sub>1</sub> features are as good as Conceptual features for larger numbers of training examples. Future work could explore whether there is some point at which features similar to Generic<sub>1</sub> outperform learning based on Conceptual features: for instance, when sufficiently many examples are available, does it help to learn the category boundaries directly based on shape rather than by relating the new category to previously learned ones? Answering these questions will be essential to understanding how the brain leverages prior learning to efficiently establish new visual concepts.</p>
</sec>
<sec sec-type="data-availability-statement" id="s5">
<title>Data Availability Statement</title>
<p>The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found at: <ext-link ext-link-type="uri" xlink:href="https://osf.io/jgep7">https://osf.io/jgep7</ext-link> (Open Science Foundation).</p>
</sec>
<sec id="s6">
<title>Author Contributions</title>
<p>MR and JR conceived and designed the work, analyzed the data, and wrote the paper. JR implemented the models and acquired the data. All authors contributed to the article and approved the submitted version.</p>
</sec>
<sec sec-type="COI-statement" id="conf1">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
</body>
<back>
<ack><p>The authors thank Jacob G. Martin for helpful conversations and Benjamin Maltbie for help with running simulations. This manuscript has been released as a pre-print at BioRxiv (Rule and Riesenhuber, <xref ref-type="bibr" rid="B42">2020</xref>).</p>
</ack>
<sec sec-type="supplementary-material" id="s7">
<title>Supplementary Material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fncom.2020.586671/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fncom.2020.586671/full#supplementary-material</ext-link></p>
<supplementary-material xlink:href="Table_1.CSV" id="SM1" mimetype="text/csv" xmlns:xlink="http://www.w3.org/1999/xlink">
<label>Supplementary Table 1</label>
<caption><p>2,000 ImageNet categories used to train the GoogLeNet object recognition network. A comma-delimited table listing the WordNet ID, a short natural-language title, and a short natural language gloss for each of the 2,000 categories used to train the modified GoogLeNet object recognition network used in this paper.</p></caption>
</supplementary-material>
<supplementary-material xlink:href="Table_2.CSV" id="SM2" mimetype="text/csv" xmlns:xlink="http://www.w3.org/1999/xlink">
<label>Supplementary Table 2</label>
<caption><p>100 ImageNet categories used to compare feature sets. A comma-delimited table listing the WordNet ID, a short natural-language title, and a short natural language gloss for each of the 100 categories used to compare feature sets extracted from GoogLeNet.</p></caption>
</supplementary-material>
<supplementary-material xlink:href="Data_Sheet_1.pdf" id="SM3" mimetype="application/pdf" xmlns:xlink="http://www.w3.org/1999/xlink">
<label>Supplementary Data</label>
<caption><p>Category selectivity analysis. An additional analysis showing that, for the four feature sets examined in this paper, the closer a feature set is to the final output of the network, the more category-selective that feature set is (i.e., individual features more reliably signal category membership).</p></caption>
</supplementary-material>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ahissar</surname> <given-names>M.</given-names></name> <name><surname>Hochstein</surname> <given-names>S.</given-names></name></person-group> (<year>2004</year>). <article-title>The reverse hierarchy theory of visual perceptual learning</article-title>. <source>Trends Cogn. Sci.</source> <volume>8</volume>, <fpage>457</fpage>&#x02013;<lpage>464</lpage>. <pub-id pub-id-type="doi">10.1016/j.tics.2004.08.011</pub-id><pub-id pub-id-type="pmid">15450510</pub-id></citation></ref>
<ref id="B2">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ashby</surname> <given-names>F. G.</given-names></name> <name><surname>Spiering</surname> <given-names>B. J.</given-names></name></person-group> (<year>2004</year>). <article-title>The neurobiology of category learning</article-title>. <source>Behav. cogn. Neurosci. Rev.</source> <volume>3</volume>, <fpage>101</fpage>&#x02013;<lpage>113</lpage>. <pub-id pub-id-type="doi">10.1177/1534582304270782</pub-id><pub-id pub-id-type="pmid">15537987</pub-id></citation></ref>
<ref id="B3">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bankson</surname> <given-names>B. B.</given-names></name> <name><surname>Hebart</surname> <given-names>M. N.</given-names></name> <name><surname>Groen</surname> <given-names>I. I. A.</given-names></name> <name><surname>Baker</surname> <given-names>C. I.</given-names></name></person-group> (<year>2018</year>). <article-title>The temporal evolution of conceptual object representations revealed through models of behavior, semantics, and deep neural networks</article-title>. <source>NeuroImage</source> <volume>178</volume>, <fpage>172</fpage>&#x02013;<lpage>182</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2018.05.037</pub-id><pub-id pub-id-type="pmid">29777825</pub-id></citation></ref>
<ref id="B4">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Binder</surname> <given-names>J. R.</given-names></name> <name><surname>Desai</surname> <given-names>R. H.</given-names></name></person-group> (<year>2011</year>). <article-title>The neurobiology of semantic memory</article-title>. <source>Trends Cogn. Sci.</source> <volume>15</volume>, <fpage>527</fpage>&#x02013;<lpage>536</lpage>. <pub-id pub-id-type="doi">10.1016/j.tics.2011.10.001</pub-id><pub-id pub-id-type="pmid">22001867</pub-id></citation></ref>
<ref id="B5">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Binder</surname> <given-names>J. R.</given-names></name> <name><surname>Desai</surname> <given-names>R. H.</given-names></name> <name><surname>Graves</surname> <given-names>W. W.</given-names></name> <name><surname>Conant</surname> <given-names>L. L.</given-names></name></person-group> (<year>2009</year>). <article-title>Where is the semantic system? a critical review and meta-analysis of 120 functional neuroimaging studies</article-title>. <source>Cereb. Cortex</source> <volume>19</volume>, <fpage>2767</fpage>&#x02013;<lpage>2796</lpage>. <pub-id pub-id-type="doi">10.1093/cercor/bhp055</pub-id><pub-id pub-id-type="pmid">19329570</pub-id></citation></ref>
<ref id="B6">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Brady</surname> <given-names>T. F.</given-names></name> <name><surname>Konkle</surname> <given-names>T.</given-names></name> <name><surname>Alvarez</surname> <given-names>G. A.</given-names></name></person-group> (<year>2011</year>). <article-title>A review of visual memory capacity: beyond individual items and toward structured representations</article-title>. <source>J. Vis.</source> <volume>11</volume>:<fpage>4</fpage>. <pub-id pub-id-type="doi">10.1167/11.5.4</pub-id><pub-id pub-id-type="pmid">21617025</pub-id></citation></ref>
<ref id="B7">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Brady</surname> <given-names>T. F.</given-names></name> <name><surname>Konkle</surname> <given-names>T.</given-names></name> <name><surname>Alvarez</surname> <given-names>G. A.</given-names></name> <name><surname>Oliva</surname> <given-names>A.</given-names></name></person-group> (<year>2008</year>). <article-title>Visual long-term memory has a massive storage capacity for object details</article-title>. <source>PNAS</source> <volume>105</volume>, <fpage>14325</fpage>&#x02013;<lpage>14329</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.0803390105</pub-id><pub-id pub-id-type="pmid">18787113</pub-id></citation></ref>
<ref id="B8">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Carey</surname> <given-names>S.</given-names></name></person-group> (<year>1985</year>). <source>Conceptual Change in Childhood</source>. <publisher-loc>Cambridge, MA</publisher-loc>: <publisher-name>MIT Press</publisher-name>.<pub-id pub-id-type="pmid">31751816</pub-id></citation></ref>
<ref id="B9">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Carey</surname> <given-names>S.</given-names></name></person-group> (<year>2009</year>). <source>The Origin of Concepts</source>. <publisher-loc>New York, NY</publisher-loc>: <publisher-name>Oxford University Press</publisher-name>.</citation></ref>
<ref id="B10">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Carey</surname> <given-names>S.</given-names></name> <name><surname>Bartlett</surname> <given-names>E.</given-names></name></person-group> (<year>1978</year>). <article-title>&#x0201C;Acquiring a single new word,&#x0201D;</article-title> in <source>Proceedings of the Stanford Child Language Conference</source> (<publisher-loc>Stanford, CA</publisher-loc>), <fpage>17</fpage>&#x02013;<lpage>29</lpage>.</citation></ref>
<ref id="B11">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chan</surname> <given-names>A. M.</given-names></name> <name><surname>Baker</surname> <given-names>J. M.</given-names></name> <name><surname>Eskandar</surname> <given-names>E.</given-names></name> <name><surname>Schomer</surname> <given-names>D.</given-names></name> <name><surname>Ulbert</surname> <given-names>I.</given-names></name> <name><surname>Marinkovic</surname> <given-names>K.</given-names></name> <etal/></person-group>. (<year>2011</year>). <article-title>First-pass selectivity for semantic categories in human anteroventral temporal lobe</article-title>. <source>J. Neurosci.</source> <volume>31</volume>, <fpage>18119</fpage>&#x02013;<lpage>18129</lpage>. <pub-id pub-id-type="doi">10.1523/JNEUROSCI.3122-11.2011</pub-id></citation>
</ref>
<ref id="B12">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>Q.</given-names></name> <name><surname>Garcea</surname> <given-names>F. E.</given-names></name> <name><surname>Almeida</surname> <given-names>J.</given-names></name> <name><surname>Mahon</surname> <given-names>B. Z.</given-names></name></person-group> (<year>2017</year>). <article-title>Connectivity-based constraints on category-specificity in the ventral object processing pathway</article-title>. <source>Neuropsychologia</source> <volume>105</volume>, <fpage>184</fpage>&#x02013;<lpage>196</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuropsychologia.2016.11.014</pub-id></citation>
</ref>
<ref id="B13">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Coutanche</surname> <given-names>M. N.</given-names></name> <name><surname>Thompson-Schill</surname> <given-names>S. L.</given-names></name></person-group> (<year>2014</year>). <article-title>Fast mapping rapidly integrates information into existing memory networks</article-title>. <source>J. Exp. Psychol. Gen.</source> <volume>143</volume>, <fpage>2296</fpage>&#x02013;<lpage>2303</lpage>. <pub-id pub-id-type="doi">10.1037/xge0000020</pub-id><pub-id pub-id-type="pmid">25222265</pub-id></citation></ref>
<ref id="B14">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Coutanche</surname> <given-names>M. N.</given-names></name> <name><surname>Thompson-Schill</surname> <given-names>S. L.</given-names></name></person-group> (<year>2015a</year>). <article-title>Creating concepts from converging features in human cortex</article-title>. <source>Cereb. Cortex</source> <volume>25</volume>, <fpage>2584</fpage>&#x02013;<lpage>2593</lpage>. <pub-id pub-id-type="doi">10.1093/cercor/bhu057</pub-id><pub-id pub-id-type="pmid">24692512</pub-id></citation></ref>
<ref id="B15">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Coutanche</surname> <given-names>M. N.</given-names></name> <name><surname>Thompson-Schill</surname> <given-names>S. L.</given-names></name></person-group> (<year>2015b</year>). <article-title>Rapid consolidation of new knowledge in adulthood via fast mapping</article-title>. <source>Trends Cogn. Sci.</source> <fpage>486</fpage>&#x02013;<lpage>488</lpage>. <pub-id pub-id-type="doi">10.1016/j.tics.2015.06.001</pub-id><pub-id pub-id-type="pmid">26139618</pub-id></citation></ref>
<ref id="B16">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Crouzet</surname> <given-names>S. M.</given-names></name> <name><surname>Serre</surname> <given-names>T.</given-names></name></person-group> (<year>2011</year>). <article-title>What are the visual features underlying rapid object recognition?</article-title> <source>Front. Psychol.</source> <volume>2</volume>, <fpage>1</fpage>&#x02013;<lpage>15</lpage>. <pub-id pub-id-type="doi">10.3389/fpsyg.2011.00326</pub-id><pub-id pub-id-type="pmid">22110461</pub-id></citation></ref>
<ref id="B17">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Deng</surname> <given-names>J.</given-names></name> <name><surname>Dong</surname> <given-names>W.</given-names></name> <name><surname>Socher</surname> <given-names>R.</given-names></name> <name><surname>Li</surname> <given-names>L.-J.</given-names></name> <name><surname>Li</surname> <given-names>K.</given-names></name> <name><surname>Li</surname> <given-names>F.-F.</given-names></name></person-group> (<year>2009</year>). <article-title>&#x0201C;ImageNet: a large-scale hierarchical image database,&#x0201D;</article-title> in <source>2009 IEEE Conference on Computer Vision and Pattern Recognition</source> (<publisher-loc>Miami, FL</publisher-loc>), <fpage>248</fpage>&#x02013;<lpage>255</lpage>. <pub-id pub-id-type="doi">10.1109/CVPR.2009.5206848</pub-id></citation></ref>
<ref id="B18">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Donahue</surname> <given-names>J.</given-names></name> <name><surname>Jia</surname> <given-names>Y.</given-names></name> <name><surname>Vinyals</surname> <given-names>O.</given-names></name> <name><surname>Hoffman</surname> <given-names>J.</given-names></name> <name><surname>Zhang</surname> <given-names>N.</given-names></name> <name><surname>Tzeng</surname> <given-names>E.</given-names></name> <etal/></person-group> (<year>2013</year>). <source>DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition. in arXiv:1310.1531 [cs]</source>. Available online at: <ext-link ext-link-type="uri" xlink:href="http://arxiv.org/abs/1310.1531">http://arxiv.org/abs/1310.1531</ext-link> (accessed March 13, 2020).</citation></ref>
<ref id="B19">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fei-Fei</surname> <given-names>L.</given-names></name> <name><surname>Fergus</surname> <given-names>R.</given-names></name> <name><surname>Perona</surname> <given-names>P.</given-names></name></person-group> (<year>2006</year>). <article-title>One-shot learning of object categories</article-title>. <source>IEEE Trans. Pattern Anal. Mach. Intell.</source> <volume>28</volume>, <fpage>594</fpage>&#x02013;<lpage>611</lpage>. <pub-id pub-id-type="doi">10.1109/TPAMI.2006.79</pub-id><pub-id pub-id-type="pmid">16566508</pub-id></citation></ref>
<ref id="B20">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Freedman</surname> <given-names>D. J.</given-names></name> <name><surname>Riesenhuber</surname> <given-names>M.</given-names></name> <name><surname>Poggio</surname> <given-names>T.</given-names></name> <name><surname>Miller</surname> <given-names>E. K.</given-names></name></person-group> (<year>2003</year>). <article-title>A comparison of primate prefrontal and inferior temporal cortices during visual categorization</article-title>. <source>J. Neurosci.</source> <volume>23</volume>, <fpage>5235</fpage>&#x02013;<lpage>5246</lpage>. <pub-id pub-id-type="doi">10.1523/JNEUROSCI.23-12-05235.2003</pub-id><pub-id pub-id-type="pmid">12832548</pub-id></citation></ref>
<ref id="B21">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Granger</surname> <given-names>C. W.</given-names></name></person-group> (<year>1969</year>). <article-title>Investigating causal relations by econometric models and cross-spectral methods</article-title>. <source>Econom. J. Econom. Soc.</source> <volume>37</volume>, <fpage>424</fpage>&#x02013;<lpage>438</lpage>. <pub-id pub-id-type="doi">10.2307/1912791</pub-id></citation></ref>
<ref id="B22">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hebart</surname> <given-names>M. N.</given-names></name> <name><surname>Bankson</surname> <given-names>B. B.</given-names></name> <name><surname>Harel</surname> <given-names>A.</given-names></name> <name><surname>Baker</surname> <given-names>C. I.</given-names></name> <name><surname>Cichy</surname> <given-names>R. M.</given-names></name></person-group> (<year>2018</year>). <article-title>The representational dynamics of task and object processing in humans</article-title>. <source>Elife</source> <volume>7</volume>:<fpage>e32816</fpage>. <pub-id pub-id-type="doi">10.7554/eLife.32816</pub-id><pub-id pub-id-type="pmid">29384473</pub-id></citation></ref>
<ref id="B23">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hodges</surname> <given-names>J. R.</given-names></name> <name><surname>Bozeat</surname> <given-names>S.</given-names></name> <name><surname>Ralph</surname> <given-names>M. A. L.</given-names></name> <name><surname>Patterson</surname> <given-names>K.</given-names></name> <name><surname>Spatt</surname> <given-names>J.</given-names></name></person-group> (<year>2000</year>). <article-title>The role of conceptual knowledge in object use evidence from semantic dementia</article-title>. <source>Brain</source> <volume>123</volume>, <fpage>1913</fpage>&#x02013;<lpage>1925</lpage>. <pub-id pub-id-type="doi">10.1093/brain/123.9.1913</pub-id><pub-id pub-id-type="pmid">10960055</pub-id></citation></ref>
<ref id="B24">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jefferies</surname> <given-names>E.</given-names></name></person-group> (<year>2013</year>). <article-title>The neural basis of semantic cognition: converging evidence from neuropsychology, neuroimaging, and TMS</article-title>. <source>Cortex</source> <volume>49</volume>, <fpage>611</fpage>&#x02013;<lpage>625</lpage>. <pub-id pub-id-type="doi">10.1016/j.cortex.2012.10.008</pub-id><pub-id pub-id-type="pmid">23260615</pub-id></citation></ref>
<ref id="B25">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jiang</surname> <given-names>X.</given-names></name> <name><surname>Bradley</surname> <given-names>E.</given-names></name> <name><surname>Rini</surname> <given-names>R. A.</given-names></name> <name><surname>Zeffiro</surname> <given-names>T.</given-names></name> <name><surname>VanMeter</surname> <given-names>J.</given-names></name> <name><surname>Riesenhuber</surname> <given-names>M.</given-names></name></person-group> (<year>2007</year>). <article-title>Categorization training results in shape- and category-selective human neural plasticity</article-title>. <source>Neuron</source> <volume>53</volume>, <fpage>891</fpage>&#x02013;<lpage>903</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuron.2007.02.015</pub-id><pub-id pub-id-type="pmid">17359923</pub-id></citation></ref>
<ref id="B26">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jiang</surname> <given-names>X.</given-names></name> <name><surname>Rosen</surname> <given-names>E.</given-names></name> <name><surname>Zeffiro</surname> <given-names>T.</given-names></name> <name><surname>VanMeter</surname> <given-names>J.</given-names></name> <name><surname>Blanz</surname> <given-names>V.</given-names></name> <name><surname>Riesenhuber</surname> <given-names>M.</given-names></name></person-group> (<year>2006</year>). <article-title>Evaluation of a shape-based model of human face discrimination using fMRI and behavioral techniques</article-title>. <source>Neuron</source> <volume>50</volume>, <fpage>159</fpage>&#x02013;<lpage>172</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuron.2006.03.012</pub-id><pub-id pub-id-type="pmid">16600863</pub-id></citation></ref>
<ref id="B27">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kravitz</surname> <given-names>D. J.</given-names></name> <name><surname>Saleem</surname> <given-names>K. S.</given-names></name> <name><surname>Baker</surname> <given-names>C. I.</given-names></name> <name><surname>Ungerleider</surname> <given-names>L. G.</given-names></name> <name><surname>Mishkin</surname> <given-names>M.</given-names></name></person-group> (<year>2013</year>). <article-title>The ventral visual pathway: an expanded neural framework for the processing of object quality</article-title>. <source>Trends Cogn. Sci.</source> <volume>17</volume>, <fpage>26</fpage>&#x02013;<lpage>49</lpage>. <pub-id pub-id-type="doi">10.1016/j.tics.2012.10.011</pub-id><pub-id pub-id-type="pmid">23265839</pub-id></citation></ref>
<ref id="B28">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lake</surname> <given-names>B. M.</given-names></name> <name><surname>Salakhutdinov</surname> <given-names>R.</given-names></name> <name><surname>Tenenbaum</surname> <given-names>J. B.</given-names></name></person-group> (<year>2015</year>). <article-title>Human-level concept learning through probabilistic program induction</article-title>. <source>Science</source> <volume>350</volume>, <fpage>1332</fpage>&#x02013;<lpage>1338</lpage>. <pub-id pub-id-type="doi">10.1126/science.aab3050</pub-id><pub-id pub-id-type="pmid">26659050</pub-id></citation></ref>
<ref id="B29">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lake</surname> <given-names>B. M.</given-names></name> <name><surname>Ullman</surname> <given-names>T. D.</given-names></name> <name><surname>Tenenbaum</surname> <given-names>J. B.</given-names></name> <name><surname>Gershman</surname> <given-names>S. J.</given-names></name></person-group> (<year>2017</year>). <article-title>Building machines that learn and think like people</article-title>. <source>Behav. Brain Sci.</source> <volume>40</volume>:<fpage>e253</fpage>. <pub-id pub-id-type="doi">10.1017/S0140525X16001837</pub-id><pub-id pub-id-type="pmid">27881212</pub-id></citation></ref>
<ref id="B30">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Malone</surname> <given-names>P. S.</given-names></name> <name><surname>Glezer</surname> <given-names>L. S.</given-names></name> <name><surname>Kim</surname> <given-names>J.</given-names></name> <name><surname>Jiang</surname> <given-names>X.</given-names></name> <name><surname>Riesenhuber</surname> <given-names>M.</given-names></name></person-group> (<year>2016</year>). <article-title>Multivariate pattern analysis reveals category-related organization of semantic representations in anterior temporal cortex</article-title>. <source>J. Neurosci.</source> <volume>36</volume>, <fpage>10089</fpage>&#x02013;<lpage>10096</lpage>. <pub-id pub-id-type="doi">10.1523/JNEUROSCI.1599-16.2016</pub-id><pub-id pub-id-type="pmid">27683905</pub-id></citation></ref>
<ref id="B31">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Martin</surname> <given-names>J. G.</given-names></name> <name><surname>Cox</surname> <given-names>P. H.</given-names></name> <name><surname>Scholl</surname> <given-names>C. A.</given-names></name> <name><surname>Riesenhuber</surname> <given-names>M.</given-names></name></person-group> (<year>2019</year>). <article-title>A crash in visual processing: interference between feedforward and feedback of successive targets limits detection and categorization</article-title>. <source>J. Vis.</source> <volume>19</volume>:<fpage>20</fpage>.<pub-id pub-id-type="doi">10.1167/19.12.20</pub-id><pub-id pub-id-type="pmid">31644785</pub-id></citation></ref>
<ref id="B32">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Miller</surname> <given-names>G. A.</given-names></name> <name><surname>Johnson-Laird</surname> <given-names>P. N.</given-names></name></person-group> (<year>1976</year>). <source>Language and Perception.</source> <publisher-loc>Cambridge, MA</publisher-loc>: <publisher-name>Belknap Press</publisher-name>.</citation></ref>
<ref id="B33">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mion</surname> <given-names>M.</given-names></name> <name><surname>Patterson</surname> <given-names>K.</given-names></name> <name><surname>Acosta-Cabronero</surname> <given-names>J.</given-names></name> <name><surname>Pengas</surname> <given-names>G.</given-names></name> <name><surname>Izquierdo-Garcia</surname> <given-names>D.</given-names></name> <name><surname>Hong</surname> <given-names>Y. T.</given-names></name> <etal/></person-group>. (<year>2010</year>). <article-title>What the left and right anterior fusiform gyri tell us about semantic memory</article-title>. <source>Brain</source> <volume>133</volume>, <fpage>3256</fpage>&#x02013;<lpage>3268</lpage>. <pub-id pub-id-type="doi">10.1093/brain/awq272</pub-id><pub-id pub-id-type="pmid">20952377</pub-id></citation></ref>
<ref id="B34">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nosofsky</surname> <given-names>R. M.</given-names></name></person-group> (<year>1986</year>). <article-title>Attention, similarity, and the identification&#x02013;categorization relationship</article-title>. <source>J. Exp. Psychol. Gen.</source> <volume>115</volume>:<fpage>39</fpage>. <pub-id pub-id-type="doi">10.1037/0096-3445.115.1.39</pub-id><pub-id pub-id-type="pmid">2937873</pub-id></citation></ref>
<ref id="B35">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Oquab</surname> <given-names>M.</given-names></name> <name><surname>Bottou</surname> <given-names>L.</given-names></name> <name><surname>Laptev</surname> <given-names>I.</given-names></name> <name><surname>Sivic</surname> <given-names>J.</given-names></name></person-group> (<year>2014</year>). <article-title>&#x0201C;Learning and Transferring Mid-Level Image Representations using Convolutional Neural Networks,&#x0201D;</article-title> in <source>The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)</source> (<publisher-loc>Columbus, OH</publisher-loc>). <pub-id pub-id-type="doi">10.1109/CVPR.2014.222</pub-id></citation></ref>
<ref id="B36">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Overlan</surname> <given-names>M. C.</given-names></name> <name><surname>Jacobs</surname> <given-names>R. A.</given-names></name> <name><surname>Piantadosi</surname> <given-names>S. T.</given-names></name></person-group> (<year>2017</year>). <article-title>Learning abstract visual concepts via probabilistic program induction in a Language of Thought</article-title>. <source>Cognition</source> <volume>168</volume>, <fpage>320</fpage>&#x02013;<lpage>334</lpage>. <pub-id pub-id-type="doi">10.1016/j.cognition.2017.07.005</pub-id><pub-id pub-id-type="pmid">28772189</pub-id></citation></ref>
<ref id="B37">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Poggio</surname> <given-names>T.</given-names></name></person-group> (<year>2012</year>). <article-title>The computational magic of the ventral stream: towards a theory</article-title>. <source>Nat. Preced.</source> <pub-id pub-id-type="doi">10.1038/npre.2011.6117</pub-id></citation></ref>
<ref id="B38">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ralph</surname> <given-names>M. A. L.</given-names></name> <name><surname>Jefferies</surname> <given-names>E.</given-names></name> <name><surname>Patterson</surname> <given-names>K.</given-names></name> <name><surname>Rogers</surname> <given-names>T. T.</given-names></name></person-group> (<year>2017</year>). <article-title>The neural and computational bases of semantic cognition</article-title>. <source>Nat. Rev. Neurosci.</source> <volume>18</volume>, <fpage>42</fpage>&#x02013;<lpage>55</lpage>. <pub-id pub-id-type="doi">10.1038/nrn.2016.150</pub-id><pub-id pub-id-type="pmid">27881854</pub-id></citation></ref>
<ref id="B39">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Razavian</surname> <given-names>A. S.</given-names></name> <name><surname>Azizpour</surname> <given-names>H.</given-names></name> <name><surname>Sullivan</surname> <given-names>J.</given-names></name> <name><surname>Carlsson</surname> <given-names>S.</given-names></name></person-group> (<year>2014</year>). <source>CNN Features off-the-shelf: an Astounding Baseline for Recognition. arXiv:1403.6382 [cs]</source>. Available online at: <ext-link ext-link-type="uri" xlink:href="http://arxiv.org/abs/1403.6382">http://arxiv.org/abs/1403.6382</ext-link> (accessed August 19, 2019). <pub-id pub-id-type="doi">10.1109/CVPRW.2014.131</pub-id></citation></ref>
<ref id="B40">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Riesenhuber</surname> <given-names>M.</given-names></name> <name><surname>Poggio</surname> <given-names>T.</given-names></name></person-group> (<year>1999</year>). <article-title>Hierarchical models of object recognition in cortex</article-title>. <source>Nat. Neurosci.</source> <volume>2</volume>, <fpage>1019</fpage>&#x02013;<lpage>1025</lpage>. <pub-id pub-id-type="doi">10.1038/14819</pub-id><pub-id pub-id-type="pmid">10526343</pub-id></citation></ref>
<ref id="B41">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Riesenhuber</surname> <given-names>M.</given-names></name> <name><surname>Poggio</surname> <given-names>T.</given-names></name></person-group> (<year>2000</year>). <article-title>Models of object recognition</article-title>. <source>Nat. Neurosci.</source> <volume>3</volume>, <fpage>1199</fpage>&#x02013;<lpage>1204</lpage>. <pub-id pub-id-type="doi">10.1038/81479</pub-id><pub-id pub-id-type="pmid">11127838</pub-id></citation></ref>
<ref id="B42">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rule</surname> <given-names>J. S.</given-names></name> <name><surname>Riesenhuber</surname> <given-names>M.</given-names></name></person-group> (<year>2020</year>). <article-title>Leveraging prior concept learning improves ability to generalize from few examples in computational models of human object recognition</article-title>. <source>bioRxiv. [Preprint]</source>. <pub-id pub-id-type="doi">10.1101/2020.02.18.944702</pub-id></citation></ref>
<ref id="B43">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Russakovsky</surname> <given-names>O.</given-names></name> <name><surname>Deng</surname> <given-names>J.</given-names></name> <name><surname>Su</surname> <given-names>H.</given-names></name> <name><surname>Krause</surname> <given-names>J.</given-names></name> <name><surname>Satheesh</surname> <given-names>S.</given-names></name> <name><surname>Ma</surname> <given-names>S.</given-names></name> <etal/></person-group>. (<year>2015</year>). <article-title>ImageNet Large scale visual recognition challenge</article-title>. <source>Int. J. Comput. Vis.</source> <volume>115</volume>, <fpage>211</fpage>&#x02013;<lpage>252</lpage>. <pub-id pub-id-type="doi">10.1007/s11263-015-0816-y</pub-id></citation></ref>
<ref id="B44">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Scholl</surname> <given-names>C. A.</given-names></name> <name><surname>Jiang</surname> <given-names>X.</given-names></name> <name><surname>Martin</surname> <given-names>J. G.</given-names></name> <name><surname>Riesenhuber</surname> <given-names>M.</given-names></name></person-group> (<year>2014</year>). <article-title>Time course of shape and category selectivity revealed by EEG rapid adaptation</article-title>. <source>J. Cogn. Neurosci.</source> <volume>26</volume>, <fpage>408</fpage>&#x02013;<lpage>421</lpage>. <pub-id pub-id-type="doi">10.1162/jocn_a_00477</pub-id><pub-id pub-id-type="pmid">24001003</pub-id></citation></ref>
<ref id="B45">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schrimpf</surname> <given-names>M.</given-names></name> <name><surname>Kubilius</surname> <given-names>J.</given-names></name> <name><surname>Hong</surname> <given-names>H.</given-names></name> <name><surname>Majaj</surname> <given-names>N. J.</given-names></name> <name><surname>Rajalingham</surname> <given-names>R.</given-names></name> <name><surname>Issa</surname> <given-names>E. B.</given-names></name> <etal/></person-group>. (<year>2018</year>). <article-title>Brain-score: Which artificial neural network for object recognition is most brain-like?</article-title> <source>bioRxiv .[Preprint]</source>. <pub-id pub-id-type="doi">10.1101/407007</pub-id></citation></ref>
<ref id="B46">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Serre</surname> <given-names>T.</given-names></name> <name><surname>Oliva</surname> <given-names>A.</given-names></name> <name><surname>Poggio</surname> <given-names>T.</given-names></name></person-group> (<year>2007a</year>). <article-title>A feedforward architecture accounts for rapid categorization</article-title>. <source>Proc. Natl. Acad. Sci. U.S.A.</source> <volume>104</volume>, <fpage>6424</fpage>&#x02013;<lpage>6429</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.0700622104</pub-id><pub-id pub-id-type="pmid">17404214</pub-id></citation></ref>
<ref id="B47">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Serre</surname> <given-names>T.</given-names></name> <name><surname>Wolf</surname> <given-names>L.</given-names></name> <name><surname>Bileschi</surname> <given-names>S.</given-names></name> <name><surname>Riesenhuber</surname> <given-names>M.</given-names></name> <name><surname>Poggio</surname> <given-names>T.</given-names></name></person-group> (<year>2007b</year>). <article-title>Robust object recognition with cortex-like mechanisms</article-title>. <source>IEEE Trans. Pattern Anal. Mac. Intell.</source> <volume>29</volume>, <fpage>411</fpage>&#x02013;<lpage>426</lpage>. <pub-id pub-id-type="doi">10.1109/TPAMI.2007.56</pub-id><pub-id pub-id-type="pmid">17224612</pub-id></citation></ref>
<ref id="B48">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Seth</surname> <given-names>A. K.</given-names></name> <name><surname>Barrett</surname> <given-names>A. B.</given-names></name> <name><surname>Barnett</surname> <given-names>L.</given-names></name></person-group> (<year>2015</year>). <article-title>Granger causality analysis in neuroscience and neuroimaging</article-title>. <source>J. Neurosci.</source> <volume>35</volume>, <fpage>3293</fpage>&#x02013;<lpage>3297</lpage>. <pub-id pub-id-type="doi">10.1523/JNEUROSCI.4399-14.2015</pub-id><pub-id pub-id-type="pmid">25716830</pub-id></citation></ref>
<ref id="B49">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sharon</surname> <given-names>T.</given-names></name> <name><surname>Moscovitch</surname> <given-names>M.</given-names></name> <name><surname>Gilboa</surname> <given-names>A.</given-names></name></person-group> (<year>2011</year>). <article-title>Rapid neocortical acquisition of long-term arbitrary associations independent of the hippocampus</article-title>. <source>Proc. Natl. Acad. Sci. U.S.A.</source> <volume>108</volume>, <fpage>1146</fpage>&#x02013;<lpage>1151</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.1005238108</pub-id><pub-id pub-id-type="pmid">21199935</pub-id></citation></ref>
<ref id="B50">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Szegedy</surname> <given-names>C.</given-names></name> <name><surname>Liu</surname> <given-names>W.</given-names></name> <name><surname>Jia</surname> <given-names>Y.</given-names></name> <name><surname>Sermanet</surname> <given-names>P.</given-names></name> <name><surname>Reed</surname> <given-names>S.</given-names></name> <name><surname>Anguelov</surname> <given-names>D.</given-names></name> <etal/></person-group> (<year>2014</year>). <article-title>Going Deeper with Convolutions</article-title>. <source>in arXiv:1409.4842 [cs]</source> Available online at: <ext-link ext-link-type="uri" xlink:href="http://arxiv.org/abs/1409.4842">http://arxiv.org/abs/1409.4842</ext-link> (accessed September 24, 2020).</citation></ref>
<ref id="B51">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Thomas</surname> <given-names>E.</given-names></name> <name><surname>Van Hulle</surname> <given-names>M. M.</given-names></name> <name><surname>Vogel</surname> <given-names>R.</given-names></name></person-group> (<year>2001</year>). <article-title>Encoding of categories by noncategory-specific neurons in the inferior temporal cortex</article-title>. <source>J. Cogn. Neurosci.</source> <volume>13</volume>, <fpage>190</fpage>&#x02013;<lpage>200</lpage>. <pub-id pub-id-type="doi">10.1162/089892901564252</pub-id><pub-id pub-id-type="pmid">11244545</pub-id></citation></ref>
<ref id="B52">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Vandenberghe</surname> <given-names>R.</given-names></name> <name><surname>Price</surname> <given-names>C.</given-names></name> <name><surname>Wise</surname> <given-names>R.</given-names></name> <name><surname>Josephs</surname> <given-names>O.</given-names></name> <name><surname>Frackowiak</surname> <given-names>R. S. J.</given-names></name></person-group> (<year>1996</year>). <article-title>Functional anatomy of a common semantic system for words and pictures</article-title>. <source>Nature</source> <volume>383</volume>, <fpage>254</fpage>&#x02013;<lpage>256</lpage>. <pub-id pub-id-type="doi">10.1038/383254a0</pub-id><pub-id pub-id-type="pmid">8805700</pub-id></citation></ref>
<ref id="B53">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Vinyals</surname> <given-names>O.</given-names></name> <name><surname>Blundell</surname> <given-names>C.</given-names></name> <name><surname>Lillicrap</surname> <given-names>T.</given-names></name> <name><surname>Kavukcuoglu</surname> <given-names>K.</given-names></name> <name><surname>Wierstra</surname> <given-names>D.</given-names></name></person-group> (<year>2016</year>). <article-title>&#x0201C;Matching networks for one shot learning,&#x0201D;</article-title> in <source>Advances in Neural Information Processing Systems</source> (<publisher-loc>Barcelona</publisher-loc>), <fpage>3630</fpage>&#x02013;<lpage>3638</lpage>.</citation></ref>
<ref id="B54">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Watanabe</surname> <given-names>S.</given-names></name></person-group> (<year>1969</year>). <source>Knowing and</source> <publisher-loc>Guessing</publisher-loc>: <publisher-name>A Quantitative Study of Inference and Information</publisher-name>. <publisher-loc>Hoboken, NJ</publisher-loc>: <publisher-name>John Wiley and Sons</publisher-name>.</citation></ref>
<ref id="B55">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Woods</surname> <given-names>W.</given-names></name></person-group> (<year>1981</year>). <article-title>&#x0201C;Procedural semantics as a theory of meaning,&#x0201D;</article-title> in <source>Elements of Discourse Understanding</source>, eds <person-group person-group-type="editor"><name><surname>Joshi</surname> <given-names>A. K.</given-names></name> <name><surname>Webber</surname> <given-names>B. L.</given-names></name> <name><surname>Sag</surname> <given-names>I. K.</given-names></name></person-group> (<publisher-loc>Cambridge</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>), <fpage>300</fpage>&#x02013;<lpage>334</lpage>.</citation></ref>
<ref id="B56">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Yamins</surname> <given-names>D. L.</given-names></name> <name><surname>Hong</surname> <given-names>H.</given-names></name> <name><surname>Cadieu</surname> <given-names>C.</given-names></name> <name><surname>DiCarlo</surname> <given-names>J. J.</given-names></name></person-group> (<year>2013</year>). <article-title>&#x0201C;Hierarchical modular optimization of convolutional networks achieves representations similar to macaque IT and human ventral stream,&#x0201D;</article-title> in <source>Advances in Neural Information Processing Systems</source> (<publisher-loc>Lake Tahoe, NV</publisher-loc>), <fpage>3093</fpage>&#x02013;<lpage>3101</lpage>.</citation></ref>
<ref id="B57">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yamins</surname> <given-names>D. L. K.</given-names></name> <name><surname>Hong</surname> <given-names>H.</given-names></name> <name><surname>Cadieu</surname> <given-names>C. F.</given-names></name> <name><surname>Solomon</surname> <given-names>E. A.</given-names></name> <name><surname>Seibert</surname> <given-names>D.</given-names></name> <name><surname>DiCarlo</surname> <given-names>J. J.</given-names></name></person-group> (<year>2014</year>). <article-title>Performance-optimized hierarchical models predict neural responses in higher visual cortex</article-title>. <source>Proc. Natl. Acad. Sci.</source> <volume>111</volume>, <fpage>8619</fpage>&#x02013;<lpage>8624</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.1403112111</pub-id></citation>
</ref>
<ref id="B58">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yildirim</surname> <given-names>I.</given-names></name> <name><surname>Jacobs</surname> <given-names>R. A.</given-names></name></person-group> (<year>2013</year>). <article-title>Transfer of object category knowledge across visual and haptic modalities: experimental and computational studies</article-title>. <source>Cognition</source> <volume>126</volume>, <fpage>135</fpage>&#x02013;<lpage>148</lpage>. <pub-id pub-id-type="doi">10.1016/j.cognition.2012.08.005</pub-id><pub-id pub-id-type="pmid">23102553</pub-id></citation></ref>
<ref id="B59">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yildirim</surname> <given-names>I.</given-names></name> <name><surname>Jacobs</surname> <given-names>R. A.</given-names></name></person-group> (<year>2015</year>). <article-title>Learning multisensory representations for auditory-visual transfer of sequence category knowledge: a probabilistic language of thought approach</article-title>. <source>Psychon. Bull. Rev.</source> <volume>22</volume>, <fpage>673</fpage>&#x02013;<lpage>686</lpage>. <pub-id pub-id-type="doi">10.3758/s13423-014-0734-y</pub-id></citation>
</ref>
<ref id="B60">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Yosinski</surname> <given-names>J.</given-names></name> <name><surname>Clune</surname> <given-names>J.</given-names></name> <name><surname>Bengio</surname> <given-names>Y.</given-names></name> <name><surname>Lipson</surname> <given-names>H.</given-names></name></person-group> (<year>2014</year>). <article-title>&#x0201C;How transferable are features in deep neural networks?&#x0201D;</article-title> in <source>Advances in Neural Information Processing Systems</source> <volume>27</volume>, eds <person-group person-group-type="editor"><name><surname>Ghahramani</surname> <given-names>Z.</given-names></name> <name><surname>Welling</surname> <given-names>M.</given-names></name> <name><surname>Cortes</surname> <given-names>C.</given-names></name> <name><surname>Lawrence</surname> <given-names>N. D.</given-names></name> <name><surname>Weinberger</surname> <given-names>K. Q.</given-names></name></person-group> (<publisher-loc>Red Hook, NY</publisher-loc>: <publisher-name>Curran Associates, Inc.</publisher-name>), <fpage>3320</fpage>&#x02013;<lpage>3328</lpage>. Available online at: <ext-link ext-link-type="uri" xlink:href="http://papers.nips.cc/paper/5347-how-transferable-are-features-in-deep-neural-networks.pdf">http://papers.nips.cc/paper/5347-how-transferable-are-features-in-deep-neural-networks.pdf</ext-link> (accessed August 19, 2019).</citation></ref>
</ref-list>
<fn-group>
<fn fn-type="financial-disclosure"><p><bold>Funding.</bold> This work was supported in part by Lawrence Livermore National Laboratory (<ext-link ext-link-type="uri" xlink:href="https://llnl.gov">https://llnl.gov</ext-link>) under the auspices of the U.S. Department of Energy under Contract DE-AC52-07NA27344 and the LLNL-LDRD Program under Project No. COMP-19- ERD-007 (MR), and by the National Science Foundation (<ext-link ext-link-type="uri" xlink:href="https://nsf.gov">https://nsf.gov</ext-link>) Brain and Cognitive Sciences Grants 1026934 and 1232530 (MR), and Graduate Research Fellowship Grants 1122374 and 1745302 (JR). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.</p>
</fn>
</fn-group>
</back>
</article>