<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="review-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Comput. Neurosci.</journal-id>
<journal-title>Frontiers in Computational Neuroscience</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Comput. Neurosci.</abbrev-journal-title>
<issn pub-type="epub">1662-5188</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fncom.2022.929348</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Computational Neuroscience</subject>
<subj-group>
<subject>Review</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Efficient Temporal Coding in the Early Visual System: Existing Evidence and Future Directions</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Price</surname> <given-names>Byron H.</given-names></name>
<uri xlink:href="http://loop.frontiersin.org/people/1832530/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Gavornik</surname> <given-names>Jeffrey P.</given-names></name>
<xref ref-type="corresp" rid="c001"><sup>&#x002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1311823/overview"/>
</contrib>
</contrib-group>
<aff><institution>Center for Systems Neuroscience, Graduate Program in Neuroscience, Department of Biology, Boston University</institution>, <addr-line>Boston, MA</addr-line>, <country>United States</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Paul Miller, Brandeis University, United States</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Benoit R. Cottereau, UMR 5549 Centre de Recherche Cerveau et Cognition (CerCo), France; Jochen Triesch, Goethe University Frankfurt, Germany</p></fn>
<corresp id="c001">&#x002A;Correspondence: Jeffrey P. Gavornik, <email>gavornik@bu.edu</email></corresp>
</author-notes>
<pub-date pub-type="epub">
<day>04</day>
<month>07</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="collection">
<year>2022</year>
</pub-date>
<volume>16</volume>
<elocation-id>929348</elocation-id>
<history>
<date date-type="received">
<day>26</day>
<month>04</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>13</day>
<month>06</month>
<year>2022</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x00A9; 2022 Price and Gavornik.</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>Price and Gavornik</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract>
<p>While it is universally accepted that the brain makes predictions, there is little agreement about how this is accomplished and under which conditions. Accurate prediction requires neural circuits to learn and store spatiotemporal patterns observed in the natural environment, but it is not obvious how such information should be stored, or encoded. Information theory provides a mathematical formalism that can be used to measure the efficiency and utility of different coding schemes for data transfer and storage. This theory shows that codes become efficient when they remove predictable, redundant spatial and temporal information. Efficient coding has been used to understand retinal computations and may also be relevant to understanding more complicated temporal processing in visual cortex. However, the literature on efficient coding in cortex is varied and can be confusing since the same terms are used to mean different things in different experimental and theoretical contexts. In this work, we attempt to provide a clear summary of the theoretical relationship between efficient coding and temporal prediction, and review evidence that efficient coding principles explain computations in the retina. We then apply the same framework to computations occurring in early visuocortical areas, arguing that data from rodents is largely consistent with the predictions of this model. Finally, we review and respond to criticisms of efficient coding and suggest ways that this theory might be used to design future experiments, with particular focus on understanding the extent to which neural circuits make predictions from efficient representations of environmental statistics.</p>
</abstract>
<kwd-group>
<kwd>efficient coding</kwd>
<kwd>predictive coding</kwd>
<kwd>time</kwd>
<kwd>temporal representations</kwd>
<kwd>visual cortex</kwd>
</kwd-group>
<contract-num rid="cn001">R01EY030200</contract-num>
<contract-sponsor id="cn001">National Eye Institute<named-content content-type="fundref-id">10.13039/100000053</named-content></contract-sponsor>
<counts>
<fig-count count="3"/>
<table-count count="4"/>
<equation-count count="11"/>
<ref-count count="164"/>
<page-count count="15"/>
<word-count count="14264"/>
</counts>
</article-meta>
</front>
<body>
<sec id="S1" sec-type="intro">
<title>Introduction</title>
<p>An interesting feature of the human mind is how it tricks us into thinking complex tasks are simple. One semi-apocryphal illustration of this was Marvin Minsky asking an undergraduate to program a computer to &#x201C;describe what it saw&#x201D; through a camera, over the course of a single summer. Minsky&#x2019;s wild underestimate of how hard this would be stemmed from an intuition that it can&#x2019;t be terribly hard to do something so effortless. Similarly, our innate understanding of time as the inescapable dimension along which life proceeds seems effortless. Surely, the mechanistic underpinnings of this ability should be easy to describe? Not quite. It is not even easy to define what &#x201C;time&#x201D; is. Though multiple research groups have made important theoretical and experimental contributions to understanding time in the brain (<xref ref-type="bibr" rid="B88">Mauk and Buonomano, 2004</xref>; <xref ref-type="bibr" rid="B23">Buonomano and Laje, 2010</xref>; <xref ref-type="bibr" rid="B1">Allman et al., 2014</xref>; <xref ref-type="bibr" rid="B45">Eichenbaum, 2014</xref>; <xref ref-type="bibr" rid="B90">Merchant et al., 2014</xref>, <xref ref-type="bibr" rid="B91">2015</xref>; <xref ref-type="bibr" rid="B145">Tucci et al., 2014</xref>; <xref ref-type="bibr" rid="B49">Finnerty et al., 2015</xref>; <xref ref-type="bibr" rid="B108">Petter et al., 2018</xref>; <xref ref-type="bibr" rid="B154">Wang et al., 2018</xref>; <xref ref-type="bibr" rid="B8">Balasubramaniam et al., 2021</xref>), neuroscience provides few satisfying answers to the big questions of how time is explicitly represented in cortex, how temporal relationships are stored in memories, and how memories of temporal relationships are used to make predictions.</p>
<p>The ability to make accurate predictions confers clear competitive advantages. Accurately extrapolating the trajectory of a moving object to predict its future state, for example, is very useful for both prey capture and predator evasion. An interesting idea is that prediction might emerge as a natural consequence of resource optimization, thus providing dual benefits for adaptive behavior and energy efficiency. The concept of <italic>efficient coding</italic> from information theory describes how to achieve such resource optimization for data storage or transmission (<xref ref-type="bibr" rid="B128">Shannon, 1948</xref>; <xref ref-type="bibr" rid="B33">Cover and Thomas, 2006b</xref>; <xref ref-type="bibr" rid="B140">Stone, 2018</xref>). When data are extended in space or time, efficient coding suggests that a <italic>predictive coding</italic> scheme can be used to compress information and save energy (<xref ref-type="bibr" rid="B47">Elias, 1955</xref>). Though these ideas were inspired by problems in telecommunications engineering, they have found significant application in biology.</p>
<p>Efficient coding was first introduced to neuroscience by <xref ref-type="bibr" rid="B5">Attneave (1954)</xref> and <xref ref-type="bibr" rid="B12">Barlow (1961)</xref>, who argued that retinal circuits use efficient coding to transform light patterns into the neural code transmitted through the optic nerve. The basic concept has evolved through the years and been used to explain a wide variety of experimental results across the visual system (<xref ref-type="bibr" rid="B136">Srinivasan et al., 1982</xref>; <xref ref-type="bibr" rid="B3">Atick and Redlich, 1992</xref>; <xref ref-type="bibr" rid="B42">Dong and Atick, 1995</xref>; <xref ref-type="bibr" rid="B37">Dan et al., 1996</xref>; <xref ref-type="bibr" rid="B102">Olshausen and Field, 1996</xref>; <xref ref-type="bibr" rid="B89">Meister and Berry, 1999</xref>; <xref ref-type="bibr" rid="B41">Doi et al., 2012</xref>; <xref ref-type="bibr" rid="B110">Pitkow and Meister, 2012</xref>). Many basic functional properties of the retina are likely the result of efficient coding, for example the unequal distribution of ON and OFF ganglion cell types (<xref ref-type="bibr" rid="B10">Balasubramanian and Sterling, 2009</xref>). <xref ref-type="bibr" rid="B136">Srinivasan et al. (1982)</xref> were the first to show that retinal ganglion cells (RGCs) effectively act as linear predictive coders. Extending this notion, <xref ref-type="bibr" rid="B100">Ocko et al. (2018)</xref> provided a plausible explanation for the existence of multiple ganglion cell subtypes. Efficient coding may also provide a framework to explain how the evolution and plasticity of cortical circuitry is shaped by natural environmental statistics (<xref ref-type="bibr" rid="B102">Olshausen and Field, 1996</xref>; <xref ref-type="bibr" rid="B16">Bell and Sejnowski, 1997</xref>; <xref ref-type="bibr" rid="B116">Rao and Ballard, 1999</xref>; <xref ref-type="bibr" rid="B158">Wiskott and Sejnowski, 2002</xref>; <xref ref-type="bibr" rid="B70">Jehee et al., 2006</xref>; <xref ref-type="bibr" rid="B36">Creutzig and Sprekeler, 2008</xref>). For example, <xref ref-type="bibr" rid="B102">Olshausen and Field (1996</xref>, <xref ref-type="bibr" rid="B103">1997)</xref> created a spatial efficient coding model that mimics primary visual cortex (V1) receptive fields when trained on natural scenes. <xref ref-type="bibr" rid="B116">Rao and Ballard (1999)</xref> famously introduced a hierarchical predictive coding model to explain the classical and extra-classical receptive field properties of V1 neurons. Related models have since been suggested to provide a general framework for understanding cortical function (<xref ref-type="bibr" rid="B52">Friston, 2005</xref>; <xref ref-type="bibr" rid="B15">Bastos et al., 2012</xref>; <xref ref-type="bibr" rid="B135">Spratling, 2017</xref>; <xref ref-type="bibr" rid="B157">Whittington and Bogacz, 2019</xref>; <xref ref-type="bibr" rid="B93">Millidge et al., 2021</xref>).</p>
<p>Overall, there is a rich literature on both efficient and predictive coding. In the visual system, however, much of the experimental and theoretical focus has been in the spatial domain. This may reflect the inherent &#x201C;spatial-ness&#x201D; of vision as a sensory modality, but we argue that time is also fundamental, even beyond motion processing. In addition, the notion of predictive coding has become somewhat restrictive in its potential instantiations in neural circuitry, e.g., prediction and predictive coding may still be relevant to understanding cortical function even if hierarchical predictive coding is the inappropriate model. To that end, this review begins with a primer on efficient coding wherein we explicitly derive the close relationship between efficient and predictive coding. Next, we review evidence for efficient coding in the retina and dorsal lateral geniculate nucleus (dLGN). We then move to later visual areas, with particular emphasis on how efficient coding principles in visual cortex may underlie a variety of time-dependent computational tasks, including visual flow processing, spatiotemporal sequence learning, and adaptation. In the end, we hope to clarify the sometimes confusing relationship between efficient and predictive coding, and discuss ways in which these theories may guide experiments and provide clues about how the nervous system codes temporal relationships to make predictions.</p>
</sec>
<sec id="S2">
<title>Efficient Coding Primer</title>
<p>Efficient coding is a concept from information theory describing how data can be transmitted or stored with minimal use of energy, time, and resources. Pioneered in the 1920s&#x2013;1950s by Harry Nyquist, Ralph Hartley, and Claude Shannon of Bell Telephone Labs, information theory provided a quantitative framework to analyze then emergent telecommunications technology and help their employer save money on telegram and telephone transmission. The goal was to design a system capable of reliable message transmission and storage.</p>
<p>As a concrete example, consider constructing a message from a four-letter alphabet, &#x03B8; = &#x007B;<italic>A, B, C, D</italic>&#x007D;. There are 4<sup>10</sup> possible 10-letter messages, a typical example of which might be BACAAABDAA. If we assume that each letter appears independently within a message according to the following probabilities:</p>
<disp-formula id="S2.E1"><mml:math id="M1"><mml:mrow><mml:mrow><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>A</mml:mi><mml:mo rspace="5.8pt">)</mml:mo></mml:mrow></mml:mrow><mml:mo rspace="5.8pt">=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac><mml:mn>1</mml:mn><mml:mn>2</mml:mn></mml:mfrac></mml:mstyle></mml:mrow><mml:mo rspace="7.5pt">,</mml:mo><mml:mrow><mml:mrow><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>B</mml:mi><mml:mo rspace="5.8pt">)</mml:mo></mml:mrow></mml:mrow><mml:mo rspace="5.8pt">=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac><mml:mn>1</mml:mn><mml:mn>4</mml:mn></mml:mfrac></mml:mstyle></mml:mrow><mml:mo rspace="7.5pt">,</mml:mo><mml:mrow><mml:mrow><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>C</mml:mi><mml:mo rspace="5.8pt">)</mml:mo></mml:mrow></mml:mrow><mml:mo rspace="5.8pt">=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac><mml:mn>1</mml:mn><mml:mn>6</mml:mn></mml:mfrac></mml:mstyle></mml:mrow><mml:mo rspace="7.5pt">,</mml:mo><mml:mrow><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>D</mml:mi><mml:mo rspace="5.8pt">)</mml:mo></mml:mrow></mml:mrow><mml:mo rspace="5.8pt">=</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac><mml:mn>1</mml:mn><mml:mn>12</mml:mn></mml:mfrac></mml:mstyle></mml:mrow></mml:mrow></mml:mrow></mml:mrow></mml:math></disp-formula>
<p>then certain messages are much more likely to occur than others (DDDDDDDDDD is very unlikely, for example). It makes intuitive sense to choose an <italic>encoding scheme</italic> that takes advantage of this non-uniformity. With this in mind, Shannon introduced the idea of <italic>entropy</italic> to quantify the average number of symbols required to store or send any such message (<xref ref-type="bibr" rid="B128">Shannon, 1948</xref>; <xref ref-type="bibr" rid="B34">Cover and Thomas, 2006c</xref>). In his formulation, information is simply <italic>I</italic> (<italic>X</italic>) = &#x2212;log<sub>2</sub> (<italic>P</italic>(<italic>X</italic>)) bits, which is the number of binary digits required to store a message that occurs with probability <italic>P</italic> (<italic>X</italic>). Due to its inverse relationship with probability, information is a measure of epistemic surprise relative to expectation, and entropy is just the expectation of the information:</p>
<disp-formula id="S2.E2"><mml:math id="M2"><mml:mrow><mml:mrow><mml:mi>H</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">&#x03B8;</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>&#x225C;</mml:mo><mml:mrow><mml:mi>E</mml:mi><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mi>I</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math></disp-formula>
<p>The average total amount of information in a message of <italic>N</italic> symbols is simply NH(&#x03B8;). In our example,</p>
<disp-formula id="S2.E3"><mml:math id="M3"><mml:mrow><mml:mi>H</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">&#x03B8;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mo rspace="0pt">-</mml:mo><mml:mstyle displaystyle="true"><mml:munder><mml:mo largeop="true" movablelimits="false" symmetric="true">&#x2211;</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mrow><mml:mo>&#x007B;</mml:mo><mml:mi>A</mml:mi><mml:mo>,</mml:mo><mml:mi>B</mml:mi><mml:mo>,</mml:mo><mml:mi>C</mml:mi><mml:mo>,</mml:mo><mml:mi>D</mml:mi><mml:mo>&#x007D;</mml:mo></mml:mrow></mml:mrow></mml:munder></mml:mstyle><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mpadded width="+3.3pt"><mml:mi>X</mml:mi></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:msub><mml:mtext>log</mml:mtext><mml:mn>2</mml:mn></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mpadded width="+3.3pt"><mml:mi>X</mml:mi></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo rspace="5.8pt">)</mml:mo></mml:mrow><mml:mo rspace="5.8pt">=</mml:mo><mml:mpadded width="+5pt"><mml:mn>1.73</mml:mn></mml:mpadded><mml:mtext>bits</mml:mtext></mml:mrow></mml:math></disp-formula>
<p>Shannon&#x2019;s <italic>source coding theorem</italic> proves that the entropy defines the minimum possible number of bits that can be used to represent one symbol from our messages without losing information. For this value of entropy, the average information in a 10-symbol message would be 17.3 bits. If we had assumed that all letters were equally likely, then entropy would be 2 bits and the average 10-symbol message would be 20 bits. As this example demonstrates, the source coding theorem implies that data from different distributions can be stored with differing amounts of information. The question is how to design an encoding scheme that takes advantage of this theorem, and minimizes the number of bits utilized.</p>
<p>According to the source coding theorem, an encoding scheme is efficient if messages are, on average, transmitted with a number of bits approaching the entropy of the source (<xref ref-type="bibr" rid="B128">Shannon, 1948</xref>; <xref ref-type="bibr" rid="B12">Barlow, 1961</xref>; <xref ref-type="bibr" rid="B2">Atick, 1992</xref>; <xref ref-type="bibr" rid="B32">Cover and Thomas, 2006a</xref>; <xref ref-type="bibr" rid="B137">Sterling and Laughlin, 2015</xref>). Efficient codes specify messages using the minimum possible number of bits and do so by removing predictable information.</p>
<p>To get a sense of what this means in practice, consider representing our symbol alphabet using a simple binary encoding scheme:</p>
<table-wrap position="float">
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<tbody>
<tr>
<td valign="top" align="left">A</td>
<td valign="top" align="center">00</td>
</tr>
<tr>
<td valign="top" align="left">B</td>
<td valign="top" align="center">01</td>
</tr>
<tr>
<td valign="top" align="left">C</td>
<td valign="top" align="center">10</td>
</tr>
<tr>
<td valign="top" align="left">D</td>
<td valign="top" align="center">11</td>
</tr>
<tr>
<td valign="top" align="left"></td>
<td valign="top" align="center"/></tr>
</tbody>
</table>
</table-wrap>
<p>Transmitting a letter/symbol with this scheme requires 2 bits, which is more than the theoretical minimum of 1.73 bits implied by the source coding theorem. What exactly makes this code inefficient? The answer is redundancy: messages transmitted with this code are, on average, predictable. To see this, consider transmitting a 0 for the first bit. Since <inline-formula><mml:math id="INEQ6"><mml:mrow><mml:mrow><mml:mpadded lspace="5pt" width="+5pt"><mml:mi>P</mml:mi></mml:mpadded><mml:mrow><mml:mo>(</mml:mo><mml:mi>A</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mn>2</mml:mn></mml:mfrac></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="INEQ7"><mml:mrow><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>B</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mn>4</mml:mn></mml:mfrac></mml:mrow></mml:math></inline-formula>, the probability that the second bit will also be a zero is 2/3. A more efficient code would reduce such redundancies.</p>
<p>Formally, Shannon redundancy is given by <xref ref-type="bibr" rid="B2">Atick (1992)</xref>:</p>
<disp-formula id="S2.E4"><mml:math id="M4"><mml:mrow><mml:mpadded width="+3.3pt"><mml:mi>R</mml:mi></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:mfrac><mml:mrow><mml:mi>H</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">&#x03B8;</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mi>C</mml:mi></mml:mfrac></mml:mstyle></mml:mrow></mml:mrow></mml:math></disp-formula>
<p>where <italic>C = 2</italic> bits is the average message length in our encoding scheme. The redundancy is always between 0 and 1, with a perfectly efficient code having <italic>H</italic> (&#x03B8;) = <italic>C</italic> and <italic>R = 0</italic>. In this case, <italic>C = 2</italic> and <italic>H</italic> (&#x03B8;) = 1.73, so <italic>R</italic> = 0.135.</p>
<p>Consider an alternative 3-bit encoding scheme:</p>
<table-wrap position="float">
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<tbody>
<tr>
<td valign="top" align="right">A</td>
<td valign="top" align="center">0</td>
</tr>
<tr>
<td valign="top" align="right">B</td>
<td valign="top" align="center">10</td>
</tr>
<tr>
<td valign="top" align="right">C</td>
<td valign="top" align="center">110</td>
</tr>
<tr>
<td valign="top" align="right">D</td>
<td valign="top" align="center">111</td>
</tr>
<tr>
<td valign="top" align="right"></td>
<td valign="top" align="center"/></tr>
</tbody>
</table>
</table-wrap>
<p>Intuition might suggest that adding an extra bit will decrease efficiency, but this is incorrect when we consider the underlying message-generating process. While C and D both require 3 bits to transmit, this relative increase in message length might be compensated by the fact that the most common symbol, A, requires only 1 bit. For the entire 3-bit scheme:</p>
<disp-formula id="S2.E5"><mml:math id="M5"><mml:mrow><mml:mi>C</mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:mrow><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>A</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>&#x00D7;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mpadded width="+5pt"><mml:mn>1</mml:mn></mml:mpadded><mml:mtext>bit</mml:mtext></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>B</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>&#x00D7;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mpadded width="+5pt"><mml:mn>2</mml:mn></mml:mpadded><mml:mtext>bits</mml:mtext></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>C</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>D</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x00D7;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mpadded width="+5pt"><mml:mn>3</mml:mn></mml:mpadded><mml:mtext>bits</mml:mtext></mml:mrow><mml:mo rspace="5.8pt">)</mml:mo></mml:mrow></mml:mrow></mml:mrow><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mpadded width="+5pt"><mml:mn>1.75</mml:mn></mml:mpadded><mml:mtext>bits</mml:mtext></mml:mrow></mml:mrow></mml:math></disp-formula>
<p>The redundancy is now <italic>R</italic> = 0.011, an order of magnitude smaller than the 2-bit scheme. As long as the statistics remain stationary, this 3-bit scheme represents a very efficient code to transmit our messages and illustrates the fundamental principle of efficient coding: <italic>use relatively fewer symbols to encode prevalent/expected messages and relatively more symbols to encode rare/unexpected messages</italic> (<xref ref-type="bibr" rid="B32">Cover and Thomas, 2006a</xref>).</p>
<p>Our example also demonstrates the principle that deviations from uniformity <italic>decrease</italic> entropy (1.73 bits is less than the 2 bits of a uniform distribution). As more complex statistical dependences are added to the source, entropy often drops even more. Consider for example a Markov chain with the following transition probability matrix, <italic>T</italic>:</p>
<table-wrap position="float">
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<tbody>
<tr>
<td valign="top" align="right"></td>
<td valign="top" align="center">A</td>
<td valign="top" align="center">B</td>
<td valign="top" align="center">C</td>
<td valign="top" align="center">D</td>
</tr>
<tr>
<td valign="top" align="right">A</td>
<td valign="top" align="center">0.1</td>
<td valign="top" align="center">0.7</td>
<td valign="top" align="center">0.1</td>
<td valign="top" align="center">0.1</td>
</tr>
<tr>
<td valign="top" align="right">B</td>
<td valign="top" align="center">0.1</td>
<td valign="top" align="center">0.1</td>
<td valign="top" align="center">0.7</td>
<td valign="top" align="center">0.1</td>
</tr>
<tr>
<td valign="top" align="right">C</td>
<td valign="top" align="center">0.1</td>
<td valign="top" align="center">0.1</td>
<td valign="top" align="center">0.1</td>
<td valign="top" align="center">0.7</td>
</tr>
<tr>
<td valign="top" align="right">D</td>
<td valign="top" align="center">0.7</td>
<td valign="top" align="center">0.1</td>
<td valign="top" align="center">0.1</td>
<td valign="top" align="center">0.1</td>
</tr>
<tr>
<td valign="top" align="right"></td>
<td valign="top" align="center"/><td valign="top" align="center"/><td valign="top" align="center"/><td valign="top" align="center"/></tr>
</tbody>
</table>
</table-wrap>
<p>The transition matrix provides conditional probabilities such as <italic>P</italic> (<italic>B</italic> &#x007C; <italic>A</italic>) = 0.7. This Markov chain generates sequences that tend to repeat the pattern ABCD, so neighboring message elements are no longer statistically independent (<italic>P</italic> (<italic>AB</italic>) &#x2260; <italic>P</italic> (<italic>A</italic>) <italic>P</italic> (<italic>B</italic>)). According to the mathematics of Markov chains, the asymptotic probability of element occurrence, &#x03C0;, solves the equation &#x03C0; = &#x03C0;<italic>T</italic>. Thus, &#x03C0; is the left eigenvector of the transition matrix, whose corresponding eigenvalue is 1. In this case,</p>
<disp-formula id="S2.E6"><mml:math id="M6"><mml:mrow><mml:mrow><mml:mi mathvariant="normal">&#x03C0;</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>X</mml:mi><mml:mo rspace="5.8pt">)</mml:mo></mml:mrow></mml:mrow><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mo>&#x007B;</mml:mo><mml:mtable rowspacing="0pt"><mml:mtr><mml:mtd columnalign="center"><mml:mrow><mml:mrow><mml:mn>0.25</mml:mn><mml:mo>,</mml:mo><mml:mpadded width="+3.3pt"><mml:mi>X</mml:mi></mml:mpadded></mml:mrow><mml:mo rspace="5.8pt">=</mml:mo><mml:mi>A</mml:mi></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:mrow><mml:mrow><mml:mn>0.25</mml:mn><mml:mo>,</mml:mo><mml:mpadded width="+3.3pt"><mml:mi>X</mml:mi></mml:mpadded></mml:mrow><mml:mo rspace="5.8pt">=</mml:mo><mml:mi>B</mml:mi></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:mrow><mml:mrow><mml:mn>0.25</mml:mn><mml:mo>,</mml:mo><mml:mpadded width="+3.3pt"><mml:mi>X</mml:mi></mml:mpadded></mml:mrow><mml:mo rspace="5.8pt">=</mml:mo><mml:mi>C</mml:mi></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:mrow><mml:mrow><mml:mn>0.25</mml:mn><mml:mo>,</mml:mo><mml:mpadded width="+3.3pt"><mml:mi>X</mml:mi></mml:mpadded></mml:mrow><mml:mo rspace="5.8pt">=</mml:mo><mml:mi>D</mml:mi></mml:mrow></mml:mtd></mml:mtr></mml:mtable><mml:mi/></mml:mrow></mml:mrow></mml:math></disp-formula>
<p>This is known as the stationary distribution of the Markov chain, representing how often we expect each symbol to occur regardless of its position within the message. For Markov chains, the so-called <italic>entropy rate</italic> becomes (<xref ref-type="bibr" rid="B35">Cover and Thomas, 2006d</xref>):</p>
<disp-formula id="S2.E7"><mml:math id="M7"><mml:mrow><mml:mi>H</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">&#x03B8;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x225C;</mml:mo><mml:mo>-</mml:mo><mml:mstyle displaystyle="true"><mml:munder><mml:mo largeop="true" movablelimits="false" symmetric="true">&#x2211;</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mrow><mml:mo stretchy="false">&#x007B;</mml:mo><mml:mi>A</mml:mi><mml:mo>,</mml:mo><mml:mi>B</mml:mi><mml:mo>,</mml:mo><mml:mi>C</mml:mi><mml:mo>,</mml:mo><mml:mi>D</mml:mi><mml:mo stretchy="false">&#x007D;</mml:mo></mml:mrow></mml:mrow></mml:munder></mml:mstyle><mml:mi mathvariant="normal">&#x03C0;</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mpadded width="+3.3pt"><mml:mi>X</mml:mi></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mi>x</mml:mi><mml:mo stretchy="false">)&#x00A0;</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mstyle displaystyle="true"><mml:munder><mml:mo largeop="true" movablelimits="false" symmetric="true">&#x2211;</mml:mo><mml:mrow><mml:mi>y</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mrow><mml:mo stretchy="false">&#x007B;</mml:mo><mml:mi>A</mml:mi><mml:mo>,</mml:mo><mml:mi>B</mml:mi><mml:mo>,</mml:mo><mml:mi>C</mml:mi><mml:mo>,</mml:mo><mml:mi>D</mml:mi><mml:mo stretchy="false">&#x007D;</mml:mo></mml:mrow></mml:mrow></mml:munder></mml:mstyle><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mpadded width="+3.3pt"><mml:mi>Y</mml:mi></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mi>y</mml:mi><mml:mo stretchy="false">&#x007C;</mml:mo><mml:mpadded width="+3.3pt"><mml:mi>X</mml:mi></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:msub><mml:mtext>log</mml:mtext><mml:mn>2</mml:mn></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>Y</mml:mi><mml:mo rspace="5.8pt">=</mml:mo><mml:mi>y</mml:mi><mml:mo stretchy="false">&#x007C;</mml:mo><mml:mi>X</mml:mi><mml:mo rspace="5.8pt">=</mml:mo><mml:mi>x</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula>
<p>For the present example:</p>
<disp-formula id="S2.E8"><mml:math id="M8"><mml:mrow><mml:mrow><mml:mi>H</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">&#x03B8;</mml:mi><mml:mo rspace="5.8pt">)</mml:mo></mml:mrow></mml:mrow><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mpadded width="+5pt"><mml:mn>1.357</mml:mn></mml:mpadded><mml:mtext>bits</mml:mtext></mml:mrow></mml:mrow></mml:math></disp-formula>
<p>This represents a significant reduction in entropy compared to the 2 bits of a 4-symbol uniform distribution, brought about by the statistical dependence between neighboring message elements. <italic>A</italic> is now predictive of <italic>B</italic>, and <italic>B</italic> of <italic>C</italic>, etc. despite each letter being equally likely overall.</p>
<p>An efficient code for this Markov distribution might look like this:</p>
<table-wrap position="float">
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<tbody>
<tr>
<td valign="top" align="right"></td>
<td valign="top" align="center">A</td>
<td valign="top" align="center">B</td>
<td valign="top" align="center">C</td>
<td valign="top" align="center">D</td>
</tr>
<tr>
<td valign="top" align="right">A</td>
<td valign="top" align="center">10</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">110</td>
<td valign="top" align="center">111</td>
</tr>
<tr>
<td valign="top" align="right">B</td>
<td valign="top" align="center">111</td>
<td valign="top" align="center">10</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">110</td>
</tr>
<tr>
<td valign="top" align="right">C</td>
<td valign="top" align="center">110</td>
<td valign="top" align="center">111</td>
<td valign="top" align="center">10</td>
<td valign="top" align="center">0</td>
</tr>
<tr>
<td valign="top" align="right">D</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">110</td>
<td valign="top" align="center">111</td>
<td valign="top" align="center">10</td>
</tr>
<tr>
<td valign="top" align="right"></td>
<td valign="top" align="center"/><td valign="top" align="center"/><td valign="top" align="center"/><td valign="top" align="center"/></tr>
</tbody>
</table>
</table-wrap>
<p>If the previous symbol was an A and we receive a 0, the encoding matrix tells us that the second symbol must be <italic>B</italic>. After that, we get another <italic>0</italic>, so <italic>C</italic>. Here is an example sequence generated by this Markov process and its corresponding binary representation (assuming all messages start with A):</p>
<disp-formula id="S2.E9"><mml:math id="M9"><mml:mtable rowspacing="0pt"><mml:mtr><mml:mtd columnalign="center"><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mtext>A</mml:mtext><mml:mo rspace="5.3pt">)</mml:mo></mml:mrow><mml:mi>BCBCDBABCCDCDA</mml:mi></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:mn>00111001101110010011100</mml:mn></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>In the absence of noise, we can perfectly reconstruct the original message from its binary counterpart. The average, long-run per-symbol message length is:</p>
<disp-formula id="S2.E10"><mml:math id="M10"><mml:mrow><mml:mi>C</mml:mi><mml:mo>=</mml:mo><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>A</mml:mi><mml:mo>&#x007C;</mml:mo><mml:mi>A</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x00D7;</mml:mo><mml:mn>2</mml:mn><mml:mo>+</mml:mo><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>B</mml:mi><mml:mo>&#x007C;</mml:mo><mml:mi>A</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x00D7;</mml:mo><mml:mn>1</mml:mn><mml:mo>+</mml:mo><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>C</mml:mi><mml:mo>&#x007C;</mml:mo><mml:mi>A</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x00D7;</mml:mo><mml:mn>3</mml:mn><mml:mo>+</mml:mo><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>D</mml:mi><mml:mo>&#x007C;</mml:mo><mml:mi>A</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x00D7;</mml:mo><mml:mpadded width="+3.3pt"><mml:mn>3</mml:mn></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mpadded width="+5pt"><mml:mn>1.5</mml:mn></mml:mpadded><mml:mtext>bits</mml:mtext></mml:mrow></mml:math></disp-formula>
<p>Redundancy is therefore <italic>R</italic> = 0.096. There is still some room for improvement, but this <italic>predictive coding</italic> scheme is much more efficient than schemes that ignore the statistical dependence between neighboring elements (<xref ref-type="bibr" rid="B47">Elias, 1955</xref>; <xref ref-type="bibr" rid="B68">Huang and Rao, 2011</xref>; <xref ref-type="bibr" rid="B135">Spratling, 2017</xref>). As before, an efficient code uses more information to represent unexpected events, but inclusion of an ordinal (or temporal) relationship changes our interpretation. Now, our efficient code uses more information (more bits) to represent <italic>prediction errors</italic> relative to expectation: when <italic>D</italic> comes after <italic>C</italic>, we expect that and use only 1 bit. We do not expect <italic>A</italic> to follow <italic>C</italic> and use correspondingly more bits (3) when this occurs. This is precisely the predictive coding algorithm as originally proposed for efficient transmission of telecommunications data (<xref ref-type="bibr" rid="B47">Elias, 1955</xref>). <xref ref-type="fig" rid="F1">Figure 1</xref> shows how predictive coding might be used to efficiently transfer information, including in a neural environment.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption><p><bold>(A)</bold> This cartoon illustrates how data can be efficiently transferred between a source and receiver by using a predictive coder to first remove and then recover redundant information. Ideally, and in the absence of internal noise, data transmitted between source and receiver is fully compressed with no redundancy or predictability. While this diagram is based on efficient coding theory as formalized for data transfer and storage in telecommunication systems, the same principles may apply to neural circuits as well. <bold>(B)</bold> A simple model illustrating how a predictive coder could be implemented by a neural circuit. Output neurons (green) receive excitatory inputs and a delayed inhibitory input transformed by a weight matrix A (orange-to-green connections). Neurons at the output transmit the residual difference between the current input, x<sub><italic>t</italic></sub>, and the predicted input, Ax<sub>(</sub><italic><sub><italic>t</italic></sub></italic><sub>&#x2013;1)</sub>.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fncom-16-929348-g001.tif"/>
</fig>
<p>To conclude, it is worth noting that efficient coding is perfectly valid in the presence of internal noise, such as when bits randomly flip during transmission. If the original encoding scheme is optimal but bits flip randomly during transmission, then information is lost at the receiving end. This is no longer an efficient code by definition, so to account for noise, efficient codes ultimately represent a trade-off between minimizing the average message length and preserving redundancy to be robust to noise (<xref ref-type="bibr" rid="B2">Atick, 1992</xref>; <xref ref-type="bibr" rid="B3">Atick and Redlich, 1992</xref>; <xref ref-type="bibr" rid="B148">van Hateren, 1992</xref>; <xref ref-type="bibr" rid="B42">Dong and Atick, 1995</xref>; <xref ref-type="bibr" rid="B131">Simoncelli, 2003</xref>). Efficient coding is therefore typically formalized as a maximization of mutual information between sensory inputs and neural responses, rather than a minimization of redundancy (<xref ref-type="bibr" rid="B41">Doi et al., 2012</xref>). This encourages neural codes that have low redundancy, but also high discriminability of different stimuli with low trial-to-trial variability for the same stimulus. Many authors include an additional energy or metabolic constraint to encourage some form of sparseness in the ultimate solution (<xref ref-type="bibr" rid="B102">Olshausen and Field, 1996</xref>; <xref ref-type="bibr" rid="B29">Chalk et al., 2018</xref>; <xref ref-type="bibr" rid="B100">Ocko et al., 2018</xref>).</p>
<sec id="S2.SS1">
<title>Lessons for Predictive Coding in the Nervous System</title>
<p>Lessons from this primer relevant to understanding temporal processing in the nervous system are: (1) Predictive coding is a direct consequence of efficient coding, particularly when applied to sequential or autocorrelated data. (2) A code that is efficient for one statistical distribution is inefficient for other distributions, implying that neural codes should be optimized for the natural environment and ought to adapt to changing environmental statistics to maintain efficiency. (3) To efficiently encode sequential data, it is necessary to learn sequence order and, if extended into the temporal domain, timing as well. Prediction emerges naturally as a consequence of efficient coding without requiring a separate representational framework.</p>
<p>The theory suggests that significant energy savings are possible. To achieve this, however, requires knowledge of the distribution of sensory inputs, ideally even the joint distribution of inputs and motor outputs. But, the process of learning the relevant distributions often falls beyond the purview of traditional efficient coding theory. Neural structures supporting efficient coding can, in principle, develop generationally through evolution (<xref ref-type="bibr" rid="B161">Zador, 2019</xref>) or within a lifetime through unsupervised or self-supervised learning algorithms (<xref ref-type="bibr" rid="B101">Oja, 2002</xref>; <xref ref-type="bibr" rid="B67">Hosoya et al., 2005</xref>; <xref ref-type="bibr" rid="B147">van den Oord et al., 2018</xref>; <xref ref-type="bibr" rid="B7">Bakhtiari et al., 2021</xref>; <xref ref-type="bibr" rid="B163">Zhuang et al., 2021</xref>). It should be noted that there is some terminological ambiguity around the phrase &#x201C;predictive coding,&#x201D; which often refers to specific unsupervised learning algorithms, premised on various assumptions about neural structure and function (<xref ref-type="bibr" rid="B116">Rao and Ballard, 1999</xref>; <xref ref-type="bibr" rid="B52">Friston, 2005</xref>; <xref ref-type="bibr" rid="B68">Huang and Rao, 2011</xref>; <xref ref-type="bibr" rid="B135">Spratling, 2017</xref>; <xref ref-type="bibr" rid="B157">Whittington and Bogacz, 2019</xref>; <xref ref-type="bibr" rid="B59">Grossberg, 2021</xref>; <xref ref-type="bibr" rid="B93">Millidge et al., 2021</xref>). These do not necessarily comport with Shannon&#x2019;s efficient coding formalism. Further adding to the ambiguity, predictive processing is often described as a general principle of cortical function, encompassing development, learning, and efficient neural encoding (<xref ref-type="bibr" rid="B158">Wiskott and Sejnowski, 2002</xref>; <xref ref-type="bibr" rid="B19">Bialek et al., 2007</xref>; <xref ref-type="bibr" rid="B104">Palmer et al., 2015</xref>; <xref ref-type="bibr" rid="B76">Keller and Mrsic-Flogel, 2018</xref>; <xref ref-type="bibr" rid="B132">Singer et al., 2018</xref>; <xref ref-type="bibr" rid="B7">Bakhtiari et al., 2021</xref>). Here, we generally refer to predictive coding as a way to encode and compress data from certain distributions, rather than as a learning algorithm. To see how efficient coding relates to nervous system function, we now summarize how the concept has served our understanding of the retina.</p>
</sec>
</sec>
<sec id="S3">
<title>Efficient Coding in Retina and Thalamus</title>
<p>From an information-theoretic perspective, neural coding in retinal photoreceptors is inefficient because the statistics of natural visual scenes create activity patterns that are highly correlated in space and time. Any code that simply recapitulates this structure would be inefficient in space, energy, and resource utilization (<xref ref-type="bibr" rid="B77">Landauer, 1976</xref>; <xref ref-type="bibr" rid="B10">Balasubramanian and Sterling, 2009</xref>; <xref ref-type="bibr" rid="B137">Sterling and Laughlin, 2015</xref>). Such inefficiency would then be exacerbated by spiking RGCs, because spikes are particularly expensive in terms of energy consumption, and higher average firing rates require increasingly greater axonal volumes (<xref ref-type="bibr" rid="B80">Laughlin et al., 1998</xref>; <xref ref-type="bibr" rid="B6">Attwell and Laughlin, 2001</xref>; <xref ref-type="bibr" rid="B9">Balasubramanian et al., 2001</xref>; <xref ref-type="bibr" rid="B79">Laughlin, 2001</xref>; <xref ref-type="bibr" rid="B107">Perge et al., 2009</xref>). Space leaving the retina is very limited (in humans, there are two orders of magnitude fewer axons in the optic nerve, 10<sup>6</sup>, than there are photoreceptors, 10<sup>8</sup>), so a more efficient representation is advantageous. <xref ref-type="bibr" rid="B5">Attneave (1954)</xref> and <xref ref-type="bibr" rid="B12">Barlow (1961</xref>, <xref ref-type="bibr" rid="B13">1989)</xref> were the first to recognize this and apply Shannon&#x2019;s theory to neuroscience. Barlow proposed that RGCs encode and transmit visual information using a simple efficient coding heuristic: reduce redundancy by generating fewer action potentials for expected visual inputs and more spikes for unexpected ones. The efficient code is actually created <italic>via</italic> filtering operations in retinal circuits, which throttle firing rates and decrease redundancy by removing many of the input correlations imparted by natural scene statistics (<xref ref-type="bibr" rid="B3">Atick and Redlich, 1992</xref>; <xref ref-type="bibr" rid="B89">Meister and Berry, 1999</xref>; <xref ref-type="bibr" rid="B110">Pitkow and Meister, 2012</xref>).</p>
<p>Barlow&#x2019;s hypothesis appears to be approximately correct (see <xref ref-type="bibr" rid="B89">Meister and Berry, 1999</xref>; <xref ref-type="bibr" rid="B10">Balasubramanian and Sterling, 2009</xref>; <xref ref-type="bibr" rid="B68">Huang and Rao, 2011</xref> for reviews). RGCs are effectively linear predictive coders (<xref ref-type="bibr" rid="B47">Elias, 1955</xref>; <xref ref-type="bibr" rid="B136">Srinivasan et al., 1982</xref>; <xref ref-type="bibr" rid="B89">Meister and Berry, 1999</xref>; <xref ref-type="bibr" rid="B68">Huang and Rao, 2011</xref>; <xref ref-type="bibr" rid="B41">Doi et al., 2012</xref>) with receptive fields resembling those predicted by efficient coding models (<xref ref-type="bibr" rid="B3">Atick and Redlich, 1992</xref>; <xref ref-type="bibr" rid="B41">Doi et al., 2012</xref>; <xref ref-type="bibr" rid="B100">Ocko et al., 2018</xref>). In the temporal domain, linear RGC receptive field filters are biphasic and compute the difference between recent past and present. Spatially, RGCs have a difference-of-Gaussians organization that compares luminance between center and surround regions. In both cases, RGCs can be understood to predict correlations, responding minimally when they are present (e.g., when luminance patterns are constant in time or uniform across the spatial extent of the receptive field) and responding maximally to expectation violations (e.g., when luminance changes rapidly in space or time). These retinal filters have even been shown to adapt over rapid timescales to more efficiently encode visual information from novel distributions (<xref ref-type="bibr" rid="B67">Hosoya et al., 2005</xref>). Beyond the retina, relay neurons in the dLGN continue this process, especially in the temporal domain (<xref ref-type="bibr" rid="B123">Saul and Humphrey, 1990</xref>; <xref ref-type="bibr" rid="B62">Hartveit, 1992</xref>; <xref ref-type="bibr" rid="B42">Dong and Atick, 1995</xref>; <xref ref-type="bibr" rid="B37">Dan et al., 1996</xref>) where whitening occurs in a manner consistent with the efficient coding hypothesis (<xref ref-type="bibr" rid="B42">Dong and Atick, 1995</xref>; <xref ref-type="bibr" rid="B37">Dan et al., 1996</xref>). For example, the efficient coding model of Dong and Atick explains the existence of lagged and non-lagged dLGN relay neurons, as observed in physiological data (<xref ref-type="bibr" rid="B62">Hartveit, 1992</xref>).</p>
<p>Despite this evidence, it is not clear that the same principles are sufficient to explain the early visual system in all its complexity. One common critique is that information theory derives from the general principle that all information is created equal. In the real world, some sources of information are more relevant to an organism than others. Frogs, for example, are better served by a visual system evolutionarily tuned to detect and locate flies than by an abstract requirement to efficiently compress all visual information equally.</p>
<p>Several groups have proposed variations on efficient coding that call for more nuanced perspectives by taking messy biological imperatives and constraints into account. <xref ref-type="bibr" rid="B10">Balasubramanian and Sterling (2009)</xref> and <xref ref-type="bibr" rid="B137">Sterling and Laughlin (2015)</xref>, for example, argue that it is advantageous to minimize per-bit computational costs, even while acknowledging that some information sources may be relatively privileged due to their ethological importance. These researchers and others have gone to great lengths to measure the computational cost of information transmission in terms of quantities like axonal volume and ATP consumption. Their more holistic approach leads to a variety of predictions regarding the structure and function of the retina that are well-supported by empirical data (<xref ref-type="bibr" rid="B21">Borghuis et al., 2008</xref>; <xref ref-type="bibr" rid="B10">Balasubramanian and Sterling, 2009</xref>; <xref ref-type="bibr" rid="B107">Perge et al., 2009</xref>; <xref ref-type="bibr" rid="B57">Gjorgjieva et al., 2014</xref>).</p>
<p>Another variation, proposed by <xref ref-type="bibr" rid="B162">Zhao et al. (2012)</xref>; <xref ref-type="bibr" rid="B143">Teuli&#x00E8;re et al. (2015)</xref>, <xref ref-type="bibr" rid="B83">Lelais et al. (2019)</xref>, and <xref ref-type="bibr" rid="B44">Eckmann et al. (2020)</xref> takes the name of <italic>active efficient coding</italic> and is built around an interesting observation: since environmental statistics are partially governed by the animal&#x2019;s own behavior, changing behavior can make a neural code more or less efficient. This idea leads to a variety of empirical predictions regarding the relationship between sensation and action. <xref ref-type="bibr" rid="B143">Teuli&#x00E8;re et al. (2015)</xref> have shown, for example, that both vergence and smooth pursuit eye movements can be learned <italic>de novo</italic> in artificial systems that optimize coding efficiency by simultaneously adjusting both the neural representation and eye movements (<xref ref-type="bibr" rid="B162">Zhao et al., 2012</xref>; <xref ref-type="bibr" rid="B83">Lelais et al., 2019</xref>).</p>
<p>Finally, and particularly relevant to our focus on time, is a hypothesis proposed by <xref ref-type="bibr" rid="B19">Bialek et al. (2007)</xref>. Efficient codes compress data down to Shannon&#x2019;s source coding limit. Beyond that limit, rate-distortion theory provides a principled way to select certain information for deletion. Inspired by this idea and a related framework known as the information bottleneck (<xref ref-type="bibr" rid="B144">Tishby et al., 2000</xref>), Bialek et al. suggested that sensory systems preferentially delete information about the past. They reasoned that predictive information (about the future) is uniquely useful for action and decision-making, and should therefore be prioritized. Such predictive information, which is inconsistent with traditional models of visual processing, has been observed in the early visual system of various species (<xref ref-type="bibr" rid="B104">Palmer et al., 2015</xref>; <xref ref-type="bibr" rid="B121">Salisbury and Palmer, 2015</xref>; <xref ref-type="bibr" rid="B119">Sachdeva et al., 2021</xref>; <xref ref-type="bibr" rid="B155">Wang et al., 2021</xref>).</p>
<p>While the retina and dLGN provide some of the best support for the idea that neural circuits are shaped by environmental statistics to efficiently encode information, there is evidence that similar considerations may hold in other visual areas as well.</p>
</sec>
<sec id="S4">
<title>Efficient Coding in Primary Visual Cortex</title>
<p>Information sources that exhibit statistical dependences across space and time can be compressed by efficient encoding schemes. In the visual system, part of this process occurs in the retina and dLGN, which perform spatial and temporal decorrelation relative to the statistical structure of natural visual inputs (<xref ref-type="bibr" rid="B136">Srinivasan et al., 1982</xref>; <xref ref-type="bibr" rid="B42">Dong and Atick, 1995</xref>; <xref ref-type="bibr" rid="B37">Dan et al., 1996</xref>; <xref ref-type="bibr" rid="B67">Hosoya et al., 2005</xref>; <xref ref-type="bibr" rid="B68">Huang and Rao, 2011</xref>; <xref ref-type="bibr" rid="B110">Pitkow and Meister, 2012</xref>). However, temporal decorrelation appears limited to brief timescales, on the order of &#x223C;30&#x2013;300 ms, and processing is largely linear, subtracting the mean and removing pairwise correlations (though see, for example, <xref ref-type="bibr" rid="B104">Palmer et al., 2015</xref> for more complex processing in the retina). Longer-timescale and higher-order correlations present in natural visual inputs survive the initial processing stages. This suggests that information reaching V1 is still inefficiently encoded. Regardless of the ultimate computational goal or task, an inefficient representation is in general more energetically costly and more difficult for downstream regions to process, as argued by <xref ref-type="bibr" rid="B11">Barlow (1990)</xref>. V1 may therefore construct a more efficient representation, acting on longer timescales and reducing higher-order correlations. Examples of higher-order correlations in natural vision are edges in the spatial domain (representing correlations between spatially adjacent center-surround receptive fields) and brief trajectories in the temporal domain [which V1 can rapidly learn to predict (<xref ref-type="bibr" rid="B159">Xu et al., 2012</xref>)]. Canonical V1 receptive fields can be understood to operate on both forms of correlation in an efficient coding sense. &#x201C;Edge detectors&#x201D; in V1 eliminate spatial correlations found in natural scenes by integrating information across multiple dLGN inputs (<xref ref-type="bibr" rid="B102">Olshausen and Field, 1996</xref>; <xref ref-type="bibr" rid="B16">Bell and Sejnowski, 1997</xref>). V1 neurons with &#x201C;space-time inseparable,&#x201D; or direction-selective receptive fields, similarly eliminate higher-order correlations associated with motion trajectories.</p>
<p>Predictive coding models consistent with efficient coding have been used to explain both classical and extra-classical receptive field properties of V1 neurons, especially in the spatial domain (<xref ref-type="bibr" rid="B134">Spratling, 2010</xref>; <xref ref-type="bibr" rid="B68">Huang and Rao, 2011</xref>; <xref ref-type="bibr" rid="B76">Keller and Mrsic-Flogel, 2018</xref>). The well-known <xref ref-type="bibr" rid="B116">Rao and Ballard (1999)</xref> hierarchical predictive coding model learns V1-like receptive fields when trained on natural scenes, showing both classical Gabor-like spatial structure and extra-classical effects like end stopping (neural firing evoked by elongated bars increases with bar length up to some critical length, beyond which firing rapidly decreases). Cells with this property were originally called hypercomplex by <xref ref-type="bibr" rid="B69">Hubel and Wiesel (1965)</xref>, but are now termed end stopped (<xref ref-type="bibr" rid="B56">Gilbert, 1977</xref>). Such <italic>contextual effects</italic> are often consistent with efficient coding models but difficult to reconcile with strictly feedforward models of visual processing (<xref ref-type="bibr" rid="B126">Schwartz et al., 2007</xref>; <xref ref-type="bibr" rid="B68">Huang and Rao, 2011</xref>; <xref ref-type="bibr" rid="B27">Carandini and Heeger, 2012</xref>; but see <xref ref-type="bibr" rid="B114">Priebe and Ferster, 2012</xref> for an alternative explanation of some contextual effects).</p>
<p>In the following sections, we review evidence for temporal efficient coding in V1, especially focusing on data from rodents.</p>
<sec id="S4.SS1">
<title>Visual Flow</title>
<p>Visual flow caused by self-motion is responsible for large amounts of neural activity in the early visual system. Given the canonical properties of neurons in V1 (acting as edge detectors &#x2026; responding to increments or decrements of light but not generally to steady-state luminance sources &#x2026; showing direction selectivity), an animal&#x2019;s natural motion through the environment, along with associated body and head-orienting movements, ought to evoke significant firing. In addition to the purely visual information evoked by such behaviors, body movements associated with locomotion and head orienting also evoke activity in rodent V1 (<xref ref-type="bibr" rid="B150">Vinck et al., 2015</xref>; <xref ref-type="bibr" rid="B141">Stringer et al., 2019</xref>; <xref ref-type="bibr" rid="B60">Guitchounts et al., 2020</xref>; <xref ref-type="bibr" rid="B106">Parker et al., 2022</xref>). This activity persists even in the dark, and may represent corollary discharge signals from motor areas or perhaps even predictions of the sensory consequences of movement (<xref ref-type="bibr" rid="B78">Lappe et al., 1999</xref>; <xref ref-type="bibr" rid="B127">Shadmehr et al., 2010</xref>; <xref ref-type="bibr" rid="B82">Leinweber et al., 2017</xref>; <xref ref-type="bibr" rid="B124">Sawtell, 2017</xref>). The predictable spatiotemporal correlations created by visual flow, head movements, and locomotion therefore make these strong candidates for efficient predictive coding.</p>
<p>There are at least two different ways in which efficient coding may shape cortical responses to natural visual flow. The first is by forming compressed representations of flow-like inputs by eliminating statistically predictable dependences between neighboring moments in time (as in our example discussing efficient encoding of Markov chains). Second, though related, external motion relative to the animal creates an unexpected visual input with respect to locally predicted visual flow. In both cases, efficient representations would be expected to generate relatively larger responses when unexpected flow patterns violate spatiotemporal predictions (<xref ref-type="fig" rid="F2">Figure 2</xref>). Even if the system does not take advantage of such a mechanism to conserve energy, it could still benefit from knowing the distribution of flow signals, which necessarily involves some ability to predict the future.</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption><p>Movement in the direction shown causes objects in the visual field to progress along predictable trajectories. Right: A tree that is in the distance at time <italic>t</italic><sub>1</sub> becomes larger and moves toward the right at later time points. Efficient coding suggests that the visual response resulting from this expected apparent motion in visual space ought to be relatively small. Left: If this progression were scrambled in time, the resulting &#x201C;unexpected optic flow&#x201D; would cause the same visual images to produce relatively larger neural responses signifying an expectation violation.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fncom-16-929348-g002.tif"/>
</fig>
<p>A series of papers by Keller and colleagues has extensively studied visual flow in the context of predictive coding (<xref ref-type="bibr" rid="B75">Keller et al., 2012</xref>; <xref ref-type="bibr" rid="B164">Zmarz and Keller, 2016</xref>; <xref ref-type="bibr" rid="B4">Attinger et al., 2017</xref>; <xref ref-type="bibr" rid="B82">Leinweber et al., 2017</xref>; <xref ref-type="bibr" rid="B71">Jordan and Keller, 2020</xref>). They first looked at dense, topographically organized projections from secondary motor cortex to V1 layer 2/3, finding functional evidence for corollary discharge and visual flow feedback transmitted to V1 (<xref ref-type="bibr" rid="B82">Leinweber et al., 2017</xref>). They also discovered <italic>mismatch receptive fields</italic> in V1, which seemed to signal prediction errors relative to the expected visual flow: about 10&#x2013;20% of recorded neurons were tuned to the properties of visual flow perturbations, for example video playback halts during active, coupled locomotion (<xref ref-type="bibr" rid="B164">Zmarz and Keller, 2016</xref>). In the latter study, mice were trained to navigate a virtual reality environment, with their movement on a spherical treadmill controlling motion in the environment. This closed-loop coupling between the animal&#x2019;s motion and its perceptual experience were crucial to the generation of mismatch signals (<xref ref-type="bibr" rid="B75">Keller et al., 2012</xref>; <xref ref-type="bibr" rid="B149">Vasilevskaya et al., 2022</xref>).</p>
<p>A more recent paper questioned these results, however, arguing that mismatch signals could be explained by canonical V1 response properties such as locomotion gain, and orientation or direction selectivity (<xref ref-type="bibr" rid="B97">Muzzu and Saleem, 2021</xref>). The authors presented mice with visual-flow-mimicking drifting gratings that randomly halted. A subset of neurons responded robustly to the perturbations. Responses were enhanced by locomotion and congruent with the neurons&#x2019; orientation selectivity. Muzzu and Saleem also reasoned that a mouse&#x2019;s tendency to move forward would, under efficient coding, establish a preference for front-to-back optic flow, but they found no such preference. While this result is interesting and suggestive of further experimentation, it is not necessarily conclusive. Most importantly, the experiment was performed in open loop with drifting gratings, quite distinct from closed-loop natural visual flow inputs. Predictive and efficient coding models predict that violations of expected visual flow will generate mismatch or error signals, based on the statistics of the <italic>natural environment</italic>. It is very difficult to know how the system ought to respond to non-natural visual flow inputs, especially when those are decoupled from the animal&#x2019;s movement. As argued in a recent rebuttal to the Muzzu and Saleem paper from the Keller lab (<xref ref-type="bibr" rid="B149">Vasilevskaya et al., 2022</xref>), closed-loop coupling between locomotion and visual flow is crucial: responses to coupled perturbations (termed mismatches) were at least twice as large as responses during yoked open-loop perturbations (after controlling for locomotion speed). This difference between closed- and open-loop perturbations was absent in mice raised in an environment with no visuomotor coupling (<xref ref-type="bibr" rid="B4">Attinger et al., 2017</xref>). Furthermore, visual inputs during complex behaviors, for example rearing and turns, occur in all directions and so a preference for front-to-back visual flow is not necessarily expected.</p>
<p>As this example suggests, tests of efficient and predictive coding should be performed with stimuli matched to the animal&#x2019;s natural environment, for example with a mouse freely exploring a nest or grassy field, or after sufficient training for the system to have learned the statistics of an unnatural environment (assuming such learning is possible). Indeed, a recent large-scale survey of neural activity across the mouse visual system showed substantial differences in neural tuning properties and overall activity in response to different types of visual input (<xref ref-type="bibr" rid="B38">De Vries et al., 2020</xref>). Many cells were unresponsive to entire classes of visual stimuli, such as natural scenes, while responding robustly to other classes, like drifting gratings. A related example comes from a study in Michael Stryker&#x2019;s lab (<xref ref-type="bibr" rid="B43">Dyballa et al., 2018</xref>). Dyballa et al. analyzed the responses of V1 neurons to flow-like videos designed to imitate a mouse&#x2019;s motion through grass and found robust visual responsiveness to spatial frequencies as high as about 1.5 cycles per degree, significantly greater than traditional visual acuity estimates of &#x223C;0.5 cycles per degree measured using sinusoidal gratings (<xref ref-type="bibr" rid="B112">Porciatti et al., 1999</xref>; <xref ref-type="bibr" rid="B115">Prusky et al., 2000</xref>; <xref ref-type="bibr" rid="B98">Niell and Stryker, 2008</xref>). This difference may represent an in-vivo demonstration of the principle that codes are efficient only for the statistical environment to which they are matched. For future studies of efficient coding, experiments like those performed in the Keller and Stryker labs may provide a good compromise between experimental tractability and naturalistic stimuli/behavior.</p>
</sec>
<sec id="S4.SS2">
<title>Sequential Visual Data</title>
<p>Real-world information streams exhibit both ordinal and temporal statistical dependences. A dancer might observe the continuous sequence of body movements required to perform a routine, or a driver might learn the discrete order and timing of turns along a route. Natural data streams contain both order and precise timing information. Any accurate model of these streams must describe <italic>how often</italic> different elements occur, <italic>what order</italic> they follow, and <italic>when</italic> they occur relative to each other. If the resulting codes are efficient, unexpected stimuli should evoke excess activity, or prediction errors, relative to expected stimuli. Depending on the nature of the encoding, prediction errors could be elicited by unexpected elements introduced to a sequence, expected elements rearranged within the sequence or omitted altogether, or expected elements presented at unexpected times. The literature contains a variety of terms describing these effects, including surprise-related enhancement, mismatch negativity, and prediction error. Ideally, responses would scale as the log of event probability, &#x2212;log[<italic>P</italic>(<italic>X</italic>)] (<xref ref-type="bibr" rid="B128">Shannon, 1948</xref>; <xref ref-type="bibr" rid="B32">Cover and Thomas, 2006a</xref>; <xref ref-type="bibr" rid="B140">Stone, 2018</xref>), though this will ultimately depend on the details of the nervous system&#x2019;s model of its natural environment and the ability of a non-negative, discrete signal (spikes) to encode probabilities. Note that contrived experimental sequential stimuli may evoke prediction errors, but only if the system has previously learned the frequency, order, and/or timing of those sequences.</p>
<p>Experimental sequences of discrete stimuli are usually designed to be predictable, sometimes obeying a Markov chain, and have provided evidence to support efficient coding models. <xref ref-type="bibr" rid="B151">Vinken et al. (2017)</xref> recorded single-unit responses in rat V1 and latero-intermediate area (LI) to random sequences of standard (90%) and oddball (10%) images. In this experiment, sequence order was irrelevant, so the thing to be learned was element frequency. In both areas, responses to standard elements were suppressed in a manner consistent with known adaptation mechanisms, but oddball elements drove significantly greater responses than control elements only in area LI. This effect was difficult to explain through adaptation. No such oddball response was observed in a similar experiment performed in monkey inferotemporal (IT) cortex (<xref ref-type="bibr" rid="B72">Kaliukhovich and Vogels, 2014</xref>). However, both studies exposed animals to sequences only during individual recording sessions. When monkeys were passively exposed to sequences containing ordinal information for much longer periods of time, there was evidence in IT for an enhanced response to order-violating stimuli (<xref ref-type="bibr" rid="B92">Meyer and Olson, 2011</xref>; <xref ref-type="bibr" rid="B96">Muckli et al., 2020</xref>). In other sensory modalities, especially audition, similar effects have been observed (<xref ref-type="bibr" rid="B118">Rubin et al., 2016</xref>; <xref ref-type="bibr" rid="B63">Heilbron and Chait, 2018</xref>; <xref ref-type="bibr" rid="B85">Maheu et al., 2019</xref>; <xref ref-type="bibr" rid="B39">Denham and Winkler, 2020</xref>). Given the more strictly temporal nature of auditory information, models in that modality may provide a source of inspiration for studies of visual temporal processing.</p>
<p>Many related studies have shown evidence for cortical novelty responses across brain regions, with wide variation in the effort to control for adaptation (<xref ref-type="bibr" rid="B73">Kato et al., 2015</xref>; <xref ref-type="bibr" rid="B86">Makino and Komiyama, 2015</xref>; <xref ref-type="bibr" rid="B53">Garrett et al., 2020</xref>; <xref ref-type="bibr" rid="B111">Poort et al., 2021</xref>; <xref ref-type="bibr" rid="B125">Schulz et al., 2021</xref>; <xref ref-type="bibr" rid="B66">Homann et al., 2022</xref>; <xref ref-type="bibr" rid="B95">Montgomery et al., 2022</xref>). In most cases, consistent with efficient coding, novel stimuli evoke more spiking activity than familiar. However, some studies have reported that familiar drive larger responses than novel stimuli. An example of this was seen in V1, where <xref ref-type="bibr" rid="B55">Gavornik and Bear (2014)</xref> repeatedly presented mice with a sequence of rapidly flashed sinusoidal gratings. Over a learning period of 5 days, the magnitude of visually evoked potentials increased dramatically in response to the trained sequence. Sequences violating trained expectations (including novel order, novel timing, and omitted elements) elicited responses that could be interpreted as error signals, but these were smaller than responses to the trained sequence. This result could reflect the fact that local field potentials largely represent synaptic currents in the dendrites (e.g., inputs) rather than neural spiking (e.g., outputs, <xref ref-type="bibr" rid="B74">Katzner et al., 2009</xref>). In more recent experiments in our lab, we have found that expectation-violating stimuli tend to elicit more spiking activity (<xref ref-type="bibr" rid="B113">Price et al., 2022</xref>). In addition to this unsupervised learning paradigm, there is also evidence for timing information in V1 following reinforcement learning (<xref ref-type="bibr" rid="B129">Shuler and Bear, 2006</xref>; <xref ref-type="bibr" rid="B61">Hangya and Kepecs, 2015</xref>; <xref ref-type="bibr" rid="B84">Levy et al., 2017</xref>). Interestingly, there is evidence that both the Gavornik and Bear sequence learning and Shuler and Bear reward timing paradigms (<xref ref-type="bibr" rid="B30">Chubykin et al., 2013</xref>) require cholinergic signaling, suggesting that this neurotransmitter may be uniquely required for plasticity that encodes temporal expectations into cortical circuits.</p>
<p>Other studies have found evidence for ordinal or temporal information in V1 using continuous-time stimuli (rather than discrete as above). One recent series of papers on &#x201C;perceptual straightening&#x201D; are particularly relevant to addressing whether the cortex produces efficient codes of spatiotemporal information (<xref ref-type="bibr" rid="B65">H&#x00E9;naff et al., 2019</xref>, <xref ref-type="bibr" rid="B64">2021</xref>). The authors argue that prediction is a fundamental cortical computation, and that it is easier to make predictions in V1 if the complex pattern of spike trajectories generated by the retina are &#x201C;straightened&#x201D; so that they evolve according to more-nearly linear dynamics. They find evidence for straightening in both human psychophysics experiments and macaque V1 (though the monkey data was recorded under anesthesia). Another study found that navigation within a virtual environment creates responses in mouse V1 that are increasingly predictive of upcoming stimuli, such that omissions of expected stimuli drive high activity (<xref ref-type="bibr" rid="B51">Fiser et al., 2016</xref>). This work ties into recent evidence for a strong functional relationship between V1 and hippocampus in the mouse, with V1 showing spatial modulation in virtual environments consistent with the hippocampal representation of space (<xref ref-type="bibr" rid="B120">Saleem et al., 2018</xref>; <xref ref-type="bibr" rid="B40">Diamanti et al., 2021</xref>). Interestingly, the Gavornik and Bear result was recently shown to require an intact hippocampus for plasticity induction (<xref ref-type="bibr" rid="B50">Finnie et al., 2021</xref>). Overall, these results blur the distinction between visual coding and memory and illustrate how difficult it is to establish an experimental paradigm to test the efficient coding hypothesis in cortex and based on visual inputs alone.</p>
</sec>
<sec id="S4.SS3">
<title>Adaptation</title>
<p>Adaptation describes the time-varying behavior of neurons as they adjust their firing properties to changes in environmental statistics. A classic example is the change in dynamic range of retinal photoreceptors in response to changes in overall light intensity (<xref ref-type="bibr" rid="B99">Normann and Perlman, 1979</xref>; <xref ref-type="bibr" rid="B27">Carandini and Heeger, 2012</xref>). After adapting to a dark environment, photoreceptor responses saturate at daytime light intensities and are thereby rendered temporarily unable to transmit information at these higher intensities (<xref ref-type="bibr" rid="B99">Normann and Perlman, 1979</xref>). This is precisely the behavior predicted by efficient coding, since it allows neurons to maximize information throughput under changing conditions. Comparable effects have also been observed in blowfly H1 neurons under a variety of experimental conditions, where adaptation provably maximizes information transmission (<xref ref-type="bibr" rid="B22">Brenner et al., 2000</xref>; <xref ref-type="bibr" rid="B48">Fairhall et al., 2001</xref>). Adaptation, at least in the very early visual system, can therefore be understood as a consequence of efficient coding principles (<xref ref-type="bibr" rid="B14">Barlow and F&#x00F6;ldi&#x00E1;k, 1989</xref>; <xref ref-type="bibr" rid="B89">Meister and Berry, 1999</xref>; <xref ref-type="bibr" rid="B153">Wainwright, 1999</xref>; <xref ref-type="bibr" rid="B156">Weber et al., 2019</xref>).</p>
<p>Natural scenes are non-stationary and dynamic at various timescales. In context of this review, we are primarily interested in how patterns in the temporal domain create probabilistic dependences between different moments in time and how these can be used to make predictions. Efficient coding suggests neural circuits should learn these dependences and remove them from the neural code, creating time-invariant representations of objects and other environmental features and allowing for prediction of future states. In this framing, adaptation can be a confound to tests of efficient or predictive coding.</p>
<p>As a simple illustration, consider an experiment that presents a sequence of two visual stimuli where each element presentation is separated by 100 ms, for example AAABABBBBBAAB. Due to synaptic depression and other known forms of adaptation (<xref ref-type="bibr" rid="B26">Carandini and Ferster, 1997</xref>; <xref ref-type="bibr" rid="B31">Chung et al., 2002</xref>; <xref ref-type="bibr" rid="B20">Blitz et al., 2004</xref>), cortical responses to one particular stimulus will often decrease with repeated presentations, e.g., BBBBB. The response to A following this run will be greater not only than the last response to B but also than the average response to A (which includes diminished responses from runs of A&#x2019;s). Paired-pulse-like facilitation may also occur for patterns like AA or ABA, creating even more complex response profiles. The wide range of adaptation-like mechanisms observed in neural tissue therefore present an obvious challenge to tests of efficient and predictive coding, because the former do not seem to require knowledge of an underlying probability distribution. At the same time, we know adaptation can serve the principle of efficient coding, as described above and outlined in an excellent review (<xref ref-type="bibr" rid="B156">Weber et al., 2019</xref>). Adaptation may therefore provide a mechanistic implementation of efficient coding for certain stimulus distributions, without the need for any form of long-term plasticity to encode temporal relationships (see <xref ref-type="fig" rid="F3">Figure 3</xref>). This may be demonstrated by a recent paper showing that novel stimuli presented within repeatable sequences evoke excess activity, as predicted by efficient coding (<xref ref-type="bibr" rid="B66">Homann et al., 2022</xref>). The proposed mechanism, however, was consistent with a straightforward adaptation model. In certain cases, it is possible to dissociate predictive coding from adaptation (<xref ref-type="bibr" rid="B142">Tang et al., 2018</xref>), though not all adaptive mechanisms are known or understood, and empirical predictions for predictive and efficient coding in cortical circuitry are not as well developed as they have been in the retina and dLGN.</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption><p>Neural adaptation caused by repetition of a single stimulus over a short period of time can cause neural responses to decrease in magnitude. Relatively large responses, as when A follows BBBBB, can be interpreted as signifying either a prediction violation or a simple lack of adaption in the population of neurons selective for A. Depending on the input statistics, these responses could be efficient, as expected stimuli (assuming we expect repeats) are represented with less activity than unexpected stimuli. For this reason, it is not always clear if adaptation is a confound to studying efficient temporal coding, a mechanism implementing it, or some mixture of the two.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fncom-16-929348-g003.tif"/>
</fig>
</sec>
</sec>
<sec id="S5">
<title>Criticisms</title>
<p>Though efficient coding and information theory are clearly relevant to neural computation, there is much debate regarding the extent to which these ideas explain what we see in the nervous system, especially beyond the retina and dLGN. We will walk through a point-counterpoint debate that emphasizes three prominent criticisms of efficient and predictive coding as theories of cortical function (see <xref ref-type="bibr" rid="B131">Simoncelli, 2003</xref> for a complementary perspective). (1) The massive expansion in the number of neurons from dLGN to V1 would appear to increase redundancy, contrary to the efficient coding hypothesis. (2) There is functional evidence that contradicts predictive coding theory. (3) Efficient coding makes very precise, testable predictions in certain contexts, but, in general, information-theoretic measures are very difficult, if not impossible, to estimate (<xref ref-type="bibr" rid="B105">Paninski, 2003</xref>) making the overall utility of applying efficient coding to the cortex unclear.</p>
<p>The first criticism is suggestive of the inherent difficulty in translating information-theoretic ideas to the nervous system. Unlike a telephone system, where engineers need only concern themselves with transferring data efficiently, cortical circuits must both encode information and operate on it. For computer hard drives, efficiency is defined by minimizing the average number of bits utilized while preserving information from known sources of noise. By comparison, efficiency in the nervous system might mean minimizing the average number of spikes per second while preserving information, or maximizing the average number of bits per molecule of ATP, or maximizing bits per volume of axon (<xref ref-type="bibr" rid="B80">Laughlin et al., 1998</xref>; <xref ref-type="bibr" rid="B10">Balasubramanian and Sterling, 2009</xref>; <xref ref-type="bibr" rid="B137">Sterling and Laughlin, 2015</xref>; <xref ref-type="bibr" rid="B140">Stone, 2018</xref>). In the retina, <xref ref-type="bibr" rid="B10">Balasubramanian and Sterling (2009)</xref> proposed the following instantiation of efficient coding theory: &#x201C;Given the information required for behavior, the retina minimizes its computational costs.&#x201D; By precisely measuring the metabolic and computational costs of information processing under certain conditions, these researchers and others (<xref ref-type="bibr" rid="B80">Laughlin et al., 1998</xref>) have found that each additional bit of information transmitted along the optic nerve requires correspondingly more neural resources, space, and energy, creating a dramatic law of diminishing returns (<xref ref-type="bibr" rid="B107">Perge et al., 2009</xref>). This explains many of the functional properties of the retina, including its differentiation into multiple parallel processing streams (<xref ref-type="bibr" rid="B21">Borghuis et al., 2008</xref>; <xref ref-type="bibr" rid="B10">Balasubramanian and Sterling, 2009</xref>; <xref ref-type="bibr" rid="B57">Gjorgjieva et al., 2014</xref>; <xref ref-type="bibr" rid="B137">Sterling and Laughlin, 2015</xref>; <xref ref-type="bibr" rid="B100">Ocko et al., 2018</xref>). Overall, such considerations reflect the subtlety of the problem and the need for very precise specifications of the theory. In V1, the fact that we find more neurons than in dLGN does constitute an increase in physical resources such as space and protein molecules, but it could well cause a decrement in the average number of spikes or in the redundancy of messages transmitted beyond V1. The expansion might also reflect a requirement for additional neural resources in the cortex as V1 integrates information across multiple modalities, or reflect cells that are being used for other cortical functions.</p>
<p>Regarding the second criticism, there are many examples in the literature that seem to contradict efficient and predictive coding. One such example comes from a study by <xref ref-type="bibr" rid="B17">Benucci et al. (2009)</xref>, where the authors studied neural responses to sequences of oriented gratings in anesthetized cat visual cortex. Membrane potentials measured in response to the sequences were highly predictable from a simple linear model of the responses to individual gratings. Thus, the temporal context in which the gratings were displayed was irrelevant. The authors therefore concluded &#x201C;spatial and temporal codes in area V1 operate largely independently.&#x201D; A more recent study by Solomon et al. looked at neural responses in awake macaque and human visual cortex in response to rapidly flashed sequences of sinusoidal gratings (<xref ref-type="bibr" rid="B133">Solomon et al., 2021</xref>). They showed <italic>standard</italic> sequences on 80% of trials and <italic>deviant</italic> sequences on 20%, expecting to observe prediction errors in response to deviant stimuli. Instead, they found minimal evidence for prediction errors, with the responses to deviant and standard stimuli being almost identical.</p>
<p>Both studies establish crucial limitations to efficient and predictive coding, but do not invalidate the theories. Importantly, the studies presented sequences of non-natural sinusoidal grating stimuli within individual recording sessions, leaving little time for the neural system to learn the new statistics. Efficient and predictive coding both suggest that expectations are either evolved or learned relative to the natural visual environment, so a random sequence of sinusoidal gratings is always novel/unexpected from that perspective (regardless of whether it came from a standard or deviant set). There is no <italic>a priori</italic> reason to expect the visual system would learn to differentiate standard from deviant stimuli within a recording session. The Solomon et al. result therefore might suggest a limitation to predictive coding: monkeys and humans do not seem to learn non-natural sequences of stimuli on a timescale of minutes to hours (though see also <xref ref-type="bibr" rid="B46">Ekman et al., 2017</xref>, which demonstrated anticipatory cue-evoked pre-play of expected visual trajectories in human V1 after a brief period of training). Given more exposure time, they may or may not learn such sequences, depending on the ability of the visual system to flexibly adapt and modify its internal expectations. The use of anesthetized cats in Benucci et al. result is particularly problematic from an interpretive standpoint, since anesthesia seems to be preferentially disrupt cortical processing (<xref ref-type="bibr" rid="B152">Voss et al., 2019</xref>). The absence of contextual modulation may reveal little about how the awake brain exploits temporal relationships to make predictions.</p>
<p>The third criticism is perhaps the most difficult, as demonstrated by a simple thought experiment. Suppose we hypothesize that V1 compresses information arriving from dLGN before sending it to V2 and that we want to test this hypothesis in the spatial domain. We might devise an experiment to measure the entropy of dLGN and V1 projection neurons in response to natural scenes. The summed entropies of all V1-projecting dLGN neurons is the average &#x201C;message length&#x201D; of that population, likewise for the summed entropies of V1 neurons transmitting to V2. Formally, our hypothesis would be:</p>
<disp-formula id="S5.E11"><mml:math id="M11"><mml:mrow><mml:mrow><mml:mstyle displaystyle="true"><mml:munderover><mml:mo largeop="true" movablelimits="false" symmetric="true">&#x2211;</mml:mo><mml:mrow><mml:mpadded width="+3.3pt"><mml:mi>i</mml:mi></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>N</mml:mi></mml:munderover></mml:mstyle><mml:mrow><mml:mi>H</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>L</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo rspace="5.8pt" stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mrow><mml:mo rspace="5.8pt">&gt;</mml:mo><mml:mrow><mml:mstyle displaystyle="true"><mml:munderover><mml:mo largeop="true" movablelimits="false" symmetric="true">&#x2211;</mml:mo><mml:mrow><mml:mpadded width="+3.3pt"><mml:mi>j</mml:mi></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>M</mml:mi></mml:munderover></mml:mstyle><mml:mrow><mml:mi>H</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>V</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mn>&#x00A0;&#x00A0;</mml:mn><mml:mi>I</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>S</mml:mi><mml:mo>;</mml:mo><mml:mi>L</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo rspace="5.8pt">&#x2248;</mml:mo><mml:mrow><mml:mi>I</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>S</mml:mi><mml:mo>;</mml:mo><mml:mi>V</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math></disp-formula>
<p>where <italic>H</italic>(<italic>L</italic><sub><italic>i</italic></sub>) is the entropy of the i-th dLGN projection neuron (of <italic>N</italic> total), <italic>H</italic>(<italic>V</italic><sub><italic>j</italic></sub>) is the entropy of the j-th V1 projection neuron, and <italic>I</italic>(<italic>S</italic>; <italic>L</italic>) is the mutual information between sensory inputs and the population response in dLGN. Input information is preserved at the output of V1 but in a compressed form. In theory, we would need to record from a very large number of neurons for a very long time to test this hypothesis. Accurate estimation of the individual entropies is tractable under certain assumptions, but estimation of the mutual information would be impossible in any realistic neuroscience experiment due to the curse of dimensionality (<xref ref-type="bibr" rid="B105">Paninski, 2003</xref>). Were we to include the temporal domain as well, by showing natural movies for example, neural responses at different timepoints would no longer be independent. Entropies and mutual information become exponentially more difficult to estimate. When you begin to consider the complexity of the circuit, rife with feedback and interconnections, the problem becomes even more difficult to specify. Therefore, while it is fairly straightforward to devise hypotheses around efficient coding, it is difficult to see how they will ever be tested. This is a fair criticism, but it is not unique to this specific theory: there is rarely an easy way to compare neural data to theory. Further, there are many ways to test ideas of efficient coding. Most are inconclusive or incomplete individually, but cumulative evidence supporting the theory can still accumulate over time.</p>
</sec>
<sec id="S6" sec-type="discussion">
<title>Discussion</title>
<p>One of the things that makes it so difficult to understand the brain is the mechanistic overlap between computation and representation. The conscious percept of a particular thought or idea or image somehow emerges from the combined activity of populations of neurons, and it is probably correct to say that the population activity defines the neural code representing the idea. This same population, however, participates in the input-output transformations responsible for computation. In a real sense, computation and representation are inseparable aspects of neural activity and there are interpretive dangers in focusing exclusively on either. Given these challenges, it is natural to question the extent to which a mathematical framework developed to help optimize data transfer and storage in engineered telecommunication systems can provide insight into brain function. This review has highlighted some of the difficulties in applying information theory to the visual cortex, and it seems unlikely that this (or perhaps any) theory will fully explain the brain&#x2019;s complex neurobiology.</p>
<p>That said, information theory provides a useful framework to understand how evolutionary pressure toward efficient resource utilization can create predictive coding schemes with an intrinsic role for time. The complex, spatiotemporal distribution of visual information means that if the brain uses an efficient coding strategy anywhere, visual areas are an ideal candidate. Natural visual scenes exhibit autocorrelations that are useful for implying causality and predicting the future or reconstructing the past. A key insight is that a drive toward efficiency encourages temporal relationships to be represented in the neural code. Efficient coding theory also implies that there ought to be selective pressure to learn approximate space-time distributions over natural visual inputs and provides an account of how sensory data ought to be encoded. In particular, neuroscientists may expect to find evidence that data is compressed by removal of predictable spatial and temporal information, thus displaying a degree of spatial and temporal invariance.</p>
<p>The theory also implies that unexpected or unpredictable patterns ought to elicit error signals that would most likely be coded by <italic>increased</italic> firing rates at either the individual neuron or population level (e.g., an unexpected stimulus could also increase the size of the response population). Neurons in the retina, dLGN, and V1 all show functional properties consistent with this hypothesis (as reviewed above) and higher visual areas may be consistent with this theory as well (<xref ref-type="bibr" rid="B70">Jehee et al., 2006</xref>; <xref ref-type="bibr" rid="B18">Beyeler et al., 2016</xref>; <xref ref-type="bibr" rid="B109">Piasini et al., 2021</xref>). An important thing to note is that predictions are based on the environmental statistics responsible for creating the internal model, and it is not clear what to expect when the system is challenged by inputs with different statistics. This implies that experiments testing neural coding in the visual system should either use stimuli with naturalistic statistics or incorporate a period of training sufficient to encode new statistics into the neural circuits before looking for evidence of predictive processing.</p>
<p>Based on the current state of the field, there are many open questions for future research. In our opinion, one of the first steps should be to more fully characterize the visual system&#x2019;s model of the visual environment. To what extent does that model incorporate the temporal dimension? On what timescales? Does the system predict the sensory consequences of the animal&#x2019;s behavior (i.e., is it a joint model of inputs and outputs)? To what extent is the model capable of incorporating new statistics? It is generally a good idea to dissociate characterizations of the modeled environmental distribution from determinations of whether that distribution is efficiently encoded (not least because there are multiple possible dimensions along which efficiency could be measured). Most experimental stimuli include spatiotemporal content that is distinct from the animal&#x2019;s normal perceptual experience. If the visual system cares about both space and time, and is flexible enough to learn, then presentation of these novel stimuli ought to induce a learning process. Another important experimental goal is therefore to characterize the extent to which the visual system can learn novel distributions, the timescale over which this learning occurs, and the overlap with known plasticity mechanisms.</p>
<p>We have focused this review largely on work in the early rodent visual system, but there is a larger body of literature relevant to this discussion in other brain regions and model systems, and other experimental paradigms, that could be adapted to address the issue; for example, the long-standing hypothesis that &#x201C;what&#x201D; and &#x201C;where&#x201D; information are processed in parallel ventral and dorsal pathways in primates (<xref ref-type="bibr" rid="B146">Ungerleider and Mishkin, 1982</xref>; <xref ref-type="bibr" rid="B58">Goodale and Milner, 1992</xref>; <xref ref-type="bibr" rid="B94">Milner and Goodale, 2008</xref>). A similar division seems to exist in the mouse visual system as well (<xref ref-type="bibr" rid="B87">Marshel et al., 2011</xref>; <xref ref-type="bibr" rid="B54">Garrett et al., 2014</xref>; <xref ref-type="bibr" rid="B7">Bakhtiari et al., 2021</xref>; <xref ref-type="bibr" rid="B130">Siegle et al., 2021</xref>). This functional segregation of space-like and time-like pathways leads to the assumption that while time is explicitly required in the dorsal stream to process motion, in the ventral stream it is useful only to integrate over noisy sensory data. Consequently, many visual processing models work only on static images. This is especially true for object recognition but also models of efficient and predictive coding (<xref ref-type="bibr" rid="B102">Olshausen and Field, 1996</xref>; <xref ref-type="bibr" rid="B16">Bell and Sejnowski, 1997</xref>; <xref ref-type="bibr" rid="B116">Rao and Ballard, 1999</xref>; <xref ref-type="bibr" rid="B117">Riesenhuber and Poggio, 1999</xref>; <xref ref-type="bibr" rid="B25">Carandini et al., 2005</xref>; <xref ref-type="bibr" rid="B160">Yamins et al., 2014</xref>; <xref ref-type="bibr" rid="B24">Cadena et al., 2019</xref>; <xref ref-type="bibr" rid="B122">Sanchez-Giraldo et al., 2019</xref>). There are theoretical arguments, though, that time could be explicitly used for computations in the ventral pathway. For example, temporal information can be explicitly useful for object recognition (<xref ref-type="bibr" rid="B138">Stone, 1998</xref>, <xref ref-type="bibr" rid="B139">1999</xref>), or to identify kinetic borders when camouflaged objects move through the visual environment (<xref ref-type="bibr" rid="B28">Cavanagh and Mather, 1989</xref>; <xref ref-type="bibr" rid="B81">Layton and Yazdanbakhsh, 2015</xref>). This suggests that experiments manipulating temporal expectation could be used in both the ventral and dorsal streams to determine the extent to which object recognition, localization, border assignment, etc. rely on efficient spatiotemporal coding principles. Given the approximate homogeneity of cortical circuits in visual and non-visual areas, it is likely that the principles used to encode visual information will be useful to understand general cortical processing algorithms as well.</p>
</sec>
<sec id="S7">
<title>Author Contributions</title>
<p>BP and JG wrote the manuscript. Both authors contributed to the article and approved the submitted version.</p>
</sec>
<sec id="conf1" sec-type="COI-statement">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec id="pudiscl1" sec-type="disclaimer">
<title>Publisher&#x2019;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
</body>
<back>
<sec id="S8" sec-type="funding-information">
<title>Funding</title>
<p>This work was supported by NEI R01EY030200.</p>
</sec>
<ack>
<p>The authors thank all colleagues in the Gavornik Lab for their insights and support.</p>
</ack>
<ref-list>
<title>References</title>
<ref id="B1"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Allman</surname> <given-names>M. J.</given-names></name> <name><surname>Teki</surname> <given-names>S.</given-names></name> <name><surname>Griffiths</surname> <given-names>T. D.</given-names></name> <name><surname>Meck</surname> <given-names>W. H.</given-names></name></person-group> (<year>2014</year>). <article-title>Properties of the internal clock: first- and second-order principles of subjective time.</article-title> <source><italic>Annu. Rev. Psychol.</italic></source> <volume>65</volume> <fpage>743</fpage>&#x2013;<lpage>771</lpage>. <pub-id pub-id-type="doi">10.1146/annurev-psych-010213-115117</pub-id> <pub-id pub-id-type="pmid">24050187</pub-id></citation></ref>
<ref id="B2"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Atick</surname> <given-names>J. J.</given-names></name></person-group> (<year>1992</year>). <article-title>Could information theory provide an ecological theory of sensory processing?</article-title> <source><italic>Netw. Comput. Neural Syst.</italic></source> <volume>22</volume> <fpage>213</fpage>&#x2013;<lpage>251</lpage>. <pub-id pub-id-type="doi">10.3109/0954898X.2011.638888</pub-id> <pub-id pub-id-type="pmid">22149669</pub-id></citation></ref>
<ref id="B3"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Atick</surname> <given-names>J. J.</given-names></name> <name><surname>Redlich</surname> <given-names>A. N.</given-names></name></person-group> (<year>1992</year>). <article-title>What does the retina know about natural scenes?</article-title> <source><italic>Neural Comput.</italic></source> <volume>4</volume> <fpage>196</fpage>&#x2013;<lpage>210</lpage>.</citation></ref>
<ref id="B4"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Attinger</surname> <given-names>A.</given-names></name> <name><surname>Wang</surname> <given-names>B.</given-names></name> <name><surname>Keller</surname> <given-names>G. B.</given-names></name></person-group> (<year>2017</year>). <article-title>Visuomotor coupling shapes the functional development of mouse visual cortex.</article-title> <source><italic>Cell</italic></source> <volume>169</volume> <fpage>1291.e14</fpage>&#x2013;<lpage>1302.e14</lpage>. <pub-id pub-id-type="doi">10.1016/j.cell.2017.05.023</pub-id> <pub-id pub-id-type="pmid">28602353</pub-id></citation></ref>
<ref id="B5"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Attneave</surname> <given-names>F.</given-names></name></person-group> (<year>1954</year>). <article-title>Some informational aspects of visual perception.</article-title> <source><italic>Psychol. Rev.</italic></source> <volume>61</volume> <fpage>183</fpage>&#x2013;<lpage>193</lpage>.</citation></ref>
<ref id="B6"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Attwell</surname> <given-names>D.</given-names></name> <name><surname>Laughlin</surname> <given-names>S. B.</given-names></name></person-group> (<year>2001</year>). <article-title>An energy budget for signaling in the grey matter of the brain.</article-title> <source><italic>J. Cerebr. Blood Metab.</italic></source> <volume>21</volume> <fpage>1133</fpage>&#x2013;<lpage>1145</lpage>.</citation></ref>
<ref id="B7"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bakhtiari</surname> <given-names>S.</given-names></name> <name><surname>Mineault</surname> <given-names>P.</given-names></name> <name><surname>Lillicrap</surname> <given-names>T.</given-names></name> <name><surname>Pack</surname> <given-names>C. C.</given-names></name> <name><surname>Richards</surname> <given-names>B. A.</given-names></name></person-group> (<year>2021</year>). <article-title>The functional specialization of visual cortex emerges from training parallel pathways with self-supervised predictive learning.</article-title> <source><italic>bioRxiv</italic></source> [<comment>Preprint</comment>]. <pub-id pub-id-type="doi">10.1101/2021.06.18.448989</pub-id></citation></ref>
<ref id="B8"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Balasubramaniam</surname> <given-names>R.</given-names></name> <name><surname>Haegens</surname> <given-names>S.</given-names></name> <name><surname>Jazayeri</surname> <given-names>M.</given-names></name> <name><surname>Merchant</surname> <given-names>H.</given-names></name> <name><surname>Sternad</surname> <given-names>D.</given-names></name> <name><surname>Song</surname> <given-names>J. H.</given-names></name></person-group> (<year>2021</year>). <article-title>Neural encoding and representation of time for sensorimotor control and learning.</article-title> <source><italic>J. Neurosci.</italic></source> <volume>41</volume> <fpage>866</fpage>&#x2013;<lpage>872</lpage>. <pub-id pub-id-type="doi">10.1523/JNEUROSCI.1652-20.2020</pub-id> <pub-id pub-id-type="pmid">33380468</pub-id></citation></ref>
<ref id="B9"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Balasubramanian</surname> <given-names>V.</given-names></name> <name><surname>Kimber</surname> <given-names>D.</given-names></name> <name><surname>Berry</surname> <given-names>M. J.</given-names></name></person-group> (<year>2001</year>). <article-title>Metabolically efficient information processing.</article-title> <source><italic>Neural Comput.</italic></source> <volume>13</volume> <fpage>799</fpage>&#x2013;<lpage>815</lpage>. <pub-id pub-id-type="doi">10.1162/089976601300014358</pub-id> <pub-id pub-id-type="pmid">11255570</pub-id></citation></ref>
<ref id="B10"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Balasubramanian</surname> <given-names>V.</given-names></name> <name><surname>Sterling</surname> <given-names>P.</given-names></name></person-group> (<year>2009</year>). <article-title>Receptive fields and functional architecture in the retina.</article-title> <source><italic>J. Physiol.</italic></source> <volume>587</volume>(<issue>Pt 12</issue>), <fpage>2753</fpage>&#x2013;<lpage>2767</lpage>. <pub-id pub-id-type="doi">10.1113/jphysiol.2009.170704</pub-id> <pub-id pub-id-type="pmid">19525561</pub-id></citation></ref>
<ref id="B11"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Barlow</surname> <given-names>H.</given-names></name></person-group> (<year>1990</year>). <article-title>Conditions for versatile learning, Helmholtz&#x2019;s unconscious inference, and the task of perception.</article-title> <source><italic>Vis. Res.</italic></source> <volume>30</volume> <fpage>1561</fpage>&#x2013;<lpage>1571</lpage>. <pub-id pub-id-type="doi">10.1016/0042-6989(90)90144-A</pub-id></citation></ref>
<ref id="B12"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Barlow</surname> <given-names>H. B.</given-names></name></person-group> (<year>1961</year>). <source><italic>Possible Principles Underlying the Transformations of Sensory Messages.</italic></source> <publisher-loc>Cambridge, MA</publisher-loc>: <publisher-name>The MIT Press</publisher-name>.</citation></ref>
<ref id="B13"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Barlow</surname> <given-names>H. B.</given-names></name></person-group> (<year>1989</year>). <article-title>Unsupervised learning.</article-title> <source><italic>Neural Comput.</italic></source> <volume>1</volume> <fpage>295</fpage>&#x2013;<lpage>311</lpage>. <pub-id pub-id-type="doi">10.1162/neco.1989.1.3.295</pub-id></citation></ref>
<ref id="B14"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Barlow</surname> <given-names>H. B.</given-names></name> <name><surname>F&#x00F6;ldi&#x00E1;k</surname> <given-names>P.</given-names></name></person-group> (<year>1989</year>). &#x201C;<article-title>Adaptation and decorrelation in the cortex</article-title>,&#x201D; in <source><italic>The Computing Neuron</italic></source>, <role>eds</role> <person-group person-group-type="editor"><name><surname>Durbin</surname> <given-names>R.</given-names></name> <name><surname>Miall</surname> <given-names>C.</given-names></name> <name><surname>Mitchison</surname> <given-names>G.</given-names></name></person-group> (<publisher-loc>Wokingham</publisher-loc>: <publisher-name>Addison-Wesley</publisher-name>), <fpage>54</fpage>&#x2013;<lpage>72</lpage>.</citation></ref>
<ref id="B15"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bastos</surname> <given-names>A. M.</given-names></name> <name><surname>Usrey</surname> <given-names>W. M.</given-names></name> <name><surname>Adams</surname> <given-names>R. A.</given-names></name> <name><surname>Mangun</surname> <given-names>G. R.</given-names></name> <name><surname>Fries</surname> <given-names>P.</given-names></name> <name><surname>Friston</surname> <given-names>K. J.</given-names></name></person-group> (<year>2012</year>). <article-title>Canonical microcircuits for predictive coding.</article-title> <source><italic>Neuron</italic></source> <volume>76</volume> <fpage>695</fpage>&#x2013;<lpage>711</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuron.2012.10.038</pub-id> <pub-id pub-id-type="pmid">23177956</pub-id></citation></ref>
<ref id="B16"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bell</surname> <given-names>A. J.</given-names></name> <name><surname>Sejnowski</surname> <given-names>T. J.</given-names></name></person-group> (<year>1997</year>). <article-title>Edges are the &#x201C;Independent components&#x201D; of natural scenes.</article-title> <source><italic>Adv. Neural Inform. Process. Syst.</italic></source> <volume>96</volume> <fpage>831</fpage>&#x2013;<lpage>837</lpage>.</citation></ref>
<ref id="B17"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Benucci</surname> <given-names>A.</given-names></name> <name><surname>Ringach</surname> <given-names>D. L.</given-names></name> <name><surname>Carandini</surname> <given-names>M.</given-names></name></person-group> (<year>2009</year>). <article-title>Coding of stimulus sequences by population responses in visual cortex.</article-title> <source><italic>Nat. Neurosci.</italic></source> <volume>12</volume> <fpage>1317</fpage>&#x2013;<lpage>1324</lpage>. <pub-id pub-id-type="doi">10.1038/nn.2398</pub-id> <pub-id pub-id-type="pmid">19749748</pub-id></citation></ref>
<ref id="B18"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Beyeler</surname> <given-names>M.</given-names></name> <name><surname>Dutt</surname> <given-names>N.</given-names></name> <name><surname>Krichmar</surname> <given-names>J. L.</given-names></name></person-group> (<year>2016</year>). <article-title>3D visual response properties of MSTd emerge from an efficient, sparse population code.</article-title> <source><italic>J. Neurosci.</italic></source> <volume>36</volume> <fpage>8399</fpage>&#x2013;<lpage>8415</lpage>. <pub-id pub-id-type="doi">10.1523/JNEUROSCI.0396-16.2016</pub-id> <pub-id pub-id-type="pmid">27511012</pub-id></citation></ref>
<ref id="B19"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bialek</surname> <given-names>W.</given-names></name> <name><surname>De Ruyter Van Steveninck</surname> <given-names>R. R.</given-names></name> <name><surname>Tishby</surname> <given-names>N.</given-names></name></person-group> (<year>2007</year>). <article-title>Efficient representation as a design principle for neural coding and computation.</article-title> <source><italic>arXiv</italic></source> [<comment>Preprint</comment>]. Available online at: <ext-link ext-link-type="uri" xlink:href="https://arxiv.org/abs/0712.4381">https://arxiv.org/abs/0712.4381</ext-link> <comment>(accessed February 10, 2022)</comment>.</citation></ref>
<ref id="B20"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Blitz</surname> <given-names>D. M.</given-names></name> <name><surname>Foster</surname> <given-names>K. A.</given-names></name> <name><surname>Regehr</surname> <given-names>W. G.</given-names></name></person-group> (<year>2004</year>). <article-title>Short-term synaptic plasticity: a comparison of two synapses.</article-title> <source><italic>Nat. Rev. Neurosci.</italic></source> <volume>5</volume> <fpage>630</fpage>&#x2013;<lpage>640</lpage>. <pub-id pub-id-type="doi">10.1038/nrn1475</pub-id> <pub-id pub-id-type="pmid">15263893</pub-id></citation></ref>
<ref id="B21"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Borghuis</surname> <given-names>B. G.</given-names></name> <name><surname>Ratliff</surname> <given-names>C. P.</given-names></name> <name><surname>Smith</surname> <given-names>R. G.</given-names></name> <name><surname>Sterling</surname> <given-names>P.</given-names></name> <name><surname>Balasubramanian</surname> <given-names>V.</given-names></name></person-group> (<year>2008</year>). <article-title>Design of a neuronal array.</article-title> <source><italic>J. Neurosci.</italic></source> <volume>28</volume> <fpage>3178</fpage>&#x2013;<lpage>3189</lpage>. <pub-id pub-id-type="doi">10.1523/JNEUROSCI.5259-07.2008</pub-id> <pub-id pub-id-type="pmid">18354021</pub-id></citation></ref>
<ref id="B22"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Brenner</surname> <given-names>N.</given-names></name> <name><surname>Bialek</surname> <given-names>W.</given-names></name> <name><surname>De Ruyter Van Steveninck</surname> <given-names>R.</given-names></name></person-group> (<year>2000</year>). <article-title>Adaptive rescaling maximizes information transmission.</article-title> <source><italic>Neuron</italic></source> <volume>26</volume> <fpage>695</fpage>&#x2013;<lpage>702</lpage>. <pub-id pub-id-type="doi">10.1016/S0896-6273(00)81205-2</pub-id></citation></ref>
<ref id="B23"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Buonomano</surname> <given-names>D. V.</given-names></name> <name><surname>Laje</surname> <given-names>R.</given-names></name></person-group> (<year>2010</year>). <article-title>Population clocks: motor timing with neural dynamics.</article-title> <source><italic>Trends Cogn. Sci.</italic></source> <volume>14</volume> <fpage>520</fpage>&#x2013;<lpage>527</lpage>.</citation></ref>
<ref id="B24"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cadena</surname> <given-names>S. A.</given-names></name> <name><surname>Denfield</surname> <given-names>G. H.</given-names></name> <name><surname>Walker</surname> <given-names>E. Y.</given-names></name> <name><surname>Gatys</surname> <given-names>L. A.</given-names></name> <name><surname>Tolias</surname> <given-names>A. S.</given-names></name> <name><surname>Bethge</surname> <given-names>M.</given-names></name><etal/></person-group> (<year>2019</year>). <article-title>Deep convolutional models improve predictions of macaque V1 responses to natural images.</article-title> <source><italic>PLoS Comput. Biol.</italic></source> <volume>15</volume>:<fpage>e1006897</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pcbi.1006897</pub-id> <pub-id pub-id-type="pmid">31013278</pub-id></citation></ref>
<ref id="B25"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Carandini</surname> <given-names>M.</given-names></name> <name><surname>Demb</surname> <given-names>J. B.</given-names></name> <name><surname>Mante</surname> <given-names>V.</given-names></name> <name><surname>Tolhurst</surname> <given-names>D. J.</given-names></name> <name><surname>Dan</surname> <given-names>Y.</given-names></name> <name><surname>Olshausen</surname> <given-names>B. A.</given-names></name><etal/></person-group> (<year>2005</year>). <article-title>Do we know what the early visual system does?</article-title> <source><italic>J. Neurosci.</italic></source> <volume>25</volume> <fpage>10577</fpage>&#x2013;<lpage>10597</lpage>. <pub-id pub-id-type="doi">10.1523/JNEUROSCI.3726-05.2005</pub-id> <pub-id pub-id-type="pmid">16291931</pub-id></citation></ref>
<ref id="B26"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Carandini</surname> <given-names>M.</given-names></name> <name><surname>Ferster</surname> <given-names>D.</given-names></name></person-group> (<year>1997</year>). <article-title>A tonic hyperpolarization underlying contrast adaptation in cat visual cortex.</article-title> <source><italic>Science</italic></source> <volume>276</volume> <fpage>949</fpage>&#x2013;<lpage>952</lpage>. <pub-id pub-id-type="doi">10.1126/science.276.5314.949</pub-id> <pub-id pub-id-type="pmid">9139658</pub-id></citation></ref>
<ref id="B27"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Carandini</surname> <given-names>M.</given-names></name> <name><surname>Heeger</surname> <given-names>D. J.</given-names></name></person-group> (<year>2012</year>). <article-title>Normalization as a canonical neural computation.</article-title> <source><italic>Nat. Rev. Neurosci.</italic></source> <volume>13</volume> <fpage>51</fpage>&#x2013;<lpage>62</lpage>. <pub-id pub-id-type="doi">10.1038/nrn3136</pub-id> <pub-id pub-id-type="pmid">22108672</pub-id></citation></ref>
<ref id="B28"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cavanagh</surname> <given-names>P.</given-names></name> <name><surname>Mather</surname> <given-names>G.</given-names></name></person-group> (<year>1989</year>). <article-title>Motion: the long and short of it.</article-title> <source><italic>Spatial Vis.</italic></source> <volume>4</volume> <fpage>103</fpage>&#x2013;<lpage>129</lpage>. <pub-id pub-id-type="doi">10.1163/156856889x00077</pub-id> <pub-id pub-id-type="pmid">2487159</pub-id></citation></ref>
<ref id="B29"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chalk</surname> <given-names>M.</given-names></name> <name><surname>Marre</surname> <given-names>O.</given-names></name> <name><surname>Tka&#x010D;ik</surname> <given-names>G.</given-names></name></person-group> (<year>2018</year>). <article-title>Toward a unified theory of efficient, predictive, and sparse coding.</article-title> <source><italic>Proc. Natl. Acad. Sci. U.S.A.</italic></source> <volume>115</volume> <fpage>186</fpage>&#x2013;<lpage>191</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.1711114115</pub-id> <pub-id pub-id-type="pmid">29259111</pub-id></citation></ref>
<ref id="B30"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chubykin</surname> <given-names>A. A.</given-names></name> <name><surname>Roach</surname> <given-names>E. B.</given-names></name> <name><surname>Bear</surname> <given-names>M. F.</given-names></name> <name><surname>Shuler</surname> <given-names>M. G. H.</given-names></name></person-group> (<year>2013</year>). <article-title>A cholinergic mechanism for reward timing within primary visual cortex.</article-title> <source><italic>Neuron</italic></source> <volume>77</volume> <fpage>723</fpage>&#x2013;<lpage>735</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuron.2012.12.039</pub-id> <pub-id pub-id-type="pmid">23439124</pub-id></citation></ref>
<ref id="B31"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chung</surname> <given-names>S.</given-names></name> <name><surname>Li</surname> <given-names>X.</given-names></name> <name><surname>Nelson</surname> <given-names>S. B.</given-names></name></person-group> (<year>2002</year>). <article-title>Short-term depression at thalamocortical synapses contributes to rapid adaptation of cortical sensory responses in vivo.</article-title> <source><italic>Neuron</italic></source> <volume>34</volume> <fpage>437</fpage>&#x2013;<lpage>446</lpage>. <pub-id pub-id-type="doi">10.1016/S0896-6273(02)00659-1</pub-id></citation></ref>
<ref id="B32"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cover</surname> <given-names>T. M.</given-names></name> <name><surname>Thomas</surname> <given-names>J. A.</given-names></name></person-group> (<year>2006a</year>). &#x201C;<article-title>Data compression</article-title>,&#x201D; in <source><italic>Elements of Information Theory</italic></source>, <role>eds</role> <person-group person-group-type="editor"><name><surname>Thomas</surname> <given-names>J. A.</given-names></name> <name><surname>Cover</surname> <given-names>T. M.</given-names></name></person-group> (<publisher-loc>Hoboken, NJ</publisher-loc>: <publisher-name>Wiley</publisher-name>), <fpage>103</fpage>&#x2013;<lpage>157</lpage>.</citation></ref>
<ref id="B33"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cover</surname> <given-names>T. M.</given-names></name> <name><surname>Thomas</surname> <given-names>J. A.</given-names></name></person-group> (<year>2006b</year>). &#x201C;<article-title>Elements of information theory</article-title>,&#x201D; in <source><italic>Wiley Series in Telecommunications and Signal Processing</italic></source>, <edition>2nd Edn</edition>, <role>eds</role> <person-group person-group-type="editor"><name><surname>Gagliardi</surname> <given-names>R. M.</given-names></name> <name><surname>Kuo</surname> <given-names>S. M.</given-names></name> <name><surname>Morgan</surname> <given-names>D. R.</given-names></name></person-group> (<publisher-loc>Hoboken, NJ</publisher-loc>: <publisher-name>Wiley</publisher-name>).</citation></ref>
<ref id="B34"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cover</surname> <given-names>T. M.</given-names></name> <name><surname>Thomas</surname> <given-names>J. A.</given-names></name></person-group> (<year>2006c</year>). &#x201C;<article-title>Entropy, relative entropy, and mutual information</article-title>,&#x201D; in <source><italic>Elements of Information Theory</italic></source>, <edition>Second Edn</edition>, <role>eds</role> <person-group person-group-type="editor"><name><surname>Thomas</surname> <given-names>J. A.</given-names></name> <name><surname>Cover</surname> <given-names>T. M.</given-names></name></person-group> (<publisher-loc>Hoboken, NJ</publisher-loc>: <publisher-name>Wiley</publisher-name>), <fpage>13</fpage>&#x2013;<lpage>55</lpage>.</citation></ref>
<ref id="B35"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cover</surname> <given-names>T. M.</given-names></name> <name><surname>Thomas</surname> <given-names>J. A.</given-names></name></person-group> (<year>2006d</year>). &#x201C;<article-title>Entropy rates of a stochastic process</article-title>,&#x201D; in <source><italic>Elements of Information Theory</italic></source>, <edition>Second Edn</edition>, <role>eds</role> <person-group person-group-type="editor"><name><surname>Thomas</surname> <given-names>J. A.</given-names></name> <name><surname>Cover</surname> <given-names>T. M.</given-names></name></person-group> (<publisher-loc>Hoboken, NJ</publisher-loc>: <publisher-name>Wiley</publisher-name>), <fpage>71</fpage>&#x2013;<lpage>102</lpage>.</citation></ref>
<ref id="B36"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Creutzig</surname> <given-names>F.</given-names></name> <name><surname>Sprekeler</surname> <given-names>H.</given-names></name></person-group> (<year>2008</year>). <article-title>Predictive coding and the slowness principle: an information-theoretic approach.</article-title> <source><italic>Neural Comput.</italic></source> <volume>20</volume> <fpage>1026</fpage>&#x2013;<lpage>1041</lpage>. <pub-id pub-id-type="doi">10.1162/neco.2008.01-07-455</pub-id> <pub-id pub-id-type="pmid">18085988</pub-id></citation></ref>
<ref id="B37"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dan</surname> <given-names>Y.</given-names></name> <name><surname>Atick</surname> <given-names>J. J.</given-names></name> <name><surname>Reid</surname> <given-names>R. C.</given-names></name></person-group> (<year>1996</year>). <article-title>Efficient coding of natural scenes in the lateral geniculate nucleus: experimental test of a computational theory.</article-title> <source><italic>J. Neurosci.</italic></source> <volume>16</volume> <fpage>3351</fpage>&#x2013;<lpage>3362</lpage>. <pub-id pub-id-type="doi">10.1523/jneurosci.16-10-03351.1996</pub-id> <pub-id pub-id-type="pmid">8627371</pub-id></citation></ref>
<ref id="B38"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>De Vries</surname> <given-names>E. J.</given-names></name> <name><surname>Lecoq</surname> <given-names>J. A.</given-names></name> <name><surname>Buice</surname> <given-names>M. A.</given-names></name> <name><surname>De Vries</surname> <given-names>S. E. J.</given-names></name> <name><surname>Groblewski</surname> <given-names>P. A.</given-names></name> <name><surname>Ocker</surname> <given-names>G. K.</given-names></name><etal/></person-group> (<year>2020</year>). <article-title>A large-scale standardized physiological survey reveals functional organization of the mouse visual cortex.</article-title> <source><italic>Nat. Neurosci.</italic></source> <volume>23</volume> <fpage>138</fpage>&#x2013;<lpage>151</lpage>. <pub-id pub-id-type="doi">10.1038/s41593-019-0550-9</pub-id> <pub-id pub-id-type="pmid">31844315</pub-id></citation></ref>
<ref id="B39"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Denham</surname> <given-names>S. L.</given-names></name> <name><surname>Winkler</surname> <given-names>I.</given-names></name></person-group> (<year>2020</year>). <article-title>Predictive coding in auditory perception: challenges and unresolved questions.</article-title> <source><italic>Eur. J. Neurosci.</italic></source> <volume>51</volume> <fpage>1151</fpage>&#x2013;<lpage>1160</lpage>. <pub-id pub-id-type="doi">10.1111/ejn.13802</pub-id> <pub-id pub-id-type="pmid">29250827</pub-id></citation></ref>
<ref id="B40"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Diamanti</surname> <given-names>E. M.</given-names></name> <name><surname>Reddy</surname> <given-names>C. B.</given-names></name> <name><surname>Schr&#x00F6;der</surname> <given-names>S.</given-names></name> <name><surname>Muzzu</surname> <given-names>T.</given-names></name> <name><surname>Harris</surname> <given-names>K. D.</given-names></name> <name><surname>Saleem</surname> <given-names>A. B.</given-names></name><etal/></person-group> (<year>2021</year>). <article-title>Spatial modulation of visual responses arises in cortex with active navigation.</article-title> <source><italic>eLife</italic></source> <volume>10</volume>:<fpage>e63705</fpage>. <pub-id pub-id-type="doi">10.7554/elife.63705</pub-id> <pub-id pub-id-type="pmid">33538692</pub-id></citation></ref>
<ref id="B41"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Doi</surname> <given-names>E.</given-names></name> <name><surname>Gauthier</surname> <given-names>J. L.</given-names></name> <name><surname>Field</surname> <given-names>G. D.</given-names></name> <name><surname>Shlens</surname> <given-names>J.</given-names></name> <name><surname>Sher</surname> <given-names>A.</given-names></name> <name><surname>Greschner</surname> <given-names>M.</given-names></name><etal/></person-group> (<year>2012</year>). <article-title>Efficient coding of spatial information in the primate retina.</article-title> <source><italic>J. Neurosci.</italic></source> <volume>32</volume> <fpage>16256</fpage>&#x2013;<lpage>16264</lpage>. <pub-id pub-id-type="doi">10.1523/JNEUROSCI.4036-12.2012</pub-id> <pub-id pub-id-type="pmid">23152609</pub-id></citation></ref>
<ref id="B42"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dong</surname> <given-names>D.</given-names></name> <name><surname>Atick</surname> <given-names>J.</given-names></name></person-group> (<year>1995</year>). <article-title>Temporal decorrelation: a theory of lagged and nonlagged responses in the lateral geniculate nucleus.</article-title> <source><italic>Netw. Comput. Neural Syst.</italic></source> <volume>6</volume> <fpage>159</fpage>&#x2013;<lpage>178</lpage>. <pub-id pub-id-type="doi">10.1088/0954-898x/6/2/003</pub-id></citation></ref>
<ref id="B43"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dyballa</surname> <given-names>L.</given-names></name> <name><surname>Hoseini</surname> <given-names>M. S.</given-names></name> <name><surname>Dadarlat</surname> <given-names>M. C.</given-names></name> <name><surname>Zucker</surname> <given-names>S. W.</given-names></name> <name><surname>Stryker</surname> <given-names>M. P.</given-names></name></person-group> (<year>2018</year>). <article-title>Flow stimuli reveal ecologically appropriate responses in mouse visual cortex.</article-title> <source><italic>Proc. Natl. Acad. Sci. U.S.A.</italic></source> <volume>115</volume> <fpage>11304</fpage>&#x2013;<lpage>11309</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.1811265115</pub-id> <pub-id pub-id-type="pmid">30327345</pub-id></citation></ref>
<ref id="B44"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Eckmann</surname> <given-names>S.</given-names></name> <name><surname>Klimmasch</surname> <given-names>L.</given-names></name> <name><surname>Shi</surname> <given-names>B. E.</given-names></name> <name><surname>Triesch</surname> <given-names>J.</given-names></name></person-group> (<year>2020</year>). <article-title>Active efficient coding explains the development of binocular vision and its failure in amblyopia.</article-title> <source><italic>Proc. Natl. Acad. Sci. U.S.A.</italic></source> <volume>117</volume> <fpage>6156</fpage>&#x2013;<lpage>6162</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.1908100117</pub-id> <pub-id pub-id-type="pmid">32123102</pub-id></citation></ref>
<ref id="B45"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Eichenbaum</surname> <given-names>H.</given-names></name></person-group> (<year>2014</year>). <article-title>Time cells in the hippocampus: a new dimension for mapping memories.</article-title> <source><italic>Nat. Rev. Neurosci.</italic></source> <volume>15</volume> <fpage>732</fpage>&#x2013;<lpage>744</lpage>. <pub-id pub-id-type="doi">10.1038/nrn3827</pub-id> <pub-id pub-id-type="pmid">25269553</pub-id></citation></ref>
<ref id="B46"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ekman</surname> <given-names>M.</given-names></name> <name><surname>Kok</surname> <given-names>P.</given-names></name> <name><surname>De Lange</surname> <given-names>F. P.</given-names></name></person-group> (<year>2017</year>). <article-title>Time-compressed preplay of anticipated events in human primary visual cortex.</article-title> <source><italic>Nat. Commun.</italic></source> <volume>8</volume>:<fpage>15276</fpage>. <pub-id pub-id-type="doi">10.1038/ncomms15276</pub-id> <pub-id pub-id-type="pmid">28534870</pub-id></citation></ref>
<ref id="B47"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Elias</surname> <given-names>P.</given-names></name></person-group> (<year>1955</year>). <article-title>Predictive coding&#x2014;part I &#x0026; II.</article-title> <source><italic>IRE Trans. Inform. Theory</italic></source> <volume>1</volume> <fpage>16</fpage>&#x2013;<lpage>33</lpage>. <pub-id pub-id-type="doi">10.1109/TIT.1955.1055126</pub-id></citation></ref>
<ref id="B48"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fairhall</surname> <given-names>A. L.</given-names></name> <name><surname>Lewen</surname> <given-names>G. D.</given-names></name> <name><surname>Bialek</surname> <given-names>W.</given-names></name> <name><surname>De Ruyter Van Steveninck</surname> <given-names>R. R.</given-names></name></person-group> (<year>2001</year>). <article-title>Efficiency and ambiguity in an adaptive neural code.</article-title> <source><italic>Nature</italic></source> <volume>412</volume> <fpage>787</fpage>&#x2013;<lpage>792</lpage>. <pub-id pub-id-type="doi">10.1038/35090500</pub-id> <pub-id pub-id-type="pmid">11518957</pub-id></citation></ref>
<ref id="B49"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Finnerty</surname> <given-names>G. T.</given-names></name> <name><surname>Shadlen</surname> <given-names>M. N.</given-names></name> <name><surname>Jazayeri</surname> <given-names>M.</given-names></name> <name><surname>Nobre</surname> <given-names>A. C.</given-names></name> <name><surname>Buonomano</surname> <given-names>D. V.</given-names></name></person-group> (<year>2015</year>). <article-title>Time in cortical circuits.</article-title> <source><italic>J. Neurosci.</italic></source> <volume>35</volume> <fpage>13912</fpage>&#x2013;<lpage>13916</lpage>. <pub-id pub-id-type="doi">10.1523/JNEUROSCI.2654-15.2015</pub-id> <pub-id pub-id-type="pmid">26468192</pub-id></citation></ref>
<ref id="B50"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Finnie</surname> <given-names>P. S. B.</given-names></name> <name><surname>Komorowski</surname> <given-names>R. W.</given-names></name> <name><surname>Bear</surname> <given-names>M. F.</given-names></name></person-group> (<year>2021</year>). <article-title>The spatiotemporal organization of experience dictates hippocampal involvement in primary visual cortical plasticity.</article-title> <source><italic>Curr. Biol.</italic></source> <volume>31</volume> <fpage>3996</fpage>&#x2013;<lpage>4008</lpage>. <pub-id pub-id-type="doi">10.1101/2021.03.01.433430</pub-id></citation></ref>
<ref id="B51"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fiser</surname> <given-names>A.</given-names></name> <name><surname>Mahringer</surname> <given-names>D.</given-names></name> <name><surname>Oyibo</surname> <given-names>H. K.</given-names></name> <name><surname>Petersen</surname> <given-names>A. V.</given-names></name> <name><surname>Leinweber</surname> <given-names>M.</given-names></name> <name><surname>Keller</surname> <given-names>G. B.</given-names></name></person-group> (<year>2016</year>). <article-title>Experience-dependent spatial expectations in mouse visual cortex.</article-title> <source><italic>Nat. Neurosci.</italic></source> <volume>19</volume> <fpage>1658</fpage>&#x2013;<lpage>1664</lpage>. <pub-id pub-id-type="doi">10.1038/nn.4385</pub-id> <pub-id pub-id-type="pmid">27618309</pub-id></citation></ref>
<ref id="B52"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Friston</surname> <given-names>K.</given-names></name></person-group> (<year>2005</year>). <article-title>A theory of cortical responses.</article-title> <source><italic>Philos. Trans. R. Soc. B Biol. Sci.</italic></source> <volume>360</volume> <fpage>815</fpage>&#x2013;<lpage>836</lpage>. <pub-id pub-id-type="doi">10.1098/rstb.2005.1622</pub-id> <pub-id pub-id-type="pmid">15937014</pub-id></citation></ref>
<ref id="B53"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Garrett</surname> <given-names>M. E.</given-names></name> <name><surname>Manavi</surname> <given-names>S.</given-names></name> <name><surname>Roll</surname> <given-names>K.</given-names></name> <name><surname>Ollerenshaw</surname> <given-names>D. R.</given-names></name> <name><surname>Groblewski</surname> <given-names>P. A.</given-names></name> <name><surname>Kiggins</surname> <given-names>J.</given-names></name><etal/></person-group> (<year>2020</year>). <article-title>Experience shapes activity dynamics and stimulus coding of VIP inhibitory and excitatory cells in visual cortex.</article-title> <source><italic>eLife</italic></source> <volume>9</volume>:<fpage>e50340</fpage>. <pub-id pub-id-type="doi">10.1101/686063</pub-id></citation></ref>
<ref id="B54"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Garrett</surname> <given-names>M. E.</given-names></name> <name><surname>Nauhaus</surname> <given-names>I.</given-names></name> <name><surname>Marshel</surname> <given-names>J. H.</given-names></name> <name><surname>Callaway</surname> <given-names>E. M.</given-names></name></person-group> (<year>2014</year>). <article-title>Topography and areal organization of mouse visual cortex.</article-title> <source><italic>J. Neurosci.</italic></source> <volume>34</volume> <fpage>12587</fpage>&#x2013;<lpage>12600</lpage>. <pub-id pub-id-type="doi">10.1523/JNEUROSCI.1124-14.2014</pub-id> <pub-id pub-id-type="pmid">25209296</pub-id></citation></ref>
<ref id="B55"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gavornik</surname> <given-names>J. P.</given-names></name> <name><surname>Bear</surname> <given-names>M. F.</given-names></name></person-group> (<year>2014</year>). <article-title>Learned spatiotemporal sequence recognition and prediction in primary visual cortex.</article-title> <source><italic>Nat. Neurosci.</italic></source> <volume>17</volume> <fpage>732</fpage>&#x2013;<lpage>737</lpage>. <pub-id pub-id-type="doi">10.1038/nn.3683</pub-id> <pub-id pub-id-type="pmid">24657967</pub-id></citation></ref>
<ref id="B56"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gilbert</surname> <given-names>C. D.</given-names></name></person-group> (<year>1977</year>). <article-title>Laminar differences in receptive field properties of cells in cat primary visual cortex.</article-title> <source><italic>J. Physiol.</italic></source> <volume>268</volume> <fpage>391</fpage>&#x2013;<lpage>421</lpage>.</citation></ref>
<ref id="B57"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gjorgjieva</surname> <given-names>J.</given-names></name> <name><surname>Sompolinsky</surname> <given-names>H.</given-names></name> <name><surname>Meister</surname> <given-names>M.</given-names></name></person-group> (<year>2014</year>). <article-title>Benefits of pathway splitting in sensory coding.</article-title> <source><italic>J. Neurosci.</italic></source> <volume>34</volume> <fpage>12127</fpage>&#x2013;<lpage>12144</lpage>. <pub-id pub-id-type="doi">10.1523/JNEUROSCI.1032-14.2014</pub-id> <pub-id pub-id-type="pmid">25186757</pub-id></citation></ref>
<ref id="B58"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Goodale</surname> <given-names>M. A.</given-names></name> <name><surname>Milner</surname> <given-names>A. D.</given-names></name></person-group> (<year>1992</year>). <article-title>Separate visual pathways for perception and action.</article-title> <source><italic>Trends Neurosci.</italic></source> <volume>20</volume> <fpage>20</fpage>&#x2013;<lpage>25</lpage>.</citation></ref>
<ref id="B59"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Grossberg</surname> <given-names>S.</given-names></name></person-group> (<year>2021</year>). <source><italic>Conscious Mind, Resonant Brain?: How Each Brain Makes a Mind.</italic></source> <publisher-loc>New York, NY</publisher-loc>: <publisher-name>Oxford University Press</publisher-name>.</citation></ref>
<ref id="B60"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Guitchounts</surname> <given-names>G.</given-names></name> <name><surname>Mas&#x00ED;s</surname> <given-names>J.</given-names></name> <name><surname>Wolff</surname> <given-names>S. B. E.</given-names></name> <name><surname>Cox</surname> <given-names>D.</given-names></name></person-group> (<year>2020</year>). <article-title>Encoding of 3D head orienting movements in the primary visual cortex.</article-title> <source><italic>Neuron</italic></source> <volume>108</volume> <fpage>512.e4</fpage>&#x2013;<lpage>525.e4</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuron.2020.07.014</pub-id> <pub-id pub-id-type="pmid">32783881</pub-id></citation></ref>
<ref id="B61"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hangya</surname> <given-names>B.</given-names></name> <name><surname>Kepecs</surname> <given-names>A.</given-names></name></person-group> (<year>2015</year>). <article-title>Vision: how to train visual cortex to predict reward time.</article-title> <source><italic>Curr. Biol.</italic></source> <volume>25</volume> <fpage>R490</fpage>&#x2013;<lpage>R492</lpage>. <pub-id pub-id-type="doi">10.1016/j.cub.2015.04.048</pub-id> <pub-id pub-id-type="pmid">26079076</pub-id></citation></ref>
<ref id="B62"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hartveit</surname> <given-names>E.</given-names></name></person-group> (<year>1992</year>). <article-title>Simultaneous recording of lagged and nonlagged cells in the cat dorsal lateral geniculate nucleus.</article-title> <source><italic>Exp. Brain Res.</italic></source> <volume>88</volume> <fpage>229</fpage>&#x2013;<lpage>232</lpage>.</citation></ref>
<ref id="B63"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Heilbron</surname> <given-names>M.</given-names></name> <name><surname>Chait</surname> <given-names>M.</given-names></name></person-group> (<year>2018</year>). <article-title>Great expectations: is there evidence for predictive coding in auditory cortex?</article-title> <source><italic>Neuroscience</italic></source> <volume>389</volume> <fpage>54</fpage>&#x2013;<lpage>73</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroscience.2017.07.061</pub-id> <pub-id pub-id-type="pmid">28782642</pub-id></citation></ref>
<ref id="B64"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>H&#x00E9;naff</surname> <given-names>O. J.</given-names></name> <name><surname>Bai</surname> <given-names>Y.</given-names></name> <name><surname>Charlton</surname> <given-names>J. A.</given-names></name> <name><surname>Nauhaus</surname> <given-names>I.</given-names></name> <name><surname>Simoncelli</surname> <given-names>E. P.</given-names></name> <name><surname>Goris</surname> <given-names>R. L. T.</given-names></name></person-group> (<year>2021</year>). <article-title>Primary visual cortex straightens natural video trajectories.</article-title> <source><italic>Nat. Commun.</italic></source> <volume>12</volume>:<fpage>5982</fpage>. <pub-id pub-id-type="doi">10.1038/s41467-021-25939-z</pub-id> <pub-id pub-id-type="pmid">34645787</pub-id></citation></ref>
<ref id="B65"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>H&#x00E9;naff</surname> <given-names>O. J.</given-names></name> <name><surname>Goris</surname> <given-names>R. L. T.</given-names></name> <name><surname>Simoncelli</surname> <given-names>E. P.</given-names></name></person-group> (<year>2019</year>). <article-title>Perceptual straightening of natural videos.</article-title> <source><italic>Nat. Neurosci.</italic></source> <volume>22</volume> <fpage>984</fpage>&#x2013;<lpage>991</lpage>. <pub-id pub-id-type="doi">10.1038/s41593-019-0377-4</pub-id> <pub-id pub-id-type="pmid">31036946</pub-id></citation></ref>
<ref id="B66"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Homann</surname> <given-names>J.</given-names></name> <name><surname>Koay</surname> <given-names>S. A.</given-names></name> <name><surname>Chen</surname> <given-names>K. S.</given-names></name> <name><surname>Tank</surname> <given-names>D. W.</given-names></name> <name><surname>Berry</surname> <given-names>M. J.</given-names></name></person-group> (<year>2022</year>). <article-title>Novel stimuli evoke excess activity in the mouse primary visual cortex.</article-title> <source><italic>Proc. Natl. Acad. Sci. U.S.A.</italic></source> <volume>119</volume>:<fpage>e2108882119</fpage>. <pub-id pub-id-type="doi">10.1073/pnas.2108882119</pub-id> <pub-id pub-id-type="pmid">35101916</pub-id></citation></ref>
<ref id="B67"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hosoya</surname> <given-names>T.</given-names></name> <name><surname>Baccus</surname> <given-names>S. A.</given-names></name> <name><surname>Meister</surname> <given-names>M.</given-names></name></person-group> (<year>2005</year>). <article-title>Dynamic predictive coding by the retina.</article-title> <source><italic>Nature</italic></source> <volume>436</volume> <fpage>71</fpage>&#x2013;<lpage>77</lpage>. <pub-id pub-id-type="doi">10.1038/nature03689</pub-id> <pub-id pub-id-type="pmid">16001064</pub-id></citation></ref>
<ref id="B68"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Huang</surname> <given-names>Y.</given-names></name> <name><surname>Rao</surname> <given-names>R. P. N.</given-names></name></person-group> (<year>2011</year>). <article-title>Predictive coding.</article-title> <source><italic>Wiley Interdiscip. Rev. Cogn. Sci.</italic></source> <volume>2</volume> <fpage>580</fpage>&#x2013;<lpage>593</lpage>. <pub-id pub-id-type="doi">10.1002/WCS.142</pub-id> <pub-id pub-id-type="pmid">26302308</pub-id></citation></ref>
<ref id="B69"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hubel</surname> <given-names>D. H.</given-names></name> <name><surname>Wiesel</surname> <given-names>T. N.</given-names></name></person-group> (<year>1965</year>). <article-title>Receptive fields and functional architecture in two nonstriate visual areas (18 and 19) of the cat.</article-title> <source><italic>J. Neurophysiol.</italic></source> <volume>28</volume> <fpage>229</fpage>&#x2013;<lpage>289</lpage>. <pub-id pub-id-type="doi">10.1152/jn.1965.28.2.229</pub-id> <pub-id pub-id-type="pmid">14283058</pub-id></citation></ref>
<ref id="B70"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jehee</surname> <given-names>J. F. M.</given-names></name> <name><surname>Rothkopf</surname> <given-names>C.</given-names></name> <name><surname>Beck</surname> <given-names>J. M.</given-names></name> <name><surname>Ballard</surname> <given-names>D. H.</given-names></name></person-group> (<year>2006</year>). <article-title>Learning receptive fields using predictive feedback.</article-title> <source><italic>J. Physiol. Paris</italic></source> <volume>100</volume> <fpage>125</fpage>&#x2013;<lpage>132</lpage>. <pub-id pub-id-type="doi">10.1016/j.jphysparis.2006.09.011</pub-id> <pub-id pub-id-type="pmid">17067787</pub-id></citation></ref>
<ref id="B71"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jordan</surname> <given-names>R.</given-names></name> <name><surname>Keller</surname> <given-names>G. B.</given-names></name></person-group> (<year>2020</year>). <article-title>Opposing influence of top-down and bottom-up input on different types of excitatory layer 2/3 neurons in mouse visual cortex.</article-title> <source><italic>bioRxiv</italic></source> [<comment>Preprint</comment>]. <pub-id pub-id-type="doi">10.1101/2020.03.25.008607</pub-id></citation></ref>
<ref id="B72"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kaliukhovich</surname> <given-names>D. A.</given-names></name> <name><surname>Vogels</surname> <given-names>R.</given-names></name></person-group> (<year>2014</year>). <article-title>Neurons in macaque inferior temporal cortex show no surprise response to deviants in visual oddball sequences.</article-title> <source><italic>J. Neurosci.</italic></source> <volume>34</volume> <fpage>12801</fpage>&#x2013;<lpage>12815</lpage>. <pub-id pub-id-type="doi">10.1523/JNEUROSCI.2154-14.2014</pub-id> <pub-id pub-id-type="pmid">25232116</pub-id></citation></ref>
<ref id="B73"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kato</surname> <given-names>H. K.</given-names></name> <name><surname>Gillet</surname> <given-names>S. N.</given-names></name> <name><surname>Isaacson</surname> <given-names>J. S.</given-names></name></person-group> (<year>2015</year>). <article-title>Flexible sensory representations in auditory cortex driven by behavioral relevance.</article-title> <source><italic>Neuron</italic></source> <volume>88</volume> <fpage>1027</fpage>&#x2013;<lpage>1039</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuron.2015.10.024</pub-id> <pub-id pub-id-type="pmid">26586181</pub-id></citation></ref>
<ref id="B74"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Katzner</surname> <given-names>S.</given-names></name> <name><surname>Nauhaus</surname> <given-names>I.</given-names></name> <name><surname>Benucci</surname> <given-names>A.</given-names></name> <name><surname>Bonin</surname> <given-names>V.</given-names></name> <name><surname>Ringach</surname> <given-names>D. L.</given-names></name> <name><surname>Carandini</surname> <given-names>M.</given-names></name></person-group> (<year>2009</year>). <article-title>Local origin of field potentials in visual cortex.</article-title> <source><italic>Neuron</italic></source> <volume>61</volume> <fpage>35</fpage>&#x2013;<lpage>41</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuron.2008.11.016</pub-id> <pub-id pub-id-type="pmid">19146811</pub-id></citation></ref>
<ref id="B75"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Keller</surname> <given-names>G. B.</given-names></name> <name><surname>Bonhoeffer</surname> <given-names>T.</given-names></name> <name><surname>H&#x00FC;bener</surname> <given-names>M.</given-names></name></person-group> (<year>2012</year>). <article-title>Sensorimotor mismatch signals in primary visual cortex of the behaving mouse.</article-title> <source><italic>Neuron</italic></source> <volume>74</volume> <fpage>809</fpage>&#x2013;<lpage>815</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuron.2012.03.040</pub-id> <pub-id pub-id-type="pmid">22681686</pub-id></citation></ref>
<ref id="B76"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Keller</surname> <given-names>G. B.</given-names></name> <name><surname>Mrsic-Flogel</surname> <given-names>T. D.</given-names></name></person-group> (<year>2018</year>). <article-title>Predictive processing: a canonical cortical computation.</article-title> <source><italic>Neuron</italic></source> <volume>100</volume> <fpage>424</fpage>&#x2013;<lpage>435</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuron.2018.10.003</pub-id> <pub-id pub-id-type="pmid">30359606</pub-id></citation></ref>
<ref id="B77"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Landauer</surname> <given-names>R.</given-names></name></person-group> (<year>1976</year>). <article-title>Information is physical.</article-title> <source><italic>Phys. Today</italic></source> <volume>44</volume>:<fpage>23</fpage>. <pub-id pub-id-type="doi">10.1063/1.881299</pub-id></citation></ref>
<ref id="B78"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lappe</surname> <given-names>M.</given-names></name> <name><surname>Bremmer</surname> <given-names>F.</given-names></name> <name><surname>Van Den Berg</surname> <given-names>A. V.</given-names></name></person-group> (<year>1999</year>). <article-title>Perception of self-motion from visual flow.</article-title> <source><italic>Trends Cogn. Sci.</italic></source> <volume>3</volume> <fpage>329</fpage>&#x2013;<lpage>336</lpage>. <pub-id pub-id-type="doi">10.1016/S1364-6613(99)01364-9</pub-id></citation></ref>
<ref id="B79"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Laughlin</surname> <given-names>S. B.</given-names></name></person-group> (<year>2001</year>). <article-title>Energy as a constraint on the coding and processing of sensory information.</article-title> <source><italic>Curr. Opin. Neurobiol.</italic></source> <volume>11</volume> <fpage>475</fpage>&#x2013;<lpage>480</lpage>. <pub-id pub-id-type="doi">10.1016/S0959-4388(00)00237-3</pub-id></citation></ref>
<ref id="B80"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Laughlin</surname> <given-names>S. B.</given-names></name> <name><surname>De Ruyter Van Steveninck</surname> <given-names>R. R.</given-names></name> <name><surname>Anderson</surname> <given-names>J. C.</given-names></name></person-group> (<year>1998</year>). <article-title>The metabolic cost of neural information.</article-title> <source><italic>Nat. Neurosci.</italic></source> <volume>1</volume> <fpage>36</fpage>&#x2013;<lpage>41</lpage>. <pub-id pub-id-type="doi">10.1038/236</pub-id> <pub-id pub-id-type="pmid">10195106</pub-id></citation></ref>
<ref id="B81"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Layton</surname> <given-names>O. W.</given-names></name> <name><surname>Yazdanbakhsh</surname> <given-names>A.</given-names></name></person-group> (<year>2015</year>). <article-title>A neural model of border-ownership from kinetic occlusion.</article-title> <source><italic>Vis. Res.</italic></source> <volume>106</volume> <fpage>64</fpage>&#x2013;<lpage>80</lpage>. <pub-id pub-id-type="doi">10.1016/j.visres.2014.11.002</pub-id> <pub-id pub-id-type="pmid">25448117</pub-id></citation></ref>
<ref id="B82"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Leinweber</surname> <given-names>M.</given-names></name> <name><surname>Ward</surname> <given-names>D. R.</given-names></name> <name><surname>Sobczak</surname> <given-names>J. M.</given-names></name> <name><surname>Attinger</surname> <given-names>A.</given-names></name> <name><surname>Keller</surname> <given-names>G. B.</given-names></name></person-group> (<year>2017</year>). <article-title>A sensorimotor circuit in mouse cortex for visual flow predictions.</article-title> <source><italic>Neuron</italic></source> <volume>95</volume> <fpage>1420.e5</fpage>&#x2013;<lpage>1432.e5</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuron.2017.08.036</pub-id> <pub-id pub-id-type="pmid">28910624</pub-id></citation></ref>
<ref id="B83"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lelais</surname> <given-names>A.</given-names></name> <name><surname>Mahn</surname> <given-names>J.</given-names></name> <name><surname>Narayan</surname> <given-names>V.</given-names></name> <name><surname>Zhang</surname> <given-names>C.</given-names></name> <name><surname>Shi</surname> <given-names>B. E.</given-names></name> <name><surname>Triesch</surname> <given-names>J.</given-names></name></person-group> (<year>2019</year>). <article-title>Autonomous development of active binocular and motion vision through active efficient coding.</article-title> <source><italic>Front. Neurorobotics</italic></source> <volume>1</volume>:<fpage>49</fpage>. <pub-id pub-id-type="doi">10.3389/fnbot.2019.00049</pub-id> <pub-id pub-id-type="pmid">31379548</pub-id></citation></ref>
<ref id="B84"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Levy</surname> <given-names>J. M.</given-names></name> <name><surname>Zold</surname> <given-names>C. L.</given-names></name> <name><surname>Namboodiri</surname> <given-names>V. K. M.</given-names></name> <name><surname>Shuler</surname> <given-names>M. G. H.</given-names></name></person-group> (<year>2017</year>). <article-title>The timing of reward-seeking action tracks visually-cued theta oscillations in primary visual cortex.</article-title> <source><italic>J. Neurosci.</italic></source> <volume>37</volume> <fpage>10408</fpage>&#x2013;<lpage>10420</lpage>. <pub-id pub-id-type="doi">10.1523/JNEUROSCI.0923-17.2017</pub-id> <pub-id pub-id-type="pmid">28947572</pub-id></citation></ref>
<ref id="B85"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Maheu</surname> <given-names>M.</given-names></name> <name><surname>Dehaene</surname> <given-names>S.</given-names></name> <name><surname>Meyniel</surname> <given-names>F.</given-names></name></person-group> (<year>2019</year>). <article-title>Brain signatures of a multiscale process of sequence learning in humans.</article-title> <source><italic>eLife</italic></source> <volume>8</volume>:<fpage>e41541</fpage>. <pub-id pub-id-type="doi">10.7554/eLife.41541</pub-id> <pub-id pub-id-type="pmid">30714904</pub-id></citation></ref>
<ref id="B86"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Makino</surname> <given-names>H.</given-names></name> <name><surname>Komiyama</surname> <given-names>T.</given-names></name></person-group> (<year>2015</year>). <article-title>Learning enhances the relative impact of top-down processing in the visual cortex.</article-title> <source><italic>Nat. Neurosci.</italic></source> <volume>18</volume> <fpage>1116</fpage>&#x2013;<lpage>1122</lpage>. <pub-id pub-id-type="doi">10.1038/nn.4061</pub-id> <pub-id pub-id-type="pmid">26167904</pub-id></citation></ref>
<ref id="B87"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Marshel</surname> <given-names>J. H.</given-names></name> <name><surname>Garrett</surname> <given-names>M. E.</given-names></name> <name><surname>Nauhaus</surname> <given-names>I.</given-names></name> <name><surname>Callaway</surname> <given-names>E. M.</given-names></name></person-group> (<year>2011</year>). <article-title>Functional specialization of seven mouse visual cortical areas.</article-title> <source><italic>Neuron</italic></source> <volume>72</volume> <fpage>1040</fpage>&#x2013;<lpage>1054</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuron.2011.12.004</pub-id> <pub-id pub-id-type="pmid">22196338</pub-id></citation></ref>
<ref id="B88"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mauk</surname> <given-names>M. D.</given-names></name> <name><surname>Buonomano</surname> <given-names>D. V.</given-names></name></person-group> (<year>2004</year>). <article-title>The neural basis of temporal processing.</article-title> <source><italic>Annu. Rev. Neurosci.</italic></source> <volume>27</volume> <fpage>307</fpage>&#x2013;<lpage>340</lpage>. <pub-id pub-id-type="doi">10.1146/annurev.neuro.27.070203.144247</pub-id> <pub-id pub-id-type="pmid">15217335</pub-id></citation></ref>
<ref id="B89"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Meister</surname> <given-names>M.</given-names></name> <name><surname>Berry</surname> <given-names>M. J.</given-names></name></person-group> (<year>1999</year>). <article-title>The neural code of the retina.</article-title> <source><italic>Neuron</italic></source> <volume>22</volume> <fpage>435</fpage>&#x2013;<lpage>450</lpage>. <pub-id pub-id-type="doi">10.1016/S0896-6273(00)80700-X</pub-id></citation></ref>
<ref id="B90"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Merchant</surname> <given-names>H.</given-names></name> <name><surname>Bartolo</surname> <given-names>R.</given-names></name> <name><surname>P&#x00E9;rez</surname> <given-names>O.</given-names></name> <name><surname>M&#x00E9;ndez</surname> <given-names>J. C.</given-names></name> <name><surname>Mendoza</surname> <given-names>G.</given-names></name> <name><surname>G&#x00E1;mez</surname> <given-names>J.</given-names></name><etal/></person-group> (<year>2014</year>). <source><italic>Neurophysiology of Timing in the Hundreds of Milliseconds: Multiple Layers of Neuronal Clocks in the Medial Premotor Areas.</italic></source> <publisher-loc>New York, NY</publisher-loc>: <publisher-name>Springer</publisher-name>, <fpage>143</fpage>&#x2013;<lpage>154</lpage>.</citation></ref>
<ref id="B91"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Merchant</surname> <given-names>H.</given-names></name> <name><surname>Grahn</surname> <given-names>J.</given-names></name> <name><surname>Trainor</surname> <given-names>L.</given-names></name> <name><surname>Rohrmeier</surname> <given-names>M.</given-names></name> <name><surname>Fitch</surname> <given-names>W. T.</given-names></name></person-group> (<year>2015</year>). <article-title>Finding the beat: a neural perspective across humans and non-human primates.</article-title> <source><italic>Philos. Trans. R. Soc. Lond.</italic></source> <volume>370</volume>:<fpage>20140093</fpage>. <pub-id pub-id-type="doi">10.1098/rstb.2014.0093</pub-id> <pub-id pub-id-type="pmid">25646516</pub-id></citation></ref>
<ref id="B92"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Meyer</surname> <given-names>T.</given-names></name> <name><surname>Olson</surname> <given-names>C. R.</given-names></name></person-group> (<year>2011</year>). <article-title>Statistical learning of visual transitions in monkey inferotemporal cortex.</article-title> <source><italic>Proc. Natl. Acad. Sci. U.S.A.</italic></source> <volume>108</volume> <fpage>19401</fpage>&#x2013;<lpage>19406</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.1112895108</pub-id> <pub-id pub-id-type="pmid">22084090</pub-id></citation></ref>
<ref id="B93"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Millidge</surname> <given-names>B.</given-names></name> <name><surname>Seth</surname> <given-names>A.</given-names></name> <name><surname>Buckley</surname> <given-names>C. L.</given-names></name></person-group> (<year>2021</year>). <article-title>Predictive coding: a theoretical and experimental review.</article-title> <source><italic>arXiv</italic></source> [<comment>Preprint</comment>]. Available online at: <ext-link ext-link-type="uri" xlink:href="https://arxiv.org/abs/2107.12979">https://arxiv.org/abs/2107.12979</ext-link> <comment>(accessed March 14, 2022)</comment>.</citation></ref>
<ref id="B94"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Milner</surname> <given-names>A. D.</given-names></name> <name><surname>Goodale</surname> <given-names>M. A.</given-names></name></person-group> (<year>2008</year>). <article-title>Two visual systems re-viewed.</article-title> <source><italic>Neuropsychologia</italic></source> <volume>46</volume> <fpage>774</fpage>&#x2013;<lpage>785</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuropsychologia.2007.10.005</pub-id> <pub-id pub-id-type="pmid">18037456</pub-id></citation></ref>
<ref id="B95"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Montgomery</surname> <given-names>D. P.</given-names></name> <name><surname>Hayden</surname> <given-names>D. J.</given-names></name> <name><surname>Chaloner</surname> <given-names>F. A.</given-names></name> <name><surname>Cooke</surname> <given-names>S. F.</given-names></name> <name><surname>Bear</surname> <given-names>M. F.</given-names></name></person-group> (<year>2022</year>). <article-title>Stimulus-selective response plasticity in primary visual cortex: progress and puzzles.</article-title> <source><italic>Front. Neural Circ.</italic></source> <volume>15</volume>:<fpage>815554</fpage>. <pub-id pub-id-type="doi">10.3389/fncir.2021.815554</pub-id> <pub-id pub-id-type="pmid">35173586</pub-id></citation></ref>
<ref id="B96"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Muckli</surname> <given-names>L.</given-names></name> <name><surname>Bisley</surname> <given-names>J.</given-names></name> <name><surname>Vergnieux</surname> <given-names>V.</given-names></name> <name><surname>Vogels</surname> <given-names>R.</given-names></name></person-group> (<year>2020</year>). <article-title>Statistical learning signals for complex visual images in macaque early visual cortex.</article-title> <source><italic>Front. Neurosci.</italic></source> <volume>14</volume>:<fpage>789</fpage>. <pub-id pub-id-type="doi">10.3389/fnins.2020.00789</pub-id> <pub-id pub-id-type="pmid">32848562</pub-id></citation></ref>
<ref id="B97"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Muzzu</surname> <given-names>T.</given-names></name> <name><surname>Saleem</surname> <given-names>A. B.</given-names></name></person-group> (<year>2021</year>). <article-title>Feature selectivity can explain mismatch signals in mouse visual cortex.</article-title> <source><italic>Cell Rep.</italic></source> <volume>37</volume>:<fpage>109772</fpage>. <pub-id pub-id-type="doi">10.1016/j.celrep.2021.109772</pub-id> <pub-id pub-id-type="pmid">34610298</pub-id></citation></ref>
<ref id="B98"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Niell</surname> <given-names>C. M.</given-names></name> <name><surname>Stryker</surname> <given-names>M. P.</given-names></name></person-group> (<year>2008</year>). <article-title>Highly selective receptive fields in mouse visual cortex.</article-title> <source><italic>J. Neurosci.</italic></source> <volume>28</volume> <fpage>7520</fpage>&#x2013;<lpage>7536</lpage>. <pub-id pub-id-type="doi">10.1523/JNEUROSCI.0623-08.2008</pub-id> <pub-id pub-id-type="pmid">18650330</pub-id></citation></ref>
<ref id="B99"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Normann</surname> <given-names>R. A.</given-names></name> <name><surname>Perlman</surname> <given-names>I.</given-names></name></person-group> (<year>1979</year>). <article-title>The effects of background illumination on the photoresponses of red and green cones.</article-title> <source><italic>J. Physiol.</italic></source> <volume>286</volume> <fpage>491</fpage>&#x2013;<lpage>507</lpage>.</citation></ref>
<ref id="B100"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ocko</surname> <given-names>S. A.</given-names></name> <name><surname>Lindsey</surname> <given-names>J.</given-names></name> <name><surname>Ganguli</surname> <given-names>S.</given-names></name> <name><surname>Deny</surname> <given-names>S.</given-names></name></person-group> (<year>2018</year>). &#x201C;<article-title>The emergence of multiple retinal cell types through efficient coding of natural movies</article-title>,&#x201D; in <source><italic>Advances in Neural Information Processing Systems 31</italic></source>, <role>eds</role> <person-group person-group-type="editor"><name><surname>Bengio</surname> <given-names>S.</given-names></name> <name><surname>Wallach</surname> <given-names>H.</given-names></name> <name><surname>Larochelle</surname> <given-names>H.</given-names></name> <name><surname>Grauman</surname> <given-names>K.</given-names></name> <name><surname>Cesa-Bianchi</surname> <given-names>N.</given-names></name> <name><surname>Garnett</surname> <given-names>R.</given-names></name></person-group> (<publisher-loc>Red Hook, NY</publisher-loc>: <publisher-name>Curran Associates, Inc.</publisher-name>), <fpage>9389</fpage>&#x2013;<lpage>9400</lpage>.</citation></ref>
<ref id="B101"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Oja</surname> <given-names>E.</given-names></name></person-group> (<year>2002</year>). <article-title>Unsupervised learning in neural computation.</article-title> <source><italic>Theor. Comput. Sci.</italic></source> <volume>287</volume> <fpage>187</fpage>&#x2013;<lpage>207</lpage>. <pub-id pub-id-type="doi">10.1016/S0304-3975(02)00160-3</pub-id></citation></ref>
<ref id="B102"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Olshausen</surname> <given-names>B. A.</given-names></name> <name><surname>Field</surname> <given-names>D. J.</given-names></name></person-group> (<year>1996</year>). <article-title>Emergence of simple-cell receptive field properties by learning a sparse code for natural images.</article-title> <source><italic>Nature</italic></source> <volume>381</volume> <fpage>607</fpage>&#x2013;<lpage>609</lpage>. <pub-id pub-id-type="doi">10.1038/381607a0</pub-id> <pub-id pub-id-type="pmid">8637596</pub-id></citation></ref>
<ref id="B103"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Olshausen</surname> <given-names>B. A.</given-names></name> <name><surname>Field</surname> <given-names>D. J.</given-names></name></person-group> (<year>1997</year>). <article-title>Sparse coding with an overcomplete basis set: a strategy employed by V1?</article-title> <source><italic>Vis. Res.</italic></source> <volume>37</volume> <fpage>3311</fpage>&#x2013;<lpage>3325</lpage>. <pub-id pub-id-type="doi">10.1016/s0042-6989(97)00169-7</pub-id></citation></ref>
<ref id="B104"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Palmer</surname> <given-names>S. E.</given-names></name> <name><surname>Marre</surname> <given-names>O.</given-names></name> <name><surname>Berry</surname> <given-names>M. J.</given-names></name> <name><surname>Bialek</surname> <given-names>W.</given-names></name></person-group> (<year>2015</year>). <article-title>Predictive information in a sensory population.</article-title> <source><italic>Proc. Natl. Acad. Sci. U.S.A.</italic></source> <volume>112</volume> <fpage>6908</fpage>&#x2013;<lpage>6913</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.1506855112</pub-id> <pub-id pub-id-type="pmid">26038544</pub-id></citation></ref>
<ref id="B105"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Paninski</surname> <given-names>L.</given-names></name></person-group> (<year>2003</year>). <article-title>Estimation of entropy and mutual information.</article-title> <source><italic>Neural Comput.</italic></source> <volume>15</volume> <fpage>1191</fpage>&#x2013;<lpage>1253</lpage>.</citation></ref>
<ref id="B106"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Parker</surname> <given-names>P. R. L.</given-names></name> <name><surname>Abe</surname> <given-names>E. T. T.</given-names></name> <name><surname>Leonard</surname> <given-names>E. S. P.</given-names></name> <name><surname>Martins</surname> <given-names>D. M.</given-names></name> <name><surname>Niell</surname> <given-names>C. M.</given-names></name></person-group> (<year>2022</year>). <article-title>Joint coding of visual input and eye/head position in V1 of freely moving mice.</article-title> <source><italic>bioRxiv</italic></source> <comment>[preprint].</comment> <pub-id pub-id-type="doi">10.1101/2022.02.01.478733</pub-id></citation></ref>
<ref id="B107"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Perge</surname> <given-names>J. A.</given-names></name> <name><surname>Koch</surname> <given-names>K.</given-names></name> <name><surname>Miller</surname> <given-names>R.</given-names></name> <name><surname>Sterling</surname> <given-names>P.</given-names></name> <name><surname>Balasubramanian</surname> <given-names>V.</given-names></name></person-group> (<year>2009</year>). <article-title>How the optic nerve allocates space, energy capacity, and information.</article-title> <source><italic>J. Neurosci.</italic></source> <volume>29</volume> <fpage>7917</fpage>&#x2013;<lpage>7928</lpage>. <pub-id pub-id-type="doi">10.1523/JNEUROSCI.5200-08.2009</pub-id> <pub-id pub-id-type="pmid">19535603</pub-id></citation></ref>
<ref id="B108"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Petter</surname> <given-names>E. A.</given-names></name> <name><surname>Gershman</surname> <given-names>S. J.</given-names></name> <name><surname>Meck</surname> <given-names>W. H.</given-names></name></person-group> (<year>2018</year>). <article-title>Integrating models of interval timing and reinforcement learning.</article-title> <source><italic>Trends Cogn. Sci.</italic></source> <volume>22</volume> <fpage>911</fpage>&#x2013;<lpage>922</lpage>. <pub-id pub-id-type="doi">10.1016/j.tics.2018.08.004</pub-id> <pub-id pub-id-type="pmid">30266150</pub-id></citation></ref>
<ref id="B109"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Piasini</surname> <given-names>E.</given-names></name> <name><surname>Soltuzu</surname> <given-names>L.</given-names></name> <name><surname>Muratore</surname> <given-names>P.</given-names></name> <name><surname>Caramellino</surname> <given-names>R.</given-names></name> <name><surname>Vinken</surname> <given-names>K.</given-names></name> <name><surname>Op De Beeck</surname> <given-names>H.</given-names></name><etal/></person-group> (<year>2021</year>). <article-title>Temporal stability of stimulus representation increases along rodent visual cortical hierarchies.</article-title> <source><italic>Nat. Commun.</italic></source> <volume>12</volume>:<fpage>4448</fpage>. <pub-id pub-id-type="doi">10.1038/s41467-021-24456-3</pub-id> <pub-id pub-id-type="pmid">34290247</pub-id></citation></ref>
<ref id="B110"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pitkow</surname> <given-names>X.</given-names></name> <name><surname>Meister</surname> <given-names>M.</given-names></name></person-group> (<year>2012</year>). <article-title>Decorrelation and efficient coding by retinal ganglion cells.</article-title> <source><italic>Nat. Neurosci.</italic></source> <volume>15</volume> <fpage>628</fpage>&#x2013;<lpage>635</lpage>. <pub-id pub-id-type="doi">10.1038/nn.3064</pub-id> <pub-id pub-id-type="pmid">22406548</pub-id></citation></ref>
<ref id="B111"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Poort</surname> <given-names>J.</given-names></name> <name><surname>Wilmes</surname> <given-names>K. A.</given-names></name> <name><surname>Blot</surname> <given-names>A.</given-names></name> <name><surname>Chadwick</surname> <given-names>A.</given-names></name> <name><surname>Sahani</surname> <given-names>M.</given-names></name> <name><surname>Clopath</surname> <given-names>C.</given-names></name><etal/></person-group> (<year>2021</year>). <article-title>Learning and attention increase visual response selectivity through distinct mechanisms.</article-title> <source><italic>Neuron</italic></source> <volume>110</volume> <fpage>1</fpage>&#x2013;<lpage>12</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuron.2021.11.016</pub-id> <pub-id pub-id-type="pmid">34906356</pub-id></citation></ref>
<ref id="B112"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Porciatti</surname> <given-names>V.</given-names></name> <name><surname>Pizzorusso</surname> <given-names>T.</given-names></name> <name><surname>Maffei</surname> <given-names>L.</given-names></name></person-group> (<year>1999</year>). <article-title>The visual physiology of the wild type mouse determined with pattern VEPs.</article-title> <source><italic>Vis. Res.</italic></source> <volume>39</volume> <fpage>3071</fpage>&#x2013;<lpage>3081</lpage>. <pub-id pub-id-type="doi">10.1016/S0042-6989(99)00022-X</pub-id></citation></ref>
<ref id="B113"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Price</surname> <given-names>B. H.</given-names></name> <name><surname>Jensen</surname> <given-names>C. M.</given-names></name> <name><surname>Khoudary</surname> <given-names>A. A.</given-names></name> <name><surname>Gavornik</surname> <given-names>J. P.</given-names></name></person-group> (<year>2022</year>). <article-title>Expectation violations produce error signals in mouse V1.</article-title> <source><italic>bioRxiv</italic></source> [<comment>Preprint</comment>]. <pub-id pub-id-type="doi">10.1101/2021.12.31.474652v1</pub-id></citation></ref>
<ref id="B114"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Priebe</surname> <given-names>N. J.</given-names></name> <name><surname>Ferster</surname> <given-names>D.</given-names></name></person-group> (<year>2012</year>). <article-title>Mechanisms of neuronal computation in mammalian visual cortex.</article-title> <source><italic>Neuron</italic></source> <volume>75</volume> <fpage>194</fpage>&#x2013;<lpage>208</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuron.2012.06.011</pub-id> <pub-id pub-id-type="pmid">22841306</pub-id></citation></ref>
<ref id="B115"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Prusky</surname> <given-names>G. T.</given-names></name> <name><surname>West</surname> <given-names>P. W. R.</given-names></name> <name><surname>Douglas</surname> <given-names>R. M.</given-names></name></person-group> (<year>2000</year>). <article-title>Behavioral assessment of visual acuity in mice and rats.</article-title> <source><italic>Vis. Res.</italic></source> <volume>40</volume> <fpage>2201</fpage>&#x2013;<lpage>2209</lpage>. <pub-id pub-id-type="doi">10.1016/S0042-6989(00)00081-X</pub-id></citation></ref>
<ref id="B116"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rao</surname> <given-names>R. P. N.</given-names></name> <name><surname>Ballard</surname> <given-names>D. H.</given-names></name></person-group> (<year>1999</year>). <article-title>Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects.</article-title> <source><italic>Nat. Neurosci.</italic></source> <volume>2</volume> <fpage>79</fpage>&#x2013;<lpage>87</lpage>. <pub-id pub-id-type="doi">10.1038/4580</pub-id> <pub-id pub-id-type="pmid">10195184</pub-id></citation></ref>
<ref id="B117"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Riesenhuber</surname> <given-names>M.</given-names></name> <name><surname>Poggio</surname> <given-names>T.</given-names></name></person-group> (<year>1999</year>). <article-title>Hierarchical models of object recognition in cortex.</article-title> <source><italic>Nat. Neurosci.</italic></source> <volume>2</volume> <fpage>1019</fpage>&#x2013;<lpage>1025</lpage>.</citation></ref>
<ref id="B118"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rubin</surname> <given-names>J.</given-names></name> <name><surname>Ulanovsky</surname> <given-names>N.</given-names></name> <name><surname>Nelken</surname> <given-names>I.</given-names></name> <name><surname>Tishby</surname> <given-names>N.</given-names></name></person-group> (<year>2016</year>). <article-title>The representation of prediction error in auditory cortex.</article-title> <source><italic>PLoS Comput. Biol.</italic></source> <volume>12</volume>:<fpage>e1005058</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pcbi.1005058</pub-id> <pub-id pub-id-type="pmid">27490251</pub-id></citation></ref>
<ref id="B119"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sachdeva</surname> <given-names>V.</given-names></name> <name><surname>Mora</surname> <given-names>T.</given-names></name> <name><surname>Walczak</surname> <given-names>A. M.</given-names></name> <name><surname>Palmer</surname> <given-names>S. E.</given-names></name></person-group> (<year>2021</year>). <article-title>Optimal prediction with resource constraints using the information bottleneck.</article-title> <source><italic>PLoS Comput. Biol.</italic></source> <volume>17</volume>:<fpage>e1008743</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pcbi.1008743</pub-id> <pub-id pub-id-type="pmid">33684112</pub-id></citation></ref>
<ref id="B120"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Saleem</surname> <given-names>A. B.</given-names></name> <name><surname>Diamanti</surname> <given-names>M.</given-names></name> <name><surname>Fournier</surname> <given-names>J.</given-names></name> <name><surname>Harris</surname> <given-names>K. D.</given-names></name> <name><surname>Carandini</surname> <given-names>M.</given-names></name></person-group> (<year>2018</year>). <article-title>Coherent encoding of subjective spatial position in visual cortex and hippocampus.</article-title> <source><italic>Nature</italic></source> <volume>562</volume> <fpage>124</fpage>&#x2013;<lpage>127</lpage>. <pub-id pub-id-type="doi">10.1038/s41586-018-0516-1</pub-id> <pub-id pub-id-type="pmid">30202092</pub-id></citation></ref>
<ref id="B121"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Salisbury</surname> <given-names>J.</given-names></name> <name><surname>Palmer</surname> <given-names>S. E.</given-names></name></person-group> (<year>2015</year>). <article-title>Optimal prediction and natural scene statistics in the retina.</article-title> <source><italic>arXiv</italic></source> [<comment>Preprint</comment>]. Available online at: <ext-link ext-link-type="uri" xlink:href="https://arxiv.org/abs/1507.00125">https://arxiv.org/abs/1507.00125</ext-link> <comment>(accessed February 25, 2022)</comment>.</citation></ref>
<ref id="B122"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sanchez-Giraldo</surname> <given-names>L. G.</given-names></name> <name><surname>Laskar</surname> <given-names>M. N. U.</given-names></name> <name><surname>Schwartz</surname> <given-names>O.</given-names></name></person-group> (<year>2019</year>). <article-title>Normalization and pooling in hierarchical models of natural images.</article-title> <source><italic>Curr. Opin. Neurobiol.</italic></source> <volume>55</volume> <fpage>65</fpage>&#x2013;<lpage>72</lpage>. <pub-id pub-id-type="doi">10.1016/j.conb.2019.01.008</pub-id> <pub-id pub-id-type="pmid">30785005</pub-id></citation></ref>
<ref id="B123"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Saul</surname> <given-names>A. B.</given-names></name> <name><surname>Humphrey</surname> <given-names>A. L.</given-names></name></person-group> (<year>1990</year>). <article-title>Spatial and temporal response properties of lagged and nonlagged cells in cat lateral geniculate nucleus.</article-title> <source><italic>J. Neurophysiol.</italic></source> <volume>64</volume> <fpage>206</fpage>&#x2013;<lpage>224</lpage>. <pub-id pub-id-type="doi">10.1152/jn.1990.64.1.206</pub-id> <pub-id pub-id-type="pmid">2388066</pub-id></citation></ref>
<ref id="B124"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sawtell</surname> <given-names>N. B.</given-names></name></person-group> (<year>2017</year>). <article-title>Neural mechanisms for predicting the sensory consequences of behavior: insights from electrosensory systems.</article-title> <source><italic>Annu. Rev. Physiol.</italic></source> <volume>79</volume> <fpage>381</fpage>&#x2013;<lpage>399</lpage>. <pub-id pub-id-type="doi">10.1146/annurev-physiol-021115-105003</pub-id> <pub-id pub-id-type="pmid">27813831</pub-id></citation></ref>
<ref id="B125"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schulz</surname> <given-names>A.</given-names></name> <name><surname>Miehl</surname> <given-names>C.</given-names></name> <name><surname>Berry</surname> <given-names>M. J.</given-names></name> <name><surname>Gjorgjieva</surname> <given-names>J.</given-names></name></person-group> (<year>2021</year>). <article-title>The generation of cortical novelty responses through inhibitory plasticity.</article-title> <source><italic>eLife</italic></source> <volume>10</volume>:<fpage>e65309</fpage>. <pub-id pub-id-type="doi">10.7554/elife.65309</pub-id> <pub-id pub-id-type="pmid">34647889</pub-id></citation></ref>
<ref id="B126"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schwartz</surname> <given-names>O.</given-names></name> <name><surname>Hsu</surname> <given-names>A.</given-names></name> <name><surname>Dayan</surname> <given-names>P.</given-names></name></person-group> (<year>2007</year>). <article-title>Space and time in visual context.</article-title> <source><italic>Nat. Rev. Neurosci.</italic></source> <volume>8</volume> <fpage>522</fpage>&#x2013;<lpage>535</lpage>. <pub-id pub-id-type="doi">10.1038/nrn2155</pub-id> <pub-id pub-id-type="pmid">17585305</pub-id></citation></ref>
<ref id="B127"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shadmehr</surname> <given-names>R.</given-names></name> <name><surname>Smith</surname> <given-names>M. A.</given-names></name> <name><surname>Krakauer</surname> <given-names>J. W.</given-names></name></person-group> (<year>2010</year>). <article-title>Error correction, sensory prediction, and adaptation in motor control.</article-title> <source><italic>Annu. Rev. Neurosci.</italic></source> <volume>33</volume> <fpage>89</fpage>&#x2013;<lpage>108</lpage>. <pub-id pub-id-type="doi">10.1146/annurev-neuro-060909-153135</pub-id> <pub-id pub-id-type="pmid">20367317</pub-id></citation></ref>
<ref id="B128"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shannon</surname> <given-names>C. E.</given-names></name></person-group> (<year>1948</year>). <article-title>A mathematical theory of communication.</article-title> <source><italic>Bell Syst. Tech. J.</italic></source> <volume>27</volume> <fpage>623</fpage>&#x2013;<lpage>656</lpage>. <pub-id pub-id-type="doi">10.1002/j.1538-7305.1948.tb00917.x</pub-id></citation></ref>
<ref id="B129"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shuler</surname> <given-names>M. G.</given-names></name> <name><surname>Bear</surname> <given-names>M. F.</given-names></name></person-group> (<year>2006</year>). <article-title>Reward timing in the primary visual cortex.</article-title> <source><italic>Science</italic></source> <volume>311</volume> <fpage>393</fpage>&#x2013;<lpage>396</lpage>. <pub-id pub-id-type="doi">10.1126/science.1121879</pub-id> <pub-id pub-id-type="pmid">16527931</pub-id></citation></ref>
<ref id="B130"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Siegle</surname> <given-names>J. H.</given-names></name> <name><surname>Jia</surname> <given-names>X.</given-names></name> <name><surname>Durand</surname> <given-names>S.</given-names></name> <name><surname>Gale</surname> <given-names>S.</given-names></name> <name><surname>Bennett</surname> <given-names>C.</given-names></name> <name><surname>Graddis</surname> <given-names>N.</given-names></name><etal/></person-group> (<year>2021</year>). <article-title>Survey of spiking in the mouse visual system reveals functional hierarchy.</article-title> <source><italic>Nature</italic></source> <volume>592</volume> <fpage>86</fpage>&#x2013;<lpage>92</lpage>. <pub-id pub-id-type="doi">10.1038/s41586-020-03171-x</pub-id> <pub-id pub-id-type="pmid">33473216</pub-id></citation></ref>
<ref id="B131"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Simoncelli</surname> <given-names>E. P.</given-names></name></person-group> (<year>2003</year>). <article-title>Vision and the statistics of the visual environment.</article-title> <source><italic>Curr. Opin. Neurobiol.</italic></source> <volume>13</volume> <fpage>144</fpage>&#x2013;<lpage>149</lpage>. <pub-id pub-id-type="doi">10.1016/S0959-4388(03)00047-3</pub-id></citation></ref>
<ref id="B132"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Singer</surname> <given-names>Y.</given-names></name> <name><surname>Teramoto</surname> <given-names>Y.</given-names></name> <name><surname>Willmore</surname> <given-names>B. D. B.</given-names></name> <name><surname>King</surname> <given-names>A. J.</given-names></name> <name><surname>Schnupp</surname> <given-names>J. W. H.</given-names></name> <name><surname>Harper</surname> <given-names>N. S.</given-names></name></person-group> (<year>2018</year>). <article-title>Sensory cortex is optimised for prediction of future input.</article-title> <source><italic>eLife</italic></source> <volume>7</volume>:<fpage>e31557</fpage>. <pub-id pub-id-type="doi">10.7554/eLife.31557</pub-id> <pub-id pub-id-type="pmid">29911971</pub-id></citation></ref>
<ref id="B133"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Solomon</surname> <given-names>S. S.</given-names></name> <name><surname>Tang</surname> <given-names>H.</given-names></name> <name><surname>Sussman</surname> <given-names>E.</given-names></name> <name><surname>Kohn</surname> <given-names>A.</given-names></name></person-group> (<year>2021</year>). <article-title>Limited evidence for sensory prediction error responses in visual cortex of macaques and humans.</article-title> <source><italic>Cerebr. Cortex</italic></source> <volume>31</volume> <fpage>3136</fpage>&#x2013;<lpage>3152</lpage>. <pub-id pub-id-type="doi">10.1093/cercor/bhab014</pub-id> <pub-id pub-id-type="pmid">33683317</pub-id></citation></ref>
<ref id="B134"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Spratling</surname> <given-names>M. W.</given-names></name></person-group> (<year>2010</year>). <article-title>Predictive coding as a model of response properties in cortical area V1.</article-title> <source><italic>J. Neurosci.</italic></source> <volume>30</volume> <fpage>3531</fpage>&#x2013;<lpage>3543</lpage>. <pub-id pub-id-type="doi">10.1523/JNEUROSCI.4911-09.2010</pub-id> <pub-id pub-id-type="pmid">20203213</pub-id></citation></ref>
<ref id="B135"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Spratling</surname> <given-names>M. W.</given-names></name></person-group> (<year>2017</year>). <article-title>A review of predictive coding algorithms.</article-title> <source><italic>Brain Cogn.</italic></source> <volume>112</volume> <fpage>92</fpage>&#x2013;<lpage>97</lpage>. <pub-id pub-id-type="doi">10.1016/j.bandc.2015.11.003</pub-id> <pub-id pub-id-type="pmid">26809759</pub-id></citation></ref>
<ref id="B136"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Srinivasan</surname> <given-names>M. V.</given-names></name> <name><surname>Laughlin</surname> <given-names>S. B.</given-names></name> <name><surname>Dubs</surname> <given-names>A.</given-names></name></person-group> (<year>1982</year>). <article-title>Predictive coding: a fresh view of inhibition in the retina.</article-title> <source><italic>Proc. R. Soc. Lond.</italic></source> <volume>216</volume> <fpage>427</fpage>&#x2013;<lpage>459</lpage>. <pub-id pub-id-type="doi">10.1098/rspb.1982.0085</pub-id> <pub-id pub-id-type="pmid">6129637</pub-id></citation></ref>
<ref id="B137"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sterling</surname> <given-names>P.</given-names></name> <name><surname>Laughlin</surname> <given-names>S.</given-names></name></person-group> (<year>2015</year>). <source><italic>Principles of Neural Design.</italic></source> <publisher-loc>Cambridge, MA</publisher-loc>: <publisher-name>The MIT Press</publisher-name>.</citation></ref>
<ref id="B138"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Stone</surname> <given-names>J. V.</given-names></name></person-group> (<year>1998</year>). <article-title>Object recognition using spatiotemporal signatures.</article-title> <source><italic>Vis. Res.</italic></source> <volume>38</volume> <fpage>947</fpage>&#x2013;<lpage>951</lpage>. <pub-id pub-id-type="doi">10.1016/S0042-6989(97)00301-5</pub-id></citation></ref>
<ref id="B139"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Stone</surname> <given-names>J. V.</given-names></name></person-group> (<year>1999</year>). <article-title>Object recognition: view-specificity and motion-specificity.</article-title> <source><italic>Vis. Res.</italic></source> <volume>39</volume> <fpage>4032</fpage>&#x2013;<lpage>4044</lpage>. <pub-id pub-id-type="doi">10.1016/S0042-6989(99)00123-6</pub-id></citation></ref>
<ref id="B140"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Stone</surname> <given-names>J. V.</given-names></name></person-group> (<year>2018</year>). <source><italic>Principles of Neural Information Theory: Computational Neuroscience and Metabolic Efficiency</italic></source>, <edition>1st Edn</edition>. <publisher-loc>Sheffield</publisher-loc>: <publisher-name>Sebtel Press</publisher-name>.</citation></ref>
<ref id="B141"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Stringer</surname> <given-names>C.</given-names></name> <name><surname>Pachitariu</surname> <given-names>M.</given-names></name> <name><surname>Steinmetz</surname> <given-names>N.</given-names></name> <name><surname>Reddy</surname> <given-names>C. B.</given-names></name> <name><surname>Carandini</surname> <given-names>M.</given-names></name> <name><surname>Harris</surname> <given-names>K. D.</given-names></name></person-group> (<year>2019</year>). <article-title>Spontaneous behaviors drive multidimensional, brainwide activity.</article-title> <source><italic>Science</italic></source> <volume>364</volume>:<fpage>255</fpage>. <pub-id pub-id-type="doi">10.1126/science.aav7893</pub-id> <pub-id pub-id-type="pmid">31000656</pub-id></citation></ref>
<ref id="B142"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tang</surname> <given-names>M. F.</given-names></name> <name><surname>Smout</surname> <given-names>C. A.</given-names></name> <name><surname>Arabzadeh</surname> <given-names>E.</given-names></name> <name><surname>Mattingley</surname> <given-names>J. B.</given-names></name></person-group> (<year>2018</year>). <article-title>Prediction error and repetition suppression have distinct effects on neural representations of visual information.</article-title> <source><italic>eLife</italic></source> <volume>7</volume>:<fpage>e33123</fpage>. <pub-id pub-id-type="doi">10.7554/eLife.33123</pub-id> <pub-id pub-id-type="pmid">30547881</pub-id></citation></ref>
<ref id="B143"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Teuli&#x00E8;re</surname> <given-names>C.</given-names></name> <name><surname>Forestier</surname> <given-names>S.</given-names></name> <name><surname>Lonini</surname> <given-names>L.</given-names></name> <name><surname>Zhang</surname> <given-names>C.</given-names></name> <name><surname>Zhao</surname> <given-names>Y.</given-names></name> <name><surname>Shi</surname> <given-names>B.</given-names></name><etal/></person-group> (<year>2015</year>). <article-title>Self-calibrating smooth pursuit through active efficient coding.</article-title> <source><italic>Robot. Auton. Syst.</italic></source> <volume>71</volume> <fpage>3</fpage>&#x2013;<lpage>12</lpage>. <pub-id pub-id-type="doi">10.1016/j.robot.2014.11.006</pub-id></citation></ref>
<ref id="B144"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tishby</surname> <given-names>N.</given-names></name> <name><surname>Pereira</surname> <given-names>F. C.</given-names></name> <name><surname>Bialek</surname> <given-names>W.</given-names></name></person-group> (<year>2000</year>). <article-title>The information bottleneck method.</article-title> <source><italic>arXiv</italic></source> [<comment>Preprint</comment>]. Available online at: <ext-link ext-link-type="uri" xlink:href="http://arxiv.org/abs/physics/0004057">http://arxiv.org/abs/physics/0004057</ext-link> <comment>(accessed March 1, 2021)</comment>.</citation></ref>
<ref id="B145"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tucci</surname> <given-names>V.</given-names></name> <name><surname>Buhusi</surname> <given-names>C. V.</given-names></name> <name><surname>Gallistel</surname> <given-names>R.</given-names></name> <name><surname>Meck</surname> <given-names>W. H.</given-names></name></person-group> (<year>2014</year>). <article-title>Towards an integrated understanding of the biology of timing.</article-title> <source><italic>Philos. Trans. R. Soc. Lond.</italic></source> <volume>369</volume>:<fpage>20120470</fpage>. <pub-id pub-id-type="doi">10.1098/rstb.2012.0470</pub-id> <pub-id pub-id-type="pmid">24446503</pub-id></citation></ref>
<ref id="B146"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ungerleider</surname> <given-names>L. G.</given-names></name> <name><surname>Mishkin</surname> <given-names>M.</given-names></name></person-group> (<year>1982</year>). &#x201C;<article-title>Two cortical visual systems</article-title>,&#x201D; in <source><italic>Analysis of Visual Behavior</italic></source>, <role>eds</role> <person-group person-group-type="editor"><name><surname>Ingle</surname> <given-names>D. J.</given-names></name> <name><surname>Goodale</surname> <given-names>M. A.</given-names></name> <name><surname>Mansfield</surname> <given-names>R. J. W.</given-names></name></person-group> (<publisher-loc>Cambridge, MA</publisher-loc>: <publisher-name>MIT Press</publisher-name>).</citation></ref>
<ref id="B147"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>van den Oord</surname> <given-names>A.</given-names></name> <name><surname>Li</surname> <given-names>Y.</given-names></name> <name><surname>Vinyals</surname> <given-names>O.</given-names></name></person-group> (<year>2018</year>). <article-title>Representation learning with contrastive predictive coding.</article-title> <source><italic>arXiv</italic></source> [<comment>Preprint</comment>]. Available online at: <ext-link ext-link-type="uri" xlink:href="http://arxiv.org/abs/1807.03748">http://arxiv.org/abs/1807.03748</ext-link> <comment>(accessed January 10, 2022)</comment>.</citation></ref>
<ref id="B148"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>van Hateren</surname> <given-names>J. H.</given-names></name></person-group> (<year>1992</year>). <article-title>A theory of maximizing sensory information.</article-title> <source><italic>Biol. Cybernet.</italic></source> <volume>68</volume> <fpage>23</fpage>&#x2013;<lpage>29</lpage>. <pub-id pub-id-type="doi">10.1007/s00422-003-0455-1</pub-id></citation></ref>
<ref id="B149"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Vasilevskaya</surname> <given-names>A.</given-names></name> <name><surname>Widmer</surname> <given-names>F. C.</given-names></name> <name><surname>Keller</surname> <given-names>G. B.</given-names></name> <name><surname>Jordan</surname> <given-names>R.</given-names></name></person-group> (<year>2022</year>). <article-title>Locomotion-induced gain of visual responses cannot explain visuomotor mismatch responses in layer 2/3 of primary visual cortex.</article-title> <source><italic>bioRxiv</italic></source> [<comment>Preprint</comment>]. <pub-id pub-id-type="doi">10.1101/2022.02.11.479795</pub-id></citation></ref>
<ref id="B150"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Vinck</surname> <given-names>M.</given-names></name> <name><surname>Batista-Brito</surname> <given-names>R.</given-names></name> <name><surname>Knoblich</surname> <given-names>U.</given-names></name> <name><surname>Cardin</surname> <given-names>J. A.</given-names></name></person-group> (<year>2015</year>). <article-title>Arousal and locomotion make distinct contributions to cortical activity patterns and visual encoding.</article-title> <source><italic>Neuron</italic></source> <volume>86</volume> <fpage>740</fpage>&#x2013;<lpage>754</lpage>. <pub-id pub-id-type="doi">10.1016/J.NEURON.2015.03.028</pub-id> <pub-id pub-id-type="pmid">25892300</pub-id></citation></ref>
<ref id="B151"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Vinken</surname> <given-names>K.</given-names></name> <name><surname>Vogels</surname> <given-names>R.</given-names></name> <name><surname>Op de Beeck</surname> <given-names>H.</given-names></name></person-group> (<year>2017</year>). <article-title>Recent visual experience shapes visual processing in rats through stimulus-specific adaptation and response enhancement.</article-title> <source><italic>Curr. Biol.</italic></source> <volume>27</volume> <fpage>914</fpage>&#x2013;<lpage>919</lpage>. <pub-id pub-id-type="doi">10.1016/j.cub.2017.02.024</pub-id> <pub-id pub-id-type="pmid">28262485</pub-id></citation></ref>
<ref id="B152"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Voss</surname> <given-names>L. J.</given-names></name> <name><surname>Garcia</surname> <given-names>P. S.</given-names></name> <name><surname>Hentschke</surname> <given-names>H.</given-names></name> <name><surname>Banks</surname> <given-names>M. I.</given-names></name></person-group> (<year>2019</year>). <article-title>Understanding the effects of general anesthetics on cortical network activity using ex vivo preparations.</article-title> <source><italic>Anesthesiology</italic></source> <volume>130</volume> <fpage>1049</fpage>&#x2013;<lpage>1063</lpage>. <pub-id pub-id-type="doi">10.1097/ALN.0000000000002554</pub-id> <pub-id pub-id-type="pmid">30694851</pub-id></citation></ref>
<ref id="B153"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wainwright</surname> <given-names>M. J.</given-names></name></person-group> (<year>1999</year>). <article-title>Visual adaptation as optimal information transmission.</article-title> <source><italic>Vis. Res.</italic></source> <volume>39</volume> <fpage>3960</fpage>&#x2013;<lpage>3974</lpage>. <pub-id pub-id-type="doi">10.1016/S0042-6989(99)00101-7</pub-id></citation></ref>
<ref id="B154"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>J.</given-names></name> <name><surname>Narain</surname> <given-names>D.</given-names></name> <name><surname>Hosseini</surname> <given-names>E. A.</given-names></name> <name><surname>Jazayeri</surname> <given-names>M.</given-names></name></person-group> (<year>2018</year>). <article-title>Flexible timing by temporal scaling of cortical responses.</article-title> <source><italic>Nat. Neurosci.</italic></source> <volume>21</volume> <fpage>102</fpage>&#x2013;<lpage>112</lpage>. <pub-id pub-id-type="doi">10.1038/s41593-017-0028-6</pub-id> <pub-id pub-id-type="pmid">29203897</pub-id></citation></ref>
<ref id="B155"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>S.</given-names></name> <name><surname>Segev</surname> <given-names>I.</given-names></name> <name><surname>Borst</surname> <given-names>A.</given-names></name> <name><surname>Palmer</surname> <given-names>S.</given-names></name></person-group> (<year>2021</year>). <article-title>Maximally efficient prediction in the early fly visual system may support evasive flight maneuvers.</article-title> <source><italic>PLoS Comput. Biol.</italic></source> <volume>17</volume>:<fpage>e1008965</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pcbi.1008965</pub-id> <pub-id pub-id-type="pmid">34014926</pub-id></citation></ref>
<ref id="B156"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Weber</surname> <given-names>A. I.</given-names></name> <name><surname>Krishnamurthy</surname> <given-names>K.</given-names></name> <name><surname>Fairhall</surname> <given-names>A. L.</given-names></name></person-group> (<year>2019</year>). <article-title>Coding principles in adaptation.</article-title> <source><italic>Annu. Rev. Vis. Sci.</italic></source> <volume>5</volume> <fpage>223</fpage>&#x2013;<lpage>246</lpage>. <pub-id pub-id-type="doi">10.1146/annurev-vision-091718</pub-id></citation></ref>
<ref id="B157"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Whittington</surname> <given-names>J. C. R.</given-names></name> <name><surname>Bogacz</surname> <given-names>R.</given-names></name></person-group> (<year>2019</year>). <article-title>Theories of error back-propagation in the brain.</article-title> <source><italic>Trends Cogn. Sci.</italic></source> <volume>23</volume> <fpage>235</fpage>&#x2013;<lpage>250</lpage>. <pub-id pub-id-type="doi">10.1016/j.tics.2018.12.005</pub-id> <pub-id pub-id-type="pmid">30704969</pub-id></citation></ref>
<ref id="B158"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wiskott</surname> <given-names>L.</given-names></name> <name><surname>Sejnowski</surname> <given-names>T. J.</given-names></name></person-group> (<year>2002</year>). <article-title>Slow feature analysis: unsupervised learning of invariances.</article-title> <source><italic>Neural Comput.</italic></source> <volume>14</volume> <fpage>715</fpage>&#x2013;<lpage>770</lpage>.</citation></ref>
<ref id="B159"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Xu</surname> <given-names>S.</given-names></name> <name><surname>Jiang</surname> <given-names>W.</given-names></name> <name><surname>Poo</surname> <given-names>M.</given-names></name> <name><surname>Dan</surname> <given-names>Y.</given-names></name></person-group> (<year>2012</year>). <article-title>Activity recall in a visual cortical ensemble.</article-title> <source><italic>Nat. Neurosci.</italic></source> <volume>15</volume> <fpage>449</fpage>&#x2013;<lpage>455</lpage>. <pub-id pub-id-type="doi">10.1038/nn.3036</pub-id> <pub-id pub-id-type="pmid">22267160</pub-id></citation></ref>
<ref id="B160"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yamins</surname> <given-names>D. L. K.</given-names></name> <name><surname>Hong</surname> <given-names>H.</given-names></name> <name><surname>Cadieu</surname> <given-names>C. F.</given-names></name> <name><surname>Solomon</surname> <given-names>E. A.</given-names></name> <name><surname>Seibert</surname> <given-names>D.</given-names></name> <name><surname>Dicarlo</surname> <given-names>J. J.</given-names></name></person-group> (<year>2014</year>). <article-title>Performance-optimized hierarchical models predict neural responses in higher visual cortex.</article-title> <source><italic>Proc. Natl. Acad. Sci. U.S.A.</italic></source> <volume>111</volume> <fpage>8619</fpage>&#x2013;<lpage>8624</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.1403112111</pub-id> <pub-id pub-id-type="pmid">24812127</pub-id></citation></ref>
<ref id="B161"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zador</surname> <given-names>A. M.</given-names></name></person-group> (<year>2019</year>). <article-title>A critique of pure learning and what artificial neural networks can learn from animal brains.</article-title> <source><italic>Nat. Commun.</italic></source> <volume>10</volume>:<fpage>3770</fpage>. <pub-id pub-id-type="doi">10.1038/s41467-019-11786-6</pub-id> <pub-id pub-id-type="pmid">31434893</pub-id></citation></ref>
<ref id="B162"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhao</surname> <given-names>Y.</given-names></name> <name><surname>Rothkopf</surname> <given-names>C. A.</given-names></name> <name><surname>Triesch</surname> <given-names>J.</given-names></name> <name><surname>Shi</surname> <given-names>B. E.</given-names></name></person-group> (<year>2012</year>). &#x201C;<article-title>A unified model of the joint development of disparity selectivity and vergence control</article-title>,&#x201D; in <source><italic>Proceedings of the IEEE International Conference on Development and Learning and Epigenetic Robotics</italic></source>, (<publisher-loc>Piscataway, NJ</publisher-loc>: <publisher-name>IEEE</publisher-name>).</citation></ref>
<ref id="B163"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhuang</surname> <given-names>C.</given-names></name> <name><surname>Yan</surname> <given-names>S.</given-names></name> <name><surname>Nayebi</surname> <given-names>A.</given-names></name> <name><surname>Schrimpf</surname> <given-names>M.</given-names></name> <name><surname>Frank</surname> <given-names>M. C.</given-names></name> <name><surname>DiCarlo</surname> <given-names>J. J.</given-names></name><etal/></person-group> (<year>2021</year>). <article-title>Unsupervised neural network models of the ventral visual stream.</article-title> <source><italic>Proc. Natl. Acad. Sci. U.S.A.</italic></source> <volume>118</volume>:<fpage>e2014196118</fpage>. <pub-id pub-id-type="doi">10.1073/pnas.2014196118</pub-id> <pub-id pub-id-type="pmid">33431673</pub-id></citation></ref>
<ref id="B164"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zmarz</surname> <given-names>P.</given-names></name> <name><surname>Keller</surname> <given-names>G. B.</given-names></name></person-group> (<year>2016</year>). <article-title>Mismatch receptive fields in mouse visual cortex.</article-title> <source><italic>Neuron</italic></source> <volume>92</volume> <fpage>766</fpage>&#x2013;<lpage>772</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuron.2016.09.057</pub-id> <pub-id pub-id-type="pmid">27974161</pub-id></citation></ref>
</ref-list>
</back>
</article>