<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Comput. Neurosci.</journal-id>
<journal-title>Frontiers in Computational Neuroscience</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Comput. Neurosci.</abbrev-journal-title>
<issn pub-type="epub">1662-5188</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fncom.2014.00150</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Neuroscience</subject>
<subj-group>
<subject>Original Research Article</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Multi-dimensional classification of GABAergic interneurons with Bayesian network-modeled label uncertainty</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name><surname>Mihaljevi&#x00107;</surname> <given-names>Bojan</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="author-notes" rid="fn001"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://community.frontiersin.org/people/u/131279"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Bielza</surname> <given-names>Concha</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<uri xlink:href="http://community.frontiersin.org/people/u/99617"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Benavides-Piccione</surname> <given-names>Ruth</given-names></name>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<xref ref-type="aff" rid="aff3"><sup>3</sup></xref>
<uri xlink:href="http://community.frontiersin.org/people/u/27665"/>
</contrib>
<contrib contrib-type="author">
<name><surname>DeFelipe</surname> <given-names>Javier</given-names></name>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<xref ref-type="aff" rid="aff3"><sup>3</sup></xref>
<uri xlink:href="http://community.frontiersin.org/people/u/5"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Larra&#x000F1;aga</surname> <given-names>Pedro</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<uri xlink:href="http://community.frontiersin.org/people/u/113559"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Computational Intelligence Group, Departamento de Inteligencia Artificial, Escuela T&#x000E9;cnica Superior de Ingenieros Inform&#x000E1;ticos, Universidad Polit&#x000E9;cnica de Madrid</institution> <country>Madrid, Spain</country></aff>
<aff id="aff2"><sup>2</sup><institution>Laboratorio Cajal de Circuitos Corticales, Centro de Tecnolog&#x000ED;a Biom&#x000E9;dica, Universidad Polit&#x000E9;cnica de Madrid</institution> <country>Madrid, Spain</country></aff>
<aff id="aff3"><sup>3</sup><institution>Instituto Cajal, Consejo Superior de Investigaciones Cient&#x000ED;ficas</institution> <country>Madrid, Spain</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: David Hansel, University of Paris, France</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Volker Steuber, University of Hertfordshire, UK; Benjamin Torben-Nielsen, Okinawa Institute of Science and Technology Graduate University, Japan</p></fn>
<fn fn-type="corresp" id="fn001"><p>&#x0002A;Correspondence: Bojan Mihaljevi&#x00107;, Departamento de Inteligencia Artificial, Escuela T&#x000E9;cnica Superior de Ingenieros Inform&#x000E1;ticos, Universidad Polit&#x000E9;cnica de Madrid, Campus de Montegancedo s/n, Boadilla del Monte 28660, Madrid, Spain e-mail: <email>bmihaljevic&#x00040;fi.upm.es</email></p></fn>
<fn fn-type="other" id="fn002"><p>This article was submitted to the journal Frontiers in Computational Neuroscience.</p></fn>
</author-notes>
<pub-date pub-type="epub">
<day>25</day>
<month>11</month>
<year>2014</year>
</pub-date>
<pub-date pub-type="collection">
<year>2014</year>
</pub-date>
<volume>8</volume>
<elocation-id>150</elocation-id>
<history>
<date date-type="received">
<day>13</day>
<month>06</month>
<year>2014</year>
</date>
<date date-type="accepted">
<day>03</day>
<month>11</month>
<year>2014</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2014 Mihaljevi&#x00107;, Bielza, Benavides-Piccione, DeFelipe and Larra&#x000F1;aga.</copyright-statement>
<copyright-year>2014</copyright-year>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<abstract><p>Interneuron classification is an important and long-debated topic in neuroscience. A recent study provided a data set of digitally reconstructed interneurons classified by 42 leading neuroscientists according to a pragmatic classification scheme composed of five categorical variables, namely, of the interneuron type and four features of axonal morphology. From this data set we now learned a model which can classify interneurons, on the basis of their axonal morphometric parameters, into these five descriptive variables simultaneously. Because of differences in opinion among the neuroscientists, especially regarding neuronal type, for many interneurons we lacked a unique, agreed-upon classification, which we could use to guide model learning. Instead, we guided model learning with a probability distribution over the neuronal type and the axonal features, obtained, for each interneuron, from the neuroscientists&#x00027; classification choices. We conveniently encoded such probability distributions with Bayesian networks, calling them <italic>label Bayesian networks</italic> (LBNs), and developed a method to predict them. This method predicts an LBN by forming a probabilistic consensus among the LBNs of the interneurons most similar to the one being classified. We used 18 axonal morphometric parameters as predictor variables, 13 of which we introduce in this paper as quantitative counterparts to the categorical axonal features. We were able to accurately predict interneuronal LBNs. Furthermore, when extracting crisp (i.e., non-probabilistic) predictions from the predicted LBNs, our method outperformed related work on interneuron classification. Our results indicate that our method is adequate for multi-dimensional classification of interneurons with probabilistic labels. Moreover, the introduced morphometric parameters are good predictors of interneuron type and the four features of axonal morphology and thus may serve as objective counterparts to the subjective, categorical axonal features.</p></abstract>
<kwd-group>
<kwd>probabilistic labels</kwd>
<kwd>consensus</kwd>
<kwd>distance-weighted k nearest neighbors</kwd>
<kwd>multiple annotators</kwd>
<kwd>neuronal morphology</kwd>
</kwd-group>
<counts>
<fig-count count="3"/>
<table-count count="5"/>
<equation-count count="5"/>
<ref-count count="77"/>
<page-count count="13"/>
<word-count count="10323"/>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="introduction" id="s1">
<title>1. Introduction</title>
<p>There are two main neuron subpopulations in the cerebral cortex: excitatory glutamatergic neurons, constituting approximately 80% of all cortical neurons, and inhibitory GABAergic interneurons, representing the remaining 20%. Although less numerous, GABAergic interneurons (for simplicity, interneurons), play multiple critical cortical functions and are highly heterogeneous with regards to their morphological, electrophysiological, and molecular properties (Ascoli et al., <xref ref-type="bibr" rid="B3">2007</xref>). Neuroscientists consider that these differences indicate that various types of interneurons actually exist and that the differences among them are functionally relevant. Although many different classification schemes have been proposed so far (e.g., Fair&#x000E9;n et al., <xref ref-type="bibr" rid="B24">1992</xref>; Kawaguchi, <xref ref-type="bibr" rid="B40">1993</xref>; Cauli et al., <xref ref-type="bibr" rid="B11">1997</xref>; Somogyi et al., <xref ref-type="bibr" rid="B68">1998</xref>; Gupta et al., <xref ref-type="bibr" rid="B32">2000</xref>; Maccaferri and Lacaille, <xref ref-type="bibr" rid="B45">2003</xref>), there is no universally accepted catalog of interneuron types (DeFelipe et al., <xref ref-type="bibr" rid="B15">2013</xref>), making it hard to share and organize data and the knowledge derived from them. Ascoli et al. (<xref ref-type="bibr" rid="B3">2007</xref>) have identified a large set of morphological, electrophysiological, and molecular properties which can be used to distinguish among interneuron types. However, gathering such comprehensive data has considerable practical burdens (DeFelipe et al., <xref ref-type="bibr" rid="B15">2013</xref>), making it hard to follow such a classification in practice.</p>
<p>Therefore, DeFelipe et al. (<xref ref-type="bibr" rid="B15">2013</xref>) proposed an alternative, pragmatic classification scheme, based on patterns of axonal arborization. The scheme classifies interneurons according to their type and four other features of axonal morphology. It contemplates ten types, most of them well established in literature, such as Martinotti and chandelier, and provides rather precise definitions of their axonal and dendritic morphology. The remaining axonal features are categorical properties such as axon&#x00027;s columnar and laminar reach (i.e., whether it is intra- or trans-columnar; intra- or trans-laminar)<xref ref-type="fn" rid="fn0001"><sup>1</sup></xref>. To assess the viability of this classification scheme, that is, whether it is useful for cataloging interneurons, DeFelipe et al. (<xref ref-type="bibr" rid="B15">2013</xref>) convened 42 leading neuroscientists to classify 320 interneurons. While the experts easily distinguished some of the neuronal types and the four remaining features, they found some types to be somewhat confusing.</p>
<p>Nonetheless, the data they gathered provides a basis for building an objective, automatized classifier, which would map quantitative neuronal properties to interneuron types and the categories of axonal features. Automatic classification of neurons has been mainly done in an unsupervised fashion (Jain, <xref ref-type="bibr" rid="B37">2010</xref>), seeking to discover groups on the basis of quantitative properties alone (Cauli et al., <xref ref-type="bibr" rid="B12">2000</xref>; Tsiola et al., <xref ref-type="bibr" rid="B72">2003</xref>; Karagiannis et al., <xref ref-type="bibr" rid="B39">2009</xref>; McGarry et al., <xref ref-type="bibr" rid="B50">2010</xref>). However, the availability of expert-provided input on interneuron type membership and their axonal features allows us to learn a model in a supervised fashion (Duda et al., <xref ref-type="bibr" rid="B19">2000</xref>), as done by, e.g., Marin et al. (<xref ref-type="bibr" rid="B48">2002</xref>) and Druckmann et al. (<xref ref-type="bibr" rid="B18">2013</xref>). When such supervision information is available, supervised learning can yield more accurate models than unsupervised learning (Guerra et al., <xref ref-type="bibr" rid="B31">2011</xref>). In addition, a model obtained in this way can be used to replace experts, as it can, given an interneuron, automatically predict its properties (the type and axonal features).</p>
<p>Using the neuroscientists&#x00027; classification choices as input for supervised classification is challenging due to the ambiguity in type membership and axonal features of the interneurons. While this ambiguity varied across our data, some interneurons were especially ambiguous: e.g., one was assigned to six different types, with at most 14 (out of 42) experts agreeing on one of these types. Previous efforts to predict the neuronal type and axonal features (DeFelipe et al., <xref ref-type="bibr" rid="B15">2013</xref>; Mihaljevi&#x00107; et al., <xref ref-type="bibr" rid="B51">2014a</xref>,<xref ref-type="bibr" rid="B52">b</xref>) considered such majority choices as <italic>ground truth</italic>, i.e., as the true type and axonal features, and therefore, for each interneuron, disregarded the opinions of the disagreeing neuroscientists (with the majority for that interneuron). While Mihaljevi&#x00107; et al. (<xref ref-type="bibr" rid="B52">2014b</xref>) only predicted the neuronal type, DeFelipe et al. (<xref ref-type="bibr" rid="B15">2013</xref>) and Mihaljevi&#x00107; et al. (<xref ref-type="bibr" rid="B51">2014a</xref>) built an independent model for each axonal feature, although these features are complementary.</p>
<p>In this paper, we predict interneuron type and axonal features simultaneously, while accounting for class label ambiguity in a principled way. Namely, for each interneuron, we encode the neuroscientists&#x00027; input with a joint probability distribution over the five class variables<xref ref-type="fn" rid="fn0002"><sup>2</sup></xref>. That is, we consider that each interneuron has a certain probability of belonging to each possible combination of the five axonal features. Assuming that all experts were equally good at classifying interneurons, these probabilities are given by the relative frequencies of such combinations in the expert-provided input. This way, we take the opinions of all annotator neuroscientists into account. Such probability distributions can be compactly encoded with Bayesian networks (Pearl, <xref ref-type="bibr" rid="B56">1988</xref>; Koller and Friedman, <xref ref-type="bibr" rid="B42">2009</xref>), given sufficient conditional independencies among the variables. We will therefore represent these joint probability distributions over class variables with Bayesian networks and call them <italic>label Bayesian networks</italic> (LBNs). As a first step in the present study, we will obtain LBNs from the experts&#x00027; input; subsequently, we will train and evaluate our model using LBNs as input.</p>
<p>To the best of our knowledge, this is the first paper tackling multi-dimensional classification (i.e., with multiple class variables; Van Der Gaag and De Waal, <xref ref-type="bibr" rid="B74">2006</xref>; Bielza et al., <xref ref-type="bibr" rid="B7">2011</xref>) with probabilistic labels. Multi-dimensional classification is hard because of dependencies among class variables: ignoring them, by building a separate model for each variable, is suboptimal, while modeling them can result in data scarcity if there are more than a few class variables. Instead of identifying global dependencies among class variables, we predict the LBN of an interneuron by looking at the interneurons most similar to it (i.e., its neighbors in the space of predictor variables), following the lazy-learning <italic>k</italic>-nearest neighbors method (<italic>k</italic>-nn) (Fix and Hodges, <xref ref-type="bibr" rid="B25">1989</xref>). Having found the neighbors of an interneuron, we predict its LBN by forming a consensus Bayesian network (e.g., Matzkevich and Abramson, <xref ref-type="bibr" rid="B49">1992</xref>) among the neighbors&#x00027; LBNs. In order to give more weight in the consensus distribution to the LBNs of the closer neighbors, we adapt the Bayesian network consensus method developed by Lopez-Cruz et al. (<xref ref-type="bibr" rid="B44">2014</xref>).</p>
<p>Note that our method takes LBNs, rather than the expert-provided labels, as input, thus abstracting away the annotators. In a similar real-world scenario, this might be useful for hiding the annotators&#x00027; labels from the data analyst, for reasons such as confidentiality protection. Furthermore, an LBN could be obtained in multiple ways: by learning from data, eliciting from an expert, or combining expert knowledge and learning from data.</p>
<p>In order to predict the neuronal type and axonal features, we introduce 13 morphometric parameters of the axon to be used as predictor variables. We defined these parameters seeking to capture the concepts represented by the four axonal features (other than neuronal type) and implemented software that computes them from digital reconstructions of neuronal morphology. In addition, we used five other axonal morphometric parameters, computed with NeuroExplorer (Glaser and Glaser, <xref ref-type="bibr" rid="B27">1990</xref>), which were already used as predictors of neuronal type by DeFelipe et al. (<xref ref-type="bibr" rid="B15">2013</xref>) and Mihaljevi&#x00107; et al. (<xref ref-type="bibr" rid="B52">2014b</xref>). In total, we used 18 axonal morphometric parameters as predictor variables.</p>
<p>We found that our method accurately predicted the probability distributions encoded by the LBNs. Also, for comparison with previous work on interneuron classification, we assessed the prediction of the majority class labels, and found that we outperformed (DeFelipe et al., <xref ref-type="bibr" rid="B15">2013</xref>) in per-class majority label accuracy.</p>
<p>The rest of this paper is structured as follows. Section 2 describes the data set, the interneuron nomenclature due to DeFelipe et al. (<xref ref-type="bibr" rid="B15">2013</xref>), the morphometric parameters, including the ones we introduce in this paper, and the extraction of LBNs from expert-provided labels; it also describes the proposed method&#x02014;the distance-weighted consensus of <italic>k</italic> nearest Bayesian networks&#x02014;, the related methods, the metrics for assessing our method&#x00027;s predictive performance, and, finally, specifies the experimental setting. We provide our results in Section 3, discuss them in Section 4, and conclude in Section 5.</p>
</sec>
<sec sec-type="materials and methods" id="s2">
<title>2. Materials and methods</title>
<sec>
<title>2.1. Neuronal reconstructions</title>
<p>We used neuronal reconstructions and expert neuroscientists&#x00027; terminological choices that were gathered by DeFelipe et al. (<xref ref-type="bibr" rid="B15">2013</xref>). Of the 320 interneurons classified in that study, 241 were digitally reconstructed cells (retrieved by DeFelipe et al., <xref ref-type="bibr" rid="B15">2013</xref> from NeuroMorpho.Org, Ascoli et al., <xref ref-type="bibr" rid="B3">2007</xref>), coming from different areas and layers of the cerebral cortex of the mouse, rat, and monkey. Forty of the reconstructions had one or multiple interrupted (i.e., with non-continuous tracing) axonal processes; when deemed feasible (36 cells), we unified the axonal processes using Neurolucida (MicroBrightField, Inc., Williston, VT, USA). We omitted the remaining four cells from our study, reducing our data sample to 237 cells.</p>
</sec>
<sec>
<title>2.2. Axonal feature-based nomenclature</title>
<p>DeFelipe et al. (<xref ref-type="bibr" rid="B15">2013</xref>) asked 42 expert neuroscientists to classify the above-described interneurons according to the interneuron nomenclature they proposed. The nomenclature consists of six categorical features of axonal arborization. The features&#x00027; categories are the following:
<list list-type="order">
<list-item><p>Axonal feature 1 (<italic>C</italic><sub>1</sub>): <monospace>intralaminar</monospace> and <monospace>transla</monospace>-<monospace>minar</monospace></p></list-item>
<list-item><p>Axonal feature 2 (<italic>C</italic><sub>2</sub>): <monospace>intracolumnar</monospace> and <monospace>trans</monospace>-<monospace>columnar</monospace></p></list-item>
<list-item><p>Axonal feature 3 (<italic>C</italic><sub>3</sub>): <monospace>centered</monospace> and <monospace>displaced</monospace></p></list-item>
<list-item><p>Axonal feature 4 (<italic>C</italic><sub>4</sub>): <monospace>ascending</monospace>, <monospace>descending</monospace>, <monospace>both</monospace>, and <monospace>no</monospace></p></list-item>
<list-item><p>Axonal feature 5 (<italic>C</italic><sub>5</sub>): <monospace>arcade</monospace> (<monospace>AR</monospace>), <monospace>Cajal-Retzius</monospace> (<monospace>CR</monospace>), <monospace>chandelier</monospace> (<monospace>CH</monospace>), <monospace>common basket</monospace> (<monospace>CB</monospace>), <monospace>common type</monospace> (<monospace>CT</monospace>), <monospace>horse-tail</monospace> (<monospace>HT</monospace>), <monospace>large basket</monospace> (<monospace>LB</monospace>), <monospace>Martinotti</monospace> (<monospace>MA</monospace>), <monospace>neurogliaform</monospace> (<monospace>NG</monospace>), and <monospace>other</monospace> (<monospace>OT</monospace>)</p></list-item>
<list-item><p>Axonal feature 6 (<italic>C</italic><sub>6</sub>): <monospace>characterized</monospace> and <monospace>unchar</monospace>-<monospace>acterized</monospace></p></list-item>
</list></p>
<p>Cells whose axon is predominantly in soma&#x00027;s cortical layer are <monospace>intarlaminar</monospace> in <italic>C</italic><sub>1</sub>; the rest are <monospace>translaminar</monospace>. Similarly, regarding <italic>C</italic><sub>2</sub>, interneurons with the axon predominantly in soma&#x00027;s cortical column are <monospace>intracolumnar</monospace>; the rest are <monospace>transcolumnar</monospace>. A cell whose dendritic arbor is mainly located in the center of the axonal arborization is <monospace>centered</monospace> (<italic>C</italic><sub>3</sub>); otherwise it is <monospace>displaced</monospace>. <italic>C</italic><sub>4</sub> further distinguishes between <monospace>translaminar</monospace> (<italic>C</italic><sub>1</sub>) and <monospace>displaced</monospace> (<italic>C</italic><sub>3</sub>) cells: cells with an axon mainly ascending toward the cortical surface are <monospace>ascending</monospace>, cells with an axon mainly descending toward the white matter are <monospace>descending</monospace>, whereas those with both ascending and descending arbors are termed <monospace>both</monospace>. To those cells that were not <monospace>translaminar</monospace> (<italic>C</italic><sub>1</sub>) and <monospace>displaced</monospace> (<italic>C</italic><sub>3</sub>) we assigned <monospace>no</monospace> in <italic>C</italic><sub>4</sub> (this category was not contemplated in the original nomenclature). Class <italic>C</italic><sub>5</sub> is the interneuron type. A cell is <monospace>uncharacterized</monospace> in <italic>C</italic><sub>6</sub> if it is not suitable for characterization according to features <italic>C</italic><sub>1</sub>&#x02013;<italic>C</italic><sub>5</sub>, due to, e.g., insufficient reconstruction; otherwise, a cell is <monospace>characterized</monospace>. An expert who considered that a neuron was uncharacterized did not categorize it according to features <italic>C</italic><sub>1</sub>&#x02013;<italic>C</italic><sub>5</sub>. Figure <xref ref-type="fig" rid="F1">1</xref> shows two interneurons characterized according to axonal features <italic>C</italic><sub>1</sub>&#x02013;<italic>C</italic><sub>5</sub>.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p><bold>Examples of interneurons of different types and axonal features. (A)</bold> Is an <monospace>intralaminar</monospace>, <monospace>intracolumnar</monospace>, <monospace>centered</monospace>, and <monospace>no</monospace> cell, according to 37 (out of 42) experts. Most of its axon (shown in blue) is at less than 200 &#x003BC;m from the soma (shown in red; the grid lines are separated by 100 &#x003BC;m) and thus appears to be mainly located in soma&#x00027;s cortical layer; it is within soma&#x00027;s cortical column (the gray vertical shadows depict a 300 &#x003BC;m-wide cortical column); and it seems to be centered around the dendritic arbor (also shown in red). It is <monospace>NG</monospace> according to 18 experts, <monospace>CB</monospace> according to 17 experts, <monospace>CT</monospace> according to 3 experts, and <monospace>OT</monospace> and <monospace>AR</monospace> according to one expert each. <bold>(B)</bold> Is a <monospace>translaminar</monospace>, <monospace>transcolumnar</monospace>, <monospace>displaced</monospace>, and <monospace>ascending</monospace> cell according to 39 experts. Its axon reaches around 800 &#x003BC;m above soma (i.e., it seems to extend to another layer); a significant portion of its axon is outside of soma&#x00027;s cortical column; its dendrites are not in the center of the axonal arborization; and its axon is predominantly above the soma. According to 34 experts, this is a <monospace>MA</monospace> cell.</p></caption>
<graphic xlink:href="fncom-08-00150-g0001.tif"/>
</fig>
</sec>
<sec>
<title>2.3. Predictor variables</title>
<p>We used 18 parameters of axonal morphology as predictor variables. Five of these parameters were computed with NeuroExplorer and were already used to predict interneuron types by DeFelipe et al. (<xref ref-type="bibr" rid="B15">2013</xref>) and Mihaljevi&#x00107; et al. (<xref ref-type="bibr" rid="B52">2014b</xref>). In addition, we introduce 13 parameters of axonal morphology, seeking to the capture the concepts represented by axonal features <italic>C</italic><sub>1</sub>&#x02013;<italic>C</italic><sub>4</sub>. We computed these parameters from 3D interneuron reconstructions files in Neurolucida&#x00027;s ASCII (<sup>&#x0002A;</sup>.asc) format.</p>
<p>The five parameters we computed with NeuroExplorer are:
<list list-type="order">
<list-item><p><italic>X</italic><sub>1</sub>: 2D convex hull perimeter (in <italic>Z</italic> projection).</p></list-item>
<list-item><p><italic>X</italic><sub>2</sub>: Axon length.</p></list-item>
<list-item><p><italic>X</italic><sub>3</sub>: Axon length at less than 150 &#x003BC;m from the soma.</p></list-item>
<list-item><p><italic>X</italic><sub>4</sub>: Axon length at more than 150 and less than 300 &#x003BC;m from the soma.</p></list-item>
<list-item><p><italic>X</italic><sub>5</sub>: Axon length at more than 300 &#x003BC;m from the soma.</p></list-item>
</list></p>
<p>Parameters <italic>X</italic><sub>3</sub>&#x02013;<italic>X</italic><sub>5</sub> are meant to measure axonal arborization with respect to the cortical column. Namely, parameter <italic>X</italic><sub>3</sub> approximates arborization length within a (300 &#x003BC;m wide) cortical column (at less than 150 &#x003BC;m from the soma); <italic>X</italic><sub>4</sub> approximates the length outside but not far from the column (more than 150 and less than 300 &#x003BC;m from the soma); and <italic>X</italic><sub>5</sub> approximates axonal length far from the column (more than 300 &#x003BC;m from the soma). <italic>X</italic><sub>1</sub> and <italic>X</italic><sub>2</sub> were used by DeFelipe et al. (<xref ref-type="bibr" rid="B15">2013</xref>) while Mihaljevi&#x00107; et al. (<xref ref-type="bibr" rid="B51">2014a</xref>) used <italic>X</italic><sub>3</sub>&#x02013;<italic>X</italic><sub>5</sub> as predictor variables.</p>
<p>We introduce the following axonal morphometric parameters:
<list list-type="order">
<list-item><p><italic>X</italic><sub>6</sub>: Axon length within soma&#x00027;s layer.</p></list-item>
<list-item><p><italic>X</italic><sub>7</sub>: Axon length outside soma&#x00027;s layer.</p></list-item>
<list-item><p><italic>X</italic><sub>8</sub>: Proportion of axon length contained within soma&#x00027;s layer, <inline-formula><mml:math id="M6"><mml:mrow><mml:mfrac><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mn>6</mml:mn></mml:msub></mml:mrow><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mn>6</mml:mn></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mn>7</mml:mn></mml:msub></mml:mrow></mml:mfrac></mml:mrow></mml:math></inline-formula>.</p></list-item>
<list-item><p><italic>X</italic><sub>9</sub>: Axon length within soma&#x00027;s cortical column.</p></list-item>
<list-item><p><italic>X</italic><sub>10</sub>: Axon length outside soma&#x00027;s cortical column.</p></list-item>
<list-item><p><italic>X</italic><sub>11</sub>: Proportion of axon length within soma&#x00027;s cortical column, <inline-formula><mml:math id="M7"><mml:mrow><mml:mfrac><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mn>9</mml:mn></mml:msub></mml:mrow><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mn>9</mml:mn></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mn>10</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:mfrac></mml:mrow></mml:math></inline-formula>.</p></list-item>
<list-item><p><italic>X</italic><sub>12</sub>: Distance, in dimensions <italic>X</italic> and <italic>Y</italic>, from axon&#x00027;s centroid to the soma.</p></list-item>
<list-item><p><italic>X</italic><sub>13</sub>: Distance from the centroid of the above-the-soma part of the axon to the soma.</p></list-item>
<list-item><p><italic>X</italic><sub>14</sub>: Distance from the centroid of the below-the-soma part of the axon to the soma.</p></list-item>
<list-item><p><italic>X</italic><sub>15</sub>: Proportion of distances <italic>X</italic><sub>13</sub> and <italic>X</italic><sub>14</sub>, <inline-formula><mml:math id="M8"><mml:mrow><mml:mfrac><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mn>13</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mn>13</mml:mn></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mn>14</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:mfrac></mml:mrow></mml:math></inline-formula>.</p></list-item>
<list-item><p><italic>X</italic><sub>16</sub>: Axon length above the soma.</p></list-item>
<list-item><p><italic>X</italic><sub>17</sub>: Axon length below the soma.</p></list-item>
<list-item><p><italic>X</italic><sub>18</sub>: Proportion of axon length above soma, <inline-formula><mml:math id="M9"><mml:mrow><mml:mfrac><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mn>16</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mn>16</mml:mn></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mn>17</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:mfrac></mml:mrow></mml:math></inline-formula>.</p></list-item>
</list></p>
<p>We computed these parameters following assumptions made by DeFelipe et al. (<xref ref-type="bibr" rid="B15">2013</xref>), namely: (a) cortical layer thickness is (roughly) determined by species and cortical area (see following paragraphs for details); and (b) the cortical column is a cylinder whose axis passes through the soma and has a diameter of 300 &#x003BC;m. We measured the distance to soma as the distance to soma&#x00027;s centroid. We computed a centroid of a set of points (e.g., of all the points comprising the reconstructed axon) by averaging those points.</p>
<p>When computing parameters <italic>X</italic><sub>6</sub> and <italic>X</italic><sub>7</sub> we looked up the approximate layer thickness according to the neuron&#x00027;s species and cortical area. DeFelipe et al. (<xref ref-type="bibr" rid="B15">2013</xref>) defined an approximate layer thickness for every species/area/layer combination present in their data, and provided it as additional information for experts who classified the interneurons. This information can be accessed at <ext-link ext-link-type="uri" xlink:href="http://cajalbbp.cesvima.upm.es/gardenerclassification/">http://cajalbbp.cesvima.upm.es/gardenerclassification/</ext-link>. DeFelipe et al. (<xref ref-type="bibr" rid="B15">2013</xref>) specified the approximate thickness in the form of an interval&#x02014;e.g., stating that layer II/III of the mouse&#x00027;s visual cortex is 200&#x02013;300 &#x003BC;m thick&#x02014;; we used the interval&#x00027;s midpoint (250 &#x003BC;m for the previous example) as an estimate of layer thickness. Also, we assumed that a soma is equidistant from the top and bottom confines of the layer (i.e., a 250 &#x003BC;m thick layer reaches 125 &#x003BC;m above and 125 &#x003BC;m below the soma).</p>
<p>For 16 mouse interneurons, seven of them from the somatosensory and nine from the visual cortex, the cortical layer was not provided. In order to compute variables <italic>X</italic><sub>6</sub> and <italic>X</italic><sub>7</sub> for these cells, we assumed them to belong to a hypothetical &#x0201C;average layer&#x0201D; for which we assumed a 197 &#x003BC;m thickness in the visual cortex and a 237 &#x003BC;m thickness in the somatosensory cortex. Although only an approximation, we consider this a more informed approximation to the &#x0201C;true&#x0201D; values of these variables than one that could be performed by a distance-computing rule (see Subsections 2.6 and 2.8) if we had left these values unspecified.</p>
</sec>
<sec>
<title>2.4. Data selection and summary</title>
<p>Axonal feature <italic>C</italic><sub>6</sub> is not a &#x0201C;proper&#x0201D; morphological feature but more of a &#x0201C;filter feature&#x0201D; which indicates whether the remaining axonal features can be reliably identified given a reconstructed interneuron. We therefore omitted <italic>C</italic><sub>6</sub> from consideration in this paper. Consequently, we removed from our data set 11 interneurons considered as <monospace>uncharacterized</monospace> by a majority (i.e., at least 21) of neuroscientists, considering that these interneurons cannot be reliably classified according to <italic>C</italic><sub>1</sub>&#x02013;<italic>C</italic><sub>5</sub>, thereby reducing our data sample to 226 interneurons.</p>
<p>Thus, we have <italic>N</italic> &#x0003D; 226 interneurons, each of them quantified by a vector <bold>X</bold> of <italic>m</italic> &#x0003D; 18 real-valued predictor variables (i.e., <bold>x</bold> &#x02208; &#x0211D;<sup>18</sup>). We also have <italic>d</italic> &#x0003D; 5 discrete class (i.e., target) variables <bold>C</bold> &#x0003D; (<italic>C</italic><sub>1</sub>,&#x02026;,<italic>C</italic><sub>5</sub>), with <bold>c</bold> &#x02208; &#x003A9;<sub><italic>C</italic><sub>1</sub></sub> &#x000D7; &#x02026; &#x000D7; &#x003A9;<sub><italic>C</italic><sub>5</sub></sub>. Each interneuron, <bold>x</bold><sup>(<italic>j</italic>)</sup>, is associated with a <italic>N</italic><sub><italic>j</italic></sub> &#x000D7; 5 (<italic>N</italic><sub><italic>j</italic></sub> &#x02264; 42) matrix <inline-graphic xlink:href="fncom-08-00150-i0001.tif"/><sup>(<italic>j</italic>)</sup> in which each row is an observation of <bold>C</bold> due to one annotator neuroscientist, i.e., <inline-graphic xlink:href="fncom-08-00150-i0001.tif"/><sup>(<italic>j</italic>)</sup><sub><italic>i</italic>,<italic>a</italic></sub> is the label for class variable <italic>C</italic><sub><italic>i</italic></sub> assigned to interneuron <bold>x</bold><sup>(<italic>j</italic>)</sup> by expert neuroscientist <italic>a</italic><xref ref-type="fn" rid="fn0003"><sup>3</sup></xref>.</p>
<p>Nonetheless, instead of the provided multi-annotator label matrices <inline-graphic xlink:href="fncom-08-00150-i0001.tif"/>, we require each interneuron to be associated with an LBN in order to apply our method. We obtained LBNs using standard procedures for LBNs from data, and then learned and evaluated our model using these LBNs as input, omitting <inline-graphic xlink:href="fncom-08-00150-i0001.tif"/> from further consideration (see the next subsection).</p>
</sec>
<sec>
<title>2.5. From multi-annotator labels to label Bayesian networks</title>
<p>Prior to applying our method, we learned LBNs from multi-annotator class label matrices <inline-graphic xlink:href="fncom-08-00150-i0001.tif"/>.</p>
<p>An LBN is a Bayesian network over the class variables <bold>C</bold>. A Bayesian network (Pearl, <xref ref-type="bibr" rid="B56">1988</xref>; Koller and Friedman, <xref ref-type="bibr" rid="B42">2009</xref>) <inline-graphic xlink:href="fncom-08-00150-i0002.tif"/> is a pair <inline-graphic xlink:href="fncom-08-00150-i0002.tif"/> &#x0003D; (<inline-graphic xlink:href="fncom-08-00150-i0003.tif"/>,&#x00398;) where <inline-graphic xlink:href="fncom-08-00150-i0003.tif"/>, the structure of the network, is a directed acyclic graph whose vertices correspond to the class variables <bold>C</bold> and its arcs encode the conditional independencies in the joint distribution over <bold>C</bold>, while &#x00398; are the parameters of the conditional probability distributions that the joint distribution is factorized into.</p>
<p>Learning a Bayesian network <inline-graphic xlink:href="fncom-08-00150-i0002.tif"/> from data consists in two steps: learning network structure, <inline-graphic xlink:href="fncom-08-00150-i0003.tif"/> (i.e., the conditional independencies it encodes), and, having obtained the structure, learning its parameters (Neapolitan, <xref ref-type="bibr" rid="B55">2004</xref>; Koller and Friedman, <xref ref-type="bibr" rid="B42">2009</xref>). While the second step is generally straightforward, many methods exist for performing the first step. We applied a method belonging to the well-known family of search&#x0002B;score structure learning methods (see Subsection 2.8).</p>
<p>We wanted the learned LBNs to be similar to the actual empirical probabilities observed in the class label matrices, <inline-graphic xlink:href="fncom-08-00150-i0001.tif"/>. In other words, we wanted the probability distribution factorized by an LBN, <italic>p</italic><sub><inline-graphic xlink:href="fncom-08-00150-i0002.tif"/><sup>(<italic>j</italic>)</sup></sub>, to be similar to the empirical distribution, <italic>p</italic><sub>&#x003F5;<sup>(<italic>j</italic>)</sup></sub>&#x02014;the relative frequency of each possible state of <bold>C</bold> in <inline-graphic xlink:href="fncom-08-00150-i0001.tif"/><sup>(<italic>j</italic>)</sup>. We used this similarity as criterion for selecting the network learning method (see Subsection 2.8) and measured it with Jensen-Shannon divergence (see Subsection 2.9.1).</p>
<p>Finally, having learned the LBNs, our final data set was <inline-graphic xlink:href="fncom-08-00150-i0004.tif"/> &#x0003D; {(<bold>x</bold><sup>(<italic>j</italic>)</sup>, <inline-graphic xlink:href="fncom-08-00150-i0002.tif"/><sup>(<italic>j</italic>)</sup>)}<sup><italic>N</italic></sup><sub><italic>j</italic> &#x0003D; 1</sub>. Figure <xref ref-type="fig" rid="F2">2</xref> depicts the LBNs for interneurons shown in Figure <xref ref-type="fig" rid="F1">1</xref>, along with the predicted LBNs for those interneurons.</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p><bold>Examples of true (A,B) and predicted (C,D) label Bayesian networks (LBNs) for neurons shown in Figure <xref ref-type="fig" rid="F1">1</xref></bold>. The leftmost networks <bold>(A,C)</bold> correspond to interneuron <bold>(A)</bold> in Figure <xref ref-type="fig" rid="F1">1</xref> whereas the right-hand ones <bold>(B,D)</bold> correspond to neuron <bold>(B)</bold> in Figure <xref ref-type="fig" rid="F1">1</xref>. The Bayesian networks are depicted with their nodes (shown as rectangles), arcs, and each node&#x00027;s marginal probability distribution. The predicted distributions are similar to the true ones for many nodes&#x02014;e.g., 93 vs. 98% for <monospace>IC</monospace> (node <italic>C</italic><sub>2</sub>) for interneuron <bold>(A)</bold>. Some marginal probabilities do differ, such as that of the <monospace>NG</monospace> type for neuron <bold>(A)</bold>&#x02014;14% predicted vs. 45% true; a lot of its probability mass was assigned to the more numerous <monospace>CT</monospace> type.</p></caption>
<graphic xlink:href="fncom-08-00150-g0002.tif"/>
</fig>
</sec>
<sec>
<title>2.6. Multi-dimensional classification with label Bayesian networks</title>
<p>Recall that we have <italic>m</italic> predictor variables <bold>X</bold>, with <bold>x</bold> &#x02208; &#x0211D;<sup><italic>m</italic></sup>, that describe the domain under study, and <italic>d</italic> discrete class (or target) variables <bold>C</bold>, with <bold>c</bold> &#x02208; &#x003A9;<sub><italic>C</italic><sub>1</sub></sub> &#x000D7; &#x02026; &#x000D7; &#x003A9;<sub><italic>C</italic><sub><italic>d</italic></sub></sub>, that we wish to predict on the basis of observations of <bold>X</bold>. We observe a data set, <inline-graphic xlink:href="fncom-08-00150-i0004.tif"/> &#x0003D; {(<bold>x</bold><sup>(<italic>j</italic>)</sup>, <inline-graphic xlink:href="fncom-08-00150-i0002.tif"/><sup>(<italic>j</italic>)</sup>)}<sup><italic>N</italic></sup><sub><italic>j</italic> &#x0003D; 1</sub>, where <inline-graphic xlink:href="fncom-08-00150-i0002.tif"/> is a label Bayesian network encoding a joint probability distribution over the multi-dimensional class variable <bold>C</bold>.</p>
<p>We predict the LBN of an unseen instance <bold>x</bold><sup>(<italic>u</italic>)</sup> by forming a consensus Bayesian network among the LBNs of its <italic>k</italic> nearest neighbors (1 &#x02264; <italic>k</italic> &#x0003C; <italic>N</italic>) in the space of predictor variables. We form the consensus by adapting the method developed by Lopez-Cruz et al. (<xref ref-type="bibr" rid="B44">2014</xref>) to weigh the effect of each neighbor&#x00027;s LBN in proportion to that neighbor&#x00027;s relative closeness to <bold>x</bold><sup>(<italic>u</italic>)</sup>. Figure <xref ref-type="fig" rid="F3">3</xref> summarizes our approach.</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p><bold>A schematic representation of multi-dimensional classification with label Bayesian networks (LBNs)</bold>. The figure depicts the assessment of our method&#x00027;s predictive performance. First (step 1; upper left), an instance <bold>x</bold><sup>(<italic>u</italic>)</sup> with LBN <inline-graphic xlink:href="fncom-08-00150-i0002.tif"/><sup>(<italic>u</italic>)</sup> is retrieved from the test set. Then (step 2; lower part), we identify <italic>k</italic> (<italic>k</italic> &#x0003D; 3 in this example) nearest neighbors of <bold>x</bold><sup>(<italic>u</italic>)</sup> and record their distances to <bold>x</bold><sup>(<italic>u</italic>)</sup>; the blue, green, and orange Bayesian networks (lower right) depict the LBNs of the three nearest neighbors of <bold>x</bold><sup>(<italic>u</italic>)</sup>. Then (step 3; upper right), we obtain the predicted LBN, <inline-graphic xlink:href="fncom-08-00150-i0002.tif"/><sup>&#x0002A;(<italic>u</italic>)</sup>, by forming a consensus Bayesian network from the LBNs of the three nearest neighbors. Here, a thicker arrow suggests more weight of that neighbor&#x00027;s LBN in the consensus: the orange arrow is thicker than the blue and green arrows (orange is the closest neighbor of <bold>x</bold><sup>(<italic>u</italic>)</sup>, see lower left). Finally (step four; upper middle), we compare true and predicted probability distributions, <italic>p</italic><sub><inline-graphic xlink:href="fncom-08-00150-i0002.tif"/><sup>(<italic>u</italic>)</sup></sub> and <italic>p</italic><sub><inline-graphic xlink:href="fncom-08-00150-i0002.tif"/><sup>&#x0002A;(<italic>u</italic>)</sup></sub>, with Jensen-Shannon divergence.</p></caption>
<graphic xlink:href="fncom-08-00150-g0003.tif"/>
</fig>
<p><italic>k</italic> nearest neighbors (<italic>k</italic>-nn; Fix and Hodges, <xref ref-type="bibr" rid="B25">1989</xref>) is an instance-based (i.e., model-less) classifier, popular in uni-dimensional classification (Duda et al., <xref ref-type="bibr" rid="B19">2000</xref>). It classifies a data instance <bold>x</bold><sup>(<italic>u</italic>)</sup> by identifying its <italic>k</italic> nearest neighbors in the predictor space, according to some distance measure&#x02014;a common choice is the Euclidean distance&#x02014; and choosing the majority from the neighbors&#x00027; labels.</p>
<sec>
<title>2.6.1. Distance-weighted consensus of Bayesian networks</title>
<p>Combining multiple Bayesian networks into a consensus Bayesian network is a recurring topic of interest. The standard methods for combining the parameters of a joint distribution, disregarding its underlying graphical structure (i.e., the conditional independencies), can yield undesirable results: for example, combining distributions with identical structures may render a consensus distribution with a different structure (Pennock and Wellman, <xref ref-type="bibr" rid="B58">1999</xref>). It is therefore common to first combine network structures (e.g., Matzkevich and Abramson, <xref ref-type="bibr" rid="B49">1992</xref>; Pennock and Wellman, <xref ref-type="bibr" rid="B58">1999</xref>; Del Sagrado and Moral, <xref ref-type="bibr" rid="B16">2003</xref>; Pe&#x000F1;a, <xref ref-type="bibr" rid="B57">2011</xref>) and combine the parameters afterwards (e.g., Pennock and Wellman, <xref ref-type="bibr" rid="B58">1999</xref>; Etminani et al., <xref ref-type="bibr" rid="B23">2013</xref>). The cited structure-combining methods produce distributions which only contain independencies that are common to all networks, rendering them too complex (i.e., having too many parameters) to be useful in practice.</p>
<p>An alternative is to draw samples from the different Bayesian networks and learn the consensus network from the generated data, using standard methods for learning Bayesian networks from data (Neapolitan, <xref ref-type="bibr" rid="B55">2004</xref>; Koller and Friedman, <xref ref-type="bibr" rid="B42">2009</xref>), as proposed by Lopez-Cruz et al. (<xref ref-type="bibr" rid="B44">2014</xref>). Lopez-Cruz et al. (<xref ref-type="bibr" rid="B44">2014</xref>) weighted the influence of each Bayesian network on the consensus by sampling from it a number of instances proportional to its weight. We can readily adapt this method to weigh the effect of neighbors&#x00027; label Bayesian networks in proportion to the their closeness to the instance being classified, <bold>x</bold><sup>(<italic>u</italic>)</sup>, by defining an appropriate weighting function. Before defining the weighting function, let us state the setting more formally.</p>
<p>We want to generate a database <inline-graphic xlink:href="fncom-08-00150-i0004.tif"/><sub><italic>u</italic></sub> by sampling from <italic>k</italic> Bayesian networks {<inline-graphic xlink:href="fncom-08-00150-i0002.tif"/><sup>(<italic>j</italic>)</sup>}<sup><italic>k</italic></sup><sub><italic>j</italic> &#x0003D; 1</sub> associated to <italic>k</italic> instances at distances <italic>d</italic><sub>1</sub>, &#x02026;, <italic>d</italic><sub><italic>k</italic></sub> from the unseen instance <bold>x</bold><sup>(<italic>u</italic>)</sup>; from this database, <inline-graphic xlink:href="fncom-08-00150-i0004.tif"/><sub><italic>u</italic></sub>, we will learn the consensus Bayesian network, <inline-graphic xlink:href="fncom-08-00150-i0002.tif"/><sup>&#x0002A;(<italic>u</italic>)</sup>. We want the number of samples in <inline-graphic xlink:href="fncom-08-00150-i0004.tif"/><sub><italic>u</italic></sub> that are drawn from <inline-graphic xlink:href="fncom-08-00150-i0002.tif"/><sup>(<italic>j</italic>)</sup> to be proportional to the how close <bold>x</bold><sup>(<italic>j</italic>)</sup> is to <bold>x</bold><sup>(<italic>u</italic>)</sup>. We measure closeness as relative to the remaining <italic>k</italic> &#x02212; 1 neighboring instances. Thus, if <italic>M</italic> is the desired size of <inline-graphic xlink:href="fncom-08-00150-i0004.tif"/><sub><italic>u</italic></sub> and <bold>w</bold> &#x0003D; (<italic>w</italic><sub>1</sub>, &#x02026;, <italic>w</italic><sub><italic>k</italic></sub>) the weights assigned to the <italic>k</italic> Bayesian networks (with <inline-formula><mml:math id="M10"><mml:mrow><mml:mstyle displaystyle='true'><mml:msubsup><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>j</mml:mi><mml:mtext>&#x0200A;</mml:mtext><mml:mo>=</mml:mo><mml:mtext>&#x0200A;</mml:mtext><mml:mn>1</mml:mn></mml:mrow><mml:mi>k</mml:mi></mml:msubsup><mml:mrow><mml:msub><mml:mi>w</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:mstyle><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:math></inline-formula> and <italic>w</italic> &#x02265; 0), the number of samples from <inline-graphic xlink:href="fncom-08-00150-i0002.tif"/><sup>(<italic>j</italic>)</sup>, <italic>M</italic><sub><italic>j</italic></sub>, is then <italic>w</italic><sub><italic>j</italic></sub> &#x000D7; <italic>M</italic>. We compute the weights as</p>
<disp-formula id="E1"><mml:math id="M1"><mml:mrow><mml:msub><mml:mi>w</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mo stretchy='false'>(</mml:mo><mml:mstyle displaystyle='true'><mml:msubsup><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>k</mml:mi></mml:msubsup><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:mstyle><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>d</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow><mml:mrow><mml:mo stretchy='false'>(</mml:mo><mml:mi>k</mml:mi><mml:mo>&#x02212;</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo><mml:mo stretchy='false'>(</mml:mo><mml:mstyle displaystyle='true'><mml:msubsup><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>k</mml:mi></mml:msubsup><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:mstyle><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:mfrac><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula>
</sec>
</sec>
<sec>
<title>2.7. Related methods</title>
<sec>
<title>2.7.1. Multiple annotators</title>
<p>A setting similar to ours, where many annotators provide class labels, occurs when <italic>learning from a crowd</italic> of annotators (Snow et al., <xref ref-type="bibr" rid="B67">2008</xref>; Sorokin and Forsyth, <xref ref-type="bibr" rid="B69">2008</xref>; Raykar et al., <xref ref-type="bibr" rid="B61">2010</xref>; Welinder et al., <xref ref-type="bibr" rid="B76">2010</xref>; Raykar and Yu, <xref ref-type="bibr" rid="B60">2012</xref>). Yet, the crowd may include annotators of different skills and therefore learning a classifier involves estimating the ground truth label from the possibly noisy ones. Methods such as those due to Dawid and Skene (<xref ref-type="bibr" rid="B14">1979</xref>); Whitehill et al. (<xref ref-type="bibr" rid="B77">2009</xref>); Raykar et al. (<xref ref-type="bibr" rid="B61">2010</xref>); Welinder et al. (<xref ref-type="bibr" rid="B76">2010</xref>); Raykar and Yu (<xref ref-type="bibr" rid="B60">2012</xref>) aim to detect the less reliable annotators and decrease their influence on the ground truth estimate. In our case, however, all annotators are domain experts; furthermore, there is currently no better approximation to ground truth than the opinions of this group of leading experts, as there is no unequivocal or objective way of determining it<xref ref-type="fn" rid="fn0004"><sup>4</sup></xref>. We thus consider that every expert&#x00027;s opinion is equally valid and that interneuron type membership and axonal features are uncertain whenever the experts do not completely agree. This allows us to represent interneuron type membership and axonal features of an interneuron with a joint probability distribution over these five class variables.</p>
</sec>
<sec>
<title>2.7.2. Probabilistic labels</title>
<p>Probabilistic labels have already been used in machine learning (Ambroise et al., <xref ref-type="bibr" rid="B1">2001</xref>; Grandvallet, <xref ref-type="bibr" rid="B30">2002</xref>; Thiel et al., <xref ref-type="bibr" rid="B71">2007</xref>; Schwenker and Trentin, <xref ref-type="bibr" rid="B65">2014</xref>). Some methods consider these to be imprecise versions of a crisp (i.e., non-probabilistic) ground truth label, which they then try to estimate, while others (Thiel et al., <xref ref-type="bibr" rid="B71">2007</xref>; Schwenker and Trentin, <xref ref-type="bibr" rid="B65">2014</xref>), more in line with our setting, assume that probabilistic labels represent intrinsic ambiguity in class membership and consider them as ground truth. Methods such as <italic>k</italic>-nn (El Gayar et al., <xref ref-type="bibr" rid="B22">2006</xref>) and support vector machines (Thiel et al., <xref ref-type="bibr" rid="B71">2007</xref>; Scherer et al., <xref ref-type="bibr" rid="B63">2013</xref>) have been adapted to deal with probabilistic labels, while regression-based methods, such as multi-layer perceptrons, can handle them without being adapted (Schwenker and Trentin, <xref ref-type="bibr" rid="B65">2014</xref>). Yet, all of these methods are aimed at predicting a single class variable.</p>
</sec>
<sec>
<title>2.7.3. Multi-dimensional classification</title>
<p>Multi-dimensional classification is more general than the related multi-label classification, which has already been considered in neuroscience (Turner et al., <xref ref-type="bibr" rid="B73">2013</xref>). It is hard because the number of possible assignments to the class variables is exponential in their number. Predicting each class variable with an independent model is suboptimal because the variables are, generally, correlated. Modeling many of these dependencies, on the other hand, can lead to data scarcity. Multi-dimensional Bayesian network classifiers (Bielza et al., <xref ref-type="bibr" rid="B7">2011</xref>; Borchani et al., <xref ref-type="bibr" rid="B8">2013</xref>) can balance model complexity and the modeling of dependencies. However, they require crisp class labels in order to be trained and thus cannot be directly applied to our setting.</p>
</sec>
<sec>
<title>2.7.4. K-nearest neighbors</title>
<p><italic>k</italic> nearest neighbors is a popular instance-based (i.e., model-less) classifier. Among other extensions, the original <italic>k</italic>-nn classifier has been adapted to weight the effect of a neighbor&#x00027;s class label in proportion to how close that neighbor is to the data instance being classified (e.g., Dudani, <xref ref-type="bibr" rid="B20">1976</xref>; MacLeod et al., <xref ref-type="bibr" rid="B46">1987</xref>; Denoeux, <xref ref-type="bibr" rid="B17">1995</xref>; Yazdani et al., <xref ref-type="bibr" rid="B78">2009</xref>). It has also been adapted to deal with non-crisp labels (J&#x000F3;&#x0017A;wik, <xref ref-type="bibr" rid="B38">1983</xref>; Keller et al., <xref ref-type="bibr" rid="B41">1985</xref>; Denoeux, <xref ref-type="bibr" rid="B17">1995</xref>); these non-crisp labels, however, are not probabilistic but possibilistic (encoded with Dempster-Shafer theory) and fuzzy. Besides the handling of non-crisp labels, methods due to Denoeux (<xref ref-type="bibr" rid="B17">1995</xref>) and Keller et al. (<xref ref-type="bibr" rid="B41">1985</xref>) are similar to ours in that they weigh the neighbors&#x00027; effect on prediction according to their closeness to the data instance being classified. On the other hand, they differ from our method in neither using probabilistic labels nor tackling the prediction of multiple class variables.</p>
</sec>
</sec>
<sec>
<title>2.8. Experimental setting</title>
<p>We identified the nearest interneurons by measuring Euclidean distance. Thus, for a pair of interneurons <bold>x</bold><sup><italic>j</italic></sup> and <bold>x</bold><sup><italic>o</italic></sup>, the distance <italic>d</italic><sub><italic>jo</italic></sub> is given by</p>
<disp-formula id="E2"><mml:math id="M2"><mml:mrow><mml:msub><mml:mi>d</mml:mi><mml:mrow><mml:mi>j</mml:mi><mml:mi>o</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mstyle displaystyle='true'><mml:munderover><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>m</mml:mi></mml:munderover><mml:mrow><mml:msup><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msubsup><mml:mi>x</mml:mi><mml:mi>i</mml:mi><mml:mrow><mml:mo stretchy='false'>(</mml:mo><mml:mi>j</mml:mi><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msubsup><mml:mo>&#x02212;</mml:mo><mml:msubsup><mml:mi>x</mml:mi><mml:mi>i</mml:mi><mml:mrow><mml:mo stretchy='false'>(</mml:mo><mml:mi>o</mml:mi><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msubsup></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mn>2</mml:mn></mml:msup></mml:mrow></mml:mstyle></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mfrac><mml:mn>1</mml:mn><mml:mn>2</mml:mn></mml:mfrac></mml:mrow></mml:msup><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula>
<p>Prior to computing distances, we standardized all predictor variables <italic>X</italic><sub>1</sub>, &#x02026;, <italic>X</italic><sub><italic>m</italic></sub> (i.e., for each <italic>X</italic><sub><italic>i</italic></sub>, we subtracted its mean and divided by standard deviation).</p>
<p>We drew samples from the neighboring networks using probabilistic logic sampling (Henrion, <xref ref-type="bibr" rid="B36">1986</xref>). We sought to draw enough samples from each distribution so to represent it correctly. We therefore set <italic>M</italic>, the total number of samples drawn from the <italic>k</italic> nearest neighbors&#x00027; distributions (see Section 2.6.1), as <italic>k</italic> &#x02217; 500 &#x02217; <italic>c</italic>, where <italic>c</italic> was the maximal number of free parameters among the <italic>k</italic> networks whose consensus is being sought. The number of free parameters of a Bayesian network is the number of parameters that suffice to fully specify the network&#x00027;s probability distribution (recall that a network consists of a structure, <inline-graphic xlink:href="fncom-08-00150-i0003.tif"/>, and parameters &#x00398;; see subsection 2.5).</p>
<p>Once we had generated the data set of sample points, we applied a Bayesian network learning algorithm to obtain the consensus probability distribution.</p>
<sec>
<title>2.8.1. Learning Bayesian networks from data</title>
<p>There were two instances in which we learned Bayesian networks from data: when learning LBNs from expert-provided class label matrices (see Subsection 2.5) and when learning the consensus network from sampled data points (Subsection 2.6.1). We considered three options for the learning procedure and chose the one that we considered most adequate for learning LBNs, according to the criterion described in Subsection 2.5. We then applied this chosen procedure in both instances of network learning.</p>
<p>The Bayesian network learning procedure we used follows the well-known search&#x0002B;score approach. Such a procedure consists of (a) a search procedure for traversing the space of possible network structures and (b) a scoring function. We searched the structure space with the tabu metaheuristic (Glover, <xref ref-type="bibr" rid="B28">1989</xref>, <xref ref-type="bibr" rid="B29">1990</xref>), a local search procedure which employs adaptive memory to improve efficiency and escape local minima, and considered three networks scores: Bayesian Information Criterion (BIC; Schwarz, <xref ref-type="bibr" rid="B64">1978</xref>), K2 (Cooper and Herskovits, <xref ref-type="bibr" rid="B13">1992</xref>) and Bayesian Dirichlet equivalence (BDe; Heckerman et al., <xref ref-type="bibr" rid="B33">1995</xref>). We compared the LBNs produced by the different scores according to how well they approximated the empirical distributions, <italic>p</italic><sub>&#x003F5;</sub>, (see Subsection 2.5) and their complexity (i.e., number of free parameters).</p>
<p>We estimated parameters by maximum likelihood estimation.</p>
</sec>
<sec>
<title>2.8.2. Software and assessment</title>
<p>We implemented the computation of the 13 here introduced axonal morphometric parameters from scratch. We performed Bayesian network learning and sampling with the <monospace>bnlearn</monospace> (Scutari, <xref ref-type="bibr" rid="B66">2010</xref>; Nagarajan et al., <xref ref-type="bibr" rid="B54">2013</xref>) package for the <monospace>R</monospace> statistical software environment (R Core Team, <xref ref-type="bibr" rid="B59">2014</xref>).</p>
<p>In traditional uni-dimensional classification, it is common to perform stratified cross-validation, that is, to have similar class proportions in train and test sets. However, such stratification is problematic in the multi-dimensional setting, due to the high number of combinations of class variables. Therefore, instead of stratified cross-validation, we evaluated our model with 20 repetitions of plain (unstratified) 10-fold cross-validation.</p>
</sec>
</sec>
<sec>
<title>2.9. Assessing results</title>
<p>We were primarily interested in predicting LBNs. We assessed this prediction with Jensen-Shannon divergence, a metric which we describe below.</p>
<p>However, for comparison with related work on interneuron classification, we also assessed how well our method predicted crisp (i.e., non-probabilistic) labels. Such an evaluation is negatively biased against our method since we take label ambiguity into account to learn the model while it is evaluated as though a true crisp label existed (i.e., as if there was no ambiguity). Below we describe how we obtained crisp labels and present accuracy metrics for multi-dimensional classification.</p>
<sec>
<title>2.9.1. Comparing probability distributions</title>
<p>We measured the dissimilarity between two probability distributions, say <italic>p</italic><sub><inline-graphic xlink:href="fncom-08-00150-i0002.tif"/><sup>(<italic>u</italic>)</sup></sub> and <italic>p</italic><sub><inline-graphic xlink:href="fncom-08-00150-i0002.tif"/><sup>&#x0002A;(<italic>u</italic>)</sup></sub>, with Jensen-Shannon divergence,</p>
<graphic xlink:href="fncom-08-00150-e0001.tif"/>
<p>where <inline-graphic xlink:href="fncom-08-00150-i0005.tif"/> and <italic>d</italic><sub><italic>KL</italic></sub>(<italic>p</italic><sub><inline-graphic xlink:href="fncom-08-00150-i0002.tif"/><sup>(<italic>u</italic>)</sup></sub>, <italic>p</italic><sub><inline-graphic xlink:href="fncom-08-00150-i0002.tif"/><sup>&#x0002A;(<italic>u</italic>)</sup></sub>) is the Kullback-Leibler divergence (Kullback and Leibler, <xref ref-type="bibr" rid="B43">1951</xref>) between <italic>p</italic><sub><inline-graphic xlink:href="fncom-08-00150-i0002.tif"/><sup>(<italic>u</italic>)</sup></sub> and <italic>p</italic><sub><inline-graphic xlink:href="fncom-08-00150-i0002.tif"/><sup>&#x0002A;(<italic>u</italic>)</sup></sub>,</p>
<graphic xlink:href="fncom-08-00150-e0002.tif"/>
<p>Unlike Kullback-Leibler divergence, Jensen-Shannon divergence is symmetric, it does not require absolute continuity (i.e., that <italic>p</italic><sub><inline-graphic xlink:href="fncom-08-00150-i0002.tif"/><sup>&#x0002A;(<italic>u</italic>)</sup></sub>(<bold>c</bold>) &#x0003D; 0 &#x021D2; <italic>p</italic><sub><inline-graphic xlink:href="fncom-08-00150-i0002.tif"/><sup>(<italic>u</italic>)</sup></sub>(<bold>c</bold>) &#x0003D; 0), its square root is a metric, and it is bounded: 0 &#x02264; <italic>d</italic><sub><italic>JS</italic></sub> &#x02264; 1.</p>
</sec>
<sec>
<title>2.9.2. Obtaining crisp labels</title>
<p>In order to assess the prediction of crisp labels, we needed to obtain a &#x0201C;true&#x0201D; crisp class label vector for each interneuron <bold>x</bold><sup>(<italic>j</italic>)</sup>. We assumed that such &#x0201C;true&#x0201D; labels were given by the choice of the majority of the experts. There were two alternative majority choices: (a) the most commonly selected class label vector, i.e., the most common row in a class labels matrix <inline-graphic xlink:href="fncom-08-00150-i0001.tif"/>; and (b) the concatenation of per-class majority labels, i.e., the vector formed by the most common choice for <italic>C</italic><sub>1</sub>, the most common choice for <italic>C</italic><sub>2</sub>, and so on, until <italic>C</italic><sub>5</sub>. We refer to the former as the <italic>joint truth</italic> and to the latter as <italic>marginal truth</italic>; the latter was used in related works on interneuron classification (DeFelipe et al., <xref ref-type="bibr" rid="B15">2013</xref>; Mihaljevi&#x00107; et al., <xref ref-type="bibr" rid="B51">2014a</xref>,<xref ref-type="bibr" rid="B52">b</xref>) since they predicted the axonal features <italic>C</italic><sub>1</sub>&#x02013;<italic>C</italic><sub>5</sub> independently. We compared our predicted crisp labels to both &#x0201C;truths.&#x0201D;</p>
<p>We also needed to extract crisp predictions from a predicted LBNs. The two straightforward methods are analogous to the above-described ones: (a) choosing the <italic>most probable explanation</italic> (MPE), i.e., the most likely joint assignment to <bold>C</bold> according to LBN <inline-graphic xlink:href="fncom-08-00150-i0002.tif"/><sup>&#x0002A;</sup>); and (b) concatenating the marginally most likely assignments to each of the class variables. For simplicity, we only used the MPE as the predicted crisp class labels vector.</p>
</sec>
<sec>
<title>2.9.3. Multi-dimensional classification accuracy metrics</title>
<p>We assessed crisp labels prediction with accuracy metrics for multi-dimensional classification (Bielza et al., <xref ref-type="bibr" rid="B7">2011</xref>):
<list list-type="order">
<list-item><p>The <italic>mean accuracy</italic> over <italic>d</italic> (<italic>d</italic> &#x0003D; 5 in our case) class variables:
<disp-formula id="E3"><mml:math id="M3"><mml:mrow><mml:mover accent='true'><mml:mrow><mml:mi>A</mml:mi><mml:mi>c</mml:mi><mml:mi>c</mml:mi></mml:mrow><mml:mo stretchy='true'>&#x000AF;</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mi>d</mml:mi></mml:mfrac><mml:mstyle displaystyle='true'><mml:munderover><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>l</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>d</mml:mi></mml:munderover><mml:mrow><mml:mfrac><mml:mn>1</mml:mn><mml:mi>N</mml:mi></mml:mfrac></mml:mrow></mml:mstyle><mml:mstyle displaystyle='true'><mml:munderover><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>u</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>N</mml:mi></mml:munderover><mml:mi>&#x003B4;</mml:mi></mml:mstyle><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msubsup><mml:mi>c</mml:mi><mml:mi>l</mml:mi><mml:mrow><mml:mo>&#x02217;</mml:mo><mml:mo stretchy='false'>(</mml:mo><mml:mi>u</mml:mi><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:msubsup><mml:mi>c</mml:mi><mml:mi>l</mml:mi><mml:mrow><mml:mo stretchy='false'>(</mml:mo><mml:mi>u</mml:mi><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msubsup></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
where <italic>c</italic><sup>&#x0002A;(<italic>u</italic>)</sup><sub>l</sub> is the predicted value of <italic>C</italic><sub>l</sub> for <italic>u</italic>-th instance, <italic>c</italic><sup>(<italic>u</italic>)</sup><sub><italic>l</italic></sub> is the corresponding true value, and &#x003B4;(<italic>a</italic>, <italic>b</italic>) &#x0003D; 1 when <italic>a</italic> &#x0003D; <italic>b</italic> and 0 otherwise.</p></list-item>
<list-item><p>The <italic>global accuracy</italic> over <italic>d</italic> class variables:
<disp-formula id="E4"><mml:math id="M4"><mml:mrow><mml:mi>A</mml:mi><mml:mi>c</mml:mi><mml:mi>c</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mi>N</mml:mi></mml:mfrac><mml:mstyle displaystyle='true'><mml:munderover><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>u</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>N</mml:mi></mml:munderover><mml:mi>&#x003B4;</mml:mi></mml:mstyle><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msup><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>c</mml:mi></mml:mstyle><mml:mrow><mml:mo>&#x02217;</mml:mo><mml:mo stretchy='false'>(</mml:mo><mml:mi>u</mml:mi><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msup><mml:mo>,</mml:mo><mml:msup><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>c</mml:mi></mml:mstyle><mml:mrow><mml:mo stretchy='false'>(</mml:mo><mml:mi>u</mml:mi><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msup></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x000B7;</mml:mo></mml:mrow></mml:math></disp-formula></p></list-item>
</list></p>
<p>Note that global accuracy is demanding as it only rewards full matches between the predicted vector and the true one. We also measured uni-dimensional <italic>marginal accuracy</italic> per each class variable,</p>
<disp-formula id="E5"><mml:math id="M5"><mml:mrow><mml:mi>A</mml:mi><mml:mi>c</mml:mi><mml:msub><mml:mi>c</mml:mi><mml:mi>l</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mi>N</mml:mi></mml:mfrac><mml:mstyle displaystyle='true'><mml:munderover><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>u</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>N</mml:mi></mml:munderover><mml:mi>&#x003B4;</mml:mi></mml:mstyle><mml:mo stretchy='false'>(</mml:mo><mml:msubsup><mml:mi>c</mml:mi><mml:mi>l</mml:mi><mml:mrow><mml:mo>&#x02217;</mml:mo><mml:mo stretchy='false'>(</mml:mo><mml:mi>u</mml:mi><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:msubsup><mml:mi>c</mml:mi><mml:mi>l</mml:mi><mml:mrow><mml:mo stretchy='false'>(</mml:mo><mml:mi>u</mml:mi><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msubsup><mml:mo stretchy='false'>)</mml:mo><mml:mo>.</mml:mo></mml:mrow></mml:math></disp-formula>
<p>When computing global and mean accuracy, we used the &#x0201C;joint truth&#x0201D; crisp labels. When computing per-class-variable marginal accuracy, we used the &#x0201C;marginal truth&#x0201D; crisp labels vector.</p>
</sec>
</sec>
</sec>
<sec sec-type="results" id="s3">
<title>3. Results</title>
<sec>
<title>3.1. From multi-annotator labels to label Bayesian networks</title>
<p>We first studied whether any network score was particularly adequate for transforming multi-expert labels into LBNs. Different scores yielded networks of different degrees of complexity but were all good at approximating of the empirical probability distribution over the expert-provided labels, <italic>p</italic><sub>&#x003F5;</sub> (see Table <xref ref-type="table" rid="T1">1</xref>). We used the score that yielded the best approximation, BDe, in the remainder of this paper. Namely, we used it to (a) transform multi-expert labels into LBNs; and (b) learn a consensus networks from the generated samples.</p>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p><bold>Transforming multi-expert labels into label Bayesian networks using different network scores</bold>.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th/>
<th align="center"><bold>BIC</bold></th>
<th align="center"><bold>K2</bold></th>
<th align="center"><bold>BDe</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">JS divergence</td>
<td align="center">00.10 &#x000B1; 0.05</td>
<td align="center">00.07 &#x000B1; 0.04</td>
<td align="center">00.06 &#x000B1; 00.04</td>
</tr>
<tr>
<td align="left">Free parameters</td>
<td align="center">18.22 &#x000B1; 1.83</td>
<td align="center">31.08 &#x000B1; 20.58</td>
<td align="center">60.34 &#x000B1; 31.14</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>Upper row: average Jensen-Shannon (JS) divergence between the empirical probability distribution over the labels, <italic>p</italic><sub>&#x003F5;</sub>, and the one encoded by the learned Bayesian network labels, <italic>p</italic><sub><inline-graphic xlink:href="fncom-08-00150-i0002.tif"/></sub>; lower row: average number of free parameters per learned network. Averaged across entire data set.</italic></p>
</table-wrap-foot>
</table-wrap>
</sec>
<sec>
<title>3.2. Predicting label Bayesian networks</title>
<p>We considered four different values of <italic>k</italic> (the number of nearest neighbors)&#x02014;namely, 3, 5, 7, and 9&#x02014;, and obtained best results with <italic>k</italic> &#x02208; {5, 7}. As Table <xref ref-type="table" rid="T2">2</xref> shows, we predicted the label Bayesian networks relatively accurately, with a Jensen-Shannon divergence of 0.29 for <italic>k</italic> &#x02208; {5, 7}.</p>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p><bold>Predicting label Bayesian networks and crisp labels</bold>.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th/>
<th align="center"><bold>JS</bold></th>
<th align="center"><bold>Global acc. (%)</bold></th>
<th align="center"><bold>Mean acc. (%)</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td align="left"><italic>k</italic> &#x0003D; 3</td>
<td align="center">0.30 &#x000B1; 0.00</td>
<td align="center">41.29 &#x000B1; 1.57</td>
<td align="center">79.10 &#x000B1; 0.74</td>
</tr>
<tr>
<td align="left"><italic>k</italic> &#x0003D; 5</td>
<td align="center">0.29 &#x000B1; 0.00</td>
<td align="center">43.84 &#x000B1; 1.48</td>
<td align="center">79.52 &#x000B1; 0.79</td>
</tr>
<tr>
<td align="left"><italic>k</italic> &#x0003D; 7</td>
<td align="center">0.29 &#x000B1; 0.00</td>
<td align="center">43.99 &#x000B1; 1.26</td>
<td align="center">79.88 &#x000B1; 0.34</td>
</tr>
<tr>
<td align="left"><italic>k</italic> &#x0003D; 9</td>
<td align="center">0.30 &#x000B1; 0.00</td>
<td align="center">39.46 &#x000B1; 1.67</td>
<td align="center">78.58 &#x000B1; 0.52</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>The leftmost column shows Jensen-Shannon (JS) divergence between predicted label Bayesian networks, <italic>p</italic><sub><inline-graphic xlink:href="fncom-08-00150-i0002.tif"/><sup>&#x0002A;</sup></sub>, and label Bayesian networks <italic>p</italic><sub><inline-graphic xlink:href="fncom-08-00150-i0002.tif"/></sub> learned from <inline-graphic xlink:href="fncom-08-00150-i0001.tif"/>. Rightmost columns show global and mean accuracy for predicting joint truth crisp labels vector, i.e., the class label vector most often selected by the experts. Obtained with 20 runs of 10-fold cross-validation.</italic></p>
</table-wrap-foot>
</table-wrap>
<p>Figure <xref ref-type="fig" rid="F2">2</xref> depicts the true and predicted LBNs for two interneurons, one having barely ambiguous axonal features and another having an ambiguous type; as the figure suggests, the LBN of the former interneuron was accurately predicted, while in that of the latter, the type (<italic>C</italic><sub>5</sub>) marginal probability was predicted only moderately well.</p>
</sec>
<sec>
<title>3.3. Predicting crisp labels</title>
<p>We predicted the joint truth (the class label vectors selected by a majority of experts; see Section 2.9.2) relatively accurately&#x02014;with a mean accuracy of 80% and global accuracy of 44% for <italic>k</italic> &#x02208; {5, 7} (see Table <xref ref-type="table" rid="T2">2</xref>). The latter result means that 44% of the MPEs of the predicted LBNs (<inline-graphic xlink:href="fncom-08-00150-i0002.tif"/><sup>&#x0002A;</sup>) were equivalent to the joint truth vectors.</p>
<p>We also assessed the marginal accuracy for each axonal feature <italic>C</italic><sub>1</sub>&#x02013;<italic>C</italic><sub>5</sub>. Here we compared the <inline-graphic xlink:href="fncom-08-00150-i0002.tif"/><sup>&#x0002A;</sup> MPE with the marginal truth, class variable by class variable. We predicted features <italic>C</italic><sub>1</sub>&#x02013;<italic>C</italic><sub>4</sub> with over 80% accuracy&#x02014;up to 88% in case of <italic>C</italic><sub>1</sub>&#x02014; and feature <italic>C</italic><sub>5</sub> with 64% accuracy with <italic>k</italic> &#x0003D; 7 (see Table <xref ref-type="table" rid="T3">3</xref>). Albeit it may seem low, the latter result is better than chance. Namely, DeFelipe et al. (<xref ref-type="bibr" rid="B15">2013</xref>) showed that even 40.25% accuracy for <italic>C</italic><sub>5</sub>&#x02014;obtained by a classifier they used&#x02014; was better than chance. It should also be recalled that the ten neuronal types were often hard to distinguish for expert neuroscientists (DeFelipe et al., <xref ref-type="bibr" rid="B15">2013</xref>). Regarding the prediction of the individual types, accurately predicted ones included the <monospace>MA</monospace> and <monospace>HT</monospace> types, which were easy to identify for the experts, and the numerous but less clear to the experts types such as <monospace>CB</monospace> and <monospace>LB</monospace>. The least clear out of the numerous types, <monospace>CT</monospace>, was predicted with relatively low accuracy (see Table <xref ref-type="table" rid="T4">4</xref>).</p>
<table-wrap position="float" id="T3">
<label>Table 3</label>
<caption><p><bold>Accuracy (in %) for each of the five axonal features <italic>C</italic><sub>1</sub>&#x02013;<italic>C</italic><sub>5</sub></bold>.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th/>
<th align="center"><italic><bold>C</bold></italic><sub><bold>1</bold></sub></th>
<th align="center"><italic><bold>C</bold></italic><sub><bold>2</bold></sub></th>
<th align="center"><italic><bold>C</bold></italic><sub><bold>3</bold></sub></th>
<th align="center"><italic><bold>C</bold></italic><sub><bold>4</bold></sub></th>
<th align="center"><italic><bold>C</bold></italic><sub><bold>5</bold></sub></th>
</tr>
</thead>
<tbody>
<tr>
<td align="left"><italic>k</italic> &#x0003D; 3</td>
<td align="center">86.15 &#x000B1; 1.12</td>
<td align="center">83.17 &#x000B1; 0.98</td>
<td align="center">86.50 &#x000B1; 0.88</td>
<td align="center">83.11 &#x000B1; 0.90</td>
<td align="center">62.69 &#x000B1; 1.24</td>
</tr>
<tr>
<td align="left"><italic>k</italic> &#x0003D; 5</td>
<td align="center">86.49 &#x000B1; 0.98</td>
<td align="center">83.25 &#x000B1; 0.79</td>
<td align="center">86.05 &#x000B1; 0.79</td>
<td align="center">84.18 &#x000B1; 0.65</td>
<td align="center">63.78 &#x000B1; 1.11</td>
</tr>
<tr>
<td align="left"><italic>k</italic> &#x0003D; 7</td>
<td align="center">88.07 &#x000B1; 1.01</td>
<td align="center">83.12 &#x000B1; 0.72</td>
<td align="center">85.29 &#x000B1; 0.55</td>
<td align="center">84.06 &#x000B1; 0.74</td>
<td align="center">64.33 &#x000B1; 1.52</td>
</tr>
<tr>
<td align="left"><italic>k</italic> &#x0003D; 9</td>
<td align="center">87.16 &#x000B1; 1.03</td>
<td align="center">83.06 &#x000B1; 0.78</td>
<td align="center">85.39 &#x000B1; 0.78</td>
<td align="center">83.88 &#x000B1; 0.71</td>
<td align="center">63.79 &#x000B1; 1.59</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>Here we compared the marginal true labels to the most probable explanation of the predicted label Bayesian networks. Obtained with 20 runs of 10-fold cross-validation.</italic></p>
</table-wrap-foot>
</table-wrap>
<table-wrap position="float" id="T4">
<label>Table 4</label>
<caption><p><bold>Confusion matrix for predicting <italic>C</italic><sub>5</sub> with <italic>k</italic> &#x0003D; 7</bold>.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th/>
<th align="center"><monospace><bold>CB</bold></monospace></th>
<th align="center"><monospace><bold>CH</bold></monospace></th>
<th align="center"><monospace><bold>CT</bold></monospace></th>
<th align="center"><monospace><bold>HT</bold></monospace></th>
<th align="center"><monospace><bold>LB</bold></monospace></th>
<th align="center"><monospace><bold>MA</bold></monospace></th>
<th align="center"><monospace><bold>NG</bold></monospace></th>
<th align="center"><bold>Per-type sensitivity</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td align="left"><monospace>CB</monospace></td>
<td align="center">41</td>
<td align="center">0</td>
<td align="center">10</td>
<td align="center">0</td>
<td align="center">6</td>
<td align="center">3</td>
<td align="center">0</td>
<td align="center">0.68</td>
</tr>
<tr>
<td align="left"><monospace>CH</monospace></td>
<td align="center">2</td>
<td align="center">0</td>
<td align="center">1</td>
<td align="center">0</td>
<td align="center">0</td>
<td align="center">0</td>
<td align="center">0</td>
<td align="center">0.00</td>
</tr>
<tr>
<td align="left"><monospace>CT</monospace></td>
<td align="center">10</td>
<td align="center">0</td>
<td align="center">25</td>
<td align="center">4</td>
<td align="center">8</td>
<td align="center">11</td>
<td align="center">0</td>
<td align="center">0.43</td>
</tr>
<tr>
<td align="left"><monospace>HT</monospace></td>
<td align="center">0</td>
<td align="center">0</td>
<td align="center">3</td>
<td align="center">10</td>
<td align="center">1</td>
<td align="center">0</td>
<td align="center">0</td>
<td align="center">0.71</td>
</tr>
<tr>
<td align="left"><monospace>LB</monospace></td>
<td align="center">9</td>
<td align="center">0</td>
<td align="center">3</td>
<td align="center">0</td>
<td align="center">25</td>
<td align="center">3</td>
<td align="center">1</td>
<td align="center">0.61</td>
</tr>
<tr>
<td align="left"><monospace>MA</monospace></td>
<td align="center">0</td>
<td align="center">0</td>
<td align="center">2</td>
<td align="center">0</td>
<td align="center">4</td>
<td align="center">36</td>
<td align="center">0</td>
<td align="center">0.86</td>
</tr>
<tr>
<td align="left"><monospace>NG</monospace></td>
<td align="center">8</td>
<td align="center">0</td>
<td align="center">0</td>
<td align="center">0</td>
<td align="center">0</td>
<td align="center">0</td>
<td align="center">0</td>
<td align="center">0.00</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>Here we compared the marginal true label for <italic>C</italic><sub>5</sub> (rows) to the <italic>C</italic><sub>5</sub> value of most probable explanation of the predicted label Bayesian network (columns). The rightmost column shows per-type sensitivity. Types <monospace>AR</monospace>, and <monospace>CR</monospace> and <monospace>OT</monospace> are omitted since no cell&#x00027;s crisp label was of one of these types. Obtained from a single run of 10-fold cross-validation.</italic></p>
</table-wrap-foot>
</table-wrap>
</sec>
</sec>
<sec sec-type="discussion" id="s4">
<title>4. Discussion</title>
<p>Previous studies on interneuron classification (DeFelipe et al., <xref ref-type="bibr" rid="B15">2013</xref>; Mihaljevi&#x00107; et al., <xref ref-type="bibr" rid="B51">2014a</xref>,<xref ref-type="bibr" rid="B52">b</xref>) used majority crisp labels, estimated for each axonal feature independently, to train and evaluate their models. Mihaljevi&#x00107; et al. (<xref ref-type="bibr" rid="B51">2014a</xref>) only considered <italic>C</italic><sub>5</sub> whereas DeFelipe et al. (<xref ref-type="bibr" rid="B15">2013</xref>) and Mihaljevi&#x00107; et al. (<xref ref-type="bibr" rid="B51">2014a</xref>) predicted axonal features <italic>C</italic><sub>1</sub>&#x02013;<italic>C</italic><sub>5</sub> with an independent model for each of them. There were non-methodological differences among these two studies and the present work and therefore any comparison of results ought to be performed with some caution. DeFelipe et al. (<xref ref-type="bibr" rid="B15">2013</xref>), for example, used 15 cells more than we did (see Section 2.4), had several of variables&#x00027; values corrupted by imperfections in the reconstructions of 36 cells&#x02014;which we corrected&#x02014;, and used only three values for <italic>C</italic><sub>4</sub>&#x02014;<monospace>ascending</monospace>, <monospace>descending</monospace>, and <monospace>both</monospace>. Furthermore, they used different morphometric predictor parameters (over 2000 of them), and applied a possibly more optimistic accuracy estimation technique&#x02014;leave-one-out estimation. Mihaljevi&#x00107; et al. (<xref ref-type="bibr" rid="B51">2014a</xref>) considered multiple subsets of the data, formed according to the degree of class label ambiguity of the included cells, and obtained best results with least ambiguous cells (e.g., with 46 cells for <italic>C</italic><sub>5</sub>). Their best results were thus obtained with a small subset of the 226 cells that we used. When using most of the cells, their results were similar to those of DeFelipe et al. (<xref ref-type="bibr" rid="B15">2013</xref>).</p>
<p>Differences aside, in Table <xref ref-type="table" rid="T5">5</xref> we compare the accuracies from the present study with those from DeFelipe et al. (<xref ref-type="bibr" rid="B15">2013</xref>). We outperformed DeFelipe et al. (<xref ref-type="bibr" rid="B15">2013</xref>) in predictive accuracy for every axonal feature, even though we used a single model to predict all features simultaneously. We especially outperformed their approach in predicting <italic>C</italic><sub>3</sub> and, even more, in predicting <italic>C</italic><sub>4</sub>. The latter was likely affected by the use of the additional category <monospace>no</monospace> (see subsection 2.2).</p>
<table-wrap position="float" id="T5">
<label>Table 5</label>
<caption><p><bold>Our best predictive accuracy (in %) vs. best accuracy from DeFelipe et al. (<xref ref-type="bibr" rid="B15">2013</xref>), for each of the axonal features <italic>C</italic><sub>1</sub>&#x02013;<italic>C</italic><sub>5</sub></bold>.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th/>
<th align="center"><italic><bold>C</bold></italic><sub><bold>1</bold></sub></th>
<th align="center"><italic><bold>C</bold></italic><sub>2</sub></th>
<th align="center"><italic><bold>C</bold></italic><sub><bold>3</bold></sub></th>
<th align="center"><italic><bold>C</bold></italic><sub><bold>4</bold></sub></th>
<th align="center"><italic><bold>C</bold></italic><sub><bold>5</bold></sub></th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">Present study 		</td>
<td align="center">88.07</td>
<td align="center">83.25</td>
<td align="center">86.50</td>
<td align="center">84.18</td>
<td align="center">64.33</td>
</tr>
<tr>
<td align="left">DeFelipe et al., <xref ref-type="bibr" rid="B15">2013</xref></td>
<td align="center">85.48</td>
<td align="center">81.33</td>
<td align="center">73.86</td>
<td align="center">60.17</td>
<td align="center">62.24</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>Our results were obtained with 20 runs of 10-fold cross-validation; those of DeFelipe et al., (<xref ref-type="bibr" rid="B15">2013</xref>) were obtained with leave-one-out cross-validation.</italic></p>
</table-wrap-foot>
</table-wrap>
<p>Despite the non-methodological differences with the study by DeFelipe et al. (<xref ref-type="bibr" rid="B15">2013</xref>), the better accuracies that we achieved might suggest some or all of the following: (a) the introduced morphometric parameters are useful for predicting interneuron type and axonal features; (b) we adequately assigned the value <monospace>no</monospace> for cells to which the other values of <italic>C</italic><sub>4</sub> did not apply; and (c) our method is adequate for classifying interneurons.</p>
<p>The above results, along with the relatively high global accuracy achieved, 44%, suggest that axonal features <italic>C</italic><sub>1</sub>&#x02013;<italic>C</italic><sub>5</sub> are interrelated and that it is useful to attempt predicting them simultaneously.</p>
<p>Finally, several other efforts regarding classification of neurons in general have been performed taking into account other morphological and/or molecular and/or electrophysiological properties (e.g., Bota and Swanson, <xref ref-type="bibr" rid="B9">2007</xref>; Ascoli et al., <xref ref-type="bibr" rid="B2">2009</xref>; Brown and Hestrin, <xref ref-type="bibr" rid="B10">2009</xref>; Battaglia et al., <xref ref-type="bibr" rid="B4">2013</xref>; S&#x000FC;mb&#x000FC;l et al., <xref ref-type="bibr" rid="B70">2014</xref>). These studies indicate that in spite of a large diversity of neuronal types, certain clear correlations exist between the axonal features and dendritic morphologies, and between these anatomical characteristics and some molecular and electrical attributes. Nevertheless, the classification of neurons is still under intense study from different angles, including anatomical, physiological, and molecular criteria, and using a variety of mathematical approaches, such as hierarchical clustering (Cauli et al., <xref ref-type="bibr" rid="B12">2000</xref>; Wang et al., <xref ref-type="bibr" rid="B75">2002</xref>; Tsiola et al., <xref ref-type="bibr" rid="B72">2003</xref>; Benavides-Piccione et al., <xref ref-type="bibr" rid="B5">2006</xref>; Dumitriu et al., <xref ref-type="bibr" rid="B21">2007</xref>; Helmstaedter et al., <xref ref-type="bibr" rid="B34">2009a</xref>,<xref ref-type="bibr" rid="B34a">b</xref>), <italic>k</italic>-means (e.g., Karagiannis et al., <xref ref-type="bibr" rid="B39">2009</xref>, affinity propagation (Santana et al., <xref ref-type="bibr" rid="B62">2013</xref>), linear discriminant analysis (Marin et al., <xref ref-type="bibr" rid="B48">2002</xref>; Druckmann et al., <xref ref-type="bibr" rid="B18">2013</xref>), Bayesian network classifiers (Mihaljevi&#x00107; et al., <xref ref-type="bibr" rid="B51">2014a</xref>), and semi-supervised model-based clustering (Mihaljevi&#x00107; et al., <xref ref-type="bibr" rid="B52">2014b</xref>).</p>
<sec>
<title>4.1. Computing axonal morphometric parameters</title>
<p>In order to compute some of the newly introduced axonal morphometric parameters&#x02014;namely, those relative to laminar and cortical distribution&#x02014;, we followed a series of assumptions originating from DeFelipe et al. (<xref ref-type="bibr" rid="B15">2013</xref>). These assumptions (simplifications) should be kept in mind when interpreting our results. First, we assumed arbitrary columnar and laminar demarcations. Namely, we considered the diameter of the hypothetical cortical column to be 300 &#x003BC;m (Malach, <xref ref-type="bibr" rid="B47">1994</xref>; Mountcastle, <xref ref-type="bibr" rid="B53">1998</xref>), whereas laminar thickness was estimated for each neuron from its original paper, when such a paper was available, and from relevant literature otherwise. Finally, we assumed that a soma is always located in the center of its layer and cortical column.</p>
</sec>
</sec>
<sec sec-type="conclusion" id="s5">
<title>5. Conclusion</title>
<p>We built a model that can automatically classify an interneuron, on the basis of a set of its axonal morphometric parameters, according to five properties which constitute the pragmatic classification scheme proposed by DeFelipe et al. (<xref ref-type="bibr" rid="B15">2013</xref>), namely, the interneuron type and four other categorical axonal features. We guided model construction with a Bayesian network-encoded probability distribution indicating the type and axonal features of each interneuron. We obtained these probability distributions from classification choices provided by a group of leading neuroscientists. We then developed an instance-based supervised classifier which could learn from such multi-dimensional probabilistic input, predicting the output by forming a consensus among a set of Bayesian networks.</p>
<p>We accurately predicted the probabilistic labels over the interneuron type and the four remaining axonal features. Furthermore, we outperformed previous work when predicting crisp (i.e., non-probabilistic) labels. Importantly, and unlike previous work, we predicted the five axonal features simultaneously (i.e., with a single model), which is useful since these features are complementary. Our results suggest that interneuron type and the and four remaining axonal features are related and that it is useful to predict them jointly.</p>
<p>We introduced 13 axonal morphometric parameters which we defined as quantitative counterparts of the four categorical axonal features. Our results suggest that these parameters are useful for predicting the type and the four axonal features. Thus, they might be considered as objective replacements, or surrogates, of the subjective categorical axonal features.</p>
<p>This paper demonstrates a useful application of Bayesian networks in neuroscience, whose potential has been largely unexploited in this field (one exception is functional connectivity analysis from neuroimaging data; see Bielza and Larra&#x000F1;aga, <xref ref-type="bibr" rid="B6">2014</xref>).</p>
<p>It would be interesting to relax the assumption that all neuroscientists who classified our data are equally accurate at classifying all types of interneurons, since some may be more familiar with certain interneuron types than with others, and account for expert competence in our model, similarly to methods for learning from a crowd of annotators such as Raykar et al. (<xref ref-type="bibr" rid="B61">2010</xref>) and Welinder et al. (<xref ref-type="bibr" rid="B76">2010</xref>).</p>
<p>We also intend to consider new methods for forming a consensus among Bayesian networks.</p>
<sec>
<title>5.1. Data sharing</title>
<p>The data set and the software reproducing our study are available online, at <ext-link ext-link-type="uri" xlink:href="http://cig.fi.upm.es/bojan/gardener/">http://cig.fi.upm.es/bojan/gardener/</ext-link>.</p>
</sec>
</sec>
<sec>
<title>Author contributions</title>
<p>Bojan Mihaljevi&#x00107;, Concha Bielza, and Pedro Larra&#x000F1;aga designed the method and the empirical study. Ruth Benavides-Piccione corrected the faulty interneuron reconstructions. Ruth Benavides-Piccione and Bojan Mihaljevi&#x00107; defined the here introduced morphological variables. Bojan Mihaljevi&#x00107; performed the data analysis, implementing necessary software, and wrote the paper. Concha Bielza, Ruth Benavides-Piccione, Javier DeFelipe and Pedro Larra&#x000F1;aga critically revised the paper.</p>
<sec>
<title>Conflict of interest statement</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p></sec>
</sec>
</body>
<back>
<ack>
<p>This work has been partially supported by the Spanish Ministry of Economy and Competitiveness through the Cajal Blue Brain (C080020-09; the Spanish partner of the Blue Brain initiative from EPFL) and TIN2013-41592-P projects, by the S2013/ICE-2845-CASI-CAM-CM project, and the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement no. 604102 (Human Brain Project). The authors thank Pedro L. L&#x000F3;pez-Cruz and Luis Guerra for useful comments regarding the definition of the here introduced morphometric parameters.</p>
</ack>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Ambroise</surname> <given-names>C.</given-names></name> <name><surname>Denoeux</surname> <given-names>T.</given-names></name> <name><surname>Govaert</surname> <given-names>G.</given-names></name> <name><surname>Smets</surname> <given-names>P.</given-names></name></person-group> (<year>2001</year>). <article-title>Learning from an imprecise teacher: probabilistic and evidential approaches</article-title>, in <source>10th International Symposium on Applied Stochastic Models and Data Analysis</source>, <volume>Vol. 1</volume> (<publisher-loc>Compiegne</publisher-loc>), <fpage>101</fpage>&#x02013;<lpage>105</lpage>.</citation>
</ref>
<ref id="B2">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ascoli</surname> <given-names>G. A.</given-names></name> <name><surname>Brown</surname> <given-names>K. M.</given-names></name> <name><surname>Calixto</surname> <given-names>E.</given-names></name> <name><surname>Card</surname> <given-names>J. P.</given-names></name> <name><surname>Galvan</surname> <given-names>E.</given-names></name> <name><surname>Perez-Rosello</surname> <given-names>T.</given-names></name> <etal/></person-group>. (<year>2009</year>). <article-title>Quantitative morphometry of electrophysiologically identified CA3b interneurons reveals robust local geometry and distinct cell classes</article-title>. <source>J. Comp. Neurol</source>. <volume>515</volume>, <fpage>677</fpage>&#x02013;<lpage>695</lpage>. <pub-id pub-id-type="doi">10.1002/cne.22082</pub-id><pub-id pub-id-type="pmid">19496174</pub-id></citation>
</ref>
<ref id="B3">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ascoli</surname> <given-names>G. A.</given-names></name> <name><surname>Donohue</surname> <given-names>D. E.</given-names></name> <name><surname>Halavi</surname> <given-names>M.</given-names></name></person-group> (<year>2007</year>). <article-title>Neuromorpho.org: a central resource for neuronal morphologies</article-title>. <source>J. Neurosci</source>. <volume>27</volume>, <fpage>9247</fpage>&#x02013;<lpage>9251</lpage>. <pub-id pub-id-type="doi">10.1523/JNEUROSCI.2055-07.2007</pub-id><pub-id pub-id-type="pmid">17728438</pub-id></citation>
</ref>
<ref id="B4">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Battaglia</surname> <given-names>D.</given-names></name> <name><surname>Karagiannis</surname> <given-names>A.</given-names></name> <name><surname>Gallopin</surname> <given-names>T.</given-names></name> <name><surname>Gutch</surname> <given-names>H. W.</given-names></name> <name><surname>Cauli</surname> <given-names>B.</given-names></name></person-group> (<year>2013</year>). <article-title>Beyond the frontiers of neuronal types</article-title>. <source>Front. Neural Circuits</source> <volume>7</volume>:<issue>13</issue>. <pub-id pub-id-type="doi">10.3389/fncir.2013.00013</pub-id><pub-id pub-id-type="pmid">23403725</pub-id></citation>
</ref>
<ref id="B5">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Benavides-Piccione</surname> <given-names>R.</given-names></name> <name><surname>Hamzei-Sichani</surname> <given-names>F.</given-names></name> <name><surname>Ballesteros-Y&#x000E1;&#x000F1;ez</surname> <given-names>I.</given-names></name> <name><surname>DeFelipe</surname> <given-names>J.</given-names></name> <name><surname>Yuste</surname> <given-names>R.</given-names></name></person-group> (<year>2006</year>). <article-title>Dendritic size of pyramidal neurons differs among mouse cortical regions</article-title>. <source>Cereb. Cortex</source> <volume>16</volume>, <fpage>990</fpage>&#x02013;<lpage>1001</lpage>. <pub-id pub-id-type="doi">10.1093/cercor/bhj041</pub-id><pub-id pub-id-type="pmid">16195469</pub-id></citation>
</ref>
<ref id="B6">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bielza</surname> <given-names>C.</given-names></name> <name><surname>Larra&#x000F1;aga</surname> <given-names>P.</given-names></name></person-group> (<year>2014</year>). <article-title>Bayesian networks in neuroscience: a survey</article-title>. <source>Front. Comput. Neurosci</source>. <volume>8</volume>:<issue>131</issue>. <pub-id pub-id-type="doi">10.3389/fncom.2014.00131</pub-id><pub-id pub-id-type="pmid">25360109</pub-id></citation>
</ref>
<ref id="B7">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bielza</surname> <given-names>C.</given-names></name> <name><surname>Li</surname> <given-names>G.</given-names></name> <name><surname>Larra&#x000F1;aga</surname> <given-names>P.</given-names></name></person-group> (<year>2011</year>). <article-title>Multi-dimensional classification with Bayesian networks</article-title>. <source>Int. J. Approx. Reason</source>. <volume>52</volume>, <fpage>705</fpage>&#x02013;<lpage>727</lpage>. <pub-id pub-id-type="doi">10.1016/j.ijar.2011.01.007</pub-id></citation>
</ref>
<ref id="B8">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Borchani</surname> <given-names>H.</given-names></name> <name><surname>Bielza</surname> <given-names>C.</given-names></name> <name><surname>Toro</surname> <given-names>C.</given-names></name> <name><surname>Larra&#x000F1;aga</surname> <given-names>P.</given-names></name></person-group> (<year>2013</year>). <article-title>Predicting human immunodeficiency virus inhibitors using multi-dimensional Bayesian network classifiers</article-title>. <source>Artif. Intell. Med</source>. <volume>57</volume>, <fpage>219</fpage>&#x02013;<lpage>229</lpage>. <pub-id pub-id-type="doi">10.1016/j.artmed.2012.12.005</pub-id><pub-id pub-id-type="pmid">23375464</pub-id></citation>
</ref>
<ref id="B9">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bota</surname> <given-names>M.</given-names></name> <name><surname>Swanson</surname> <given-names>L. W.</given-names></name></person-group> (<year>2007</year>). <article-title>The neuron classification problem</article-title>. <source>Brain Res. Rev</source>. <volume>56</volume>, <fpage>79</fpage>&#x02013;<lpage>88</lpage>. <pub-id pub-id-type="doi">10.1016/j.brainresrev.2007.05.005</pub-id><pub-id pub-id-type="pmid">17582506</pub-id></citation>
</ref>
<ref id="B10">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Brown</surname> <given-names>S. P.</given-names></name> <name><surname>Hestrin</surname> <given-names>S.</given-names></name></person-group> (<year>2009</year>). <article-title>Cell-type identity: a key to unlocking the function of neocortical circuits</article-title>. <source>Curr. Opin. Neurobiol</source>. <volume>19</volume>, <fpage>415</fpage>&#x02013;<lpage>421</lpage>. <pub-id pub-id-type="doi">10.1016/j.conb.2009.07.011</pub-id><pub-id pub-id-type="pmid">19674891</pub-id></citation>
</ref>
<ref id="B11">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cauli</surname> <given-names>B.</given-names></name> <name><surname>Audinat</surname> <given-names>E.</given-names></name> <name><surname>Lambolez</surname> <given-names>B.</given-names></name> <name><surname>Angulo</surname> <given-names>M. C.</given-names></name> <name><surname>Ropert</surname> <given-names>N.</given-names></name> <name><surname>Tsuzuki</surname> <given-names>K.</given-names></name> <etal/></person-group>. (<year>1997</year>). <article-title>Molecular and physiological diversity of cortical nonpyramidal cells</article-title>. <source>J. Neurosci</source>. <volume>17</volume>, <fpage>3894</fpage>&#x02013;<lpage>3906</lpage>. <pub-id pub-id-type="pmid">9133407</pub-id></citation>
</ref>
<ref id="B12">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cauli</surname> <given-names>B.</given-names></name> <name><surname>Porter</surname> <given-names>J. T.</given-names></name> <name><surname>Tsuzuki</surname> <given-names>K.</given-names></name> <name><surname>Lambolez</surname> <given-names>B.</given-names></name> <name><surname>Rossier</surname> <given-names>J.</given-names></name> <name><surname>Quenet</surname> <given-names>B.</given-names></name> <etal/></person-group>. (<year>2000</year>). <article-title>Classification of fusiform neocortical interneurons based on unsupervised clustering</article-title>. <source>Proc. Natl. Acad. Sci. U.S.A</source>. <volume>97</volume>, <fpage>6144</fpage>&#x02013;<lpage>6149</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.97.11.6144</pub-id><pub-id pub-id-type="pmid">10823957</pub-id></citation>
</ref>
<ref id="B13">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cooper</surname> <given-names>G. F.</given-names></name> <name><surname>Herskovits</surname> <given-names>E.</given-names></name></person-group> (<year>1992</year>). <article-title>A Bayesian method for the induction of probabilistic networks from data</article-title>. <source>Mach. Learn</source>. <volume>9</volume>, <fpage>309</fpage>&#x02013;<lpage>347</lpage>. <pub-id pub-id-type="doi">10.1007/BF00994110</pub-id></citation>
</ref>
<ref id="B14">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dawid</surname> <given-names>A. P.</given-names></name> <name><surname>Skene</surname> <given-names>A. M.</given-names></name></person-group> (<year>1979</year>). <article-title>Maximum likelihood estimation of observer error-rates using the EM algorithm</article-title>. <source>J. R. Stat. Soc. Ser. C Appl. Stat</source>. <volume>28</volume>, <fpage>20</fpage>&#x02013;<lpage>28</lpage>. <pub-id pub-id-type="doi">10.2307/2346806</pub-id></citation>
</ref>
<ref id="B15">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>DeFelipe</surname> <given-names>J.</given-names></name> <name><surname>L&#x000F3;pez-Cruz</surname> <given-names>P. L.</given-names></name> <name><surname>Benavides-Piccione</surname> <given-names>R.</given-names></name> <name><surname>Bielza</surname> <given-names>C.</given-names></name> <name><surname>Larra&#x000F1;aga</surname> <given-names>P.</given-names></name> <name><surname>Anderson</surname> <given-names>S.</given-names></name> <etal/></person-group>. (<year>2013</year>). <article-title>New insights into the classification and nomenclature of cortical GABAergic interneurons</article-title>. <source>Nat. Rev. Neurosci</source>. <volume>14</volume>, <fpage>202</fpage>&#x02013;<lpage>216</lpage>. <pub-id pub-id-type="doi">10.1038/nrn3444</pub-id><pub-id pub-id-type="pmid">23385869</pub-id></citation>
</ref>
<ref id="B16">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Del Sagrado</surname> <given-names>J.</given-names></name> <name><surname>Moral</surname> <given-names>S.</given-names></name></person-group> (<year>2003</year>). <article-title>Qualitative combination of Bayesian networks</article-title>. <source>Int. J. Intell. Syst</source>. <volume>18</volume>, <fpage>237</fpage>&#x02013;<lpage>249</lpage>. <pub-id pub-id-type="doi">10.1002/int.10086</pub-id></citation>
</ref>
<ref id="B17">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Denoeux</surname> <given-names>T.</given-names></name></person-group> (<year>1995</year>). <article-title>A k-nearest neighbor classification rule based on Dempster-Shafer theory</article-title>. <source>Syst. Man Cybern. IEEE Trans</source>. <volume>25</volume>, <fpage>804</fpage>&#x02013;<lpage>813</lpage>. <pub-id pub-id-type="doi">10.1109/21.376493</pub-id></citation>
</ref>
<ref id="B18">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Druckmann</surname> <given-names>S.</given-names></name> <name><surname>Hill</surname> <given-names>S.</given-names></name> <name><surname>Sch&#x000FC;rmann</surname> <given-names>F.</given-names></name> <name><surname>Markram</surname> <given-names>H.</given-names></name> <name><surname>Segev</surname> <given-names>I.</given-names></name></person-group> (<year>2013</year>). <article-title>A hierarchical structure of cortical interneuron electrical diversity revealed by automated statistical analysis</article-title>. <source>Cere. Cortex</source> <volume>23</volume>, <fpage>2994</fpage>&#x02013;<lpage>3006</lpage>. <pub-id pub-id-type="doi">10.1093/cercor/bhs290</pub-id><pub-id pub-id-type="pmid">22989582</pub-id></citation>
</ref>
<ref id="B19">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Duda</surname> <given-names>R. O.</given-names></name> <name><surname>Hart</surname> <given-names>P. E.</given-names></name> <name><surname>Stork</surname> <given-names>D. G.</given-names></name></person-group> (<year>2000</year>). <source>Pattern Classification, 2nd Edn</source>. <publisher-loc>New York, NY</publisher-loc>: <publisher-name>Wiley-Interscience</publisher-name>.</citation>
</ref>
<ref id="B20">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dudani</surname> <given-names>S. A.</given-names></name></person-group> (<year>1976</year>). <article-title>The distance-weighted k-nearest-neighbor rule</article-title>. <source>IEEE Trans. Syst. Man Cybern</source>. <volume>SMC-6</volume>, <fpage>325</fpage>&#x02013;<lpage>327</lpage>. <pub-id pub-id-type="doi">10.1109/TSMC.1976.5408784</pub-id></citation>
</ref>
<ref id="B21">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dumitriu</surname> <given-names>D.</given-names></name> <name><surname>Cossart</surname> <given-names>R.</given-names></name> <name><surname>Huang</surname> <given-names>J.</given-names></name> <name><surname>Yuste</surname> <given-names>R.</given-names></name></person-group> (<year>2007</year>). <article-title>Correlation between axonal morphologies and synaptic input kinetics of interneurons from mouse visual cortex</article-title>. <source>Cereb. Cortex</source> <volume>17</volume>, <fpage>81</fpage>&#x02013;<lpage>91</lpage>. <pub-id pub-id-type="doi">10.1093/cercor/bhj126</pub-id><pub-id pub-id-type="pmid">16467567</pub-id></citation>
</ref>
<ref id="B22">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>El Gayar</surname> <given-names>N.</given-names></name> <name><surname>Schwenker</surname> <given-names>F.</given-names></name> <name><surname>Palm</surname> <given-names>G.</given-names></name></person-group> (<year>2006</year>). <article-title>A study of the robustness of knn classifiers trained using soft labels</article-title>, in <source>Artificial Neural Networks in Pattern Recognition, Volume 4087 of em Lecture Notes in Computer Science</source>, eds <person-group person-group-type="editor"><name><surname>Schwenker</surname> <given-names>F.</given-names></name> <name><surname>Marinai</surname> <given-names>S.</given-names></name></person-group> (<publisher-loc>Berlin, Heidelberg</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>67</fpage>&#x02013;<lpage>80</lpage>.</citation>
</ref>
<ref id="B23">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Etminani</surname> <given-names>K.</given-names></name> <name><surname>Naghibzadeh</surname> <given-names>M.</given-names></name> <name><surname>Pe&#x000F1;a</surname> <given-names>J. M.</given-names></name></person-group> (<year>2013</year>). <article-title>DemocraticOP: a democratic way of aggregating Bayesian network parameters</article-title>. <source>Int. J. Approx. Reason</source>. <volume>54</volume>, <fpage>602</fpage>&#x02013;<lpage>614</lpage>. <pub-id pub-id-type="doi">10.1016/j.ijar.2012.12.002</pub-id></citation>
</ref>
<ref id="B24">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fair&#x000E9;n</surname> <given-names>A.</given-names></name> <name><surname>Regidor</surname> <given-names>J.</given-names></name> <name><surname>Kruger</surname> <given-names>L.</given-names></name></person-group> (<year>1992</year>). <article-title>The cerebral cortex of the mouse (a first contribution - the &#x02018;acoustic&#x02019; cortex)</article-title>. <source>Somatosens. Mot. Res</source>. <volume>9</volume>, <fpage>3</fpage>&#x02013;<lpage>36</lpage>. <pub-id pub-id-type="doi">10.3109/08990229209144760</pub-id><pub-id pub-id-type="pmid">1317625</pub-id></citation>
</ref>
<ref id="B25">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fix</surname> <given-names>E.</given-names></name> <name><surname>Hodges</surname> <given-names>J. L.</given-names></name></person-group> (<year>1989</year>), <article-title>Discriminatory analysis. Nonparametric discrimination: consistency properties</article-title>. <source>Int. Statist. Rev</source>. <volume>57</volume>, <fpage>238</fpage>&#x02013;<lpage>247</lpage>.</citation>
</ref>
<ref id="B27">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Glaser</surname> <given-names>J. R.</given-names></name> <name><surname>Glaser</surname> <given-names>E. M.</given-names></name></person-group> (<year>1990</year>). <article-title>Neuron imaging with Neurolucida &#x02014; A PC-based system for image combining microscopy</article-title>. <source>Comput. Med. Imaging Graph</source>. <volume>14</volume>, <fpage>307</fpage>&#x02013;<lpage>317</lpage>. <pub-id pub-id-type="doi">10.1016/0895-6111(90)90105-K</pub-id><pub-id pub-id-type="pmid">2224829</pub-id></citation>
</ref>
<ref id="B28">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Glover</surname> <given-names>F.</given-names></name></person-group> (<year>1989</year>). <article-title>Tabu search-part I</article-title>. <source>ORSA J. Comput</source>. <volume>1</volume>, <fpage>190</fpage>&#x02013;<lpage>206</lpage>. <pub-id pub-id-type="doi">10.1287/ijoc.1.3.190</pub-id></citation>
</ref>
<ref id="B29">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Glover</surname> <given-names>F.</given-names></name></person-group> (<year>1990</year>). <article-title>Tabu searchpart II</article-title>. <source>ORSA J. Comput</source>. <volume>2</volume>, <fpage>4</fpage>&#x02013;<lpage>32</lpage>. <pub-id pub-id-type="doi">10.1287/ijoc.2.1.4</pub-id></citation>
</ref>
<ref id="B30">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Grandvallet</surname> <given-names>Y.</given-names></name></person-group> (<year>2002</year>). <article-title>Logistic regression for partial labels</article-title>, in <source>9th International Conference on Information Processing and Management of Uncertainty in Knowledge-based Systems (IMPU &#x00027;02)</source>, <volume>Vol. 3</volume> (<publisher-loc>Annecy</publisher-loc>), <fpage>1935</fpage>&#x02013;<lpage>1941</lpage>.</citation>
</ref>
<ref id="B31">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Guerra</surname> <given-names>L.</given-names></name> <name><surname>McGarry</surname> <given-names>L. M.</given-names></name> <name><surname>Robles</surname> <given-names>V.</given-names></name> <name><surname>Bielza</surname> <given-names>C.</given-names></name> <name><surname>Larra&#x000F1;aga</surname> <given-names>P.</given-names></name> <name><surname>Yuste</surname> <given-names>R.</given-names></name></person-group> (<year>2011</year>). <article-title>Comparison between supervised and unsupervised classifications of neuronal cell types: a case study</article-title>. <source>Dev. Neurobiol</source>. <volume>71</volume>, <fpage>71</fpage>&#x02013;<lpage>82</lpage>. <pub-id pub-id-type="doi">10.1002/dneu.20809</pub-id><pub-id pub-id-type="pmid">21154911</pub-id></citation>
</ref>
<ref id="B32">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gupta</surname> <given-names>A.</given-names></name> <name><surname>Wang</surname> <given-names>Y.</given-names></name> <name><surname>Markram</surname> <given-names>H.</given-names></name></person-group> (<year>2000</year>). <article-title>Organizing principles for a diversity of GABAergic interneurons and synapses in the neocortex</article-title>. <source>Science</source> <volume>287</volume>, <fpage>273</fpage>&#x02013;<lpage>278</lpage>. <pub-id pub-id-type="doi">10.1126/science.287.5451.273</pub-id><pub-id pub-id-type="pmid">10634775</pub-id></citation>
</ref>
<ref id="B33">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Heckerman</surname> <given-names>D.</given-names></name> <name><surname>Geiger</surname> <given-names>D.</given-names></name> <name><surname>Chickering</surname> <given-names>D. M.</given-names></name></person-group> (<year>1995</year>). <article-title>Learning Bayesian networks: the combination of knowledge and statistical data</article-title>. <source>Mach. learn</source>. <volume>20</volume>, <fpage>197</fpage>&#x02013;<lpage>243</lpage>. <pub-id pub-id-type="doi">10.1007/BF00994016</pub-id></citation>
</ref>
<ref id="B34">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Helmstaedter</surname> <given-names>M.</given-names></name> <name><surname>Sakmann</surname> <given-names>B.</given-names></name> <name><surname>Feldmeyer</surname> <given-names>D.</given-names></name></person-group> (<year>2009a</year>). <article-title>L2/3 interneuron groups defined by multiparameter analysis of axonal projection, dendritic geometry, and electrical excitability</article-title>. <source>Cereb. Cortex</source> <volume>19</volume>, <fpage>951</fpage>&#x02013;<lpage>962</lpage>. <pub-id pub-id-type="doi">10.1093/cercor/bhn130</pub-id><pub-id pub-id-type="pmid">18802122</pub-id></citation>
</ref>
<ref id="B34a">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Helmstaedter</surname> <given-names>M.</given-names></name> <name><surname>Sakmann</surname> <given-names>B.</given-names></name> <name><surname>Feldmeyer</surname> <given-names>D.</given-names></name></person-group> (<year>2009b</year>). <article-title>The relation between dendritic geometry, electrical excitability, and axonal projections of l2/3 interneurons in rat barrel cortex</article-title>. <source>Cereb. Cortex</source> <volume>19</volume>, <fpage>938</fpage>&#x02013;<lpage>950</lpage>. <pub-id pub-id-type="doi">10.1093/cercor/bhn138</pub-id><pub-id pub-id-type="pmid">18787231</pub-id></citation>
</ref>
<ref id="B36">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Henrion</surname> <given-names>M.</given-names></name></person-group> (<year>1986</year>). <article-title>Propagating uncertainty in Bayesian networks by probabilistic logic sampling</article-title>, in <source>Second Annual Conference on Uncertainty in Artificial Intelligence, UAI &#x00027;86</source> (<publisher-loc>Philadelphia, PA</publisher-loc>), <fpage>149</fpage>&#x02013;<lpage>164</lpage>.</citation>
</ref>
<ref id="B37">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jain</surname> <given-names>A. K.</given-names></name></person-group> (<year>2010</year>). <article-title>Data clustering: 50 years beyond k-means</article-title>. <source>Patt. Recogn. Lett</source>. <volume>31</volume>, <fpage>651</fpage>&#x02013;<lpage>666</lpage>. <pub-id pub-id-type="doi">10.1016/j.patrec.2009.09.011</pub-id></citation>
</ref>
<ref id="B38">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>J&#x000F3;&#x0017A;wik</surname> <given-names>A.</given-names></name></person-group> (<year>1983</year>). <article-title>A learning scheme for a fuzzy k-NN rule</article-title>. <source>Patt. Recogn. Lett</source>. <volume>1</volume>, <fpage>287</fpage>&#x02013;<lpage>289</lpage>. <pub-id pub-id-type="doi">10.1016/0167-8655(83)90064-8</pub-id></citation>
</ref>
<ref id="B39">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Karagiannis</surname> <given-names>A.</given-names></name> <name><surname>Gallopin</surname> <given-names>T.</given-names></name> <name><surname>D&#x000E1;vid</surname> <given-names>C.</given-names></name> <name><surname>Battaglia</surname> <given-names>D.</given-names></name> <name><surname>Geoffroy</surname> <given-names>H.</given-names></name> <name><surname>Rossier</surname> <given-names>J.</given-names></name> <etal/></person-group>. (<year>2009</year>). <article-title>Classification of NPY-expressing neocortical interneurons</article-title>. <source>J. Neurosci</source>. <volume>29</volume>, <fpage>3642</fpage>&#x02013;<lpage>3659</lpage>. <pub-id pub-id-type="doi">10.1523/JNEUROSCI.0058-09.2009</pub-id><pub-id pub-id-type="pmid">19295167</pub-id></citation>
</ref>
<ref id="B40">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kawaguchi</surname> <given-names>Y.</given-names></name></person-group> (<year>1993</year>). <article-title>Physiological, morphological, and histochemical characterization of three classes of interneurons in rat neostriatum</article-title>. <source>J. Neurosci</source>. <volume>13</volume>, <fpage>4908</fpage>&#x02013;<lpage>4923</lpage>. <pub-id pub-id-type="pmid">7693897</pub-id></citation>
</ref>
<ref id="B41">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Keller</surname> <given-names>J. M.</given-names></name> <name><surname>Gray</surname> <given-names>M. R.</given-names></name> <name><surname>Givens</surname> <given-names>J. A.</given-names></name></person-group> (<year>1985</year>). <article-title>A fuzzy k-nearest neighbor algorithm</article-title>. <source>IEEE Trans. Syst. Man Cybern</source>. <volume>SMC-15</volume>, <fpage>580</fpage>&#x02013;<lpage>585</lpage>. <pub-id pub-id-type="doi">10.1109/TSMC.1985.6313426</pub-id></citation>
</ref>
<ref id="B42">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Koller</surname> <given-names>D.</given-names></name> <name><surname>Friedman</surname> <given-names>N.</given-names></name></person-group> (<year>2009</year>). <source>Probabilistic Graphical Models: Principles and Techniques</source>. <publisher-loc>Cambridge, MA</publisher-loc>: <publisher-name>MIT press</publisher-name>.</citation>
</ref>
<ref id="B43">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kullback</surname> <given-names>S.</given-names></name> <name><surname>Leibler</surname> <given-names>R. A.</given-names></name></person-group> (<year>1951</year>). <article-title>On information and sufficiency</article-title>. <source>Ann. Math. Stat</source>. <volume>22</volume>, <fpage>79</fpage>&#x02013;<lpage>86</lpage>. <pub-id pub-id-type="doi">10.1214/aoms/1177729694</pub-id></citation>
</ref>
<ref id="B44">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lopez-Cruz</surname> <given-names>P.</given-names></name> <name><surname>Larra&#x000F1;aga</surname> <given-names>P.</given-names></name> <name><surname>DeFelipe</surname> <given-names>J.</given-names></name> <name><surname>Bielza</surname> <given-names>C.</given-names></name></person-group> (<year>2014</year>). <article-title>Bayesian network modeling of the consensus between experts: an application to neuron classification</article-title>. <source>Int. J. Approx. Reason</source>. <volume>55</volume>, <fpage>3</fpage>&#x02013;<lpage>22</lpage>. <pub-id pub-id-type="doi">10.1016/j.ijar.2013.03.011</pub-id></citation>
</ref>
<ref id="B45">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Maccaferri</surname> <given-names>G.</given-names></name> <name><surname>Lacaille</surname> <given-names>J.-C.</given-names></name></person-group> (<year>2003</year>). <article-title>Interneuron diversity series: hippocampal interneuron classifications&#x02013;making things as simple as possible, not simpler</article-title>. <source>Trends Neurosci</source>. <volume>26</volume>, <fpage>564</fpage>&#x02013;<lpage>571</lpage>. <pub-id pub-id-type="doi">10.1016/j.tins.2003.08.002</pub-id><pub-id pub-id-type="pmid">14522150</pub-id></citation>
</ref>
<ref id="B46">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>MacLeod</surname> <given-names>J. E.</given-names></name> <name><surname>Luk</surname> <given-names>A.</given-names></name> <name><surname>Titterington</surname> <given-names>D.</given-names></name></person-group> (<year>1987</year>). <article-title>A re-examination of the distance-weighted k-nearest neighbor classification rule</article-title>. <source>IEEE Trans. Syst. Man Cybern</source>. <volume>17</volume>, <fpage>689</fpage>&#x02013;<lpage>696</lpage>. <pub-id pub-id-type="doi">10.1109/TSMC.1987.289362</pub-id></citation>
</ref>
<ref id="B47">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Malach</surname> <given-names>R.</given-names></name></person-group> (<year>1994</year>). <article-title>Cortical columns as devices for maximizing neuronal diversity</article-title>. <source>Trends Neurosci</source>. <volume>17</volume>, <fpage>101</fpage>&#x02013;<lpage>104</lpage>. <pub-id pub-id-type="doi">10.1016/0166-2236(94)90113-9</pub-id><pub-id pub-id-type="pmid">7515522</pub-id></citation>
</ref>
<ref id="B48">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Marin</surname> <given-names>E. C.</given-names></name> <name><surname>Jefferis</surname> <given-names>G. S.</given-names></name> <name><surname>Komiyama</surname> <given-names>T.</given-names></name> <name><surname>Zhu</surname> <given-names>H.</given-names></name> <name><surname>Luo</surname> <given-names>L.</given-names></name></person-group> (<year>2002</year>). <article-title>Representation of the glomerular olfactory map in the drosophila brain</article-title>. <source>Cell</source> <volume>109</volume>, <fpage>243</fpage>&#x02013;<lpage>255</lpage>. <pub-id pub-id-type="doi">10.1016/S0092-8674(02)00700-6</pub-id><pub-id pub-id-type="pmid">12007410</pub-id></citation>
</ref>
<ref id="B49">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Matzkevich</surname> <given-names>I.</given-names></name> <name><surname>Abramson</surname> <given-names>B.</given-names></name></person-group> (<year>1992</year>). <article-title>The topological fusion of Bayes nets</article-title>, in <source>Proceedings of the Eighth international conference on Uncertainty in artificial intelligence</source> (<publisher-loc>Stanford, CA</publisher-loc>: <publisher-name>Morgan Kaufmann Publishers Inc.</publisher-name>), <fpage>191</fpage>&#x02013;<lpage>198</lpage>.</citation>
</ref>
<ref id="B50">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>McGarry</surname> <given-names>L. M.</given-names></name> <name><surname>Packer</surname> <given-names>A. M.</given-names></name> <name><surname>Fino</surname> <given-names>E.</given-names></name> <name><surname>Nikolenko</surname> <given-names>V.</given-names></name> <name><surname>Sippy</surname> <given-names>T.</given-names></name> <name><surname>Yuste</surname> <given-names>R.</given-names></name></person-group> (<year>2010</year>). <article-title>Quantitative classification of somatostatin-positive neocortical interneurons identifies three interneuron subtypes</article-title>. <source>Front. Neural Circuits</source> <volume>4</volume>:<issue>12</issue>. <pub-id pub-id-type="doi">10.3389/fncir.2010.00012</pub-id><pub-id pub-id-type="pmid">20617186</pub-id></citation>
</ref>
<ref id="B51">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mihaljevi&#x00107;</surname> <given-names>B.</given-names></name> <name><surname>Benavides-Piccione</surname> <given-names>R.</given-names></name> <name><surname>Bielza</surname> <given-names>C.</given-names></name> <name><surname>DeFelipe</surname> <given-names>J.</given-names></name> <name><surname>Larra&#x000F1;aga</surname> <given-names>P.</given-names></name></person-group> (<year>2014a</year>). <article-title>Bayesian network classifiers for categorizing cortical GABAergic interneurons</article-title>. <source>Neuroinformatics</source>. (in press). <pub-id pub-id-type="doi">10.1007/s12021-014-9254-1</pub-id></citation>
</ref>
<ref id="B52">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mihaljevi&#x00107;</surname> <given-names>B.</given-names></name> <name><surname>Benavides-Piccione</surname> <given-names>R.</given-names></name> <name><surname>Guerra</surname> <given-names>L.</given-names></name> <name><surname>DeFelipe</surname> <given-names>J.</given-names></name> <name><surname>Larra&#x000F1;aga</surname> <given-names>P.</given-names></name> <name><surname>Bielza</surname> <given-names>C.</given-names></name></person-group> (<year>2014b</year>). <article-title>Classifying GABAergic interneurons with semi-supervised projected model-based clustering</article-title>. <source>Artif. Intell. Med</source>. (in press).</citation>
</ref>
<ref id="B53">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Mountcastle</surname> <given-names>V. B.</given-names></name></person-group> (<year>1998</year>). <source>Perceptual Neuroscience: The Cerebral Cortex</source>. <publisher-loc>Cambridge, MA</publisher-loc>: <publisher-name>Harvard University Press</publisher-name>.</citation>
</ref>
<ref id="B54">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Nagarajan</surname> <given-names>R.</given-names></name> <name><surname>Scutari</surname> <given-names>M.</given-names></name> <name><surname>L&#x000E9;bre</surname> <given-names>S.</given-names></name></person-group> (<year>2013</year>). <source>Bayesian Networks in R: with Applications in Systems Biology (Use R!)</source>. <publisher-loc>New York, NY</publisher-loc>: <publisher-name>Springer</publisher-name>. <pub-id pub-id-type="doi">10.1007/978-1-4614-6446-4</pub-id></citation>
</ref>
<ref id="B55">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Neapolitan</surname> <given-names>R. E.</given-names></name></person-group> (<year>2004</year>). <source>Learning Bayesian Networks</source>. <publisher-loc>Upper Saddle River, NJ</publisher-loc>: <publisher-name>Pearson Prentice Hall</publisher-name>.</citation>
</ref>
<ref id="B56">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Pearl</surname> <given-names>J.</given-names></name></person-group> (<year>1988</year>). <source>Probabilistic Reasoning in Intelligent Systems</source>. <publisher-loc>San Francisco, CA</publisher-loc>: <publisher-name>Morgan Kaufmann</publisher-name>.</citation>
</ref>
<ref id="B57">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pe&#x000F1;a</surname> <given-names>J. M.</given-names></name></person-group> (<year>2011</year>). <article-title>Finding consensus Bayesian network structures</article-title>. <source>J. Artif. Intell. Res</source>. <volume>42</volume>, <fpage>661</fpage>&#x02013;<lpage>687</lpage>.</citation>
</ref>
<ref id="B58">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Pennock</surname> <given-names>D. M.</given-names></name> <name><surname>Wellman</surname> <given-names>M. P.</given-names></name></person-group> (<year>1999</year>). <article-title>Graphical representations of consensus belief</article-title>, in <source>Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence</source> (<publisher-loc>Stockholm</publisher-loc>: <publisher-name>Morgan Kaufmann Publishers Inc.</publisher-name>), <fpage>531</fpage>&#x02013;<lpage>540</lpage>.</citation>
</ref>
<ref id="B59">
<citation citation-type="book"><person-group person-group-type="author"><collab>R Core Team.</collab></person-group> (<year>2014</year>). <source><monospace>R</monospace>: A Language and Environment for Statistical Computing</source>. <publisher-loc>Vienna</publisher-loc>: <publisher-name>R Foundation for Statistical Computing</publisher-name>.</citation>
</ref>
<ref id="B60">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Raykar</surname> <given-names>V. C.</given-names></name> <name><surname>Yu</surname> <given-names>S.</given-names></name></person-group> (<year>2012</year>). <article-title>Eliminating spammers and ranking annotators for crowdsourced labeling tasks</article-title>. <source>J. Mach. Learn. Res</source>. <volume>13</volume>, <fpage>491</fpage>&#x02013;<lpage>518</lpage>.</citation>
</ref>
<ref id="B61">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Raykar</surname> <given-names>V. C.</given-names></name> <name><surname>Yu</surname> <given-names>S.</given-names></name> <name><surname>Zhao</surname> <given-names>L. H.</given-names></name> <name><surname>Valadez</surname> <given-names>G. H.</given-names></name> <name><surname>Florin</surname> <given-names>C.</given-names></name> <name><surname>Bogoni</surname> <given-names>L.</given-names></name> <etal/></person-group>. (<year>2010</year>). <article-title>Learning from crowds</article-title>. <source>J. Mach. Learn. Res</source>. <volume>11</volume>, <fpage>1297</fpage>&#x02013;<lpage>1322</lpage>.</citation>
</ref>
<ref id="B62">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Santana</surname> <given-names>R.</given-names></name> <name><surname>McGarry</surname> <given-names>L. M.</given-names></name> <name><surname>Bielza</surname> <given-names>C.</given-names></name> <name><surname>Larra&#x000F1;aga</surname> <given-names>P.</given-names></name> <name><surname>Yuste</surname> <given-names>R.</given-names></name></person-group> (<year>2013</year>). <article-title>Classification of neocortical interneurons using affinity propagation</article-title>. <source>Front. Neural Circuits</source> <volume>7</volume>:<issue>185</issue>. <pub-id pub-id-type="doi">10.3389/fncir.2013.00185</pub-id><pub-id pub-id-type="pmid">24348339</pub-id></citation>
</ref>
<ref id="B63">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Scherer</surname> <given-names>S.</given-names></name> <name><surname>Kane</surname> <given-names>J.</given-names></name> <name><surname>Gobl</surname> <given-names>C.</given-names></name> <name><surname>Schwenker</surname> <given-names>F.</given-names></name></person-group> (<year>2013</year>). <article-title>Investigating fuzzy-input fuzzy-output support vector machines for robust voice quality classification</article-title>. <source>Comput. Speech Lang</source>. <volume>27</volume>, <fpage>263</fpage>&#x02013;<lpage>287</lpage>. <pub-id pub-id-type="doi">10.1016/j.csl.2012.06.001</pub-id></citation>
</ref>
<ref id="B64">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schwarz</surname> <given-names>G.</given-names></name></person-group> (<year>1978</year>). <article-title>Estimating the dimension of a model</article-title>. <source>Ann. Stat</source>. <volume>6</volume>, <fpage>461</fpage>&#x02013;<lpage>464</lpage>. <pub-id pub-id-type="doi">10.1214/aos/1176344136</pub-id></citation>
</ref>
<ref id="B65">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schwenker</surname> <given-names>F.</given-names></name> <name><surname>Trentin</surname> <given-names>E.</given-names></name></person-group> (<year>2014</year>). <article-title>Pattern classification and clustering: a review of partially supervised learning approaches</article-title>. <source>Patt. Recogn. Lett</source>. <volume>37</volume>, <fpage>4</fpage>&#x02013;<lpage>14</lpage>. <pub-id pub-id-type="doi">10.1016/j.patrec.2013.10.017</pub-id></citation>
</ref>
<ref id="B66">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Scutari</surname> <given-names>M.</given-names></name></person-group> (<year>2010</year>). <article-title>Learning Bayesian networks with the bnlearn <monospace>R</monospace> package</article-title>. <source>J. Stat. Soft</source>. <volume>35</volume>, <fpage>1</fpage>&#x02013;<lpage>22</lpage>.</citation>
</ref>
<ref id="B67">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Snow</surname> <given-names>R.</given-names></name> <name><surname>O&#x00027;Connor</surname> <given-names>B.</given-names></name> <name><surname>Jurafsky</surname> <given-names>D.</given-names></name> <name><surname>Ng</surname> <given-names>A. Y.</given-names></name></person-group> (<year>2008</year>). <article-title>Cheap and fast&#x02014;but is it good?: evaluating non-expert annotations for natural language tasks</article-title>, in <source>Proceedings of the Conference on Empirical Methods in Natural Language Processing</source> (<publisher-loc>Honolulu, HI</publisher-loc>: <publisher-name>Association for Computational Linguistics</publisher-name>), <fpage>254</fpage>&#x02013;<lpage>263</lpage>.</citation>
</ref>
<ref id="B68">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Somogyi</surname> <given-names>P.</given-names></name> <name><surname>Tam&#x000E1;s</surname> <given-names>G.</given-names></name> <name><surname>Lujan</surname> <given-names>R.</given-names></name> <name><surname>Buhl</surname> <given-names>E. H.</given-names></name></person-group> (<year>1998</year>). <article-title>Salient features of synaptic organisation in the cerebral cortex</article-title>. <source>Brain Res. Rev</source>. <volume>26</volume>, <fpage>113</fpage>&#x02013;<lpage>135</lpage>. <pub-id pub-id-type="doi">10.1016/S0165-0173(97)00061-1</pub-id><pub-id pub-id-type="pmid">9651498</pub-id></citation>
</ref>
<ref id="B69">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Sorokin</surname> <given-names>A.</given-names></name> <name><surname>Forsyth</surname> <given-names>D.</given-names></name></person-group> (<year>2008</year>). <article-title>Utility data annotation with Amazon Mechanical Turk</article-title>, in <source>First IEEE Workshop on Internet Vision at CVPR</source> (<publisher-loc>Anchorage, AK</publisher-loc>), <fpage>1</fpage>&#x02013;<lpage>8</lpage>.</citation>
</ref>
<ref id="B70">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>S&#x000FC;mb&#x000FC;l</surname> <given-names>U.</given-names></name> <name><surname>Song</surname> <given-names>S.</given-names></name> <name><surname>McCulloch</surname> <given-names>K.</given-names></name> <name><surname>Becker</surname> <given-names>M.</given-names></name> <name><surname>Lin</surname> <given-names>B.</given-names></name> <name><surname>Sanes</surname> <given-names>J. R.</given-names></name> <etal/></person-group>. (<year>2014</year>). <article-title>A genetic and computational approach to structurally classify neuronal types</article-title>. <source>Nat. Commun</source>. <volume>5</volume>, <fpage>3512</fpage>. <pub-id pub-id-type="pmid">24662602</pub-id></citation>
</ref>
<ref id="B71">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Thiel</surname> <given-names>C.</given-names></name> <name><surname>Scherer</surname> <given-names>S.</given-names></name> <name><surname>Schwenker</surname> <given-names>F.</given-names></name></person-group> (<year>2007</year>). <article-title>Fuzzy-input fuzzy-output one-against-all support vector machines</article-title>, in <source>Knowledge-Based Intelligent Information and Engineering Systems, Volume 4694 of Lecture Notes in Computer Science</source> (<publisher-loc>Vietri sul Mare</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>156</fpage>&#x02013;<lpage>165</lpage>.</citation>
</ref>
<ref id="B72">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tsiola</surname> <given-names>A.</given-names></name> <name><surname>Hamzei-Sichani</surname> <given-names>F.</given-names></name> <name><surname>Peterlin</surname> <given-names>Z.</given-names></name> <name><surname>Yuste</surname> <given-names>R.</given-names></name></person-group> (<year>2003</year>). <article-title>Quantitative morphologic classification of layer 5 neurons from mouse primary visual cortex</article-title>. <source>J. Comp. Neurol</source>. <volume>461</volume>, <fpage>415</fpage>&#x02013;<lpage>428</lpage>. <pub-id pub-id-type="doi">10.1002/cne.10628</pub-id><pub-id pub-id-type="pmid">12746859</pub-id></citation>
</ref>
<ref id="B73">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Turner</surname> <given-names>M. D.</given-names></name> <name><surname>Chakrabarti</surname> <given-names>C.</given-names></name> <name><surname>Jones</surname> <given-names>T. B.</given-names></name> <name><surname>Xu</surname> <given-names>J. F.</given-names></name> <name><surname>Fox</surname> <given-names>P. T.</given-names></name> <name><surname>Luger</surname> <given-names>G. F.</given-names></name> <etal/></person-group>. (<year>2013</year>). <article-title>Automated annotation of functional imaging experiments via multi-label classification</article-title>. <source>Front. Neurosci</source>. <volume>7</volume>:<issue>240</issue>. <pub-id pub-id-type="doi">10.3389/fnins.2013.00240</pub-id><pub-id pub-id-type="pmid">24409112</pub-id></citation>
</ref>
<ref id="B74">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Van Der Gaag</surname> <given-names>L. C.</given-names></name> <name><surname>De Waal</surname> <given-names>P. R.</given-names></name></person-group> (<year>2006</year>). <article-title>Multi-dimensional Bayesian network classifiers</article-title>, in <source>Probabilistic Graphical Models</source> (<publisher-loc>Prague</publisher-loc>), <fpage>107</fpage>&#x02013;<lpage>114</lpage>.</citation>
</ref>
<ref id="B75">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>Y.</given-names></name> <name><surname>Gupta</surname> <given-names>A.</given-names></name> <name><surname>Toledo-Rodriguez</surname> <given-names>M.</given-names></name> <name><surname>Wu</surname> <given-names>C. Z.</given-names></name> <name><surname>Markram</surname> <given-names>H.</given-names></name></person-group> (<year>2002</year>). <article-title>Anatomical, physiological, molecular and circuit properties of nest basket cells in the developing somatosensory cortex</article-title>. <source>Cereb. Cortex</source> <volume>12</volume>, <fpage>395</fpage>&#x02013;<lpage>410</lpage>. <pub-id pub-id-type="doi">10.1093/cercor/12.4.395</pub-id><pub-id pub-id-type="pmid">11884355</pub-id></citation>
</ref>
<ref id="B76">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Welinder</surname> <given-names>P.</given-names></name> <name><surname>Branson</surname> <given-names>S.</given-names></name> <name><surname>Belongie</surname> <given-names>S.</given-names></name> <name><surname>Perona</surname> <given-names>P.</given-names></name></person-group> (<year>2010</year>). <article-title>The multidimensional wisdom of crowds</article-title>, in <source>Advances in Neural Information Processing Systems 23 (NIPS)</source> (<publisher-loc>Vancouver, BC</publisher-loc>), <fpage>2424</fpage>&#x02013;<lpage>2432</lpage>.</citation>
</ref>
<ref id="B77">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Whitehill</surname> <given-names>J.</given-names></name> <name><surname>Ruvolo</surname> <given-names>P.</given-names></name> <name><surname>Wu</surname> <given-names>T.</given-names></name> <name><surname>Bergsma</surname> <given-names>J.</given-names></name> <name><surname>Movellan</surname> <given-names>J. R.</given-names></name></person-group> (<year>2009</year>). <article-title>Whose vote should count more: optimal integration of labels from labelers of unknown expertise</article-title>, in <source>Advances in Neural Information Processing Systems (NIPS)</source>, <volume>Vol. 22</volume> (<publisher-loc>Vancouver, BC</publisher-loc>), <fpage>2035</fpage>&#x02013;<lpage>2043</lpage>.</citation>
</ref>
<ref id="B78">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Yazdani</surname> <given-names>A.</given-names></name> <name><surname>Ebrahimi</surname> <given-names>T.</given-names></name> <name><surname>Hoffmann</surname> <given-names>U.</given-names></name></person-group> (<year>2009</year>). <article-title>Classification of EEG signals using Dempster Shafer theory and a k-nearest neighbor classifier</article-title>, in <source>4th International IEEE/EMBS Conference on Neural Engineering, NER&#x00027;09</source> (<publisher-loc>Antalya</publisher-loc>: <publisher-name>IEEE</publisher-name>), <fpage>327</fpage>&#x02013;<lpage>330</lpage>.</citation>
</ref>
</ref-list>
<fn-group>
<fn id="fn0001"><p><sup>1</sup>Following DeFelipe et al. (<xref ref-type="bibr" rid="B15">2013</xref>), we will often refer to the type and the four axonal features simply as <italic>axonal features</italic> (i.e., interneuron type is also encompassed by this term).</p></fn>
<fn id="fn0002"><p><sup>2</sup>From this point on, we adopt some machine learning terminology. Namely, in machine learning, each (discrete) predicted variable&#x02014;in our case, these are the five axonal features&#x02014; is called a <italic>class variable</italic>; this term, therefore, applies to each of the five axonal features, even though only the neuronal type is a class in the usual meaning of this term in neuroscience. <italic>Class labels</italic> are the assignments to the class variables associated with the data points (interneurons); e.g., an interneuron might be labeled as intralaminar. We will sometimes refer to expert neuroscientists as <italic>annotators</italic> because they annotated (labeled) the data with class labels. <italic>Predictor variables</italic> are the independent variables in a model, e.g., we use morphometric properties of an interneuron as predictor variables to predict interneuron type and other axonal features (the latter are the class variables). We will not use the term &#x0201C;features&#x0201D; as a synonym of &#x0201C;predictor variables,&#x0201D; as commonly done in machine learning, in order to avoid confusion with the term &#x0201C;axonal features.&#x0201D; We will use the terms &#x0201C;axonal features&#x0201D; and &#x0201C;class variables&#x0201D; interchangeably.</p></fn>
<fn id="fn0003"><p><sup>3</sup>Only neuroscientists who considered that <bold>x</bold><sup>(<italic>j</italic>)</sup> was <monospace>characterized</monospace> (axonal feature <italic>C</italic><sub>6</sub>) labeled <bold>x</bold><sup>(<italic>j</italic>)</sup> according to <italic>C</italic><sub>1</sub>&#x02013;<italic>C</italic><sub>5</sub>. Therefore, <italic>N</italic><sub><italic>j</italic></sub> may be less than 42 and varies across interneurons.</p></fn>
<fn id="fn0004"><p><sup>4</sup>Possibly, the ground truth might be better approximated by consulting more leading neuroscientists. Our group, nonetheless, includes many of the leading experts involved in the interneuron nomenclature effort (Ascoli et al., <xref ref-type="bibr" rid="B3">2007</xref>).</p></fn>
</fn-group>
</back>
</article>
