<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Genet.</journal-id>
<journal-title>Frontiers in Genetics</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Genet.</abbrev-journal-title>
<issn pub-type="epub">1664-8021</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fgene.2021.631505</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Genetics</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Feature Selection Using Approximate Conditional Entropy Based on Fuzzy Information Granule for Gene Expression Data Classification</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name><surname>Zhang</surname> <given-names>Hengyi</given-names></name>
<xref ref-type="corresp" rid="c001"><sup>&#x002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1146046/overview"/>
</contrib>
</contrib-group>
<aff><institution>College of Animal Science and Technology, Northwest A&#x0026;F University</institution>, <addr-line>Yangling</addr-line>, <country>China</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Wilson Wen Bin Goh, Nanyang Technological University, Singapore</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Guosheng Han, Xiangtan University, China; Yusen Zhang, Shandong University, China</p></fn>
<corresp id="c001">&#x002A;Correspondence: Hengyi Zhang, <email>zhanghengyi2000@163.com</email></corresp>
<fn fn-type="other" id="fn004"><p>This article was submitted to Computational Genomics, a section of the journal Frontiers in Genetics</p></fn>
</author-notes>
<pub-date pub-type="epub">
<day>30</day>
<month>03</month>
<year>2021</year>
</pub-date>
<pub-date pub-type="collection">
<year>2021</year>
</pub-date>
<volume>12</volume>
<elocation-id>631505</elocation-id>
<history>
<date date-type="received">
<day>20</day>
<month>11</month>
<year>2020</year>
</date>
<date date-type="accepted">
<day>12</day>
<month>03</month>
<year>2021</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x00A9; 2021 Zhang.</copyright-statement>
<copyright-year>2021</copyright-year>
<copyright-holder>Zhang</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract>
<p>Classification is widely used in gene expression data analysis. Feature selection is usually performed before classification because of the large number of genes and the small sample size in gene expression data. In this article, a novel feature selection algorithm using approximate conditional entropy based on fuzzy information granule is proposed, and the correctness of the method is proved by the monotonicity of entropy. Firstly, the fuzzy relation matrix is established by Laplacian kernel. Secondly, the approximately equal relation on fuzzy sets is defined. And then, the approximate conditional entropy based on fuzzy information granule and the importance of internal attributes are defined. Approximate conditional entropy can measure the uncertainty of knowledge from two different perspectives of information and algebra theory. Finally, the greedy algorithm based on the approximate conditional entropy is designed for feature selection. Experimental results for six large-scale gene datasets show that our algorithm not only greatly reduces the dimension of the gene datasets, but also is superior to five state-of-the-art algorithms in terms of classification accuracy.</p>
</abstract>
<kwd-group>
<kwd>feature selection</kwd>
<kwd>Laplacian kernel</kwd>
<kwd>fuzzy information granule</kwd>
<kwd>fuzzy relation matrix</kwd>
<kwd>approximate conditional entropy</kwd>
</kwd-group>
<counts>
<fig-count count="1"/>
<table-count count="9"/>
<equation-count count="9"/>
<ref-count count="30"/>
<page-count count="8"/>
<word-count count="0"/>
</counts>
</article-meta>
</front>
<body>
<sec id="S1">
<title>Introduction</title>
<p>The development of DNA microarray technology has brought about a large number of gene expression data. It is a hot topic in bioinformatics to analyze and mine the knowledge behind these data (<xref ref-type="bibr" rid="B21">Sun et al., 2019b</xref>). As the most basic data mining method, classification is widely used in the analysis of gene expression data. Due to the small sample size and high dimensionality of gene expression data, the traditional classification methods are often ineffective when applied to gene expression data directly (<xref ref-type="bibr" rid="B4">Fu and Wang, 2003</xref>; <xref ref-type="bibr" rid="B15">Mitra et al., 2011</xref>; <xref ref-type="bibr" rid="B17">Phan et al., 2012</xref>; <xref ref-type="bibr" rid="B13">Konstantina et al., 2015</xref>). It has become a consensus in the academic community to reduce the dimensionality before classification. Feature selection is the most widely used dimensionality reduction method in gene expression data because it can maintain the biological significance of each feature. Feature selection can not only reduce the time and space complexity of classification learning algorithm, avoid dimensionality disaster, and improve the prediction accuracy of classification, but also help to explain biological phenomena.</p>
<p>Feature selection methods are generally divided into three categories: filter, wrapper, and embedded method (<xref ref-type="bibr" rid="B5">Hu et al., 2018</xref>). The filter method obtains the optimal subset of features by judging the similarity between the features and the objective function based on the statistical characteristics of data. The wrapper method uses a specific model to carry out multiple rounds of training. After each round of training, several features are removed according to the score of the objective function, and then the next round of training is carried out based on the new feature set. In this way, recursion is repeated until the number of remaining features reaches the required number. The embedded method uses machine learning algorithm to get the weight coefficient of each feature in the first place, and then selects the feature according to the weight coefficient from large to small. Wrapper and embedded methods have heavy computational burden and are not suitable for large-scale gene data sets. Our feature selection method belongs to the filter method, in which a heuristic search algorithm is used to find an optimal subset of features using approximate conditional entropy based on fuzzy information granule for gene expression data classification.</p>
<p>Attribute reduction is a fundamental research topic and an important application of granular computing (<xref ref-type="bibr" rid="B3">Dong et al., 2018</xref>; <xref ref-type="bibr" rid="B25">Wang et al., 2019</xref>). Attribute reduction can be used for feature selection. Granular computing is a new concept and new computing paradigm of information processing, which is mainly used to deal with fuzzy and uncertain information (<xref ref-type="bibr" rid="B18">Qian et al., 2011</xref>).</p>
<p><xref ref-type="bibr" rid="B16">Pawlak (1982)</xref> proposed the rough set theory. Rough set theory is a new mathematical tool to deal with fuzziness and uncertainty. Granular computing is one of the important research contents of rough set theory. On the basis of equivalence relation, rough set theory is only suitable for dealing with discrete data widely existing in real life. When dealing with attribute reduction problem of continuous data in classical rough set theory, discretization method is often used to convert continuous data into discrete data, but the discretization will inevitably lead to information loss (<xref ref-type="bibr" rid="B2">Dai and Xu, 2012</xref>). To overcome this drawback, Hu et al. proposed a neighborhood rough set model (<xref ref-type="bibr" rid="B6">Hu et al., 2008</xref>, <xref ref-type="bibr" rid="B8">2011</xref>). Using neighborhood rough set model to select attribute of decision table containing continuous data can keep classification ability well and need not discretize it. The existing neighborhood rough set attribute reduction methods are based on the perspective of algebra or information theory. The definition of attribute significance based on algebra theory only describes the influence of attributes on the definite classification subset contained in the universe. The definition of attribute significance based on information theory only describes the influence of attributes on uncertain classification subsets contained in the universe. A single perspective is not comprehensive (<xref ref-type="bibr" rid="B11">Jiang et al., 2015</xref>).</p>
<p><xref ref-type="bibr" rid="B30">Zadeh (1979)</xref> proposed the concept of information granulation based on fuzzy sets theory. Objects in the universe are granulated into a set of fuzzy information granules by a fuzzy-binary relation (<xref ref-type="bibr" rid="B24">Tsang et al., 2008</xref>; <xref ref-type="bibr" rid="B10">Jensen and Shen, 2009</xref>).</p>
<p>In this article, a heuristic feature selection algorithm based on fuzzy information granules and approximate conditional entropy is designed to improve the classification performance of gene expression data sets. The experimental results for several gene expression data sets show that the proposed algorithm can find optimal reduction sets with few genes and high classification accuracy.</p>
<p>The remainder of this article is organized as follows. Section &#x201C;Materials and Methods&#x201D; gives the gene expression datasets for the experiment and our feature selection algorithm. Section &#x201C;Experimental Results and Analysis&#x201D; shows and analyzes the experimental results. Section &#x201C;Conclusion and Discussion&#x201D; summarizes this study and discusses future research focus.</p>
</sec>
<sec id="S2" sec-type="materials|methods">
<title>Materials and Methods</title>
<sec id="S2.SS1">
<title>Gene Expression Data Sets</title>
<p>The following six gene expression datasets are used in this article.</p>
<list list-type="simple">
<list-item>
<label>(1)</label>
<p>Leukemia1 dataset consists of 7129 genes and 72 samples with two subtypes: patients and healthy people (<xref ref-type="bibr" rid="B20">Sun et al., 2019a</xref>).</p>
</list-item>
<list-item>
<label>(2)</label>
<p>Leukemia2 dataset consists of 5327 genes and 72 samples with three subtypes: ALL-T (acute lymphoblastic leukemia, T-cell), ALL-B (acute lymphoblastic leukemia, B-cell), and AML (acute myeloid leukemia) (<xref ref-type="bibr" rid="B3">Dong et al., 2018</xref>).</p>
</list-item>
<list-item>
<label>(3)</label>
<p>Brain Tumor dataset consists of 10,367 genes and 50 samples with four subtypes (<xref ref-type="bibr" rid="B9">Huang et al., 2017</xref>).</p>
</list-item>
<list-item>
<label>(4)</label>
<p>9_Tumors dataset consists of 5726 genes and 60 samples with nine subtypes: non-small cell lung cancer, colon cancer, breast cancer, ovarian cancer, leukemia, kidney cancer, melanoma, prostate cancer, and central nervous system cancer (<xref ref-type="bibr" rid="B28">Ye et al., 2019</xref>).</p>
</list-item>
<list-item>
<label>(5)</label>
<p>Robert dataset consists of 23,416 genes and 194 samples with two subtypes: Musculus CD8+T-cells and L1210 cells (<xref ref-type="bibr" rid="B12">Kimmerling et al., 2016</xref>).</p>
</list-item>
<list-item>
<label>(6)</label>
<p>Ting dataset consists of 21,583 genes and 187 samples with seven subtypes: GMP cells, MEF cells, MP cells, nb508 cells, TuGMP cells, TuMP cells, and WBC cells (<xref ref-type="bibr" rid="B23">Ting et al., 2014</xref>).</p>
</list-item>
</list>
<p>The six gene expression datasets are summarized in <xref ref-type="table" rid="T1">Table 1</xref>.</p>
<table-wrap position="float" id="T1">
<label>TABLE 1</label>
<caption><p>Description of six experimental datasets.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left">No.</td>
<td valign="top" align="left">Datasets</td>
<td valign="top" align="left">Genes</td>
<td valign="top" align="left">Samples</td>
<td valign="top" align="left">Classes</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">1</td>
<td valign="top" align="left">Leukemia1</td>
<td valign="top" align="left">7129</td>
<td valign="top" align="left">72</td>
<td valign="top" align="left">2 (47/25)</td>
</tr>
<tr>
<td valign="top" align="left">2</td>
<td valign="top" align="left">Leukemia2</td>
<td valign="top" align="left">5327</td>
<td valign="top" align="left">72</td>
<td valign="top" align="left">3 (9/38/25)</td>
</tr>
<tr>
<td valign="top" align="left">3</td>
<td valign="top" align="left">Brain_Tumor</td>
<td valign="top" align="left">10,367</td>
<td valign="top" align="left">50</td>
<td valign="top" align="left">4 (14/7/14/15)</td>
</tr>
<tr>
<td valign="top" align="left">4</td>
<td valign="top" align="left">9_Tumors</td>
<td valign="top" align="left">5726</td>
<td valign="top" align="left">60</td>
<td valign="top" align="left">9 (9/7/8/6/6/8/8/2/6)</td>
</tr>
<tr>
<td valign="top" align="left">5</td>
<td valign="top" align="left">Robert</td>
<td valign="top" align="left">23,416</td>
<td valign="top" align="left">194</td>
<td valign="top" align="left">2 (88/106)</td>
</tr>
<tr>
<td valign="top" align="left">6</td>
<td valign="top" align="left">Ting</td>
<td valign="top" align="left">21,583</td>
<td valign="top" align="left">187</td>
<td valign="top" align="left">7 (18/12/75/16/20/34/12)</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="S2.SS2">
<title>Fuzzy Sets and Fuzzy-Binary Relation</title>
<p>Let <italic>U</italic> = {<italic>x</italic><sub>1</sub>, <italic>x</italic><sub>2</sub>, &#x2026;, <italic>x</italic><sub><italic>n</italic></sub>} be a nonempty finite set and denote a universe, <italic>I</italic> = [0, 1], <italic>I<sup>U</sup></italic> denotes all fuzzy sets on <italic>U</italic>.</p>
<p>Fuzzy sets are regarded as the extensions of classical sets (<xref ref-type="bibr" rid="B29">Zadeh, 1965</xref>).</p>
<p><italic>F</italic> is a fuzzy set on <italic>U</italic>, i.e., <italic>F</italic>: <italic>U</italic> &#x2192; <italic>I</italic>, then <italic>F</italic>(<italic>x</italic><sub><italic>i</italic></sub>) is the membership degree of <italic>x</italic><sub><italic>i</italic></sub> to <italic>F</italic>.</p>
<p>The cardinality of <italic>F</italic> &#x2208; <italic>I<sup>U</sup></italic> is <inline-formula><mml:math id="INEQ7"><mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:mi>F</mml:mi><mml:mo>|</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:msubsup><mml:mo largeop="true" symmetric="true">&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:msubsup><mml:mrow><mml:mi>F</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo rspace="5.8pt" stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:mrow></mml:math></inline-formula>.</p>
<p>Fuzzy-binary relation are fuzzy sets on two universes. <italic>I</italic><sup><italic>U</italic>&#x00D7;<italic>U</italic></sup> denotes all fuzzy-binary relations on <italic>U</italic> &#x00D7; <italic>U</italic>.</p>
<p>Fuzzy-binary relation <italic>R</italic> can be represented by</p>
<disp-formula id="S2.E1"><label>(1)</label><mml:math id="M1"><mml:mrow><mml:msub><mml:mi>M</mml:mi><mml:mi>R</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mtable columnspacing="5pt" displaystyle="true" rowspacing="0pt"><mml:mtr><mml:mtd columnalign="center"><mml:msub><mml:mi>r</mml:mi><mml:mn>11</mml:mn></mml:msub></mml:mtd><mml:mtd columnalign="center"><mml:msub><mml:mi>r</mml:mi><mml:mn>12</mml:mn></mml:msub></mml:mtd><mml:mtd columnalign="center"><mml:mi mathvariant="normal">&#x22EF;</mml:mi></mml:mtd><mml:mtd columnalign="center"><mml:msub><mml:mi>r</mml:mi><mml:mrow><mml:mn>1</mml:mn><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mtd></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:msub><mml:mi>r</mml:mi><mml:mn>21</mml:mn></mml:msub></mml:mtd><mml:mtd columnalign="center"><mml:msub><mml:mi>r</mml:mi><mml:mn>22</mml:mn></mml:msub></mml:mtd><mml:mtd columnalign="center"><mml:mi mathvariant="normal">&#x22EF;</mml:mi></mml:mtd><mml:mtd columnalign="center"><mml:msub><mml:mi>r</mml:mi><mml:mrow><mml:mn>2</mml:mn><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mtd></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:mi mathvariant="normal">&#x22EF;</mml:mi></mml:mtd><mml:mtd columnalign="center"><mml:mi mathvariant="normal">&#x22EF;</mml:mi></mml:mtd><mml:mtd columnalign="center"><mml:mi mathvariant="normal">&#x22EF;</mml:mi></mml:mtd><mml:mtd columnalign="center"><mml:mi mathvariant="normal">&#x22EF;</mml:mi></mml:mtd></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:msub><mml:mi>r</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mtd><mml:mtd columnalign="center"><mml:msub><mml:mi>r</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mn>2</mml:mn></mml:mrow></mml:msub></mml:mtd><mml:mtd columnalign="center"><mml:mi mathvariant="normal">&#x22EF;</mml:mi></mml:mtd><mml:mtd columnalign="center"><mml:msub><mml:mi>r</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mtd></mml:mtr></mml:mtable><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula>
<p>where <italic>r</italic><sub><italic>ij</italic></sub> = <italic>R</italic>(<italic>x</italic><sub><italic>i</italic></sub>, <italic>x</italic><sub><italic>j</italic></sub>) &#x2208; <italic>I</italic> is the similarity of <italic>x</italic><sub><italic>i</italic></sub> and <italic>x</italic><sub><italic>j</italic></sub>.</p>
</sec>
<sec id="S2.SS3">
<title>Information Systems and Rough Sets</title>
<p><bold>Definition 2.1</bold> (<xref ref-type="bibr" rid="B14">Li et al., 2017</xref>). Let <italic>U</italic>be a set of objects and <italic>A</italic> a set of attributes. Suppose that <italic>U</italic> and <italic>A</italic> are finite sets. If each attribute <italic>a</italic> &#x2208; <italic>A</italic> determines an information function <italic>a</italic>:<italic>U</italic>&#x2192;<italic>V</italic><sub><italic>a</italic></sub>, where <italic>V</italic><sub><italic>a</italic></sub> is the set of function values of attribute <italic>a</italic>, then the pair (<italic>U</italic>, <italic>A</italic>) is called an information system.</p>
<p>Moreover, if <italic>A</italic> = <italic>C</italic>&#x22C3;<italic>D</italic>, <italic>C</italic> is a condition attribute set and <italic>D</italic> is a decision attribute set, then the pair (<italic>U</italic>, <italic>A</italic>) is called a decision information system.</p>
<p>If (<italic>U</italic>, <italic>A</italic>) is an information system and <italic>P</italic> &#x2286; <italic>A</italic>, then an equivalence relation (or indiscernibility relation) <italic>ind</italic>(<italic>P</italic>) can be defined by (<italic>x</italic>, <italic>y</italic>) &#x2208; <italic>ind</italic>(<italic>P</italic>)&#x21D4;&#x2200;<italic>a</italic> &#x2208; <italic>P</italic>, <italic>a</italic>(<italic>x</italic>) = <italic>a</italic>(<italic>y</italic>).</p>
<p>Obviously, <inline-formula><mml:math id="INEQ22"><mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>n</mml:mi><mml:mi>d</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>P</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:munder><mml:mo mathsize="160%" movablelimits="false" stretchy="false">&#x22C2;</mml:mo><mml:mrow><mml:mi>a</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mi>P</mml:mi></mml:mrow></mml:munder><mml:mrow><mml:mi>i</mml:mi><mml:mi>n</mml:mi><mml:mi>d</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mo stretchy="false">{</mml:mo><mml:mi>a</mml:mi><mml:mo stretchy="false">}</mml:mo></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:mrow></mml:math></inline-formula>.</p>
<p>For <italic>P</italic> &#x2286; <italic>A</italic> and <italic>x</italic> &#x2208; <italic>U</italic>, denote [<italic>x</italic>]<sub><italic>ind</italic>(<italic>P</italic>)</sub> = {<italic>y</italic>|(<italic>x</italic>, <italic>y</italic>) &#x2208; <italic>ind</italic>(<italic>P</italic>)} and <italic>U</italic>/<italic>ind</italic>(<italic>P</italic>) = {[<italic>x</italic>]<sub><italic>ind</italic>(<italic>P</italic>)</sub>|<italic>x</italic>&#x2208;<italic>U</italic>}.</p>
<p>Usually, <italic>[x]</italic><sub><italic>ind(P)</italic></sub> and <italic>U</italic>/<italic>ind</italic>(<italic>P</italic>) are briefly denoted by <italic>[x]</italic><sub><italic>P</italic></sub> and <italic>U</italic>/<italic>P</italic>, respectively.</p>
<p>According to the rough set theory, for <italic>P</italic> &#x2286; <italic>A</italic>, <italic>X</italic> &#x2286; <italic>U</italic> is characterized by <inline-formula><mml:math id="INEQ31"><mml:mrow><mml:mover accent="true"><mml:mi>P</mml:mi><mml:mo stretchy="false">&#x00AF;</mml:mo></mml:mover><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>X</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="INEQ32"><mml:mrow><mml:munder accentunder="true"><mml:mi>P</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:munder><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>X</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>, where <inline-formula><mml:math id="INEQ33"><mml:mrow><mml:mrow><mml:munder accentunder="true"><mml:mi>P</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:munder><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>X</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo largeop="true" mathsize="160%" stretchy="false" symmetric="true">&#x22C3;</mml:mo><mml:mrow><mml:mo stretchy="false">{</mml:mo><mml:mi>Y</mml:mi><mml:mo>|</mml:mo><mml:mrow><mml:mrow><mml:mi>Y</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mrow><mml:mi>U</mml:mi><mml:mo>/</mml:mo><mml:mi>P</mml:mi></mml:mrow></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:mi>Y</mml:mi><mml:mo> &#x2286; </mml:mo><mml:mi>X</mml:mi></mml:mrow></mml:mrow><mml:mo stretchy="false">}</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="INEQ34"><mml:mrow><mml:mrow><mml:mover accent="true"><mml:mi>P</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>X</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo largeop="true" mathsize="160%" stretchy="false" symmetric="true">&#x22C3;</mml:mo><mml:mrow><mml:mo stretchy="false">{</mml:mo><mml:mi>Y</mml:mi><mml:mo>|</mml:mo><mml:mrow><mml:mrow><mml:mi>Y</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mrow><mml:mi>U</mml:mi><mml:mo>/</mml:mo><mml:mi>P</mml:mi></mml:mrow></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:mrow><mml:mi>Y</mml:mi><mml:mrow><mml:mo largeop="true" mathsize="160%" stretchy="false" symmetric="true">&#x22C2;</mml:mo><mml:mi>X</mml:mi></mml:mrow></mml:mrow><mml:mo rspace="5.8pt">&#x2260;</mml:mo><mml:mpadded width="+3.3pt"><mml:mi mathvariant="normal">&#x03D5;</mml:mi></mml:mpadded></mml:mrow></mml:mrow><mml:mo rspace="5.8pt" stretchy="false">}</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math></inline-formula>.</p>
<p><inline-formula><mml:math id="INEQ35"><mml:mrow><mml:munder accentunder="true"><mml:mi>P</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:munder><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>X</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="INEQ36"><mml:mrow><mml:mover accent="true"><mml:mi>P</mml:mi><mml:mo stretchy="false">&#x00AF;</mml:mo></mml:mover><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>X</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> are referred to as the lower and upper approximations of <italic>X</italic>, respectively.</p>
<p><italic>X</italic> is crisp if <inline-formula><mml:math id="INEQ37"><mml:mrow><mml:mrow><mml:mover accent="true"><mml:mi>P</mml:mi><mml:mo stretchy="false">&#x00AF;</mml:mo></mml:mover><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>X</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:munder accentunder="true"><mml:mi>P</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:munder><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>X</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math></inline-formula> and <italic>X</italic> is rough if <inline-formula><mml:math id="INEQ38"><mml:mrow><mml:mrow><mml:mover accent="true"><mml:mi>P</mml:mi><mml:mo stretchy="false">&#x00AF;</mml:mo></mml:mover><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>X</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>&#x2260;</mml:mo><mml:mrow><mml:munder accentunder="true"><mml:mi>P</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:munder><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>X</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math></inline-formula>.</p>
</sec>
<sec id="S2.SS4">
<title>The Approximately Equal Relation on Fuzzy Sets</title>
<p>Given <italic>F</italic>,<italic>G</italic> &#x2208; <italic>I<sup>U</sup></italic>. For <italic>x</italic> &#x2208; <italic>U</italic>, <italic>F</italic>(<italic>x</italic>) and <italic>G</italic>(<italic>x</italic>) are the membership degrees of <italic>x</italic> belonging to fuzzy sets <italic>F</italic> and <italic>G</italic>, respectively. <italic>F</italic>(<italic>x</italic>) and <italic>G</italic>(<italic>x</italic>) &#x2208; [0,1]. Actually, it is very difficult to ensure that the equation <italic>F</italic>(<italic>x</italic>) = <italic>G</italic>(<italic>x</italic>) holds. For this reason, we propose the following approximately equal relation of fuzzy sets.</p>
<p><bold>Definition 2.2</bold> Given <italic>A</italic>,<italic>B</italic> &#x2208; <italic>I<sup>U</sup></italic>. If there exists <italic>k</italic> &#x2208; <italic>N</italic>(<italic>k</italic>&#x2265;2) such that for any <italic>x</italic> &#x2208; <italic>U</italic>, <italic>A</italic>(<italic>x</italic>),<italic>B</italic>(<italic>x</italic>) &#x2208; [0,1/<italic>k</italic>) or <italic>A</italic>(<italic>x</italic>),<italic>B</italic>(<italic>x</italic>) &#x2208; [1/<italic>k</italic>,2/<italic>k</italic>)&#x2026;or <italic>A</italic>(<italic>x</italic>),<italic>B</italic>(<italic>x</italic>) &#x2208; [(<italic>k</italic>&#x2212;1)/<italic>k</italic>,1], then we say that <italic>A</italic> is approximately equal to <italic>B</italic>, and denote it by <inline-formula><mml:math id="INEQ52"><mml:mrow><mml:mi>A</mml:mi><mml:mrow><mml:mover><mml:mo movablelimits="false">&#x2248;</mml:mo><mml:mi>k</mml:mi></mml:mover><mml:mi>B</mml:mi></mml:mrow></mml:mrow></mml:math></inline-formula>, where <italic>k</italic> is regarded as a threshold value.</p>
<p><bold>Definition 2.3</bold> For each <italic>a</italic> &#x2208; <italic>U</italic>, define <italic>x<sup>R</sup></italic>:<italic>U</italic>&#x2192;[0,1],<italic>x<sup>R</sup></italic>(<italic>a</italic>) = <italic>R</italic>(<italic>x</italic>,<italic>a</italic>)(<italic>x</italic> &#x2208; <italic>U</italic>), <italic>x<sup>R</sup></italic> is referred to as a fuzzy set that means the membership degree of <italic>a</italic> to <italic>x</italic>.</p>
<p><bold>Definition 2.4</bold> <inline-formula><mml:math id="INEQ56"><mml:mrow><mml:msub><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:mi>x</mml:mi><mml:mo stretchy="false">]</mml:mo></mml:mrow><mml:mi>R</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mrow><mml:mo stretchy="false">{</mml:mo><mml:mi>y</mml:mi><mml:mo>|</mml:mo><mml:mrow><mml:mrow><mml:mrow><mml:msup><mml:mi>x</mml:mi><mml:mi>R</mml:mi></mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>a</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mover><mml:mo movablelimits="false">&#x2248;</mml:mo><mml:mi>k</mml:mi></mml:mover><mml:mrow><mml:msup><mml:mi>y</mml:mi><mml:mi>R</mml:mi></mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>a</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:mrow><mml:mo>,</mml:mo><mml:mi>y</mml:mi></mml:mrow><mml:mo>&#x2208;</mml:mo><mml:mi>U</mml:mi></mml:mrow><mml:mo stretchy="false">}</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>, <italic>[x]</italic><sub><italic>R</italic></sub>is referred to as the fuzzy equal class of <italic>x</italic> induced by the fuzzy relation <italic>R</italic> on <italic>U</italic>.</p>
<p><bold>Definition 2.5 [<italic>x</italic><sub><italic>i</italic></sub>]<sub><italic>R</italic></sub>(<italic>i</italic> = 1,2,&#x2026;,|<italic>U</italic>|)</bold> is named as the fuzzy information granule induced by the fuzzy relation <italic>R</italic> on <italic>U</italic>.</p>
<p><bold>Definition 2.6</bold><italic>G</italic>(<italic>R</italic>) = {[<italic>x</italic><sub>1</sub>]<sub><italic>R</italic></sub>,[<italic>x</italic><sub>2</sub>]<sub><italic>R</italic></sub>,&#x2026;,[<italic>x</italic><sub><italic>n</italic></sub>]<sub><italic>R</italic></sub>} is referred to as the fuzzy-binary granular structure of the universe <italic>U</italic> induced by <italic>R</italic>.</p>
<p>It is easy to prove: <inline-formula><mml:math id="INEQ59"><mml:mrow><mml:mrow><mml:munder accentunder="true"><mml:mi>P</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:munder><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>X</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo stretchy="false">{</mml:mo><mml:mi>x</mml:mi><mml:mo>|</mml:mo><mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:mi>x</mml:mi><mml:mo stretchy="false">]</mml:mo></mml:mrow><mml:mi>R</mml:mi></mml:msub><mml:mo> &#x2286; </mml:mo><mml:mi>X</mml:mi></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:mi>x</mml:mi><mml:mo stretchy="false">]</mml:mo></mml:mrow><mml:mi>R</mml:mi></mml:msub><mml:mo>&#x2208;</mml:mo><mml:mrow><mml:mi>G</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>R</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:mrow><mml:mo stretchy="false">}</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="INEQ60"><mml:mrow><mml:mrow><mml:mover accent="true"><mml:mi>P</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>X</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo stretchy="false">{</mml:mo><mml:mi>x</mml:mi><mml:mo>|</mml:mo><mml:mrow><mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:mi>x</mml:mi><mml:mo stretchy="false">]</mml:mo></mml:mrow><mml:mi>R</mml:mi></mml:msub><mml:mrow><mml:mo largeop="true" mathsize="160%" stretchy="false" symmetric="true">&#x22C2;</mml:mo><mml:mi>X</mml:mi></mml:mrow></mml:mrow><mml:mo>&#x2260;</mml:mo><mml:mi mathvariant="normal">&#x03D5;</mml:mi></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:mpadded width="+3.3pt"><mml:msub><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:mi>x</mml:mi><mml:mo stretchy="false">]</mml:mo></mml:mrow><mml:mi>R</mml:mi></mml:msub></mml:mpadded><mml:mo rspace="5.8pt">&#x2208;</mml:mo><mml:mrow><mml:mi>G</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>R</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:mrow><mml:mo rspace="5.8pt" stretchy="false">}</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>.</p>
</sec>
<sec id="S2.SS5">
<title>Fuzzy-Binary Relation Based on Laplacian Kernel</title>
<p><xref ref-type="bibr" rid="B7">Hu et al. (2010)</xref> found that there are some relationships between rough sets and Gaussian kernel method, so Gaussian kernel is used to obtain fuzzy relations. Compared with Gaussian kernel, Laplacian kernel has higher peak, faster reduction and smoother tail. Therefore, Laplacian kernel is better than Gaussian kernel in describing the similarity between objects. In this article, we use Laplacian kernel <inline-formula><mml:math id="INEQ61"><mml:mrow><mml:mrow><mml:mi>k</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mi>exp</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mo>-</mml:mo><mml:mfrac><mml:mrow><mml:mo fence="true" maxsize="142%" minsize="142%">||</mml:mo><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow><mml:mo fence="true" maxsize="142%" minsize="142%">||</mml:mo></mml:mrow><mml:mi mathvariant="normal">&#x03C3;</mml:mi></mml:mfrac></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math></inline-formula> to extract the similarity between two objects from decision information system, where ||<italic>x</italic><sub><italic>i</italic></sub>&#x2212;<italic>x</italic><sub><italic>j</italic></sub>|| is the Euclidean distance between two objects <italic>x</italic><sub><italic>i</italic></sub> and <italic>x</italic><sub><italic>j</italic></sub>. In general, &#x03C3; is a given positive value.</p>
<p>Obviously, <italic>k</italic>(<italic>x</italic><sub><italic>i</italic></sub>, <italic>x</italic><sub><italic>j</italic></sub>) satisfies:</p>
<list list-type="simple">
<list-item>
<label>(1)</label>
<p><italic>k</italic>(<italic>x</italic><sub><italic>i</italic></sub>, <italic>x</italic><sub><italic>j</italic></sub>)&#x2208;(0,1].</p>
</list-item>
<list-item>
<label>(2)</label>
<p><italic>k</italic>(<italic>x</italic><sub><italic>i</italic></sub>, <italic>x</italic><sub><italic>j</italic></sub>) = <italic>k</italic>(<italic>x</italic><sub><italic>j</italic></sub>,<italic>x</italic><sub><italic>i</italic></sub>).</p>
</list-item>
<list-item>
<label>(3)</label>
<p><italic>k</italic>(<italic>x</italic><sub><italic>i</italic></sub>, <italic>x</italic><sub><italic>i</italic></sub>) = 1.</p>
</list-item>
</list>
<p>Let <italic>R</italic> = (<italic>k</italic>(<italic>x</italic><sub><italic>i</italic></sub>, <italic>x</italic><sub><italic>j</italic></sub>))<sub><italic>n</italic>&#x00D7;<italic>n</italic></sub>, then <italic>R</italic> is called the fuzzy relation matrix induced by Laplacian kernel.</p>
</sec>
<sec id="S2.SS6">
<title>Feature Selection Using Approximate Conditional Entropy Based on Fuzzy Information Granule</title>
<sec id="S2.SS6.SSS1">
<title>Approximate Accuracy and Approximate Conditional Entropy</title>
<p><bold>Definition 2.7</bold> Given a decision information system (<italic>U</italic>, <italic>C</italic>&#x22C3;<italic>D</italic>), &#x2200;<italic>X</italic> &#x2286; <italic>U</italic>, <italic>X</italic> &#x2260; &#x03D5; (&#x03D5; is an empty set), then the approximate accuracy of <italic>X</italic> is defined as</p>
<disp-formula id="S2.E2"><label>(2)</label><mml:math id="M2"><mml:mrow><mml:mrow><mml:mi>a</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>X</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mo>|</mml:mo><mml:mrow><mml:munder accentunder="true"><mml:mi>P</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:munder><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>X</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>|</mml:mo></mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:mrow><mml:mover accent="true"><mml:mi>P</mml:mi><mml:mo stretchy="false">&#x00AF;</mml:mo></mml:mover><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>X</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>|</mml:mo></mml:mrow></mml:mfrac></mml:mrow></mml:math></disp-formula>
<p>where |.| denotes the cardinality of set. Obviously, 0&#x2264;<italic>a</italic>(<italic>X</italic>)&#x2264;1.</p>
<p><bold>Definition 2.8</bold> Given a decision information system (<italic>U</italic>,<italic>C</italic>&#x22C3;<italic>D</italic>), &#x2200;<italic>B</italic> &#x2286; <italic>C</italic>, the fuzzy information granule of object <italic>x</italic> under <italic>B</italic> is <italic>[x]</italic><sub><italic>R_B</italic></sub>, the partition of <italic>U</italic> derived from <italic>D</italic> is {<italic>X</italic><sub>1</sub>,<italic>X</italic><sub>2</sub>,&#x2026;,<italic>X</italic><sub><italic>k</italic></sub>}, then the conditional entropy of <italic>D</italic> relative to <italic>B</italic> is defined as</p>
<disp-formula id="S2.E3"><label>(3)</label><mml:math id="M3"><mml:mrow><mml:mrow><mml:mi>H</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>D</mml:mi><mml:mo>/</mml:mo><mml:mi>B</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo>-</mml:mo><mml:mrow><mml:munderover><mml:mo largeop="true" movablelimits="false" symmetric="true">&#x2211;</mml:mo><mml:mrow><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>k</mml:mi></mml:munderover><mml:mrow><mml:munderover><mml:mo largeop="true" movablelimits="false" symmetric="true">&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:mi>U</mml:mi><mml:mo>|</mml:mo></mml:mrow></mml:munderover><mml:mrow><mml:mfrac><mml:mrow><mml:mo>|</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy="false">]</mml:mo></mml:mrow><mml:msub><mml:mi>R</mml:mi><mml:mi>B</mml:mi></mml:msub></mml:msub><mml:mrow><mml:mo largeop="true" mathsize="160%" stretchy="false" symmetric="true">&#x22C2;</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:mrow><mml:mo>|</mml:mo></mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:mi>U</mml:mi><mml:mo>|</mml:mo></mml:mrow></mml:mfrac><mml:mrow><mml:mi>log</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mfrac><mml:mrow><mml:mo>|</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy="false">]</mml:mo></mml:mrow><mml:msub><mml:mi>R</mml:mi><mml:mi>B</mml:mi></mml:msub></mml:msub><mml:mrow><mml:mo largeop="true" mathsize="160%" stretchy="false" symmetric="true">&#x22C2;</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:mrow><mml:mo>|</mml:mo></mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy="false">]</mml:mo></mml:mrow><mml:msub><mml:mi>R</mml:mi><mml:mi>B</mml:mi></mml:msub></mml:msub><mml:mo>|</mml:mo></mml:mrow></mml:mfrac></mml:mrow></mml:mrow></mml:mrow></mml:mrow></mml:mrow></mml:mrow></mml:math></disp-formula>
<p>where <italic>R</italic><sub><italic>B</italic></sub> denotes the fuzzy relation based on attribute set <italic>B</italic> and <italic>log</italic> is a base-2 logarithm.</p>
<p>The approximate accuracy can effectively measure the imprecision of the set caused by the boundary region, while the conditional entropy can effectively measure the knowledge uncertainty caused by the information granularity. We combine the two to propose approximate conditional entropy.</p>
<p><bold>Definition 2.9</bold> Let (<italic>U</italic>,<italic>C</italic>&#x22C3;<italic>D</italic>) be a decision information system, &#x2200;<italic>B</italic> &#x2286; <italic>C</italic>, the fuzzy information granule of object <italic>x</italic> under <italic>B</italic> is <italic>[x]</italic><sub><italic>R_B</italic></sub>, the partition of <italic>U</italic> derived from <italic>D</italic> is {<italic>X</italic><sub>1</sub>,<italic>X</italic><sub>2</sub>,&#x2026;,<italic>X</italic><sub><italic>k</italic></sub>}, <italic>a</italic><sub><italic>B</italic></sub>(<italic>X</italic><sub><italic>i</italic></sub>) is the approximate accuracy of <italic>X</italic><sub><italic>i</italic></sub> under <italic>R</italic><sub><italic>B</italic></sub>, then the approximate conditional entropy of <italic>D</italic> relative to <italic>B</italic> is defined as</p>
<disp-formula id="S2.Ex1"><mml:math id="M4"><mml:mrow><mml:mrow><mml:msub><mml:mi>H</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>c</mml:mi><mml:mi>e</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>D</mml:mi><mml:mo>/</mml:mo><mml:mi>B</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo>-</mml:mo><mml:mrow><mml:munderover><mml:mo largeop="true" movablelimits="false" symmetric="true">&#x2211;</mml:mo><mml:mrow><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>k</mml:mi></mml:munderover><mml:mrow><mml:munderover><mml:mo largeop="true" movablelimits="false" symmetric="true">&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:mi>U</mml:mi><mml:mo>|</mml:mo></mml:mrow></mml:munderover><mml:mrow><mml:mrow><mml:mi>log</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>2</mml:mn><mml:mo>-</mml:mo><mml:mrow><mml:msub><mml:mi>a</mml:mi><mml:mi>B</mml:mi></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mfrac><mml:mrow><mml:mo>|</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy="false">]</mml:mo></mml:mrow><mml:msub><mml:mi>R</mml:mi><mml:mi>B</mml:mi></mml:msub></mml:msub><mml:mrow><mml:mo largeop="true" mathsize="160%" stretchy="false" symmetric="true">&#x22C2;</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:mrow><mml:mo>|</mml:mo></mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:mi>U</mml:mi><mml:mo>|</mml:mo></mml:mrow></mml:mfrac></mml:mrow></mml:mrow></mml:mrow></mml:mrow></mml:mrow></mml:math></disp-formula>
<disp-formula id="S2.E4"><label>(4)</label><mml:math id="M5"><mml:mrow><mml:mi>log</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mfrac><mml:mrow><mml:mo>|</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy="false">]</mml:mo></mml:mrow><mml:msub><mml:mi>R</mml:mi><mml:mi>B</mml:mi></mml:msub></mml:msub><mml:mrow><mml:mo largeop="true" mathsize="160%" stretchy="false" symmetric="true">&#x22C2;</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:mrow><mml:mo>|</mml:mo></mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy="false">]</mml:mo></mml:mrow><mml:msub><mml:mi>R</mml:mi><mml:mi>B</mml:mi></mml:msub></mml:msub><mml:mo>|</mml:mo></mml:mrow></mml:mfrac></mml:mrow></mml:math></disp-formula>
<p><bold>Theorem 2.1</bold> Let (<italic>U</italic>,<italic>C</italic>&#x22C3;<italic>D</italic>) be a decision information system, &#x2200;<italic>B</italic> &#x2286; <italic>C</italic>, the fuzzy information granule of object <italic>x</italic> under <italic>B</italic> is <italic>[x]</italic><sub><italic>R_B</italic></sub>, the partition of <italic>U</italic> derived from <italic>D</italic> is {<italic>X</italic><sub>1</sub>,<italic>X</italic><sub>2</sub>,&#x2026;,<italic>X</italic><sub><italic>k</italic></sub>}.</p>
<list list-type="simple">
<list-item>
<label>(1)</label>
<p><italic>H</italic><sub><italic>ace</italic></sub>(<italic>D</italic>/<italic>B</italic>) gets the maximum value |<italic>U</italic>|<italic>log</italic>&#x2061;|<italic>U</italic>| if and only if [<italic>x</italic><sub><italic>i</italic></sub>]<sub><italic>R</italic><sub><italic>B</italic></sub></sub> = <italic>U</italic>(<italic>i</italic> = 1,2,&#x2026;,<italic>n</italic>) and |<italic>X</italic><sub><italic>j</italic></sub>| = 1(<italic>j</italic> = 1,2,&#x2026;,<italic>k</italic> = <italic>n</italic>).</p>
</list-item>
<list-item>
<label>(2)</label>
<p><italic>H</italic><sub><italic>ace</italic></sub>(<italic>D</italic>/<italic>B</italic>)gets the minimum value <italic>0</italic> if and only if [<italic>x</italic><sub><italic>i</italic></sub>]<sub><italic>R</italic><sub><italic>B</italic></sub></sub> &#x2286; [<italic>x</italic><sub><italic>i</italic></sub>]<sub><italic>R</italic><sub><italic>D</italic></sub></sub>(<italic>i</italic> = 1,2,&#x2026;,<italic>n</italic>).</p>
</list-item>
</list>
<p><bold>Proof.</bold> (1) Due to [<italic>x</italic><sub><italic>i</italic></sub>]<sub><italic>R</italic><sub><italic>B</italic></sub></sub> = <italic>U</italic>(<italic>i</italic> = 1,2,&#x2026;,<italic>n</italic>) and |<italic>X</italic><sub><italic>j</italic></sub>| = 1(<italic>j</italic> = 1,2,&#x2026;,<italic>k</italic>), we have <italic>a</italic><sub><italic>B</italic></sub>(<italic>X</italic><sub><italic>j</italic></sub>) = 0(j = 1,2,&#x2026;,k) according to Definition 2.7.</p>
<p>Thus, <italic>log</italic>(2&#x2212;<italic>a</italic><sub><italic>B</italic></sub>(<italic>X</italic><sub><italic>j</italic></sub>)) = 1(<italic>j</italic> = 1,2,&#x2026;,<italic>k</italic>).</p>
<p>Clearly, <inline-formula><mml:math id="INEQ94"><mml:mrow><mml:mrow><mml:mfrac><mml:mrow><mml:mo>|</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy="false">]</mml:mo></mml:mrow><mml:msub><mml:mi>R</mml:mi><mml:mi>B</mml:mi></mml:msub></mml:msub><mml:mstyle scriptlevel="-1"><mml:mrow><mml:mo largeop="true" mathsize="160%" stretchy="false" symmetric="true">&#x22C2;</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi mathsize="71%">j</mml:mi></mml:msub></mml:mrow></mml:mstyle></mml:mrow><mml:mo>|</mml:mo></mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:mi>U</mml:mi><mml:mo>|</mml:mo></mml:mrow></mml:mfrac><mml:mrow><mml:mi>log</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mfrac><mml:mrow><mml:mo>|</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy="false">]</mml:mo></mml:mrow><mml:msub><mml:mi>R</mml:mi><mml:mi>B</mml:mi></mml:msub></mml:msub><mml:mstyle scriptlevel="-1"><mml:mrow><mml:mo largeop="true" mathsize="160%" stretchy="false" symmetric="true">&#x22C2;</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi mathsize="71%">j</mml:mi></mml:msub></mml:mrow></mml:mstyle></mml:mrow><mml:mo>|</mml:mo></mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy="false">]</mml:mo></mml:mrow><mml:msub><mml:mi>R</mml:mi><mml:mi>B</mml:mi></mml:msub></mml:msub><mml:mo>|</mml:mo></mml:mrow></mml:mfrac></mml:mrow></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mfrac><mml:mn>1</mml:mn><mml:mrow><mml:mo>|</mml:mo><mml:mi>U</mml:mi><mml:mo>|</mml:mo></mml:mrow></mml:mfrac><mml:mrow><mml:mi>log</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mpadded width="+3.3pt"><mml:mfrac><mml:mn>1</mml:mn><mml:mrow><mml:mo>|</mml:mo><mml:mi>U</mml:mi><mml:mo>|</mml:mo></mml:mrow></mml:mfrac></mml:mpadded></mml:mrow></mml:mrow></mml:mrow></mml:math></inline-formula>.</p>
<p>By Definition 2.9, we have <italic>H</italic><sub><italic>ace</italic></sub>(<italic>D</italic>/<italic>B</italic>) = |<italic>U</italic>|<italic>log</italic>&#x2061;|<italic>U</italic>|.</p>
<p>The converse is also true.</p>
<p>(2) Due to [<italic>x</italic><sub><italic>i</italic></sub>]<sub><italic>R</italic><sub><italic>B</italic></sub></sub> &#x2286; [<italic>x</italic><sub><italic>i</italic></sub>]<sub><italic>R</italic><sub><italic>D</italic></sub></sub>(<italic>i</italic> = 1,2,&#x2026;,<italic>n</italic>), we have <italic>a</italic><sub><italic>B</italic></sub>(<italic>X</italic><sub><italic>j</italic></sub>) = 1(j = 1,2,&#x2026;,k) according to Definition 2.7. Thus <italic>log</italic>(2&#x2212;<italic>a</italic><sub><italic>B</italic></sub>(<italic>X</italic><sub><italic>j</italic></sub>)) = 0(j = 1,2,..,k). Obviously, <italic>H</italic><sub><italic>ace</italic></sub>(<italic>D</italic>/<italic>B</italic>) = 0 according to Definition 2.9.</p>
<p>The converse is also true.</p>
<p><bold>Theorem 2.2</bold> Let (<italic>U</italic>,<italic>C</italic>&#x22C3;<italic>D</italic>) be a decision information system, &#x2200;<italic>L</italic>,<italic>M</italic> &#x2286; <italic>C</italic>, if <italic>M</italic> &#x2286; <italic>L</italic>, then <italic>H</italic><sub><italic>ace</italic></sub>(<italic>D</italic>/<italic>M</italic>)&#x2265;<italic>H</italic><sub><italic>ace</italic></sub>(<italic>D</italic>/<italic>L</italic>).</p>
<p><bold>Proof.</bold> Due to <italic>M</italic> &#x2286; <italic>L</italic> &#x2286; <italic>C</italic>, we have <inline-formula><mml:math id="INEQ105"><mml:mrow><mml:mrow><mml:munder accentunder="true"><mml:msub><mml:mi>P</mml:mi><mml:mi>M</mml:mi></mml:msub><mml:mo>&#x00AF;</mml:mo></mml:munder><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>X</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo> &#x2286; </mml:mo><mml:mrow><mml:munder accentunder="true"><mml:msub><mml:mi>P</mml:mi><mml:mi>L</mml:mi></mml:msub><mml:mo>&#x00AF;</mml:mo></mml:munder><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>X</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="INEQ106"><mml:mrow><mml:mrow><mml:mover accent="true"><mml:msub><mml:mi>P</mml:mi><mml:mi>M</mml:mi></mml:msub><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>X</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>&#x2287;</mml:mo><mml:mrow><mml:mpadded width="+3.3pt"><mml:mover accent="true"><mml:msub><mml:mi>P</mml:mi><mml:mi>L</mml:mi></mml:msub><mml:mo>&#x00AF;</mml:mo></mml:mover></mml:mpadded><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>X</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math></inline-formula>.</p>
<p>Then <italic>a</italic><sub><italic>M</italic></sub>(<italic>X</italic>)&#x2264;<italic>a</italic><sub><italic>L</italic></sub>(<italic>X</italic>) according to Definition 2.7.</p>
<p>By <italic>M</italic> &#x2286; <italic>L</italic> and <italic>U</italic>/<italic>D</italic> = {<italic>X</italic><sub>1</sub>,<italic>X</italic><sub>2</sub>,&#x2026;,<italic>X</italic><sub><italic>k</italic></sub>}, we have</p>
<disp-formula id="S2.Ex2"><mml:math id="M6"><mml:mrow><mml:mo>-</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:mo>|</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy="false">]</mml:mo></mml:mrow><mml:msub><mml:mi>R</mml:mi><mml:mi>M</mml:mi></mml:msub></mml:msub><mml:mrow><mml:mo largeop="true" mathsize="160%" stretchy="false" symmetric="true">&#x22C2;</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:mrow><mml:mo>|</mml:mo></mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:mi>U</mml:mi><mml:mo>|</mml:mo></mml:mrow></mml:mfrac><mml:mrow><mml:mi>log</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mfrac><mml:mrow><mml:mo>|</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy="false">]</mml:mo></mml:mrow><mml:msub><mml:mi>R</mml:mi><mml:mi>M</mml:mi></mml:msub></mml:msub><mml:mrow><mml:mo largeop="true" mathsize="160%" stretchy="false" symmetric="true">&#x22C2;</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:mrow><mml:mo>|</mml:mo></mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy="false">]</mml:mo></mml:mrow><mml:msub><mml:mi>R</mml:mi><mml:mi>M</mml:mi></mml:msub></mml:msub><mml:mo>|</mml:mo></mml:mrow></mml:mfrac></mml:mrow></mml:mrow></mml:mrow></mml:math></disp-formula>
<disp-formula id="S2.E5"><label>(5)</label><mml:math id="M7"><mml:mrow><mml:mo>&#x2265;</mml:mo><mml:mrow><mml:mo>-</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:mo>|</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy="false">]</mml:mo></mml:mrow><mml:msub><mml:mi>R</mml:mi><mml:mi>L</mml:mi></mml:msub></mml:msub><mml:mrow><mml:mo largeop="true" mathsize="160%" stretchy="false" symmetric="true">&#x22C2;</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:mrow><mml:mo>|</mml:mo></mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:mi>U</mml:mi><mml:mo>|</mml:mo></mml:mrow></mml:mfrac><mml:mrow><mml:mi>log</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mfrac><mml:mrow><mml:mo>|</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy="false">]</mml:mo></mml:mrow><mml:msub><mml:mi>R</mml:mi><mml:mi>L</mml:mi></mml:msub></mml:msub><mml:mrow><mml:mo largeop="true" mathsize="160%" stretchy="false" symmetric="true">&#x22C2;</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mrow></mml:mrow><mml:mo>|</mml:mo></mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy="false">]</mml:mo></mml:mrow><mml:msub><mml:mi>R</mml:mi><mml:mi>L</mml:mi></mml:msub></mml:msub><mml:mo>|</mml:mo></mml:mrow></mml:mfrac></mml:mrow></mml:mrow></mml:mrow><mml:mo>&#x2265;</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math></disp-formula>
<p>Consequently, <italic>H</italic><sub><italic>ace</italic></sub>(<italic>D</italic>/<italic>M</italic>)&#x2265;<italic>H</italic><sub><italic>ace</italic></sub>(<italic>D</italic>/<italic>L</italic>) according to Definition 2.9.</p>
<p>Theorem 2.2 shows that <italic>H</italic><sub><italic>ace</italic></sub>(<italic>D</italic>/<italic>B</italic>) decreases monotonically with the increase of the number of attributes in <italic>B</italic>, which is very important for constructing forward greedy algorithm of attributes reduction.</p>
<p><bold>Definition 2.10</bold> Let (<italic>U</italic>,<italic>C</italic>&#x22C3;<italic>D</italic>) be a decision information system and <italic>B</italic> &#x2286; <italic>C</italic>, if <italic>H</italic><sub><italic>ace</italic></sub>(<italic>D</italic>/<italic>B</italic>) = <italic>H</italic><sub><italic>ace</italic></sub>(<italic>D</italic>/<italic>C</italic>) and <italic>H</italic><sub><italic>ace</italic></sub>(<italic>D</italic>/(<italic>B</italic>&#x2212;{<italic>b</italic>})) &#x003E; <italic>H</italic><sub><italic>ace</italic></sub>(<italic>D</italic>/<italic>C</italic>)(&#x2200;<italic>b</italic> &#x2208; <italic>B</italic>), then <italic>B</italic> is called a reduction of <italic>C</italic> relative to <italic>D</italic>.</p>
<p>The first condition guarantees that the selected attribute subset has the same amount of information as the whole attribute set. The second condition guarantees that there is no redundancy in the attribute reduction set.</p>
<p><bold>Definition 2.11</bold> Assume that (<italic>U</italic>,<italic>C</italic>&#x22C3;<italic>D</italic>) be a decision information system, &#x2200;<italic>c</italic> &#x2208; <italic>C</italic>, define the following indicator,</p>
<disp-formula id="S2.E6"><label>(6)</label><mml:math id="M8"><mml:mrow><mml:mrow><mml:mi>I</mml:mi><mml:mi>I</mml:mi><mml:mi>A</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>c</mml:mi><mml:mo>,</mml:mo><mml:mi>C</mml:mi><mml:mo>,</mml:mo><mml:mi>D</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mrow><mml:msub><mml:mi>H</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>c</mml:mi><mml:mi>e</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>D</mml:mi><mml:mo>/</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>C</mml:mi><mml:mo>-</mml:mo><mml:mrow><mml:mo stretchy="false">{</mml:mo><mml:mi>c</mml:mi><mml:mo stretchy="false">}</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>-</mml:mo><mml:mrow><mml:msub><mml:mi>H</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>c</mml:mi><mml:mi>e</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>D</mml:mi><mml:mo>/</mml:mo><mml:mi>C</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:mrow></mml:math></disp-formula>
<p>then <italic>IIA</italic>(<italic>c</italic>,<italic>C</italic>,<italic>D</italic>) is called the importance of internal attribute of <italic>c</italic> in <italic>C</italic> relative to <italic>D</italic>.</p>
<p><bold>Definition 2.12</bold> Assume that (<italic>U</italic>,<italic>C</italic>&#x22C3;<italic>D</italic>) be a decision information system, &#x2200;<italic>c</italic> &#x2208; <italic>C</italic>, if <italic>IIA</italic>(<italic>c</italic>,<italic>C</italic>,<italic>D</italic>) &#x003E; 0, then attribute <italic>c</italic> is called a core attribute of <italic>C</italic> relative to <italic>D</italic>.</p>
<p><bold>Definition 2.13</bold> Assume that (<italic>U</italic>,<italic>C</italic>&#x22C3;<italic>D</italic>) be a decision information system, <italic>B</italic> &#x2286; <italic>C</italic>, &#x2200;<italic>d</italic> &#x2208; <italic>C</italic>&#x2212;<italic>B</italic>, define the following indicator,</p>
<disp-formula id="S2.E7"><label>(7)</label><mml:math id="M9"><mml:mrow><mml:mrow><mml:mi>I</mml:mi><mml:mi>E</mml:mi><mml:mi>A</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>d</mml:mi><mml:mo>,</mml:mo><mml:mi>B</mml:mi><mml:mo>,</mml:mo><mml:mi>C</mml:mi><mml:mo>,</mml:mo><mml:mi>D</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mrow><mml:msub><mml:mi>H</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>c</mml:mi><mml:mi>e</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>D</mml:mi><mml:mo>/</mml:mo><mml:mi>B</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>-</mml:mo><mml:mrow><mml:msub><mml:mi>H</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>c</mml:mi><mml:mi>e</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>D</mml:mi><mml:mo>/</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>B</mml:mi><mml:mrow><mml:mo largeop="true" mathsize="160%" movablelimits="false" stretchy="false" symmetric="true">&#x22C3;</mml:mo><mml:mrow><mml:mo stretchy="false">{</mml:mo><mml:mi>d</mml:mi><mml:mo stretchy="false">}</mml:mo></mml:mrow></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:mrow></mml:math></disp-formula>
<p>then <italic>IEA</italic>(<italic>d</italic>,<italic>B</italic>,<italic>C</italic>,<italic>D</italic>) is called the importance of external attribute of <italic>d</italic> to <italic>B</italic> relative to <italic>D</italic>.</p>
<p><italic>IEA</italic>(<italic>d</italic>,<italic>B</italic>,<italic>C</italic>,<italic>D</italic>) shows the change of approximate conditional entropy after adding attribute <italic>d</italic>. The larger <italic>IEA</italic>(<italic>d</italic>,<italic>B</italic>,<italic>C</italic>,<italic>D</italic>) is, the more important <italic>d</italic> is to <italic>B</italic> relative to <italic>D</italic>.</p>
</sec>
<sec id="S2.SS6.SSS2">
<title>Feature Selection Algorithm Using Approximate Conditional Entropy</title>
<p>In this article, a novel feature selection algorithm using approximate conditional entropy (FSACE) is proposed and described as follows.</p>
<table-wrap>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<tbody>
<tr>
<td valign="top" align="left">|</td>
</tr>
<tr>
<td valign="top" align="left">Input: A decision information system (<italic>U</italic>,<italic>C</italic>&#x22C3;<italic>D</italic>) and &#x03C3;.</td>
</tr>
<tr>
<td valign="top" align="left">Output:A selected gene subset <italic>B</italic>.</td>
</tr>
<tr>
<td valign="top" align="left">Step 1. Initialize <italic>B</italic> = &#x03D5;.</td>
</tr>
<tr>
<td valign="top" align="left">Step 2. Compute <italic>H</italic><sub><italic>ace</italic></sub>(<italic>D</italic>/<italic>C</italic>).</td>
</tr>
<tr>
<td valign="top" align="left">Step 3.&#x2200;<italic>c</italic> &#x2208; <italic>C</italic>, compute <italic>IIA</italic>(<italic>c</italic>,<italic>C</italic>,<italic>D</italic>), if <italic>IIA</italic>(<italic>c</italic>,<italic>C</italic>,<italic>D</italic>) &#x003E; 0, then <italic>B</italic> = <italic>B</italic>&#x22C3;{<italic>c</italic>}.</td>
</tr>
<tr>
<td valign="top" align="left">Step 4. If <italic>B</italic> = &#x03D5;, then turn to step 5. If <italic>B</italic>&#x2260;&#x03D5;, compute <italic>H</italic><sub><italic>ace</italic></sub>(<italic>D</italic>/<italic>B</italic>). If <italic>H</italic><sub><italic>ace</italic></sub>(<italic>D</italic>/<italic>B</italic>) = <italic>H</italic><sub><italic>ace</italic></sub>(<italic>D</italic>/<italic>C</italic>), then turn to step 6; otherwise, turn to step 5.</td>
</tr>
<tr>
<td valign="top" align="left">Step 5. Let <italic>M</italic> = <italic>C</italic>&#x2212;<italic>B</italic>, select a attribute <italic>m</italic> &#x2208; <italic>M</italic> so that it satisfies <inline-formula><mml:math id="INEQ141"><mml:mrow><mml:mrow><mml:mi>I</mml:mi><mml:mi>E</mml:mi><mml:mi>A</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>m</mml:mi><mml:mo>,</mml:mo><mml:mi>B</mml:mi><mml:mo>,</mml:mo><mml:mi>C</mml:mi><mml:mo>,</mml:mo><mml:mi>D</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:munder><mml:mo movablelimits="false">max</mml:mo><mml:mrow><mml:mi>x</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mi>M</mml:mi></mml:mrow></mml:munder><mml:mrow><mml:mi>I</mml:mi><mml:mi>E</mml:mi><mml:mi>A</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>B</mml:mi><mml:mo>,</mml:mo><mml:mi>C</mml:mi><mml:mo>,</mml:mo><mml:mi>D</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:mrow></mml:math></inline-formula>. Let <italic>B</italic> = <italic>B</italic>&#x22C3;{<italic>m</italic>}, compute <italic>H</italic><sub><italic>ace</italic></sub>(<italic>D</italic>/<italic>B</italic>). If <italic>H</italic><sub><italic>ace</italic></sub>(<italic>D</italic>/<italic>B</italic>) = <italic>H</italic><sub><italic>ace</italic></sub>(<italic>D</italic>/<italic>C</italic>), then turn to step 6; otherwise, turn to step 5.</td>
</tr>
<tr>
<td valign="top" align="left">Step 6. The feature selection subset <italic>B</italic> is obtained, andthe algorithm ends.</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
</sec>
</sec>
<sec id="S3">
<title>Experimental Results and Analysis</title>
<p>All experiments are performed on a personal computer running Windows 10 with an Intel(R) Core(TM) i7-4790 CPU operating at 3.60 GHz with 8 GB memory using MATLAB R2019a. The classifiers (KNN, CART, and SVM) are selected to verify the classification accuracy, where the parameter <italic>k=3</italic> in KNN and Gaussian kernel function is selected in SVM. Other parameters of the three algorithms are the default values of the software.</p>
<sec id="S3.SS1">
<title>Influence of Different Values of &#x03C3; on Classification Performance</title>
<p>In this part, the classification accuracy of different Laplacian kernel parameters values of <bold>&#x03C3;</bold> is tested. For gene expression data, feature selection aims to improve classification accuracy by eliminating redundant genes. The different values of <bold>&#x03C3;</bold> influence the size of granulated gene data, which affects the classification accuracy of selected genes. Therefore, the different values of <bold>&#x03C3;</bold> should be set in the process of feature selection of gene expression data sets. Moreover, the different values of <bold>&#x03C3;</bold> also affect the composition of the selected gene subset. To obtain a suitable <bold>&#x03C3;</bold> and a good gene subset, the classification accuracy of the selected gene subset for different values of <bold>&#x03C3;</bold> should be discussed in detail.</p>
<p>The corresponding experiments are performed to graphically illustrate the classification accuracy of FSACE under different values of <bold>&#x03C3;</bold>. The results are shown in <xref ref-type="fig" rid="F1">Figure 1</xref>, where the horizontal axis denotes &#x03C3; &#x2208; [0.05,1] at intervals of 0.05, and the vertical axis represents the classification accuracy.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption><p>Classification accuracy for six gene expression data sets with different values of <bold>&#x03C3;</bold>.</p></caption>
<graphic xlink:href="fgene-12-631505-g001.tif"/>
</fig>
<p><xref ref-type="fig" rid="F1">Figure 1</xref> shows that <bold>&#x03C3;</bold> greatly influences the classification performance of FSACE. <bold>&#x03C3;</bold> is usually set to make the classification accuracy highest. Thus, the appropriate parameter values of <bold>&#x03C3;</bold> can be obtained for each data set from <xref ref-type="fig" rid="F1">Figure 1</xref>. In <xref ref-type="fig" rid="F1">Figure 1A</xref>, for Leukemia1 data set, when <bold>&#x03C3;</bold> is 0.95, the classification accuracy is the highest. In <xref ref-type="fig" rid="F1">Figure 1B</xref>, for Leukemia2 data set, when <bold>&#x03C3;</bold> is 0.55, the classification accuracy is the highest. In <xref ref-type="fig" rid="F1">Figure 1C</xref>, for Brain tumor data set, when <bold>&#x03C3;</bold> is 0.80, the classification accuracy is the highest. In <xref ref-type="fig" rid="F1">Figure 1D</xref>, for 9-tumors data set, when <bold>&#x03C3;</bold> is 0.75, the classification accuracy is the highest. In <xref ref-type="fig" rid="F1">Figure 1E</xref>, for Robert data set, when <bold>&#x03C3;</bold> is 0.60, the classification accuracy is the highest. In <xref ref-type="fig" rid="F1">Figure 1F</xref>, for Ting data set, when <bold>&#x03C3;</bold> is 0.75, the classification accuracy is the highest. Therefore, the appropriate values of <bold>&#x03C3;</bold> for different data sets are determined.</p>
</sec>
<sec id="S3.SS2">
<title>The Feature Selection Results and Classification Performance of FSACE</title>
<p>The classification results obtained from the three classifiers (KNN, CART, and SVM) with 10-fold cross-validation are shown in <xref ref-type="table" rid="T2">Table 2</xref> on the test data by FSACE.</p>
<table-wrap position="float" id="T2">
<label>TABLE 2</label>
<caption><p>Classification results of six gene expression data sets.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left">Data sets</td>
<td valign="top" align="center" colspan="4">Original data<hr/></td>
<td valign="top" align="left" colspan="4">Feature selection data using FSACE<hr/></td>
</tr>
<tr>
<td/>
<td valign="top" align="center">Genes</td>
<td valign="top" align="center">CART</td>
<td valign="top" align="center">KNN</td>
<td valign="top" align="center">SVM</td>
<td valign="top" align="center">Genes</td>
<td valign="top" align="center">CART</td>
<td valign="top" align="center">KNN</td>
<td valign="top" align="center">SVM</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Leukemia1</td>
<td valign="top" align="center">7129</td>
<td valign="top" align="center">0.822</td>
<td valign="top" align="center">0.839</td>
<td valign="top" align="center">0.917</td>
<td valign="top" align="center">9</td>
<td valign="top" align="center">0.911</td>
<td valign="top" align="center">0.947</td>
<td valign="top" align="center">0.931</td>
</tr>
<tr>
<td valign="top" align="left">Leukemia2</td>
<td valign="top" align="center">5327</td>
<td valign="top" align="center">0.849</td>
<td valign="top" align="center">0.820</td>
<td valign="top" align="center">0.834</td>
<td valign="top" align="center">9</td>
<td valign="top" align="center">0.891</td>
<td valign="top" align="center">0.894</td>
<td valign="top" align="center">0.878</td>
</tr>
<tr>
<td valign="top" align="left">Brain tumor</td>
<td valign="top" align="center">10,367</td>
<td valign="top" align="center">0.571</td>
<td valign="top" align="center">0.604</td>
<td valign="top" align="center">0.737</td>
<td valign="top" align="center">5</td>
<td valign="top" align="center">0.743</td>
<td valign="top" align="center">0.631</td>
<td valign="top" align="center">0.614</td>
</tr>
<tr>
<td valign="top" align="left">9-tumors</td>
<td valign="top" align="center">5726</td>
<td valign="top" align="center">0.273</td>
<td valign="top" align="center">0.349</td>
<td valign="top" align="center">0.334</td>
<td valign="top" align="center">2</td>
<td valign="top" align="center">0.318</td>
<td valign="top" align="center">0.359</td>
<td valign="top" align="center">0.355</td>
</tr>
<tr>
<td valign="top" align="left">Robert</td>
<td valign="top" align="center">23,416</td>
<td valign="top" align="center">0.947</td>
<td valign="top" align="center">0.928</td>
<td valign="top" align="center">0.933</td>
<td valign="top" align="center">14</td>
<td valign="top" align="center">0.985</td>
<td valign="top" align="center">0.974</td>
<td valign="top" align="center">0.990</td>
</tr>
<tr>
<td valign="top" align="left">Ting</td>
<td valign="top" align="center">21,583</td>
<td valign="top" align="center">0.864</td>
<td valign="top" align="center">0.826</td>
<td valign="top" align="center">0.841</td>
<td valign="top" align="center">17</td>
<td valign="top" align="center">0.873</td>
<td valign="top" align="center">0.847</td>
<td valign="top" align="center">0.882</td>
</tr>
<tr>
<td valign="top" align="left">Average</td>
<td valign="top" align="center">12,258</td>
<td valign="top" align="center">0.721</td>
<td valign="top" align="center">0.728</td>
<td valign="top" align="center">0.766</td>
<td valign="top" align="center">9.333</td>
<td valign="top" align="center">0.787</td>
<td valign="top" align="center">0.775</td>
<td valign="top" align="center">0.775</td>
</tr>
</tbody>
</table>
</table-wrap>
<p><xref ref-type="table" rid="T2">Table 2</xref> shows that FSACE not only greatly reduces the dimensionality of all six gene expression data sets, but also improves the classification accuracy.</p>
<p>The results of feature genes selection from six gene expression data sets are shown in <xref ref-type="table" rid="T3">Table 3</xref> using FSACE.</p>
<table-wrap position="float" id="T3">
<label>TABLE 3</label>
<caption><p>The selected feature genes on six gene expression data sets using FSACE.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left">Data sets</td>
<td valign="top" align="left">The selected feature gene subsets</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Leukemia1</td>
<td valign="top" align="left">(758,1144,1630,2659,3897,4196,5552,6471,6584)</td>
</tr>
<tr>
<td valign="top" align="left">Leukemia2</td>
<td valign="top" align="left">(568,848,861,1610,2197,3256,3358,4688,5032)</td>
</tr>
<tr>
<td valign="top" align="left">Brain tumor</td>
<td valign="top" align="left">(642,7169,7844,9413,9794)</td>
</tr>
<tr>
<td valign="top" align="left">9-tumors</td>
<td valign="top" align="left">(1677,2590)</td>
</tr>
<tr>
<td valign="top" align="left">Robert</td>
<td valign="top" align="left">(12883,1600,9892,16398,8720,4510,18137,2320,14931, 14679,10352,12481,18034,406)</td>
</tr>
<tr>
<td valign="top" align="left">Ting</td>
<td valign="top" align="left">(4754,5676,2503,5379,3304,4752,6015,2193,15687,641, 7938,2629,6837,4653,19016,8621,4267)</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="S3.SS3">
<title>Comparison of the Classification Performance of Several Entropy-Based Feature Selection Algorithms</title>
<p>To evaluate the performance of FSACE in terms of classification accuracy, FSACE algorithm is compared with several state-of-the-art feature selection algorithms, including EGGS (<xref ref-type="bibr" rid="B1">Chen et al., 2017</xref>), EGGS-FS (<xref ref-type="bibr" rid="B27">Yang et al., 2016</xref>), MEAR (<xref ref-type="bibr" rid="B26">Xu et al., 2009</xref>), Fisher (<xref ref-type="bibr" rid="B19">Saqlain et al., 2019</xref>), and Lasso (<xref ref-type="bibr" rid="B22">Tibshirani, 1996</xref>). According to the change trend of Fisher scores of six gene datasets, we select the top-200 genes as the reduction set for Fisher algorithm.</p>
<p><xref ref-type="table" rid="T4">Tables 4</xref>&#x2013;<xref ref-type="table" rid="T9">9</xref> show the experimental results of six gene expression data sets using six different feature selection methods.</p>
<table-wrap position="float" id="T4">
<label>TABLE 4</label>
<caption><p>Classification accuracy of Leukemia1 using six different feature selection algorithms.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left">Feature selection method</td>
<td valign="top" align="center">Genes</td>
<td valign="top" align="center">CART</td>
<td valign="top" align="center">KNN</td>
<td valign="top" align="center">SVM</td>
<td valign="top" align="center">Average</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">ECGS (<xref ref-type="bibr" rid="B14">Li et al., 2017</xref>)</td>
<td valign="top" align="center">8</td>
<td valign="top" align="center">0.744</td>
<td valign="top" align="center">0.619</td>
<td valign="top" align="center">0.813</td>
<td valign="top" align="center">0.725</td>
</tr>
<tr>
<td valign="top" align="left">EGGS-FS (<xref ref-type="bibr" rid="B7">Hu et al., 2010</xref>)</td>
<td valign="top" align="center">5</td>
<td valign="top" align="center">0.821</td>
<td valign="top" align="center">0.794</td>
<td valign="top" align="center">0.701</td>
<td valign="top" align="center">0.772</td>
</tr>
<tr>
<td valign="top" align="left">MEAR (<xref ref-type="bibr" rid="B1">Chen et al., 2017</xref>)</td>
<td valign="top" align="center">3</td>
<td valign="top" align="center">0.939</td>
<td valign="top" align="center">0.919</td>
<td valign="top" align="center">0.925</td>
<td valign="top" align="center">0.928</td>
</tr>
<tr>
<td valign="top" align="left">Fisher (<xref ref-type="bibr" rid="B19">Saqlain et al., 2019</xref>)</td>
<td valign="top" align="center">200</td>
<td valign="top" align="center">0.639</td>
<td valign="top" align="center">0.857</td>
<td valign="top" align="center">0.778</td>
<td valign="top" align="center">0.758</td>
</tr>
<tr>
<td valign="top" align="left">Lasso (<xref ref-type="bibr" rid="B22">Tibshirani, 1996</xref>)</td>
<td valign="top" align="center">52</td>
<td valign="top" align="center">0.857</td>
<td valign="top" align="center">0.960</td>
<td valign="top" align="center">0.972</td>
<td valign="top" align="center">0.929</td>
</tr>
<tr>
<td valign="top" align="left">FSACE</td>
<td valign="top" align="center">9</td>
<td valign="top" align="center">0.911</td>
<td valign="top" align="center">0.947</td>
<td valign="top" align="center">0.931</td>
<td valign="top" align="center">0.930</td>
</tr>
</tbody>
</table>
</table-wrap>
<table-wrap position="float" id="T5">
<label>TABLE 5</label>
<caption><p>Classification accuracy of Leukemia2 using six different feature selection algorithms.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left">Feature selection method</td>
<td valign="top" align="center">Genes</td>
<td valign="top" align="center">CART</td>
<td valign="top" align="center">KNN</td>
<td valign="top" align="center">SVM</td>
<td valign="top" align="center">Average</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">ECGS (<xref ref-type="bibr" rid="B14">Li et al., 2017</xref>)</td>
<td valign="top" align="center">3</td>
<td valign="top" align="center">0.571</td>
<td valign="top" align="center">0.509</td>
<td valign="top" align="center">0.557</td>
<td valign="top" align="center">0.546</td>
</tr>
<tr>
<td valign="top" align="left">EGGS-FS (<xref ref-type="bibr" rid="B7">Hu et al., 2010</xref>)</td>
<td valign="top" align="center">2</td>
<td valign="top" align="center">0.907</td>
<td valign="top" align="center">0.871</td>
<td valign="top" align="center">0.874</td>
<td valign="top" align="center">0.884</td>
</tr>
<tr>
<td valign="top" align="left">MEAR (<xref ref-type="bibr" rid="B1">Chen et al., 2017</xref>)</td>
<td valign="top" align="center">5</td>
<td valign="top" align="center">0.903</td>
<td valign="top" align="center">0.829</td>
<td valign="top" align="center">0.872</td>
<td valign="top" align="center">0.868</td>
</tr>
<tr>
<td valign="top" align="left">Fisher (<xref ref-type="bibr" rid="B19">Saqlain et al., 2019</xref>)</td>
<td valign="top" align="center">200</td>
<td valign="top" align="center">0.726</td>
<td valign="top" align="center">0.803</td>
<td valign="top" align="center">0.846</td>
<td valign="top" align="center">0.792</td>
</tr>
<tr>
<td valign="top" align="left">Lasso (<xref ref-type="bibr" rid="B22">Tibshirani, 1996</xref>)</td>
<td valign="top" align="center">37</td>
<td valign="top" align="center">0.817</td>
<td valign="top" align="center">0.914</td>
<td valign="top" align="center">0.909</td>
<td valign="top" align="center">0.880</td>
</tr>
<tr>
<td valign="top" align="left">FSACE</td>
<td valign="top" align="center">9</td>
<td valign="top" align="center">0.891</td>
<td valign="top" align="center">0.894</td>
<td valign="top" align="center">0.878</td>
<td valign="top" align="center">0.888</td>
</tr>
</tbody>
</table>
</table-wrap>
<table-wrap position="float" id="T6">
<label>TABLE 6</label>
<caption><p>Classification accuracy of Brain tumor using six different feature selection algorithms.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left">Feature selection method</td>
<td valign="top" align="center">Genes</td>
<td valign="top" align="center">CART</td>
<td valign="top" align="center">KNN</td>
<td valign="top" align="center">SVM</td>
<td valign="top" align="center">Average</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">ECGS (<xref ref-type="bibr" rid="B14">Li et al., 2017</xref>)</td>
<td valign="top" align="center">9</td>
<td valign="top" align="center">0.515</td>
<td valign="top" align="center">0.491</td>
<td valign="top" align="center">0.544</td>
<td valign="top" align="center">0.517</td>
</tr>
<tr>
<td valign="top" align="left">EGGS-FS (<xref ref-type="bibr" rid="B7">Hu et al., 2010</xref>)</td>
<td valign="top" align="center">5</td>
<td valign="top" align="center">0.388</td>
<td valign="top" align="center">0.490</td>
<td valign="top" align="center">0.531</td>
<td valign="top" align="center">0.470</td>
</tr>
<tr>
<td valign="top" align="left">MEAR (<xref ref-type="bibr" rid="B1">Chen et al., 2017</xref>)</td>
<td valign="top" align="center">&#x2013;</td>
<td valign="top" align="center">&#x2013;</td>
<td valign="top" align="center">&#x2013;</td>
<td valign="top" align="center">&#x2013;</td>
<td valign="top" align="center">&#x2013;</td>
</tr>
<tr>
<td valign="top" align="left">Fisher (<xref ref-type="bibr" rid="B19">Saqlain et al., 2019</xref>)</td>
<td valign="top" align="center">200</td>
<td valign="top" align="center">0.630</td>
<td valign="top" align="center">0.704</td>
<td valign="top" align="center">0.617</td>
<td valign="top" align="center">0.650</td>
</tr>
<tr>
<td valign="top" align="left">Lasso (<xref ref-type="bibr" rid="B22">Tibshirani, 1996</xref>)</td>
<td valign="top" align="center">&#x2013;</td>
<td valign="top" align="center">&#x2013;</td>
<td valign="top" align="center">&#x2013;</td>
<td valign="top" align="center">&#x2013;</td>
<td valign="top" align="center">&#x2013;</td>
</tr>
<tr>
<td valign="top" align="left">FSACE</td>
<td valign="top" align="center">5</td>
<td valign="top" align="center">0.743</td>
<td valign="top" align="center">0.631</td>
<td valign="top" align="center">0.614</td>
<td valign="top" align="center">0.663</td>
</tr>
</tbody>
</table>
</table-wrap>
<table-wrap position="float" id="T7">
<label>TABLE 7</label>
<caption><p>Classification accuracy of 9-tumors using six different feature selection algorithms.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left">Feature selection method</td>
<td valign="top" align="center">Genes</td>
<td valign="top" align="center">CART</td>
<td valign="top" align="center">KNN</td>
<td valign="top" align="center">SVM</td>
<td valign="top" align="center">Average</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">ECGS (<xref ref-type="bibr" rid="B14">Li et al., 2017</xref>)</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">0.177</td>
<td valign="top" align="center">0.102</td>
<td valign="top" align="center">0.672</td>
<td valign="top" align="center">0.317</td>
</tr>
<tr>
<td valign="top" align="left">EGGS-FS (<xref ref-type="bibr" rid="B7">Hu et al., 2010</xref>)</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">0.224</td>
<td valign="top" align="center">0.203</td>
<td valign="top" align="center">0.393</td>
<td valign="top" align="center">0.273</td>
</tr>
<tr>
<td valign="top" align="left">MEAR (<xref ref-type="bibr" rid="B1">Chen et al., 2017</xref>)</td>
<td valign="top" align="center">&#x2013;</td>
<td valign="top" align="center">&#x2013;</td>
<td valign="top" align="center">&#x2013;</td>
<td valign="top" align="center">&#x2013;</td>
<td valign="top" align="center">&#x2013;</td>
</tr>
<tr>
<td valign="top" align="left">Fisher (<xref ref-type="bibr" rid="B19">Saqlain et al., 2019</xref>)</td>
<td valign="top" align="center">200</td>
<td valign="top" align="center">0.249</td>
<td valign="top" align="center">0.335</td>
<td valign="top" align="center">0.414</td>
<td valign="top" align="center">0.333</td>
</tr>
<tr>
<td valign="top" align="left">Lasso (<xref ref-type="bibr" rid="B22">Tibshirani, 1996</xref>)</td>
<td valign="top" align="center">27</td>
<td valign="top" align="center">0.199</td>
<td valign="top" align="center">0.361</td>
<td valign="top" align="center">0.322</td>
<td valign="top" align="center">0.294</td>
</tr>
<tr>
<td valign="top" align="left">FSACE</td>
<td valign="top" align="center">2</td>
<td valign="top" align="center">0.318</td>
<td valign="top" align="center">0.359</td>
<td valign="top" align="center">0.355</td>
<td valign="top" align="center">0.344</td>
</tr>
</tbody>
</table>
</table-wrap>
<table-wrap position="float" id="T8">
<label>TABLE 8</label>
<caption><p>Classification accuracy of Robert using six different feature selection algorithms.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left">Feature selection method</td>
<td valign="top" align="center">Genes</td>
<td valign="top" align="center">CART</td>
<td valign="top" align="center">KNN</td>
<td valign="top" align="center">SVM</td>
<td valign="top" align="center">Average</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">ECGS (<xref ref-type="bibr" rid="B14">Li et al., 2017</xref>)</td>
<td valign="top" align="center">11</td>
<td valign="top" align="center">0.948</td>
<td valign="top" align="center">0.937</td>
<td valign="top" align="center">0.964</td>
<td valign="top" align="center">0.950</td>
</tr>
<tr>
<td valign="top" align="left">EGGS-FS (<xref ref-type="bibr" rid="B7">Hu et al., 2010</xref>)</td>
<td valign="top" align="center">6</td>
<td valign="top" align="center">0.957</td>
<td valign="top" align="center">0.954</td>
<td valign="top" align="center">0.975</td>
<td valign="top" align="center">0.962</td>
</tr>
<tr>
<td valign="top" align="left">MEAR (<xref ref-type="bibr" rid="B1">Chen et al., 2017</xref>)</td>
<td valign="top" align="center">&#x2013;</td>
<td valign="top" align="center">&#x2013;</td>
<td valign="top" align="center">&#x2013;</td>
<td valign="top" align="center">&#x2013;</td>
<td valign="top" align="center">&#x2013;</td>
</tr>
<tr>
<td valign="top" align="left">Fisher (<xref ref-type="bibr" rid="B19">Saqlain et al., 2019</xref>)</td>
<td valign="top" align="center">200</td>
<td valign="top" align="center">0.976</td>
<td valign="top" align="center">0.990</td>
<td valign="top" align="center">0.989</td>
<td valign="top" align="center">0.985</td>
</tr>
<tr>
<td valign="top" align="left">Lasso (<xref ref-type="bibr" rid="B22">Tibshirani, 1996</xref>)</td>
<td valign="top" align="center">21</td>
<td valign="top" align="center">0.984</td>
<td valign="top" align="center">0.991</td>
<td valign="top" align="center">0.989</td>
<td valign="top" align="center">0.988</td>
</tr>
<tr>
<td valign="top" align="left">FSACE</td>
<td valign="top" align="center">14</td>
<td valign="top" align="center">0.993</td>
<td valign="top" align="center">0.991</td>
<td valign="top" align="center">0.985</td>
<td valign="top" align="center">0.990</td>
</tr>
</tbody>
</table>
</table-wrap>
<table-wrap position="float" id="T9">
<label>TABLE 9</label>
<caption><p>Classification accuracy of Ting using six different feature selection algorithms.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left">Feature selection method</td>
<td valign="top" align="center">Genes</td>
<td valign="top" align="center">CART</td>
<td valign="top" align="center">KNN</td>
<td valign="top" align="center">SVM</td>
<td valign="top" align="center">Average</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">ECGS (<xref ref-type="bibr" rid="B14">Li et al., 2017</xref>)</td>
<td valign="top" align="center">12</td>
<td valign="top" align="center">0.793</td>
<td valign="top" align="center">0.781</td>
<td valign="top" align="center">0.651</td>
<td valign="top" align="center">0.742</td>
</tr>
<tr>
<td valign="top" align="left">EGGS-FS (<xref ref-type="bibr" rid="B7">Hu et al., 2010</xref>)</td>
<td valign="top" align="center">9</td>
<td valign="top" align="center">0.745</td>
<td valign="top" align="center">0.717</td>
<td valign="top" align="center">0.626</td>
<td valign="top" align="center">0.696</td>
</tr>
<tr>
<td valign="top" align="left">MEAR (<xref ref-type="bibr" rid="B1">Chen et al., 2017</xref>)</td>
<td valign="top" align="center">&#x2013;</td>
<td valign="top" align="center">&#x2013;</td>
<td valign="top" align="center">&#x2013;</td>
<td valign="top" align="center">&#x2013;</td>
<td valign="top" align="center">&#x2013;</td>
</tr>
<tr>
<td valign="top" align="left">Fisher (<xref ref-type="bibr" rid="B19">Saqlain et al., 2019</xref>)</td>
<td valign="top" align="center">200</td>
<td valign="top" align="center">0.833</td>
<td valign="top" align="center">0.779</td>
<td valign="top" align="center">0.770</td>
<td valign="top" align="center">0.794</td>
</tr>
<tr>
<td valign="top" align="left">Lasso (<xref ref-type="bibr" rid="B22">Tibshirani, 1996</xref>)</td>
<td valign="top" align="center">56</td>
<td valign="top" align="center">0.833</td>
<td valign="top" align="center">0.833</td>
<td valign="top" align="center">0.845</td>
<td valign="top" align="center">0.837</td>
</tr>
<tr>
<td valign="top" align="left">FSACE</td>
<td valign="top" align="center">17</td>
<td valign="top" align="center">0.833</td>
<td valign="top" align="center">0.833</td>
<td valign="top" align="center">0.872</td>
<td valign="top" align="center">0.846</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>As shown in <xref ref-type="table" rid="T4">Tables 4</xref>, <xref ref-type="table" rid="T5">5</xref>, FSACE has the highest average classification accuracy for Leukemia1 and Leukemia2, and exhibits better classification performance than the other five algorithms.</p>
<p>As shown in <xref ref-type="table" rid="T6">Tables 6</xref>, <xref ref-type="table" rid="T7">7</xref>, MEAR cannot work on Brain Tumor data set and 9-tumors data set, its results are denoted by the sign &#x2013;. FSACE obtains the highest average classification accuracy among the five feature selection algorithms for Brain Tumor data set and 9-tumors data set.</p>
<p><xref ref-type="table" rid="T8">Tables 8</xref>, <xref ref-type="table" rid="T9">9</xref> shows that MEAR still can not work on Robert data set and Ting data set, which indicates that the algorithm is not stable. Our algorithm still has the highest classification accuracy among all the algorithms. Although the classification accuracy of our algorithm is only a little higher than lasso algorithm, the number of attributes reduced by our algorithm is much less than lasso algorithm.</p>
<p><xref ref-type="table" rid="T4">Tables 4</xref>&#x2013;<xref ref-type="table" rid="T9">9</xref> show that the average number of attributes reduced by our algorithm is slightly more than that of MEAR, ECGS, and EGGS-FS, but the average classification accuracy is much higher than that of these three algorithms.</p>
<p>Therefore, FSACE can not only effectively remove noise and redundant data from the original data, but also improve the classification accuracy of gene expression data sets.</p>
</sec>
</sec>
<sec id="S4">
<title>Conclusion and Discussion</title>
<p>Firstly, the concept of approximate conditional entropy is given and its monotonicity is proved in this article. Approximate conditional entropy can describe the uncertainty of knowledge from two aspects of boundary and information granule. And then, a novel feature selection algorithm FSACE is proposed based on the approximate conditional entropy. Finally, the effectiveness of the proposed algorithm is verified on several gene expression data sets. Experimental results show that compared with several state-of-the-art feature selection algorithms, the proposed feature selection algorithm not only can obtain compact features, but also improve classification performance. The time complexity of FSACE is <italic>O</italic>(|<italic>U</italic>|<sup>2</sup>|<italic>C</italic>|<sup>2</sup>). Because the gene expression data sets usually contain a large number of genes, the time complexity of FSACE is high. In addition, FSACE does not consider the interaction between attributes. Therefore, reducing the time complexity of FSACE and seeking more efficient feature selection algorithm considering interaction between attributes are two issues that we will study in the future.</p>
</sec>
<sec id="S5">
<title>Data Availability Statement</title>
<p>Publicly available datasets were analyzed in this study. This data can be found here: <ext-link ext-link-type="uri" xlink:href="http://portals.broadinstitute.org/cgi-bin/cancer/datasets.cgi">http://portals.broadinstitute.org/cgi-bin/cancer/datasets.cgi</ext-link> (cancer Program Legacy Publication Resources).</p>
</sec>
<sec id="S6">
<title>Author Contributions</title>
<p>The author confirms being the sole contributor of this work and has approved it for publication.</p>
</sec>
<sec sec-type="COI-statement" id="conf1">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
</body>
<back>
<ref-list>
<title>References</title>
<ref id="B1"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>Y.</given-names></name> <name><surname>Zhang</surname> <given-names>Z.</given-names></name> <name><surname>Zheng</surname> <given-names>J.</given-names></name> <name><surname>Ying</surname> <given-names>M.</given-names></name> <name><surname>Yu</surname> <given-names>X.</given-names></name></person-group> (<year>2017</year>). <article-title>Gene selection for tumor classification using neighborhood rough sets and entropy measures.</article-title> <source><italic>J. Biomed. Inform</italic>.</source> <volume>67</volume> <fpage>59</fpage>&#x2013;<lpage>68</lpage>. <pub-id pub-id-type="doi">10.1016/j.jbi.2017.02.007</pub-id> <pub-id pub-id-type="pmid">28215562</pub-id></citation></ref>
<ref id="B2"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dai</surname> <given-names>J.</given-names></name> <name><surname>Xu</surname> <given-names>Q.</given-names></name></person-group> (<year>2012</year>). <article-title>Approximations and uncertainty measures in incomplete information systems.</article-title> <source><italic>Inf. Sci</italic>.</source> <volume>198</volume> <fpage>62</fpage>&#x2013;<lpage>80</lpage>. <pub-id pub-id-type="doi">10.1016/j.ins.2012.02.032</pub-id></citation></ref>
<ref id="B3"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dong</surname> <given-names>H.</given-names></name> <name><surname>Li</surname> <given-names>T.</given-names></name> <name><surname>Ding</surname> <given-names>R.</given-names></name> <name><surname>Sun</surname> <given-names>J.</given-names></name></person-group> (<year>2018</year>). <article-title>A novel hybrid genetic algorithm with granular information for feature selection and optimization.</article-title> <source><italic>Appl. Soft Comput</italic>.</source> <volume>65</volume> <fpage>33</fpage>&#x2013;<lpage>46</lpage>. <pub-id pub-id-type="doi">10.1016/j.asoc.2017.12.048</pub-id></citation></ref>
<ref id="B4"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fu</surname> <given-names>X.</given-names></name> <name><surname>Wang</surname> <given-names>L.</given-names></name></person-group> (<year>2003</year>). <article-title>Data dimensionality reduction with application to simplifying RBF network structure and improving classification performance.</article-title> <source><italic>IEEE Trans. Syst. Man Cybern. Part B Cybern</italic>.</source> <volume>33</volume> <fpage>399</fpage>&#x2013;<lpage>409</lpage>. <pub-id pub-id-type="doi">10.1109/tsmcb.2003.810911</pub-id> <pub-id pub-id-type="pmid">18238187</pub-id></citation></ref>
<ref id="B5"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hu</surname> <given-names>L.</given-names></name> <name><surname>Gao</surname> <given-names>W.</given-names></name> <name><surname>Zhao</surname> <given-names>K.</given-names></name> <name><surname>Zhang</surname> <given-names>P.</given-names></name> <name><surname>Wang</surname> <given-names>F.</given-names></name></person-group> (<year>2018</year>). <article-title>Feature selection considering two types of feature relevancy and feature interdependency.</article-title> <source><italic>Expert Syst. Appl</italic>.</source> <volume>93</volume> <fpage>423</fpage>&#x2013;<lpage>434</lpage>. <pub-id pub-id-type="doi">10.1016/j.eswa.2017.10.016</pub-id></citation></ref>
<ref id="B6"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hu</surname> <given-names>Q.</given-names></name> <name><surname>Yu</surname> <given-names>D.</given-names></name> <name><surname>Liu</surname> <given-names>J.</given-names></name> <name><surname>Wu</surname> <given-names>C.</given-names></name></person-group> (<year>2008</year>). <article-title>Neighborhood rough set based heterogeneous feature subset selection.</article-title> <source><italic>Inf. Sci</italic>.</source> <volume>178</volume> <fpage>3577</fpage>&#x2013;<lpage>3594</lpage>. <pub-id pub-id-type="doi">10.1016/j.ins.2008.05.024</pub-id></citation></ref>
<ref id="B7"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hu</surname> <given-names>Q.</given-names></name> <name><surname>Zhang</surname> <given-names>L.</given-names></name> <name><surname>Chen</surname> <given-names>D.</given-names></name> <name><surname>Witold</surname> <given-names>P.</given-names></name> <name><surname>Daren</surname> <given-names>Y.</given-names></name></person-group> (<year>2010</year>). <article-title>Gaussian kernel based fuzzy rough sets: model, uncertainty measures and applications.</article-title> <source><italic>Int. J. Approx. Reason</italic>.</source> <volume>51</volume> <fpage>453</fpage>&#x2013;<lpage>471</lpage>. <pub-id pub-id-type="doi">10.1016/j.ijar.2010.01.004</pub-id></citation></ref>
<ref id="B8"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hu</surname> <given-names>Q.</given-names></name> <name><surname>Zhang</surname> <given-names>L.</given-names></name> <name><surname>Zhang</surname> <given-names>D.</given-names></name> <name><surname>Wei</surname> <given-names>P.</given-names></name> <name><surname>Shuang</surname> <given-names>A.</given-names></name> <name><surname>Witold</surname> <given-names>P.</given-names></name></person-group> (<year>2011</year>). <article-title>Measuring relevance between discrete and continuous features based on neighborhood mutual information.</article-title> <source><italic>Expert Syst. Appl</italic>.</source> <volume>38</volume> <fpage>10737</fpage>&#x2013;<lpage>10750</lpage>. <pub-id pub-id-type="doi">10.1016/j.eswa.2011.01.023</pub-id></citation></ref>
<ref id="B9"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Huang</surname> <given-names>X.</given-names></name> <name><surname>Zhang</surname> <given-names>L.</given-names></name> <name><surname>Wang</surname> <given-names>B.</given-names></name> <name><surname>Li</surname> <given-names>F. Z.</given-names></name> <name><surname>Zhang</surname> <given-names>Z.</given-names></name></person-group> (<year>2017</year>). <article-title>Feature clustering based support vector machine recursive feature elimination for gene selection.</article-title> <source><italic>Appl. Intell</italic>.</source> <volume>48</volume> <fpage>1</fpage>&#x2013;<lpage>14</lpage>.</citation></ref>
<ref id="B10"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jensen</surname> <given-names>R.</given-names></name> <name><surname>Shen</surname> <given-names>Q.</given-names></name></person-group> (<year>2009</year>). <article-title>New approaches to fuzzy-rough feature selection.</article-title> <source><italic>IEEE Trans. Fuzzy Syst</italic>.</source> <volume>17</volume> <fpage>824</fpage>&#x2013;<lpage>838</lpage>. <pub-id pub-id-type="doi">10.1109/tfuzz.2008.924209</pub-id></citation></ref>
<ref id="B11"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jiang</surname> <given-names>F.</given-names></name> <name><surname>Wang</surname> <given-names>S.</given-names></name> <name><surname>Du</surname> <given-names>J.</given-names></name> <name><surname>Sui</surname> <given-names>Y. F.</given-names></name></person-group> (<year>2015</year>). <article-title>Attribute reduction based on approximation decision entropy.</article-title> <source><italic>Control and Decis</italic>.</source> <volume>30</volume> <fpage>65</fpage>&#x2013;<lpage>70</lpage>. <pub-id pub-id-type="doi">10.3390/e20010065</pub-id> <pub-id pub-id-type="pmid">33265152</pub-id></citation></ref>
<ref id="B12"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kimmerling</surname> <given-names>R.</given-names></name> <name><surname>Szeto</surname> <given-names>G.</given-names></name> <name><surname>Li</surname> <given-names>J.</given-names></name> <name><surname>Alex</surname> <given-names>S. G.</given-names></name> <name><surname>Samuel</surname> <given-names>W. K.</given-names></name> <name><surname>Kristofor</surname> <given-names>R. P.</given-names></name><etal/></person-group> (<year>2016</year>). <article-title>A microfluidic platform enabling single-cell RNA-seq of multigenerational lineages.</article-title> <source><italic>Nat. Commun</italic>.</source> <volume>7</volume>:<issue>10220</issue>.</citation></ref>
<ref id="B13"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Konstantina</surname> <given-names>K.</given-names></name> <name><surname>Themis</surname> <given-names>P.</given-names></name> <name><surname>Konstantinos</surname> <given-names>P. E.</given-names></name> <name><surname>Michalis</surname> <given-names>V. K.</given-names></name> <name><surname>Dimitrios</surname> <given-names>I. F.</given-names></name></person-group> (<year>2015</year>). <article-title>Machine learning applications in cancer prognosis and prediction.</article-title> <source><italic>Comput. Struct. Biotechnol. J</italic>.</source> <volume>13</volume> <fpage>8</fpage>&#x2013;<lpage>17</lpage>. <pub-id pub-id-type="doi">10.1016/j.csbj.2014.11.005</pub-id> <pub-id pub-id-type="pmid">25750696</pub-id></citation></ref>
<ref id="B14"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>Z.</given-names></name> <name><surname>Liu</surname> <given-names>X.</given-names></name> <name><surname>Zhang</surname> <given-names>G.</given-names></name> <name><surname>Xie</surname> <given-names>N.</given-names></name> <name><surname>Wang</surname> <given-names>S.</given-names></name></person-group> (<year>2017</year>). <article-title>A multi-granulation decision-theoretic rough set method for distributed fc-decision information systems: an application in medical diagnosis.</article-title> <source><italic>Appl. Soft Comput</italic>.</source> <volume>56</volume> <fpage>233</fpage>&#x2013;<lpage>244</lpage>. <pub-id pub-id-type="doi">10.1016/j.asoc.2017.02.033</pub-id></citation></ref>
<ref id="B15"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mitra</surname> <given-names>S.</given-names></name> <name><surname>Das</surname> <given-names>R.</given-names></name> <name><surname>Hayashi</surname> <given-names>Y.</given-names></name></person-group> (<year>2011</year>). <article-title>Genetic networks and soft computing.</article-title> <source><italic>IEEE/ACM Trans. Comput. Biol. Bioinform</italic>.</source> <volume>8</volume> <fpage>94</fpage>&#x2013;<lpage>107</lpage>.</citation></ref>
<ref id="B16"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pawlak</surname> <given-names>Z.</given-names></name></person-group> (<year>1982</year>). <article-title>Rough sets.</article-title> <source><italic>Int. J. Comput. Inf. Sci</italic>.</source> <volume>11</volume> <fpage>341</fpage>&#x2013;<lpage>356</lpage>.</citation></ref>
<ref id="B17"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Phan</surname> <given-names>J.</given-names></name> <name><surname>Quo</surname> <given-names>C.</given-names></name> <name><surname>Wang</surname> <given-names>M.</given-names></name></person-group> (<year>2012</year>). <article-title>Cardiovascular genomics: a biomarker identification pipeline.</article-title> <source><italic>IEEE Trans. Inf. Technol. Biomed</italic>.</source> <volume>16</volume> <fpage>809</fpage>&#x2013;<lpage>822</lpage>. <pub-id pub-id-type="doi">10.1109/titb.2012.2199570</pub-id> <pub-id pub-id-type="pmid">22614726</pub-id></citation></ref>
<ref id="B18"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Qian</surname> <given-names>Y.</given-names></name> <name><surname>Liang</surname> <given-names>J.</given-names></name> <name><surname>Wu</surname> <given-names>W.</given-names></name> <name><surname>Dang</surname> <given-names>C.</given-names></name></person-group> (<year>2011</year>). <article-title>Information granularity in fuzzy binary GrC model.</article-title> <source><italic>IEEE Trans. Fuzzy Syst</italic>.</source> <volume>19</volume> <fpage>253</fpage>&#x2013;<lpage>264</lpage>. <pub-id pub-id-type="doi">10.1109/tfuzz.2010.2095461</pub-id></citation></ref>
<ref id="B19"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Saqlain</surname> <given-names>S. M.</given-names></name> <name><surname>Sher</surname> <given-names>M.</given-names></name> <name><surname>Shah</surname> <given-names>F. A.</given-names></name> <name><surname>Khan</surname> <given-names>I.</given-names></name> <name><surname>Ashraf</surname> <given-names>M. U.</given-names></name> <name><surname>Awais</surname> <given-names>M.</given-names></name><etal/></person-group> (<year>2019</year>). <article-title>Fisher score and Matthews correlation coefficient-based feature subset selection for heart disease diagnosis using support vector machines[J].</article-title> <source><italic>Knowl. Inf. Syst</italic>.</source> <volume>58</volume> <fpage>139</fpage>&#x2013;<lpage>167</lpage>. <pub-id pub-id-type="doi">10.1007/s10115-018-1185-y</pub-id></citation></ref>
<ref id="B20"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sun</surname> <given-names>L.</given-names></name> <name><surname>Wang</surname> <given-names>L.</given-names></name> <name><surname>Xu</surname> <given-names>J.</given-names></name> <name><surname>Zhang</surname> <given-names>S.</given-names></name></person-group> (<year>2019a</year>). <article-title>A neighborhood rough sets-based attribute reduction method using Lebesgue and entropy measures.</article-title> <source><italic>Entropy</italic></source> <volume>21</volume> <fpage>1</fpage>&#x2013;<lpage>26</lpage>.</citation></ref>
<ref id="B21"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sun</surname> <given-names>L.</given-names></name> <name><surname>Zhang</surname> <given-names>X.</given-names></name> <name><surname>Qian</surname> <given-names>Y.</given-names></name> <name><surname>Xu</surname> <given-names>J.</given-names></name> <name><surname>Zhang</surname> <given-names>S.</given-names></name></person-group> (<year>2019b</year>). <article-title>Feature selection using neighborhood entropy-based uncertainty measures for gene expression data classification.</article-title> <source><italic>Inf. Sci</italic>.</source> <volume>502</volume> <fpage>18</fpage>&#x2013;<lpage>41</lpage>. <pub-id pub-id-type="doi">10.1016/j.ins.2019.05.072</pub-id></citation></ref>
<ref id="B22"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tibshirani</surname> <given-names>R.</given-names></name></person-group> (<year>1996</year>). <article-title>Regression shrinkage and selection via the lasso.</article-title> <source><italic>J. R. Stat. Soc. Series B Stat. Methodol</italic>.</source> <volume>58</volume> <fpage>267</fpage>&#x2013;<lpage>288</lpage>. <pub-id pub-id-type="doi">10.1111/j.2517-6161.1996.tb02080.x</pub-id></citation></ref>
<ref id="B23"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ting</surname> <given-names>D.</given-names></name> <name><surname>Wittner</surname> <given-names>B.</given-names></name> <name><surname>Ligorio</surname> <given-names>M.</given-names></name> <name><surname>Brian</surname> <given-names>W. B.</given-names></name> <name><surname>Ajay</surname> <given-names>M. S.</given-names></name> <name><surname>Xega</surname> <given-names>K.</given-names></name><etal/></person-group> (<year>2014</year>). <article-title>Single-cell RNA sequencing identifies extracellular matrix gene expression by pancreatic circulating tumor cells.</article-title> <source><italic>Cell Rep</italic>.</source> <volume>8</volume> <fpage>1905</fpage>&#x2013;<lpage>1918</lpage>. <pub-id pub-id-type="doi">10.1016/j.celrep.2014.08.029</pub-id> <pub-id pub-id-type="pmid">25242334</pub-id></citation></ref>
<ref id="B24"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tsang</surname> <given-names>E.</given-names></name> <name><surname>Chen</surname> <given-names>D.</given-names></name> <name><surname>Yeung</surname> <given-names>D.</given-names></name> <name><surname>Wang</surname> <given-names>X. Z.</given-names></name> <name><surname>Lee</surname> <given-names>J. W. T.</given-names></name></person-group> (<year>2008</year>). <article-title>Attributes reduction using fuzzy rough sets.</article-title> <source><italic>IEEE Trans. Fuzzy Syst</italic>.</source> <volume>16</volume> <fpage>1130</fpage>&#x2013;<lpage>1141</lpage>. <pub-id pub-id-type="doi">10.1109/tfuzz.2006.889960</pub-id></citation></ref>
<ref id="B25"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>C.</given-names></name> <name><surname>Shi</surname> <given-names>Y.</given-names></name> <name><surname>Fan</surname> <given-names>X.</given-names></name> <name><surname>Shao</surname> <given-names>M. W.</given-names></name></person-group> (<year>2019</year>). <article-title>Attribute reduction based on k-nearest neighborhood rough sets.</article-title> <source><italic>Int. J. Approx. Reason</italic>.</source> <volume>106</volume> <fpage>18</fpage>&#x2013;<lpage>31</lpage>. <pub-id pub-id-type="doi">10.1016/j.ijar.2018.12.013</pub-id></citation></ref>
<ref id="B26"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Xu</surname> <given-names>F.</given-names></name> <name><surname>Miao</surname> <given-names>D.</given-names></name> <name><surname>Wei</surname> <given-names>L.</given-names></name></person-group> (<year>2009</year>). <article-title>Fuzzy-rough attribute reduction via mutual information with an application to cancer classification.</article-title> <source><italic>Comput. Math. Appl</italic>.</source> <volume>57</volume> <fpage>1010</fpage>&#x2013;<lpage>1017</lpage>. <pub-id pub-id-type="doi">10.1016/j.camwa.2008.10.027</pub-id></citation></ref>
<ref id="B27"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yang</surname> <given-names>J.</given-names></name> <name><surname>Liu</surname> <given-names>Y.</given-names></name> <name><surname>Feng</surname> <given-names>C.</given-names></name> <name><surname>Zhu</surname> <given-names>G. Q.</given-names></name></person-group> (<year>2016</year>). <article-title>Applying the fisher score to identify Alzheimer&#x2019;s disease-related genes.</article-title> <source><italic>Genet. Mol. Res</italic>.</source> <volume>15</volume> <fpage>1</fpage>&#x2013;<lpage>9</lpage>.</citation></ref>
<ref id="B28"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ye</surname> <given-names>C.</given-names></name> <name><surname>Pan</surname> <given-names>J.</given-names></name> <name><surname>Jin</surname> <given-names>Q.</given-names></name></person-group> (<year>2019</year>). <article-title>An improved SSO algorithm for cyber-enabled tumor risk analysis based on gene selection.</article-title> <source><italic>Future Gener. Comput. Syst</italic>.</source> <volume>92</volume> <fpage>407</fpage>&#x2013;<lpage>418</lpage>. <pub-id pub-id-type="doi">10.1016/j.future.2018.10.008</pub-id></citation></ref>
<ref id="B29"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zadeh</surname> <given-names>L.</given-names></name></person-group> (<year>1965</year>). <article-title>Fuzzy sets.</article-title> <source><italic>Inf. Control</italic></source> <volume>8</volume> <fpage>338</fpage>&#x2013;<lpage>353</lpage>.</citation></ref>
<ref id="B30"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zadeh</surname> <given-names>L.</given-names></name></person-group> (<year>1979</year>). <source><italic>Fuzzy Sets and Information Granularity, Advance in Fuzzy Set Theory &#x0026; Application.</italic></source> <publisher-loc>Amsterdam</publisher-loc>: <publisher-name>North Holland Publishing</publisher-name>, <fpage>3</fpage>&#x2013;<lpage>18</lpage>.</citation></ref>
</ref-list>
</back>
</article>
