AUTHOR=Das Pijush , Roychowdhury Anirban , Das Subhadeep , Roychoudhury Susanta , Tripathy Sucheta TITLE=sigFeature: Novel Significant Feature Selection Method for Classification of Gene Expression Data Using Support Vector Machine and t Statistic JOURNAL=Frontiers in Genetics VOLUME=Volume 11 - 2020 YEAR=2020 URL=https://www.frontiersin.org/journals/genetics/articles/10.3389/fgene.2020.00247 DOI=10.3389/fgene.2020.00247 ISSN=1664-8021 ABSTRACT=Biological data is accumulated at a faster rate but interpreting them still remains a problem. Classifying biological data into distinct groups is the first step in understanding them. Data classification in response to a certain treatment is an extremely important aspect for differentially expressed genes (DEGs) in making present/absent calls. Many feature selection algorithms have been developed including the support vector machine recursive feature elimination procedure (SVM-RFE) and its variants. SVM-RFEs are greedy methods that try to find superlative possible combinations leading to binary classification, which may not be biologically significant. To overcome this limitation of SVM-RFE, we propose a novel feature selection algorithm based on SVM-RFE and t-statistic to discover the differentially significant features along with good performance in classification. The R package “sigFeature” is centred around a function named “sigFeature()”, that provides automatic feature selection for binary classification of data. Using six publicly available microarray datasets with different biological attributes, we further compared the performance of “sigFeature” to three other feature selection algorithms. A small number of selected features (by sigFeature) also show higher classification accuracy. When we performed Gene Set Enrichment Analysis (GSEA) with the selected features (genes) from “sigFeature” and compared it with the outputs of other algorithms, we observed that “sigFeature” is able to predict the signature of four out of six microarray datasets accurately. Thus, “sigFeature” is not only better in discovering differentially significant features but also attains higher classification accuracy than other competing methods.