Fuzzy Discretization and Rough Set based Feature Selection for High-Dimensional Classification

Author(s)

Abstract

1 Prema Ramasamy, Assistant Professor, New Horizon College of Engineering, Bangalore E-mail:premabit@gmail.com 2 Professor, Department of Computer Science and Engineering, Bannari Amman Institute of Techlology, Sathyamangalam. (Received May 11 2018, accepted July 16 2018) Contemporary biological technologies like gene expression microarrays produce extremely high- dimensional datasets with limited samples. Analysis of gene expression data is essential in microarray gene expression studies in order to retrieve the required information. Gene expression data generally contain a large number of genes but a small number of samples. The complicated relations among the different genes make analysis more difficult, and removing irrelevant genes improves the quality of results. In this regard, a new feature selection algorithm called 2-level MRMS is presented based on rough set theory. It selects a set of genes from microarray data by maximizing the relevance and significance of the selected genes. The paper also presents a novel discretization method, Gaussian Fuzzy Discretization based on fuzzy logic to discretize the continuous gene expression values. The performance of the proposed algorithm, along with a comparison with other related feature selection methods, is studied using the classification accuracy of k-Nearest Neighbor (kNN) and Support Vector Machine (SVM) on four microarray data sets. The experimental results show that the genes selected using 2-level MRMS feature selection give high classification accuracy than other methods.
About this article

Abstract View

  • 3902

Pdf View

  • 706