Asian Journal of Computer Science and Technology (AJCST)
Selection of Features on Mining Techniques for ClassificationAuthor : Gurrampally Kumar, S. Mohan and G. Prabakaran
Volume 7 No.1 Special Issue:November 2018 pp 108-111
Feature selection has been developed by several mining techniques for classification. Some existing approaches couldn’t remove the irrelevant data from dataset for class. Thus it needs the selection of appropriate features that emphasize its role in classification. For this it consider the statistical method like correlation coefficient to identify the features from feature set whose data are very important for existing classes. The several methods such as Gaussian process, linear regression and Euclidean distance have taken into consideration for clarity of classification. The experimental results reveal that the proposed method identifies the exact relevant features for several classes.
Feature Selection, Data Mining, Classification, Correlation Coefficient
 E. R. Dougherty, “Small sample issue for Microarray-based classification,” Comparative Functional Genomics, Vol. 2, pp. 28–34, 2001.
 C. Ding and H. Peng, “Minimum redundancy feature selection from micro array gene expression data,” in Proc. Compute. Syst. Bioinformatics Conf., pp. 523–529, 2003.
 T. R. Golub, D. K. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, J. P. Mesirov, H. Coller, M. L. Loh, J. R. Downing, M. A. Caligiuri, C. D. Bloomfield, and E. S. Lander, “Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring,” Science, Vol. 286, pp. 531–537, Oct. 1999.
 N. R. Pal, K. Aguan, A. Sharma, and S. Amari, “Discovering biomarkers from gene expression data for predicting cancer subgroups using neural networks and relational fuzzy clustering,” BMC Bioinformatics, Vol. 8, pp.1-18, 2007.
 N. R. Pal, “A fuzzy rule based approach to identify biomarkers for diagnostic classification of cancers,” in Proc. IEEE Int. Fuzzy Syst. Conf., pp. 1–6, 2007.
 Y.-S. Tsai, C.-T. Lin, G. C. Tseng, I.-F. Chung, and N. R. Pal, “Discovery of dominant and dormant genes from expression data using a novel generalization of SNR for multi-class problems,” BMC Bioinformatics, Vol. 9, pp.1-33, 2008.
 Y.-S. Tsai, K. Aguan, N. R. Pal, and I.-F. Chung, “Identification of single and multiple-class specific signature genes from gene expression profiles by group marker index,” PLoS ONE, Vol. 6, pp. e24259, 2011.
 N.K. Kamila, L.D. Jena, and H.K. Bhuyan, “Pareto-based multi-objective optimization for classification in data mining. Cluster computing (Springer),” Vol. 19, No. 4, pp. 1723–1745, Dec 2016.
 Jun Wang, Jin-Mao Wei, Zhenglu Yang, and Shu-Qin Wang, “Feature Selection by Maximizing Independent Classification
Information” IEEE Transactions on Knowledge and Data Engineering, Vol. 29, No. 4, pp. 828 – 841, April, 2017.
 H. K. Bhuyan, and N.K. Kamila, “Privacy preserving Sub-feature Selection based on fuzzy probabilities,” Cluster computing, (Springer), Vol. 17, No. 4, pp. 1383-1399, 2014.
 H. K. Bhuyan, and N.K. Kamila, “Privacy preserving sub-feature selection in distributed data mining,” Applied soft computing, Elsevier, Vol. 36, pp. 552-569, 2015.
 Z. Li, J. Liu, Y. Yang, X. Zhou, and H. Lu, “Clustering-guided sparse structural learning for unsupervised feature selection”, IEEE Trans. Knowl. Data Eng., Vol. 26, No. 9, pp. 2138-2150, Sep. 2013.
 D. Koller, and M. Sahami, “Toward optimal feature selection”, in Proc. 13th Int. Conf. Mach. Learn., pp. 284-292, 1996.
 M. Banerjee, and N. R. Pal, “Feature selection with SVD entropy: Some modification and extension”, Inf. Sci., Vol. 264, pp. 118-134, 2014.
 P. Mitra, C. A. Murthy, and S. K. Pal, “Unsupervised feature selection using feature similarity”, IEEE Trans. Pattern Anal. Mach. Intell., Vol. 24, No. 3, pp. 301-312, Mar. 2002.
 N. Søndberg-madsen, C. Thomsen, and J. M. Pea, “Unsupervised feature subset selection”, in Proc. Workshop Probabilistic Graph. Models Classification, pp. 71-82, 2003.
 J. Tang, and H. Liu, “Unsupervised feature selection for linked social media data”, in Proc. 18th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, pp. 904-912, 2012.
 X. He, D. Cai, and P. Niyogi, “Laplacian score for feature selection”, in Proc. Adv. Neural Inf. Process. Syst., pp. 507–514, 2005.
 D. Cai, C. Zhang, and X. He, “Unsupervised feature selection for multicluster data”, in Proc. 16th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, pp. 333-342, 2010.
 Danilo Costarelli, “Sigmoidal Functions Approximation and Applications,” PhD, Dissertation, Dipartimento di Matematica e Fisica Sezione di Matematica, Roma TRE Universita, Deglistudi, 2014.
 Monami Banerjee and Nikhil R. Pal, “Unsupervised Feature Selection with Controlled Redundancy (UFeSCoR),” IEEE Transactions on Knowledge and Data Engineering, Vol. 27, No. 12, Dec 2015.
 H. K. Bhuyan, and C. V. Madhusudan Reddy: Sub-feature selection for novel classification, IEEE Explore, April, 2018.