
Asian Journal of Computer Science and Technology (AJCST)
An Efficient Closed Maximal Pattern Sequences Mining on High Dimensional Datasets
Author : J. Krishna and M. HarithaVolume 8 No.3 Special Issue:June 2019 pp 50-53
Abstract
Previous methods have presented convincing arguments that mining complete set of patterns is huge for effective usage. A compact but high quality set of patterns, such as closed patterns and maximal patterns is needed. Most of the previously maximal pattern sequences mining algorithms on high dimensional sequence, such as biological data set, work under the same support. In this paper, an efficient algorithm Closed Maximal Pattern Sequences (CMPS-Mine) for mining closed maximal patterns based on multi-support is suggested. Careful exhibitions once Beta-globin gene sequences have exhibited that CMPS-Mine expends less memory utilization and run time over Prefix Span. It generates compacted outcomes and two kinds of interesting patterns.
Keywords
Multi Support, Sequential Pattern Mining, Maximal Pattern, High Dimensional Sequence
References
[1] N.R. Mabroukeh, and C.I. Ezeife, “A Taxonomy of Sequential Pattern Mining Algorithms”, Journal ACM Computing Surveys, Vol. 43, No. 1, pp.1-41, 2010.
[2] J. Cohen, “Bioinformatics-an Introduction for computer scientists”, ACM Computing Surveys (CSUR), Vol. 36, No. 2, pp.122-158, 2004.
[3] Z. Ezziane, “Applications of artificial intelligence in bioinformatics”, A review Expert Systems with Applications, Vol. 30, pp.2-10, 2006.
[4] Y. Xiong, and Y.Y. Zhu, “BioPM: an efficient algorithm for protein motif mining”, in Proceedings of the 1st International conference on Bioinformatics and Biomedical Engineering, pp.394-397, 2007.
[5] J.W. Han, H. Cheng, D. Xin, and X.F. Yan “Frequent pattern mining: current status and future directions”, Data Mining and Knowledge Discovery, Vol. 15, pp.55-86, 2007.
[6] J.W. Pei, and J.Y. Wang, et al. “Mining sequential patterns by pattern-growth: The prefix span approach”, IEEE Transactions on Knowledge and Data Engineering, Vol. 16, pp.1-17, 2004.
[7] R. Alves, D.S.R. Baena, and J.S.A. Ruiz, “Gene association analysis: a survey of frequent pattern mining from gene expression data”, Briefings in Bioinformatics, pp.1-12, 2009.
[8] B. Lavanya, and A. Murugan, “A DNA based approach to find closed repetitive gapped subsequences from a sequence database”, International Journal of Computer Applications, Vol.29, No.5, pp.45-49, 2011.
[9] P.G. Ferreira, and P.J. Azevedo, “Protein sequence pattern mining with constraints”, Knowledge Discovery in Databases, Vol. 3721, pp.96-107, 2005.
[10] D. He, X.G. Zhu, X.D. Wu, “Mining approximate repeating patterns from sequence data with gap constraints”, Computational Intelligence, Vol. 27, No. 3, pp.336-362, 2011.
[11] J. Krishna, P. Suryanarayana Babu, “DFP-MINER: Assessing the Accuracy of Correlated Sequence Patterns from High Dimensional Biological Datasets”, International Journal of Creative Research Thoughts, Vol. 5, No. 4, pp. 1233-1241, November, 2017.