An Empirical Review on Data Feature Selection and Big Data ClusteringAuthor : Venkata Rao Maddumala, R. Arunkumar and S. Arivalagan
Volume 7 No.1 Special Issue:November 2018 pp 96-100
With the fast advancement of the Big Data, Big Data innovations have risen as a key data investigation apparatus, in which, feature extraction and data bunching calculations are considered as a basic part for data examination. Nonetheless, there has been constrained research that tends to the difficulties crosswise over Big Data and along these lines proposing an exploration motivation is vital to illuminate the examination challenges for bunching Big Data. By handling this particular viewpoint – grouping calculation in Big Data, this paper looks at on Big Data advancements, identified with feature determination and data bunching calculations and conceivable uses. In view of our survey, this paper distinguishes an arrangement of research difficulties that can be utilized as an exploration plan for the Big Data bunching research. This exploration plan goes for distinguishing and crossing over the examination holes between Big Data feature choice and grouping calculations.
Big Data, Clustering, Feature Selection
 Al-Madi, Nailah, Ibrahim Aljarah, and Simone A. Ludwig, “Parallel glowworm swarm optimization clustering algorithm based on MapReduce”, IEEE Symposium on Swarm Intelligence, 2014.
 Amini, Amineh, Teh Ying Wah, and Hadi Saboohi, “On density-based data streams clustering algorithms: a survey”, Journal of Computer Science and Technology, Vol. 29, No.1, pp. 116-141, 2014.
 A. Akbar, F. Carrez, K. Moessner, J. Sancho and J. Rico, “Context-aware stream processing for distributed IoT applications In Internet of Things (WF-IoT)”, 2015 IEEE 2nd World Forum, pp. 663- 668, Dec. 2015.
 Ahmed, Ejaz, and Mubashir Husain Rehmani, “Mobile edge computing: opportunities, solutions, and challenges”, pp. 59-63, 2017.
 Pavel Berkhin, “A survey of clustering data mining techniques”, Grouping multidimensional data. Springer Berlin Heidelberg, pp. 25-71, 2006.
 Chen, Yong, Hong Chen, Anjee Gorkhali, Yang Lu, Yiqian Ma, and Ling Li, “Big Data analytics and Big Data science: a survey”, Journal of Management Analytics 3, Vol. 1, pp. 1- 42, 2016.
 Da Xu, Li, Wu He, and Shancang Li, “Internet of things in industries: A survey”, IEEE Transactions on industrial informatics, Vol. 10, No. 4, pp. 2233-2243, 2014.
 D’Urso Pierpaolo, Riccardo Massari, Livia De Giovanni, and Carmela Cappelli, “Exponential distance-based fuzzy clustering for interval-valued data”, Fuzzy Optimization and Decision Making Vol. 16, No.1, pp. 51-70, 2017.
 Sanjit Kumar Dash, Debi Prasad Mishra, Ranjita Mishra, and Sweta Dash, “Privacy preserving K-Medoids clustering: an approach towards securing data in Mobile cloud architecture” 2nd International Conference on Computational Science, Engineering and Information Technology, pp. 439-443, ACM, 2012.
 Anind K Dey, “Understanding and using context”, Personal and ubiquitous computing, Vol. 5, No.1, pp. 4-7, 2001.
 El Naqa, Issam and Martin J. Murphy, “What Is Machine Learning?” Machine Learning in Radiation Oncology. Springer International Publishing, pp. 3-11, 2015.
 Fahad, Adil, Najlaa Alshatri, Zahir Tari, Abdullah Alamri, Ibrahim Khalil, Albert Y. Zomaya, Sebti Foufou, and Abdelaziz Bouras, “A survey of clustering algorithms for Big Data: Taxonomy and empirical analysis”, IEEE transactions on emerging topics in computing, Vol. 2, No. 3, pp. 267-279, 2014.
 Fredj, Sameh Ben, Mathieu Boussard, Daniel Kofman, and Ludovic Noirie, “A scalable IoT service search based on clustering and aggregation”, In Green Computing and Communications (GreenCom), 2013 IEEE and Internet of Things, pp. 403-410, 2013.
 Kun Guo, Wenzhong Guo, Yuzhong Chen, Qirong Qiu, and Qishan Zhang, “Community discovery by propagating local and global information based on the MapReduce model”, Information Sciences, Vol. 323, pp. 73-93, 2015.
 Poonam Goyal, Sonal Kumari, Sumit Sharma, Dhruv Kumar, Vivek Kishore, Sundar Balasubramaniam, and Navneet Goyal, “A Fast, Scalable SLINK Algorithm for Commodity Cluster Computing Exploiting Spatial Locality”, In High Performance Computing and Communications; IEEE 14th International Conference on Smart City, 2016.
 Timothy C. Havens, James C. Bezdek, and Marimuthu Palaniswami, “Scalable single linkage hierarchical clustering for Big Data”, Intelligent Sensors, Sensor Networks and Information Processing, 2013 IEEE Eighth International Conference on. IEEE, 2013.
 Hossain, M. Shamim, Changsheng Xu, Ying Li, Al-Sakib Khan Pathan, Josu Bilbao, Wenjun Zeng, and Abdulmotaleb El Saddik, “Impact of Next-Generation Mobile Technologies on IoT-Cloud Convergence”, IEEE Communications Magazine, Vol. 55, No. 1, pp. 18-19, 2017.
 Jiang, Dajie, and Guangyi Liu, “An Overview of 5G Requirements”, 5G Mobile Communications. Springer International Publishing, pp. 3-26, 2017.
 Kitchin, Rob, “Big Data—Hype or revolution”, The SAGE handbook of social media research methods, 2017.
 Liu, Yiyi, Quanquan Gu, Jack P. Hou, Jiawei Han, and Jian Ma, “A network-assisted co-clustering algorithm to discover cancer subtypes based on gene expression”, BMC bioinformatics, Vol. 15, No. 1, pp. 37, 2014.
 Lin, Chao, Yan Yang, and Tonny Rutayisire, “A parallel Cop-K means clustering algorithm based on MapReduce framework”, Knowledge Engineering and Management, pp. 93-102, 2011.
 Li, Yan, Hong Liu, Guang-peng Liu, Liang Li, Philip Moore, and Bin Hu, “A grouping method based on grid density and relationship for crowd evacuation simulation”, Physical A: Statistical Mechanics and its Applications, 2017.
 Manogaran, Gunasekaran, Chandu Thota, Daphne Lopez, V. Vijayakumar, Kaja M. Abbas, and Revathi Sundarsekar, “Big Data Knowledge System in Healthcare”, In Internet of Things and Big Data Technologies for Next Generation Healthcare, pp. 133- 157, Springer International Publishing, 2017.
 Mavromoustakis, Constandinos X., George Mastorakis, and Jordi Mongay Batalla, “Internet of Things (IoT) in 5G Mobile Technologies”, Modeling and Optimization in Science and Technologies, 2016
 Mohebi, Amin, Saeed Aghabozorgi, Teh Ying Wah, Tutut Herawan, and Ramin Yahyapour, “Iterative Big Data clustering algorithms: a review”, Software: Practice and Experience, Vol. 46, No. 1, pp. 107-129, 2016.
 Nguyen, Cuong Duc, Dung Tien Nguyen, and Van-Hau Pham, “Parallel two-phase K-means”, International Conference on Computational Science and Its Applications. Springer Berlin Heidelberg, 2013.
 Ng, Raymond T., and Jiawei Han, “CLARANS: A method for clustering objects for spatial data mining”, IEEE transactions on knowledge and data engineering, Vol. 14, No. 5, pp. 1003-1016, 2002.
 Pandove, Divya, and Shivani Goel, “A comprehensive study on clustering approaches for Big Data mining”, Electronics and Communication Systems (ICECS), 2015 2nd International Conference on IEEE, 2015.
 Rafailidis, D., E. Constantinou and Y. Manolopoulos, “Landmark selection for spectral clustering based on Weighted Page Rank”, Future Generation Computer Systems, Vol. 68, pp. 465 – 472, 2017.
 Shirkhorshidi, Ali Seyed, Saeed Aghabozorgi, Teh Ying Wah, and Tutut Herawan, “Big Data clustering: a review”, In International Conference on Computational Science and Its Applications, pp. 707-720, 2014.
 Srirama, Satish Narayana, Pelle Jakovits, and Eero Vainikko, “Adapting scientific computing problems to clouds using MapReduce”, Future Generation Computer Systems, Vol. 28, No. 1, pp. 184-192, 2012.
 Sreenivasulu, G., S. Viswanadha Raju, and N. Sambasiva Rao, “Review of Clustering Techniques”, International Conference on Data Engineering and Communication Technology. Springer Singapore, 2017.
 Van Kranenburg, Rob. The Internet of Things: A critique of ambient technology and the all-seeing network of RFID. Institute of Network Cultures, 2008.
 Xu, Lina, Rem Collier, and Gregory MP O’Hare, “A survey of clustering techniques in WSNs and consideration of the challenges of applying such to 5g iot scenarios”, IEEE Internet of Things Journal, Vol. 4, No. 5, pp. 1229-1249, 2017.