• #0 (no title)
  • #0 (no title)
  • About
  • Facebook
  • Twitter
  • RSS
(As ISO 9001:2015 Certified Publications)
    • Quick Search
    • Advanced Search
  • Home
  • Editorial Policy
  • Author Guidelines
  • Submission
  • Copyright Form
  • Career
  • Contact us
  • Subscription

Back to Journal

Home»Articles»Importance of MapReduce for Big Data Applications: A Survey

Importance of MapReduce for Big Data Applications: A Survey

Author : M. Durairaj and T. S. Poornappriya
Volume 7 No.1 January-June 2018 pp 112-118

Abstract

Significant regard for MapReduce framework has been trapped by a wide range of areas. It is presently a practical model for data-focused applications because of its basic interface of programming, high elasticity, and capacity to withstand the subjection to defects. Additionally, it is fit for preparing a high extent of data in Distributed Computing environments (DCE). MapReduce, on various events, has turned out to be material to a wide scope of areas. MapReduce is a parallel programming model and a related usage presented by Google. In the programming model, a client determines the calculation by two capacities, Map and Reduce. The basic MapReduce library consequently parallelizes the calculation and handles muddled issues like data dispersion, load adjusting, and adaptation to non-critical failure. Huge data spread crosswise over numerous machines, need to parallelize. Moves the data, and gives booking, adaptation to non-critical failure. A writing survey on the MapReduce programming in different areas has completed in this paper. An examination course has been distinguished by utilizing a writing audit.

Keywords

Big Data, Hadoop, Distributed File System, MapReduce Programming, Cloud Computing

Full Text:

References

[1] Wang, Botao, et al., “Parallel online sequential extreme learning machine based on MapReduce”, Neurocomputing, Vol. 149, pp. 224-232, 2015.
[2] Marozzo, Fabrizio, Domenico Talia, and Paolo Trunfio. “P2P-MapReduce: Parallel data processing in dynamic Cloud environments.” Journal of Computer and System Sciences, Vol. 78, No.5, pp. 1382-1402, 2012.
[3] Mohamed, Hisham, and Stéphane Marchand-Maillet. „MRO-MPI: MapReduce overlapping using MPI and an optimized data exchange policy”, Parallel Computing, Vol.39, No.12, pp. 851-866, 2013.
[4] Barre, Benjamin, et al., “MapReduce for parallel trace validation of LTL properties”, International Conference on Runtime Verification. Springer, Berlin, Heidelberg, 2012.
[5] Lu, Lu, et al., “Morpho: A decoupled MapReduce framework for elastic cloud computing”, Future Generation Computer Systems, Vol. 36, pp. 80-90, 2014.
[6] Dean, Jeffrey, and Sanjay Ghemawat. “MapReduce: a flexible data processing tool”, Communications of the ACM, Vol.53, No.1, pp. 72-77, 2010.
[7] Dean, Jeffrey, and Sanjay Ghemawat, “MapReduce: simplified data processing on large clusters”, Communications of the ACM, Vol. 51, No.1, pp. 107-113, 2008.
[8] Kolb, Lars, Andreas Thor, and Erhard Rahm, “Multi-pass sorted neighborhood blocking with MapReduce”, Computer Science-Research and Development, Vol. 27, No.1, pp. 45-63, 2012.
[9] [9] Anjos, Julio CS, et al., “MRA++: Scheduling and data placement on MapReduce for heterogeneous environments”, Future Generation Computer Systems, Vol. 42, pp. 22-35, 2015.
[10] Zhang, Junbo, et al., “A comparison of parallel large-scale knowledge acquisition using rough set theory on different MapReduce runtime systems”, International Journal of Approximate Reasoning, Vol.55 No.3, pp. 896-907, 2014.
[11] Slagter, Kenn, et al., “SmartJoin: a network-aware multiway join for MapReduce”, Cluster Computing, Vol. 17, No.3, pp. 629-641, 2014.
[12] Xiao, Zhifeng, and Yang Xiao, “Achieving accountable MapReduce in cloud computing”, Future Generation Computer Systems, 30, pp.1-13, 2014.
[13] Debortoli, Stefan, Oliver Müller, and Jan vom Brocke, “Comparing business intelligence and big data skills”, Business & Information Systems Engineering, Vol. 6, No.5, pp. 289-300, 2014.
[14] Shamsi, Jawwad, Muhammad Ali Khojaye, and Mohammad Ali Qasmi, “Data-intensive cloud computing: requirements, expectations, challenges, and solutions”, Journal of grid computing, Vol.11, No.2, pp. 281-310, 2013.
[15] Lin, Jimmy, and Chris Dyer, “Data-intensive text processing with MapReduce”, Synthesis Lectures on Human Language Technologies, Vol. 3, No.1, pp.1-177, 2010.
[16] Jain, Reshu, Prasenjit Sarkar, and Dinesh Subhraveti, “Gpfs-snc: An enterprise cluster file system for big data”, IBM Journal of Research and Development, Vol. 57, No.3/4, pp. 5-1, 2013.
[17] Lee, Daewoo, Jin-Soo Kim, and Seungryoul Maeng, “Large-scale incremental processing with MapReduce”, Future Generation Computer Systems, Vol. 36, 66-79, 2014.
[18] Zaharia, Matei, et al., “Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing”, Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation. USENIX Association, 2012.
[19] Zhao, Yaxiong, Jie Wu, and Cong Liu, “Dache: A data aware caching for big-data applications using the MapReduce framework”, Tsinghua science and technology, Vol. 19, NO.1, pp. 39-50, 2014.
[20] Costa, Paolo, Austin Donnelly, Antony Rowstron, and Greg O’Shea, “Camdoop: Exploiting in-network aggregation for big data applications”, In Presented as part of the 9th {USENIX} Symposium on Networked Systems Design and Implementation ({NSDI} 12), pp. 29-42. 2012.
[21] Pandey, Shweta, and Vrinda Tokekar, “Prominence of MapReduce in big data processing”, In 2014 Fourth IEEE International Conference on Communication Systems and Network Technologies, pp. 555-560. IEEE, 2014.
[22] Liu, Ji, et al., “A survey of data-intensive scientific workflow management”, Journal of Grid Computing, Vol. 13, No.4, pp. 457-493, 2015.
[23] Wu, Tin-Yu, et al., “Cloud-based image processing system with priority-based data distribution mechanism”, Computer Communications, Vol. 35, No. 15, pp. 1809-1818, 2012.
[24] Senger, Hermes, et al., “BSP cost and scalability analysis for MapReduce operations”, Concurrency and Computation: Practice and Experience, Vol. 28,No .8, pp. 2503-2527, 2016.
[25] Idris, Muhammad, et al., “Context‐aware scheduling in MapReduce: a compact review”, Concurrency and Computation: Practice and Experience, Vol. 27, No. 17, pp. 5332-5349, 2017.
[26] Lee, Chia-Wei, et al., “A dynamic data placement strategy for hadoop in heterogeneous environments”, Big Data Research, Vol. 1, pp. 14-22, 2014.
[27] Aridhi, Sabeur, et al., “Density-based data partitioning strategy to approximate large-scale subgraph mining”, Information Systems, Vol. 48, pp. 213-223, 2015.
[28] Giachetta, Roberto, “A framework for processing large scale geospatial and remote sensing data in MapReduce environment”, Computers & Graphics, Vol. 49, pp. 37-46, 2015.
[29] Jin, Songchang, et al., “Community structure mining in big data social media networks with MapReduce”, Cluster computing, Vol. 18, No.3, pp. 999-1010, 2015.
[30] Zhang, Fan, et al., “A task-level adaptive MapReduce framework for real-time streaming data in healthcare applications”, Future Generation Computer Systems, Vol. 43, pp. 149-160, 2015.
[31] Landset, Sara, et al., “A survey of open source tools for machine learning with big data in the Hadoop ecosystem”, Journal of Big Data, Vol. 2, No.1, pp. 24, 2015.
[32] López, Victoria, et , “Cost-sensitive linguistic fuzzy rule based classification systems under the MapReduce framework for imbalanced big data”, Fuzzy Sets and Systems, Vol. 258, pp. 5-38, 2015.
[33] Mashayekhy, Lena, et al., “Energy-aware scheduling of mapreduce jobs for big data applications”, IEEE transactions on Parallel and distributed systems, Vol. 26, No.10, pp. 2720-2733, 2015.
[34] Peralta, Daniel, et al., “Evolutionary feature selection for big data classification: A MapReduce approach”, Mathematical Problems in Engineering, 2015.
[35] Triguero, Isaac, et al., “MRPR: A MapReduce solution for prototype reduction in big data classification”, neurocomputing, Vol. 150, pp. 331-345, 2015.
[36] [36] Yao, Qin, et al., “Design and development of a medical big data processing system based on Hadoop”, Journal of medical systems, Vol. 39, No.3, pp. 23, 2015.
[37] Wang, Yong, et al., “Improving the performance of GIS polygon overlay computation with MapReduce for spatial big data processing”, Cluster Computing, Vol. 18, No.2, 507-516, 2015.
[38] Bechini, Alessio, Francesco Marcelloni, and Armando Segatori, “A MapReduce solution for associative classification of big data”, Information Sciences, Vol. 332, pp. 33-55, 2016.
[39] Tsai, Chih-Fong, Wei-Chao Lin, and Shih-Wen Ke, “Big data mining with parallel computing: A comparison of distributed and
MapReduce methodologies”, Journal of Systems and Software, Vol. 122, pp. 83-92, 2016.
[40] Cao, Jianfang, et al., “Big data: A parallel particle swarm optimization-back-propagation neural network algorithm based on MapReduce”, PloS one, Vol. 11, No. 6, pp. e0157551, 2016.
[41] Kamal, Sarwar, et al., “A MapReduce approach to diminish imbalance parameters for big deoxyribonucleic acid dataset”, Computer methods and programs in biomedicine, Vol. 131, pp. 191-206, 2016.
[42] Gu, Boncheol, et al., “Biscuit: A framework for near-data processing of big data workloads”, ACM SIGARCH Computer Architecture News IEEE Press, Vol. 44. No. 3, 2016.
[43] Chen, Jiaoyan, et al., “MR-ELM: a MapReduce-based framework for large-scale ELM training in big data era”, Neural Computing and Applications, Vol. 27, No. 1, pp.101-110, 2016.
[44] Xia, Yingjie, et al., “Big traffic data processing framework for intelligent monitoring and recording systems”, Neurocomputing, Vol. 181, pp. 139-146, 2016.
[45] Kumar, Ajay, et al., “A big data MapReduce framework for fault diagnosis in cloud-based manufacturing”, International Journal of Production Research, Vol. 54, No. 23, pp. 7060-7073, 2016.
[46] Eldawy, Ahmed, Mohamed F. Mokbel, and Christopher Jonathan, “HadoopViz: A MapReduce framework for extensible visualization of big spatial data”, 2016 IEEE 32nd International Conference on Data Engineering (ICDE), IEEE, 2016.
[47] Zhai, Junhai, Xizhao Wang, and Xiaohe Pang, “Voting-based instance selection from large data sets with MapReduce and random weight networks”, Information Sciences, Vol. 367, pp. 1066-1077, 2016.
[48] Manogaran, Gunasekaran, et al., “Machine learning based big data processing framework for cancer diagnosis using hidden Markov model and GM clustering”, Wireless personal communications, Vol. 102, No.3, pp. 2099-2116, 2018.
[49] Sadikin, Rifki, et al., “Processing next generation sequencing data in map-reduce framework using hadoop-BAM in a computer cluster”, 2017 2nd International conferences on Information Technology, Information Systems and Electrical Engineering (ICITISEE). IEEE, 2017.
[50] Li, Zhenlong, et al., “A spatiotemporal indexing approach for efficient processing of big array-based climate data with MapReduce”, International Journal of Geographical Information Science, Vol. 31, No.1, pp. 17-35, 2017.
[51] [51] Ahmad, Awais, et al., “Multilevel data processing using parallel algorithms for analyzing big data in high-performance computing”, International Journal of Parallel Programming, pp. 1-20, 2018.
[52] Fernández, Alberto, et al., “Fuzzy rule based classification systems for big data with MapReduce: granularity analysis”, Advances in Data Analysis and Classification, Vol. 11, No.4, 711-730, 2017.
[53] Benmounah, Zakaria, Souham Meshoul, and Mohamed Batouche, “Scalable Differential Evolutionary Clustering Algorithm for Big Data Using Map-Reduce Paradigm”, International Journal of Applied Metaheuristic Computing (IJAMC), Vol. 8, No.1, pp. 45-60, 2017.
[54] Zhai, Junhai, Sufang Zhang, and Chenxi Wang, “The classification of imbalanced large data sets based on mapreduce and ensemble of elm classifiers”, International Journal of Machine Learning and Cybernetics, Vol.8, No.3, pp. 1009-1017, 2017.
[55] Pulgar-Rubio, F., et al., “MEFASD-BD: a multi-objective evolutionary fuzzy algorithm for subgroup discovery in big data environments-a MapReduce solution”, Knowledge-Based Systems, Vol. 117, pp. 70-78, 2017.
[56] Cho, Wonhee, and Eunmi Choi, “Big data pre-processing methods with vehicle driving data using MapReduce techniques”, The Journal of Supercomputing, Vol. 73, No.7, pp. 3179-3195, 2017.
[57] Zhang, Fan, et al., “Process Streaming Healthcare Data with Adaptive MapReduce Framework”, Handbook of Large-Scale Distributed Computing in Smart Healthcare. Springer, Cham, pp. 43-66, 2017.
[58] Talan, Pooja P., et al., “An Overview of Hadoop MapReduce, Spark, and Scalable Graph Processing Architecture”, Recent Developments in Machine Learning and Data Analytics. Springer, Singapore, pp. 35-42, 2019.
[59] Zhang, Bin, Xiaoyang Wang, and Zhigao Zheng, “The optimization for recurring queries in big data analysis system with MapReduce”, Future Generation Computer Systems, Vol. 87, pp. 549-556, 2018.
[60] Qian, Jin, Min Xia, and Xiaodong Yue, “Parallel knowledge acquisition algorithms for big data using MapReduce”, International Journal of Machine Learning and Cybernetics, Vol. 9, No.6, pp. 1007-1021, 2018.
[61] Martín, D., et al., “MRQAR: A generic MapReduce framework to discover quantitative association rules in big data problems”, Knowledge-Based Systems, Vol. 153, pp. 176-192, 2018.
[62] Manogaran, Gunasekaran, and Daphne Lopez, “Spatial cumulative sum algorithm with big data analytics for climate change detection”, Computers & Electrical Engineering, Vol. 65, pp. 207-221, 2018.
[63] Zou, Quan, Guoqing Li, and Wenyang Yu, “MapReduce functions to remote sensing distributed data processing—Global vegetation drought monitoring as an example”, Software: Practice and Experience, Vol. 48, No.7, pp. 1352-1367, 2018.
[64] Tran, Xuan T., et al., “A New Data Layout Scheme for Energy-Efficient MapReduce Processing Tasks”, Journal of Grid Computing, Vol. 16, No.2, pp. 285-298, 2018.
[65] Ramírez-Gallego, Sergio, et al., “A distributed evolutionary multivariate discretizer for big data processing on apache spark”, Swarm and Evolutionary Computation, Vol. 38, pp. 240-250, 2018.
[66] Manogaran, Gunasekaran, Daphne Lopez, and Naveen Chilamkurti, “In-Mapper combiner based MapReduce algorithm for processing of big climate data”, Future Generation Computer Systems, Vol. 86, pp. 433-445, 2018.
[67] Zhang, Liang, et al., “Efficient finer-grained incremental processing with MapReduce for big data”, Future Generation Computer Systems, Vol. 80, pp. 102-111, 2018.

Asian Journal of Computer Science and Technology is a peer-reviewed international journal that publishes high-quality scientific articles (both theory and practice) and research papers covering all aspects of future computer and Information Technology areas. Topics include, but are not limited to:

Foundations of High-performance ComputingTheory of algorithms and computability

Parallel & distributed computing

Computer networks

Neural networks

LAN/WAN/MAN

Database theory & practice

Mobile Computing for e-Commerce

Future Internet architecture

Protocols and services

Mobile and ubiquitous networks

Green networking

Internet content search

Opportunistic networking

Network applications

Network scaling and limits

Artifial Intelligences

Pattern/Image Recognitions

Communication Network

Information Security

Knowledge Management

Management Information systems

Multimedia communicatiions

Operations research

Optical networks

Software Engineering

Virtual reality

Web Technologies

Wireless technology

Significant regard for MapReduce framework has been trapped by a wide range of areas. It is presently a practical model for data-focused applications because of its basic interface of programming, high elasticity, and capacity to withstand the subjection to defects. Additionally, it is fit for preparing a high extent of data in Distributed Computing environments (DCE). MapReduce, on various events, has turned out to be material to a wide scope of areas. MapReduce is a parallel programming model and a related usage presented by Google. In the programming model, a client determines the calculation by two capacities, Map and Reduce. The basic MapReduce library consequently parallelizes the calculation and handles muddled issues like data dispersion, load adjusting, and adaptation to non-critical failure. Huge data spread crosswise over numerous machines, need to parallelize. Moves the data, and gives booking, adaptation to non-critical failure. A writing survey on the MapReduce programming in different areas has completed in this paper. An examination course has been distinguished by utilizing a writing audit.

Editor-in-Chief
Dr. K. Ganesh
Global Lead, Supply Chain Management, Center of Competence and Senior Knowledge
Expert at McKinsey and Company, India
[email protected]
Editorial Advisory Board
Dr. Eng. Hamid Ali Abed AL-Asadi
Department of Computer Science, Basra University, Iraq
[email protected]
Dr. Norjihan Binti Abdul Ghani
Department of Information System, University of Malaya, Malaysia
[email protected]
Dr. Christos Bouras
Department of Computer Engineering & Informatics, University of Patras, Greece
[email protected]
Dr. Maizatul Akmar Binti Ismail
Department of Information System, University of Malaya, Malaysia
[email protected]
Dr. Harold Castro
Department of Systems Engineering and Computing, University of the Andes, Colombia
[email protected]
Dr. Busyairah Binti Syd Ali
Department of Software Engineering, University of Malaya, Malaysia
[email protected]
Dr. Sri Devi Ravana
Department of Information system, University of Malaya, Malaysia
[email protected]
Dr. Karpaga Selvi Subramanian
Department of Computer Engineering, Mekelle University, Ethiopia
[email protected]
Dr. Mazliza Binti Othman
Department of Computer System & Technology, University of Malaya, Malaysia
[email protected]
Dr. Chiam Yin Kia
Department of Software Engineering, University of Malaya, Malaysia
[email protected]
Dr. OUH Eng Lieh
Department of Information Systems, Singapore Management University, Singapore
[email protected]

2016

2015

2014

  • Results
  • Asian Review of Mechanical Engineering (ARME)
  • career

2013

  • Home
  • Shop
  • My Account
  • Logout
  • Contact us
  • The Asian Review of Civil Engineering (TARCE)

2012

  • Asian Journal of Electrical Sciences(AJES)
  • Asian Journal of Computer Science and Technology (AJCST)
  • Asian Journal of Information Science and Technology (AJIST)
  • Asian Journal of Engineering and Applied Technology (AJEAT)
  • Asian Journal of Science and Applied Technology (AJSAT)
  • Asian Journal of Managerial Science (AJMS)
  • Asian Review of Social Sciences (ARSS)

2011

2010

    Table of Contents

    Editorial Note

    Editorial Dr. K. Ganesh

    Editor-in-Chief
    Dr. K. Ganesh
    Global Lead, Supply Chain Management, Center of Competence and Senior Knowledge
    Expert at McKinsey and Company, India
    [email protected]
    Editorial Advisory Board
    Dr. Eng. Hamid Ali Abed AL-Asadi
    Department of Computer Science, Basra University, Iraq
    [email protected]
    Dr. Norjihan Binti Abdul Ghani
    Department of Information System, University of Malaya, Malaysia
    [email protected]
    Dr. Christos Bouras
    Department of Computer Engineering & Informatics, University of Patras, Greece
    [email protected]
    Dr. Maizatul Akmar Binti Ismail
    Department of Information System, University of Malaya, Malaysia
    [email protected]
    Dr. Harold Castro
    Department of Systems Engineering and Computing, University of the Andes, Colombia
    [email protected]
    Dr. Busyairah Binti Syd Ali
    Department of Software Engineering, University of Malaya, Malaysia
    [email protected]
    Dr. Sri Devi Ravana
    Department of Information system, University of Malaya, Malaysia
    [email protected]
    Dr. Karpaga Selvi Subramanian
    Department of Computer Engineering, Mekelle University, Ethiopia
    [email protected]
    Dr. Mazliza Binti Othman
    Department of Computer System & Technology, University of Malaya, Malaysia
    [email protected]
    Dr. Chiam Yin Kia
    Department of Software Engineering, University of Malaya, Malaysia
    [email protected]
    Dr. OUH Eng Lieh
    Department of Information Systems, Singapore Management University, Singapore
    [email protected]

    Articles

Advanced Search

You can submit your research paper to the journal in just a few clicks. Please follow the steps outlined below: 1. Register your details and select to be an Author 2. Log in with your user name and password 3. ‘Start a new submission’ and follow these 5 steps:

[gravityform id="1" name="Registration" title="false" description="false"]

Privacy Statement

The names and email addresses entered in this journal site will be used exclusively for the stated purposes of this journal and will not be made available for any other purpose or to any other party.

Privacy Statement

The names and email addresses entered in this journal site will be used exclusively for the stated purposes of this journal and will not be made available for any other purpose or to any other party.

Lorem1 ipsum dolor sit amet, consectetur adipiscing elit. Nulla convallis ultricies scelerisque. Fusce dolor augue, sollicitudin eget lacus vitae, rutrum commodo lacus. Praesent ullamcorper facilisis dui. Sed suscipit id lorem ut dapibus. Integer dictum cursus nisl, quis ullamcorper augue. Sed non rutrum mauris. Maecenas in dolor est. Donec eget sagittis mi. Sed non leo eu odio mollis pulvinar vitae et leo. Integer eu feugiat tortor. Duis massa purus, eleifend id erat eget, hendrerit semper risus. Suspendisse cursus varius dapibus

Lorem1 ipsum dolor sit amet, consectetur adipiscing elit. Nulla convallis ultricies scelerisque. Fusce dolor augue, sollicitudin eget lacus vitae, rutrum commodo lacus. Praesent ullamcorper facilisis dui. Sed suscipit id lorem ut dapibus. Integer dictum cursus nisl, quis ullamcorper augue.

Subscription

Subscription (for 12 issues):
Rs. 5000; Overseas - USD 500;
Cheque drawn in favour of "Informatics Publishing Limited"
Click here to download online subscription form

Download

DD Mailing Address

Lorem1 ipsum dolor sit amet,
Lorem1 ipsum dolor sit amet,
Lorem1 ipsum dolor sit amet.

BACK TO TOP

Outstanding Scholars

The Journals honor Outstanding Scholars in various fields. Scholar of the Month should have contributed to their field and to the larger community. Recipients will be nominated by the Advisory Board and approved by the Editor-in-Chief of the allied journals published by The Research Publication. Scholar of the Month will be displayed in the web portal of the concerned journal.

Please send your brief write up to [email protected]

Editors and Reviewers

The Research Publication is seeking qualified researchers to join its editorial team as Associate Editor, Editorial Advisory Board Member, and Reviewers.
Kindly send your details to [email protected]

Call For Papers

Authors are requested to submit their papers electronically to [email protected] with mentioning the journal title.

Mailing Address

The Research Publication 1/611, Maruthi Nagar, Rakkipalayam Post, Coimbatore – 641 031, Tamil Nadu, India Phone No.: 0422 2461001

  • About
  • Editorial Policy
  • Author Guidelines
  • Contact us
  • Copyright
  • Facebook
  • Twitter
  • RSS

© 2015 The Research Publication. All rights reserved.

The Research Publication
  • Home
  • Editorial Policy
  • Author Guidelines
  • Submission
  • Copyright Form
  • Career
  • Contact us
  • Subscription