On the optimization of Hadoop MapReduce default job scheduling through dynamic job prioritization

Peyravi, Narges; Moeini, Ali

doi:10.22059/jac.2020.79266

Document Type : Research Paper

Authors

Narges Peyravi ¹
Ali Moeini ²

¹ Department of Computer Engineering and Information Technology, Faculty of Engineering, University of Qom, Qom, Iran

² Department of Algorithms and Computation, School of Engineering Science, College of Engineering, University of Tehran

https://doi.org/10.22059/jac.2020.79266

Abstract

One of the most popular frameworks for big data processing is Apache Hadoop MapReduce. The default Hadoop scheduler uses queue system. However, it does not consider any specific priority for the jobs required for MapReduce programming model. In this paper, a new dynamic score is developed to improve the performance of the default Hadoop MapReduce scheduler. This dynamic priority score is computed based on effective factors such as job runtime estimation, input data size, waiting time, and length or bustle of the waiting queue. The implementation of the proposed scheduling method, based on this dynamic score, not only improves CPU and memory performance, but also reduced waiting time and average turnaround time by approximately $45\%$ and $40\%$ respectively, compared to the default Hadoop scheduler.

Keywords

Journal of Algorithms and Computation

On the optimization of Hadoop MapReduce default job scheduling through dynamic job prioritization

Volume 52, Issue 2 - Serial Number 2
December 2020
Pages 109-126

On the optimization of Hadoop MapReduce default job scheduling through dynamic job prioritization

Volume 52, Issue 2 - Serial Number 2December 2020Pages 109-126

Volume 52, Issue 2 - Serial Number 2
December 2020
Pages 109-126