Document Type : Research Paper

Authors

Department of Computer science, Shahid Bahonar University of Kerman, Kerman, Iran

Abstract

A data-intensive computing platform, encountered in some grid and cloud computing applications, includes numerous tasks that process, transfer or analysis large data files. In such environments, there are large and geographically distributed users that need these huge data. Data management is one of the main challenges of distributed computing environment since data plays on devoted role. Dynamic data replication techniques have been widely applied to improve data access and availability. In order to introduce an appropriate data replication algorithm, there are four important problems that must be solved. 1) Which file should be replicated; 2) How many suitable new replicas should be stored; 3) Where the new replicas should be placed; 4) Which replica should be deleted to make room for new copies. In this paper, we focus particularly on replica replacement issue which makes a significant difference in the efficiency of replication algorithm. We survey replica replacement approaches (from 2004 to 2018) that are developed for both grid and cloud environments. The presented review illustrates the replica replacement problem from a technological and it differs significantly from previous reviews in terms of comprehensiveness and integrated discussion. In this paper, we present different parameters involved in replacement process and show the key points of the recent algorithms with a tabular representation of all those factors. We also report open issues and new challenges in the area.

Keywords