Skip to Main Content
Article navigation
Purpose

One of the techniques for improving the performance of distributed systems is data replication, wherein new replicas are created to provide more accessibility, fault tolerance and lower access cost of the data. In this paper, the authors propose a community-based solution for the management of data replication, based on the graph model of communication latency between computing and storage nodes. Communities are the clusters of nodes that the communication latency between the nodes are minimum values. The purpose of this study if to, by using this method, minimize the latency and access cost of the data.

Design/methodology/approach

This paper used the Louvain algorithm for finding the best communities. In the proposed algorithm, by requesting a file according to the nodes of each community, the cost of accessing the file located out of the applicant’s community was calculated and the results were accumulated. On exceeding the accumulated costs from a specified threshold, a new replica of the file was created in the applicant’s community. Besides, the number of replicas of each file should be limited to prevent the system from creating useless and redundant data.

Findings

To evaluate the method, four metrics were introduced and measured, including communication latency, response time, data access cost and data redundancy. The results indicated acceptable improvement in all of them.

Originality/value

So far, this is the first research that aims at managing the replicas via community detection algorithms. It opens many opportunities for further studies in this area.

Licensed re-use rights only
You do not currently have access to this content.
Don't already have an account? Register

Purchased this content as a guest? Enter your email address to restore access.

Please enter valid email address.
Email address must be 94 characters or fewer.
Pay-Per-View Access
$41.00
Rental

or Create an Account

Close Modal
Close Modal