Title

Remote data integrity checking with server-side repair

Document Type

Article

Publication Date

8-24-2017

Abstract

Distributed storage systems store data redundantly at multiple servers that are geographically spread throughout the world. This basic approach would be sufficient in handling server failure due to natural faults, because when one server fails, data from healthy servers can be used to restore the desired redundancy level. However, in a setting where servers are untrusted and can behave maliciously, data redundancy must be used in tandem with Remote Data Checking (RDC) to ensure that the redundancy level of the storage systems is maintained over time. All previous RDC schemes for distributed systems impose a heavy burden on the data owner (client) during data maintenance: To repair data at a faulty server, the data owner needs to first download a large amount of data, re-generate the data to be stored at a new server, and then upload this data at a new healthy server. We work on a new concept, namely, server-side repair, in which the servers are responsible to repair the corruption, whereas the client acts as a lightweight repair coordinator during repair. We propose two novel RDC schemes for replication-based distributed storage systems, RDC-SR and ERDC-SR, which enable server-side repair (thus taking advantage of the premium connections available between a CSP’s data centers) and minimize the load on the client side. Although both schemes achieve a similar objective, RDC-SR assumes that the computational power of the CSP will not grow over time, whereas ERDC-SR relaxes this assumption and considers a CSP whose computational power can increase over time. Our guidelines on choosing the parameters of these schemes provide insights on their practical usage and also reveal that, whereas ERDC-SR can handle more powerful adversaries, it also imposes a minimal file size. Finally, we evaluate the performance of the two schemes. For the RDC-SR scheme, we build a prototype on the Amazon cloud and provide experimental results to support its effectiveness. Our prototype for RDC-SR built on Amazon AWS validates the practicality of this new approach. For the ERDC-SR scheme, our analytical performance analysis shows that the scheme is an order of magnitude more efficient than a simple extension of RDC-SR to defend against the stronger adversarial model.

Publisher's Statement

Publisher's version of record: http://dx.doi.org/10.3233/JCS-16868

Publication Title

Journal of Computer Security

First Page

537

Last Page

584