Sciweavers

FAST
2010

I/O Deduplication: Utilizing Content Similarity to Improve I/O Performance

14 years 1 months ago
I/O Deduplication: Utilizing Content Similarity to Improve I/O Performance
Duplication of data in storage systems is becoming increasingly common. We introduce I/O Deduplication, a storage optimization that utilizes content similarity for improving I/O performance by eliminating I/O operations and reducing the mechanical delays during I/O operations. I/O Deduplication consists of three main techniques: content-based caching, dynamic replica retrieval, and selective duplication. Each of these techniques is motivated by our observations with I/O workload traces obtained from actively-used production storage systems, all of which revealed surprisingly high levels of content similarity for both stored and accessed data. Evaluation of a prototype implementation using these workloads revealed an overall improvement in disk I/O performance of 28-47% across these workloads. Further breakdown also showed that each of the three techniques contributed significantly to the overall performance improvement.
Ricardo Koller, Raju Rangaswami
Added 02 Oct 2010
Updated 02 Oct 2010
Type Conference
Year 2010
Where FAST
Authors Ricardo Koller, Raju Rangaswami
Comments (0)