An Approximation to the Greedy Algorithm for Differential Compression of Very Large Files

15 years 13 days ago

Download www.eecs.harvard.edu

We present a new differential compression algorithm that combines the hash value techniques and suffix array techniques of previous work. Differential compression refers to encoding a file (a version file) as a set of changes with respect to another file (a reference file). Previous differential compression algorithms can be shown empirically to run in linear-time but they have certain drawbacks, namely they do not find the best matches for every offset of the version file. Our algorithm finds the best matches for every offset of the version file, with respect to a certain granularity (or block size) and above a certain length threshold. It has two variations depending on how we choose the block size. If we keep the block size fixed, we show that the compression performance of our algorithm is similar to that of the greedy algorithm, without the expensive space and time requirements. If we vary the block size linearly with the reference file size, we show that our algorithm can run in...

Ramesh C. Agarwal, Suchitra Amalapurapu, Shaili Ja

Real-time Traffic

Block Size | Computer Graphics | DCC 2004 | Greedy Algorithm | Reference File Size |

claim paper

Post Info
More Details (n/a)

Added	25 Dec 2009
Updated	25 Dec 2009
Type	Conference
Year	2004
Where	DCC
Authors	Ramesh C. Agarwal, Suchitra Amalapurapu, Shaili Jain

Comments (0)

Sciweavers

An Approximation to the Greedy Algorithm for Differential Compression of Very Large Files

Block Size | Computer Graphics | DCC 2004 | Greedy Algorithm | Reference File Size |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers