Benchmarking file and storage systems on large filesystem images is important, but difficult and often infeasible. Typically, running benchmarks on such large disk setups is a frequent source of frustration for filesystem evaluators; the scale alone acts as a strong deterrent against using larger albeit realistic benchmarks. To address this problem, we develop David: a system that makes it practical to run large benchmarks using modest amount of storage or memory capacities readily available on most computers. David creates a “compressed” version of the original file-system image by omitting all file data and laying out metadata more efficiently; an online storage model determines the runtime of the benchmark workload on the original uncompressed image. David works under any file system as demonstrated in this paper with ext3 and btrfs. We find that David reduces storage requirements by orders of magnitude; David is able to emulate a 1 TB target workload using only an 80 ...
Nitin Agrawal, Leo Arulraj, Andrea C. Arpaci-Dusse