Sciweavers

PVLDB
2008

Online maintenance of very large random samples on flash storage

13 years 11 months ago
Online maintenance of very large random samples on flash storage
Recent advances in flash media have made it an attractive alternative for data storage in a wide spectrum of computing devices, such as embedded sensors, mobile phones, PDA's, laptops, and even servers. However, flash media has many unique characteristics that make existing data management/analytics algorithms designed for magnetic disks perform poorly with flash storage. For example, while random (page) reads are as fast as sequential reads, random (page) writes and in-place data updates are orders of magnitude slower than sequential writes. In this paper, we consider an important fundamental problem that would seem to be particularly challenging for flash storage: efficiently maintaining a very large (100 MBs or more) random sample of a data stream (e.g., of sensor readings). First, we show that previous algorithms such as reservoir sampling and geometric file are not readily adapted to econd, we propose B-FILE, an energy-efficient abstraction for flash media to store self-expi...
Suman Nath, Phillip B. Gibbons
Added 28 Dec 2010
Updated 28 Dec 2010
Type Journal
Year 2008
Where PVLDB
Authors Suman Nath, Phillip B. Gibbons
Comments (0)