Network monitoring systems that support data archival and after-the-fact (retrospective) queries are useful for a multitude of purposes, such as anomaly detection and network and security forensics. Data archival for such systems, however, is complicated by (a) data arrival rate, which may be hundreds of thousands of packets per second per link, and (b) the need for online indexing of this data to support retrospective queries. At these data rates, both common database index structures and general-purpose file systems perform poorly. This paper describes Hyperion, a system for archival, indexing, and on-line retrieval of high-volume data streams. We employ a write-optimized stream file system for high-speed storage of simultaneous data streams, and a novel use of signature file indexes in a distributed multi-level index. We implement Hyperion on commodity hardware and conduct a detailed evaluation using synthetic data and real network traces. Our streaming file system, StreamFS, i...
Peter Desnoyers, Prashant J. Shenoy