Storage Optimization for Large-Scale Distributed Stream Processing Systems

15 years 8 months ago

Download www.cecs.uci.edu

We consider storage in an extremely large-scale distributed computer system designed for stream processing applications. In such systems, incoming data and intermediate results may need to be stored to enable future analyses. The quantity of such data would dominate even the largest storage system. Thus, a mechanism is needed to keep the most useful data. One recently introduced approach is to employ retention value functions, which effectively assign each data object a value that changes over time [5]. Storage space is then reclaimed automatically by deleting data of lowest current value. In such large systems, there will naturally be multiple ﬁle systems available, each with different properties. Choosing the right ﬁle system for a given incoming data stream presents a challenge. In this paper we provide a novel and effective scheme for optimizing the placement of data within a distributed storage subsystem employing retention value functions. The goal is to keep the data of hig...

Kirsten Hildrum, Fred Douglis, Joel L. Wolf, Phili

Real-time Traffic

Distributed And Parallel Computing | Incoming Data | IPPS 2007 | Largest Storage System | Retention Value Functions |

claim paper

» Applying Database Support for Large Scale Data Driven Science in Distributed Environments

» FaultTolerant Replication Management in LargeScale Distributed Storage Systems

» Self Management for LargeScale Distributed Systems An Overview of the SELFMAN Project

» Active Storage for LargeScale Data Mining and Multimedia

» Semantic Routing and Filtering for LargeScale Video Streams Monitoring

» Historical data storage for large scale sensor networks

» Distributing the Kalman Filter for LargeScale Systems

» REMO ResourceAware Application State Monitoring for LargeScale Distributed Systems

Post Info
More Details (n/a)

Added	03 Jun 2010
Updated	03 Jun 2010
Type	Conference
Year	2007
Where	IPPS
Authors	Kirsten Hildrum, Fred Douglis, Joel L. Wolf, Philip S. Yu, Lisa Fleischer, Akshay Katta

Comments (0)

Sciweavers

Storage Optimization for Large-Scale Distributed Stream Processing Systems

Distributed And Parallel Computing | Incoming Data | IPPS 2007 | Largest Storage System | Retention Value Functions |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers