Just-in-Time Analytics on Large File Systems

14 years 10 months ago

Download www.seas.gwu.edu

As ﬁle systems reach the petabytes scale, users and administrators are increasingly interested in acquiring highlevel analytical information for ﬁle management and analysis. Two particularly important tasks are the processing of aggregate and top-k queries which, unfortunately, cannot be quickly answered by hierarchical ﬁle systems such as ext3 and NTFS. Existing pre-processing based solutions, e.g., ﬁle system crawling and index building, consume a signiﬁcant amount of time and space (for generating and maintaining the indexes) which in many cases cannot be justiﬁed by the infrequent usage of such solutions. In this paper, we advocate that user interests can often be sufﬁciently satisﬁed by approximate i.e., statistically accurate - answers. We develop Glance, a just-in-time sampling-based system which, after consuming a small number of disk accesses, is capable of producing extremely accurate answers for a broad class of aggregate and top-k queries over a ﬁle syste...

H. Howie Huang, Nan Zhang 0004, Wei Wang, Gautam D

Real-time Traffic

Disk Accesses | FAST 2011 | Justi | Operating System | Time Sampling |

claim paper

» Performance Analysis of Resource Selection Schemes for a Large Scale VideoonDemand System

» OnLine Analytical Processing on Large Databases Managed by Computational Grids

» LargeScale Simulation of Replica Placement Algorithms for a Serverless Distributed File Sy...

» PreDatA preparatory data analytics on petascale machines

» Feasibility of a serverless distributed file system deployed on an existing set of desktop...

» Relational versus nonrelational database systems for data warehousing

» Denialofservice resilience in peertopeer file sharing systems

» Optimal Scheduling of PeertoPeer File Dissemination

Post Info
More Details (n/a)

Added	28 Aug 2011
Updated	28 Aug 2011
Type	Journal
Year	2011
Where	FAST
Authors	H. Howie Huang, Nan Zhang 0004, Wei Wang, Gautam Das, Alexander S. Szalay

Comments (0)

Sciweavers

Just-in-Time Analytics on Large File Systems

Disk Accesses | FAST 2011 | Justi | Operating System | Time Sampling |

Explore & Download

Productivity Tools

Sciweavers