Log summarization and anomaly detection for troubleshooting distributed systems

14 years 7 months ago

Download www.cedps.net

— Today’s system monitoring tools are capable of detecting system failures such as host failures, OS errors, and network partitions in near-real time. Unfortunately, the same cannot yet be said of the end-to-end distributed software stack. Any given action, for example, reliably transferring a directory of ﬁles, can involve a wide range of complex and interrelated actions across multiple pieces of software: checking user certiﬁcates and permissions, getting details for all ﬁles, performing third-party transfers, understanding re-try policy decisions, etc. We present an infrastructure for troubleshooting complex middleware, a general purpose technique for conﬁgurable log summarization, and an anomaly detection technique that works in near-real time on running Grid middleware. We present results gathered using this infrastructure from instrumented Grid middleware and applications running on the Emulab testbed. From these results, we analyze the effectiveness of several algori...

Dan Gunter, Brian Tierney, Aaron Brown, D. Martin

Real-time Traffic

Distributed And Parallel Computing | GRID 2007 | Grid Middleware | Instrumented Grid Middleware | Near-real Time |

claim paper

Post Info
More Details (n/a)

Added	07 Jun 2010
Updated	07 Jun 2010
Type	Conference
Year	2007
Where	GRID
Authors	Dan Gunter, Brian Tierney, Aaron Brown, D. Martin Swany, John Bresnahan, Jennifer M. Schopf

Comments (0)

Sciweavers

Log summarization and anomaly detection for troubleshooting distributed systems

Distributed And Parallel Computing | GRID 2007 | Grid Middleware | Instrumented Grid Middleware | Near-real Time |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers