Large-scale distributed systems provide the backbone for numerous distributed applications and online services. These systems span over a multitude of computing nodes located at d...
This paper reports results from a failure analysis (i.e., incorrect query construction) of 51,473 queries from 18,113 users of Excite, a major Web search engine. Given that many d...
Tracing is a dynamic analysis technique to continuously capture events of interest on a running program. The occurrence of a statement, the invocation of a function, and the trigg...
Log preprocessing, a process applied on the raw log before applying a predictive method, is of paramount importance to failure prediction and diagnosis. While existing filtering ...
Ziming Zheng, Zhiling Lan, Byung-Hoon Park, Al Gei...