—Accurate fault detection and localization is essential to the efficient and economical operation of ISP networks. In addition, it affects the performance of Internet applicatio...
A significant fraction of software failures in large-scale Internet systems are cured by rebooting, even when the exact failure causes are unknown. However, rebooting can be expen...
George Candea, Shinichi Kawamoto, Yuichi Fujiki, G...
Abstract—We investigate the performance of TCP under three representatives of packet scheduling algorithms at the router. Our main focus is to investigate how fair service can be...
Go Hasegawa, Takahiro Matsuo, Masayuki Murata, Hid...
Abstract—We describe a novel application of using data mining and statistical learning methods to automatically monitor and detect abnormal execution traces from console logs in ...
Wei Xu, Ling Huang, Armando Fox, David Patterson, ...
Operator mistakes are a significant source of unavailability in modern Internet services. In this paper, we first characterize these mistakes by performing an extensive set of exp...