A large fraction of the URLs on the web contain duplicate (or near-duplicate) content. De-duping URLs is an extremely important problem for search engines, since all the principal...
While scalable data mining methods are expected to cope with massive Web data, coping with evolving trends in noisy data in a continuous fashion, and without any unnecessary stopp...
The TaskTracer system allows knowledge workers to define a set of activities that characterize their desktop work. It then associates with each user-defined activity the set of ...
Jianqiang Shen, Jed Irvine, Xinlong Bao, Michael G...
We study the design of cryptographic primitives resilient to key-leakage attacks, where an attacker can repeatedly and adaptively learn information about the secret key, subject o...
Experimental performance studies on computer systems, including Grids, require deep understandings on their workload characteristics. The need arises from two important and closel...