The detection of repeated subsequences, time series motifs, is a problem which has been shown to have great utility for several higher-level data mining algorithms, including clas...
The information resources on the Web are vast, but much of the Web is based on a browsing paradigm that requires someone to actively seek information. Instead, one would like to h...
Disk-based deduplication storage has emerged as the new-generation storage system for enterprise data protection to replace tape libraries. Deduplication removes redundant data se...
Motivated by increased concern over energy consumption in modern data centers, we propose a new, distributed computing platform called Nano Data Centers (NaDa). NaDa uses ISP-cont...
Vytautas Valancius, Nikolaos Laoutaris, Laurent Ma...
Efficiently detecting outliers or anomalies is an important problem in many areas of science, medicine and information technology. Applications range from data cleaning to clinica...
Matthew Eric Otey, Amol Ghoting, Srinivasan Partha...