We present a document routing and index partitioning scheme for scalable similarity-based search of documents in a large corpus. We consider the case when similarity-based search ...
The aggregation and comparison of behavioral patterns on the WWW represent a tremendous opportunity for understanding past behaviors and predicting future behaviors. In this paper...
Eytan Adar, Daniel S. Weld, Brian N. Bershad, Stev...
A fundamental problem in data management is to draw and maintain a sample of a large data set, for approximate query answering, selectivity estimation, and query planning. With la...
Graham Cormode, S. Muthukrishnan, Ke Yi, Qin Zhang
This paper presents an algorithm for discovering conjunction rules with high reliability from data sets. The discovery of conjunction rules, each of which is a restricted form of ...
The Passive Aggressive framework [1] is a principled approach to online linear classification that advocates minimal weight updates i.e., the least required so that the current tr...