The ability of fast similarity search at large scale is of great importance to many Information Retrieval (IR) applications. A promising way to accelerate similarity search is sem...
We present a new approach to semi-supervised anomaly detection. Given a set of training examples believed to come from the same distribution or class, the task is to learn a model ...
Large repositories of source code create new challenges and opportunities for statistical machine learning. Here we first develop Sourcerer, an infrastructure for the automated c...
Erik Linstead, Paul Rigor, Sushil Krishna Bajracha...
We present Schism, a novel workload-aware approach for database partitioning and replication designed to improve scalability of sharednothing distributed databases. Because distri...
Carlo Curino, Yang Zhang, Evan P. C. Jones, Samuel...