Sampling is crucial for controlling resource consumption by internet traffic flow measurements. Routers use Packet Sampled NetFlow [9], and completed flow records are sampled in...
Topic models provide a powerful tool for analyzing large text collections by representing high dimensional data in a low dimensional subspace. Fitting a topic model given a set of...
A real-time AI system in the real world needs to monitor an immense volume of data. To do this, the system must filter out much of the incoming data. However, it must remain re ...
We consider the problem of efficiently sampling Web search engine query results. In turn, using a small random sample instead of the full set of results leads to efficient approxi...
Aris Anagnostopoulos, Andrei Z. Broder, David Carm...
We address the problem of computing approximate answers to continuous sliding-window joins over data streams when the available memory may be insufficient to keep the entire join...