Counting by Coin Tossings

14 years 5 months ago

Download algo.inria.fr

Abstract. This text is an informal review of several randomized algorithms that have appeared over the past two decades and have proved instrumental in extracting eﬃciently quantitative characteristics of very large data sets. The algorithms are by nature probabilistic and based on hashing. They exploit properties of simple discrete probabilistic models and their design is tightly coupled with their analysis, itself often founded on methods from analytic combinatorics. Singularly eﬃcient solutions have been found that defy information theoretic lower bounds applicable to deterministic algorithms. Characteristics like the total number of elements, cardinality (the number of distinct elements), frequency moments, as well as unbiased samples can be gathered with little loss of information and only a small probability of failure. The algorithms are applicable to traﬃc monitoring in networks, to data base query optimization, and to some of the basic tasks of data mining. They apply to...

Philippe Flajolet

Real-time Traffic

ASIAN 2004 | Black Sheep | Discrete Probabilistic Models | Single Black Sheep |

claim paper

Post Info
More Details (n/a)

Added	30 Jun 2010
Updated	30 Jun 2010
Type	Conference
Year	2004
Where	ASIAN
Authors	Philippe Flajolet

Comments (0)

Sciweavers

Counting by Coin Tossings

ASIAN 2004 | Black Sheep | Discrete Probabilistic Models | Single Black Sheep |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers