Sciweavers

ASIAN
2004
Springer

Counting by Coin Tossings

14 years 4 months ago
Counting by Coin Tossings
Abstract. This text is an informal review of several randomized algorithms that have appeared over the past two decades and have proved instrumental in extracting efficiently quantitative characteristics of very large data sets. The algorithms are by nature probabilistic and based on hashing. They exploit properties of simple discrete probabilistic models and their design is tightly coupled with their analysis, itself often founded on methods from analytic combinatorics. Singularly efficient solutions have been found that defy information theoretic lower bounds applicable to deterministic algorithms. Characteristics like the total number of elements, cardinality (the number of distinct elements), frequency moments, as well as unbiased samples can be gathered with little loss of information and only a small probability of failure. The algorithms are applicable to traffic monitoring in networks, to data base query optimization, and to some of the basic tasks of data mining. They apply to...
Philippe Flajolet
Added 30 Jun 2010
Updated 30 Jun 2010
Type Conference
Year 2004
Where ASIAN
Authors Philippe Flajolet
Comments (0)