Sciweavers

860 search results - page 49 / 172
» Lower Bounds on the Size of Test Data Sets
Sort
View
COLT
1999
Springer
14 years 12 hour ago
Beating the Hold-Out: Bounds for K-fold and Progressive Cross-Validation
The empirical error on a test set, the hold-out estimate, often is a more reliable estimate of generalization error than the observed error on the training set, the training estim...
Avrim Blum, Adam Kalai, John Langford
SIAMCOMP
2000
109views more  SIAMCOMP 2000»
13 years 7 months ago
Dual-Bounded Generating Problems: Partial and Multiple Transversals of a Hypergraph
Abstract. We consider two natural generalizations of the notion of transversal to a finite hypergraph, arising in data-mining and machine learning, the so called multiple and parti...
Endre Boros, Vladimir Gurvich, Leonid Khachiyan, K...
STACS
2009
Springer
14 years 2 months ago
Error-Correcting Data Structures
We study data structures in the presence of adversarial noise. We want to encode a given object in a succinct data structure that enables us to efficiently answer specific queries...
Ronald de Wolf
CIKM
2005
Springer
14 years 1 months ago
Towards estimating the number of distinct value combinations for a set of attributes
Accurately and efficiently estimating the number of distinct values for some attribute(s) or sets of attributes in a data set is of critical importance to many database operation...
Xiaohui Yu, Calisto Zuzarte, Kenneth C. Sevcik
SIGMOD
2003
ACM
158views Database» more  SIGMOD 2003»
14 years 7 months ago
Processing Set Expressions over Continuous Update Streams
There is growing interest in algorithms for processing and querying continuous data streams (i.e., data that is seen only once in a fixed order) with limited memory resources. In ...
Sumit Ganguly, Minos N. Garofalakis, Rajeev Rastog...