Sciweavers

801 search results - page 71 / 161
» The Inefficiency of Batch Training for Large Training Sets
Sort
View
SAC
2004
ACM
14 years 2 months ago
Forest trees for on-line data
This paper presents an hybrid adaptive system for induction of forest of trees from data streams. The Ultra Fast Forest Tree system (UFFT) is an incremental algorithm, with consta...
João Gama, Pedro Medas, Ricardo Rocha
CIKM
2003
Springer
14 years 2 months ago
Online duplicate document detection: signature reliability in a dynamic retrieval environment
As online document collections continue to expand, both on the Web and in proprietary environments, the need for duplicate detection becomes more critical. Few users wish to retri...
Jack G. Conrad, Xi S. Guo, Cindy P. Schriber
SDM
2010
SIAM
144views Data Mining» more  SDM 2010»
13 years 10 months ago
Predictive Modeling with Heterogeneous Sources
Lack of labeled training examples is a common problem for many applications. In the same time, there is usually an abundance of labeled data from related tasks. But they have diff...
Xiaoxiao Shi, Qi Liu, Wei Fan, Qiang Yang, Philip ...
NIPS
2000
13 years 10 months ago
A Neural Probabilistic Language Model
A goal of statistical language modeling is to learn the joint probability function of sequences of words in a language. This is intrinsically difficult because of the curse of dim...
Yoshua Bengio, Réjean Ducharme, Pascal Vinc...
CVPR
1998
IEEE
14 years 11 months ago
A Methodology for Deriving Probabilistic Correctness Measures from Recognizers
This paper describes the derivation of probability of correctness from scores assigned by most recognizers. Motivation for this research is three-fold: i probability values can be...
Djamel Bouchaffra, Venu Govindaraju, Sargur N. Sri...