Sciweavers

1403 search results - page 44 / 281
» Set cover algorithms for very large datasets
Sort
View
CEAS
2006
Springer
14 years 7 days ago
Fast Uncertainty Sampling for Labeling Large E-mail Corpora
One of the biggest challenges in building effective anti-spam solutions is designing systems to defend against the everevolving bag of tricks spammers use to defeat them. Because ...
Richard Segal, Ted Markowitz, William Arnold
ICIP
2009
IEEE
13 years 6 months ago
Automatic discovery of image families: Global vs. local features
Gathering a large collection of images has been made quite easy by social and image sharing websites, e.g. flickr.com. However, using such collections faces the problem that they ...
Mohamed Aly, Peter Welinder, Mario E. Munich, Piet...
ICDCS
2012
IEEE
11 years 11 months ago
MOVE: A Large Scale Keyword-Based Content Filtering and Dissemination System
—The Web 2.0 era is characterized by the emergence of a very large amount of live content. A real time and finegrained content filtering approach can precisely keep users upto-...
Weixiong Rao, Lei Chen 0002, Pan Hui, Sasu Tarkoma
ACL
2008
13 years 10 months ago
Smoothing a Tera-word Language Model
Frequency counts from very large corpora, such as the Web 1T dataset, have recently become available for language modeling. Omission of low frequency n-gram counts is a practical ...
Deniz Yuret
KDD
2006
ACM
142views Data Mining» more  KDD 2006»
14 years 9 months ago
Mining distance-based outliers from large databases in any metric space
Let R be a set of objects. An object o R is an outlier, if there exist less than k objects in R whose distances to o are at most r. The values of k, r, and the distance metric ar...
Yufei Tao, Xiaokui Xiao, Shuigeng Zhou