Sciweavers

313 search results - page 55 / 63
» Using Recon for Data Cleaning
Sort
View
PODS
2010
ACM
215views Database» more  PODS 2010»
14 years 16 days ago
An optimal algorithm for the distinct elements problem
We give the first optimal algorithm for estimating the number of distinct elements in a data stream, closing a long line of theoretical research on this problem begun by Flajolet...
Daniel M. Kane, Jelani Nelson, David P. Woodruff
ICFP
2006
ACM
14 years 7 months ago
Algebraic fusion of functions with an accumulating parameter and its improvement
We present a unifying solution to the problem of fusion of functions, where both the producer function and the consumer function have one accumulating parameter. The key idea in t...
Shin-ya Katsumata, Susumu Nishimura
MM
2009
ACM
277views Multimedia» more  MM 2009»
14 years 2 months ago
Inferring semantic concepts from community-contributed images and noisy tags
In this paper, we exploit the problem of inferring images’ semantic concepts from community-contributed images and their associated noisy tags. To infer the concepts more accura...
Jinhui Tang, Shuicheng Yan, Richang Hong, Guo-Jun ...
SIGMOD
2010
ACM
228views Database» more  SIGMOD 2010»
14 years 8 days ago
Probabilistic string similarity joins
Edit distance based string similarity join is a fundamental operator in string databases. Increasingly, many applications in data cleaning, data integration, and scientific compu...
Jeffrey Jestes, Feifei Li, Zhepeng Yan, Ke Yi
CVPR
2009
IEEE
13 years 11 months ago
ImageNet: A large-scale hierarchical image database
The explosion of image data on the Internet has the potential to foster more sophisticated and robust models and algorithms to index, retrieve, organize and interact with images a...
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai...