Sciweavers

775 search results - page 16 / 155
» Email data cleaning
Sort
View
LREC
2008
108views Education» more  LREC 2008»
13 years 11 months ago
A Lightweight and Efficient Tool for Cleaning Web Pages
Originally conceived as a "naive" baseline experiment using traditional n-gram language models as classifiers, the NCLEANER system has turned out to be a fast and lightw...
Stefan Evert
ICDE
2006
IEEE
141views Database» more  ICDE 2006»
14 years 11 months ago
Clean Answers over Dirty Databases: A Probabilistic Approach
The detection of duplicate tuples, corresponding to the same real-world entity, is an important task in data integration and cleaning. While many techniques exist to identify such...
Ariel Fuxman, Periklis Andritsos, Renée J. ...
DCC
2003
IEEE
14 years 9 months ago
PPM Model Cleaning
The Prediction by Partial Matching (PPM) algorithm uses a cumulative frequency count of input symbols in different contexts to estimate their probability distribution. Excellent c...
Milenko Drinic, Darko Kirovski, Miodrag Potkonjak
SDM
2007
SIAM
89views Data Mining» more  SDM 2007»
13 years 11 months ago
Preventing Information Leaks in Email
The widespread use of email has raised serious privacy concerns. A critical issue is how to prevent email information leaks, i.e., when a message is accidentally addressed to non-...
Vitor R. Carvalho, William W. Cohen
CHI
2010
ACM
14 years 4 months ago
Cleanly: trashducation urban system
Half the world’s population is expected to live in urban areas by 2020. The high human density and changes in peoples’ consumption habits result in an everincreasing amount of...
Inbal Reif, Florian Alt, Juan David Hincapié...