Sciweavers

3280 search results - page 473 / 656
» MiTAP for real users, real data, real problems
Sort
View
WWW
2005
ACM
14 years 10 months ago
Duplicate detection in click streams
We consider the problem of finding duplicates in data streams. Duplicate detection in data streams is utilized in various applications including fraud detection. We develop a solu...
Ahmed Metwally, Divyakant Agrawal, Amr El Abbadi
KDD
2008
ACM
120views Data Mining» more  KDD 2008»
14 years 9 months ago
Entity categorization over large document collections
Extracting entities (such as people, movies) from documents and identifying the categories (such as painter, writer) they belong to enable structured querying and data analysis ov...
Arnd Christian König, Rares Vernica, Venkates...
KDD
2002
ACM
144views Data Mining» more  KDD 2002»
14 years 9 months ago
Efficiently mining frequent trees in a forest
Mining frequent trees is very useful in domains like bioinformatics, web mining, mining semi-structured data, and so on. We formulate the problem of mining (embedded) subtrees in ...
Mohammed Javeed Zaki
ICDM
2009
IEEE
111views Data Mining» more  ICDM 2009»
14 years 4 months ago
A Game Theoretical Model for Adversarial Learning
Abstract—It is now widely accepted that in many situations where classifiers are deployed, adversaries deliberately manipulate data in order to reduce the classifier’s accura...
Wei Liu, Sanjay Chawla
GFKL
2007
Springer
139views Data Mining» more  GFKL 2007»
14 years 3 months ago
The Noise Component in Model-based Cluster Analysis
The so-called noise-component has been introduced by Banfield and Raftery (1993) to improve the robustness of cluster analysis based on the normal mixture model. The idea is to ad...
Christian Hennig, Pietro Coretto