Sciweavers

156 search results - page 11 / 32
» The UCI KDD Archive of Large Data Sets for Data Mining Resea...
Sort
View
KDD
2010
ACM
197views Data Mining» more  KDD 2010»
13 years 5 months ago
Semi-supervised feature selection for graph classification
The problem of graph classification has attracted great interest in the last decade. Current research on graph classification assumes the existence of large amounts of labeled tra...
Xiangnan Kong, Philip S. Yu
KDD
2006
ACM
153views Data Mining» more  KDD 2006»
14 years 8 months ago
Model compression
Often the best performing supervised learning models are ensembles of hundreds or thousands of base-level classifiers. Unfortunately, the space required to store this many classif...
Cristian Bucila, Rich Caruana, Alexandru Niculescu...
KDD
2007
ACM
152views Data Mining» more  KDD 2007»
14 years 8 months ago
Efficient incremental constrained clustering
Clustering with constraints is an emerging area of data mining research. However, most work assumes that the constraints are given as one large batch. In this paper we explore the...
Ian Davidson, S. S. Ravi, Martin Ester
KDD
2002
ACM
166views Data Mining» more  KDD 2002»
14 years 8 months ago
Frequent term-based text clustering
Text clustering methods can be used to structure large sets of text or hypertext documents. The well-known methods of text clustering, however, do not really address the special p...
Florian Beil, Martin Ester, Xiaowei Xu
KDD
2008
ACM
176views Data Mining» more  KDD 2008»
14 years 8 months ago
Febrl -: an open source data cleaning, deduplication and record linkage system with a graphical user interface
Matching records that refer to the same entity across databases is becoming an increasingly important part of many data mining projects, as often data from multiple sources needs ...
Peter Christen