Sciweavers

89 search results - page 4 / 18
» Exploiting Dataset Similarity for Distributed Mining
Sort
View
NIPS
2007
13 years 9 months ago
Density Estimation under Independent Similarly Distributed Sampling Assumptions
A method is proposed for semiparametric estimation where parametric and nonparametric criteria are exploited in density estimation and unsupervised learning. This is accomplished ...
Tony Jebara, Yingbo Song, Kapil Thadani
ADMA
2010
Springer
271views Data Mining» more  ADMA 2010»
13 years 2 months ago
Exploiting Concept Clumping for Efficient Incremental E-Mail Categorization
We introduce a novel approach to incremental e-mail categorization based on identifying and exploiting "clumps" of messages that are classified similarly. Clumping reflec...
Alfred Krzywicki, Wayne Wobcke
KDD
2002
ACM
138views Data Mining» more  KDD 2002»
14 years 8 months ago
Learning to match and cluster large high-dimensional data sets for data integration
Part of the process of data integration is determining which sets of identifiers refer to the same real-world entities. In integrating databases found on the Web or obtained by us...
William W. Cohen, Jacob Richman
VLDB
2008
ACM
147views Database» more  VLDB 2008»
14 years 8 months ago
Tree-based partition querying: a methodology for computing medoids in large spatial datasets
Besides traditional domains (e.g., resource allocation, data mining applications), algorithms for medoid computation and related problems will play an important role in numerous e...
Kyriakos Mouratidis, Dimitris Papadias, Spiros Pap...
BMCBI
2008
114views more  BMCBI 2008»
13 years 7 months ago
Visualization of large influenza virus sequence datasets using adaptively aggregated trees with sampling-based subscale represen
Background: With the amount of influenza genome sequence data growing rapidly, researchers need machine assistance in selecting datasets and exploring the data. Enhanced visualiza...
Leonid Zaslavsky, Yiming Bao, Tatiana A. Tatusova