Sciweavers

808 search results - page 98 / 162
» Keyword-based document clustering
Sort
View
JCB
2007
106views more  JCB 2007»
13 years 7 months ago
Clustered Sequence Representation for Fast Homology Search
We present a novel approach to managing redundancy in sequence databanks such as GenBank. We store clusters of near-identical sequences as a representative union-sequence and a se...
Michael Cameron, Yaniv Bernstein, Hugh E. Williams
KDD
2008
ACM
119views Data Mining» more  KDD 2008»
14 years 8 months ago
SAIL: summation-based incremental learning for information-theoretic clustering
Information-theoretic clustering aims to exploit information theoretic measures as the clustering criteria. A common practice on this topic is so-called INFO-K-means, which perfor...
Junjie Wu, Hui Xiong, Jian Chen
WCRE
1999
IEEE
14 years 2 days ago
Experiments with Clustering as a Software Remodularization Method
As valuable software systems get old, reverse engineering becomes more and more important to the companies that have to maintain the code. Clustering is a key activity in reverse ...
Nicolas Anquetil, Timothy Lethbridge
SDM
2003
SIAM
184views Data Mining» more  SDM 2003»
13 years 9 months ago
Finding Clusters of Different Sizes, Shapes, and Densities in Noisy, High Dimensional Data
The problem of finding clusters in data is challenging when clusters are of widely differing sizes, densities and shapes, and when the data contains large amounts of noise and out...
Levent Ertöz, Michael Steinbach, Vipin Kumar
SIGIR
2008
ACM
13 years 7 months ago
Enhancing text clustering by leveraging Wikipedia semantics
Most traditional text clustering methods are based on "bag of words" (BOW) representation based on frequency statistics in a set of documents. BOW, however, ignores the ...
Jian Hu, Lujun Fang, Yang Cao, Hua-Jun Zeng, Hua L...