Sciweavers

280 search results - page 11 / 56
» A Semi-Supervised Document Clustering Algorithm Based on EM
Sort
View
JMLR
2002
111views more  JMLR 2002»
13 years 6 months ago
The Learning-Curve Sampling Method Applied to Model-Based Clustering
We examine the learning-curve sampling method, an approach for applying machinelearning algorithms to large data sets. The approach is based on the observation that the computatio...
Christopher Meek, Bo Thiesson, David Heckerman
DATAMINE
2006
166views more  DATAMINE 2006»
13 years 6 months ago
Accelerated EM-based clustering of large data sets
Motivated by the poor performance (linear complexity) of the EM algorithm in clustering large data sets, and inspired by the successful accelerated versions of related algorithms l...
Jakob J. Verbeek, Jan Nunnink, Nikos A. Vlassis
SIGIR
1998
ACM
13 years 11 months ago
Web Document Clustering: A Feasibility Demonstration
Users of Web search engines are often forced to sift through the long ordered list of document “snippets” returned by the engines. The IR community has explored document cluste...
Oren Zamir, Oren Etzioni
WWW
2007
ACM
14 years 7 months ago
A new suffix tree similarity measure for document clustering
In this paper, we propose a new similarity measure to compute the pairwise similarity of text-based documents based on suffix tree document model. By applying the new suffix tree ...
Hung Chim, Xiaotie Deng
ACL
2008
13 years 8 months ago
Inducing Gazetteers for Named Entity Recognition by Large-Scale Clustering of Dependency Relations
We propose using large-scale clustering of dependency relations between verbs and multiword nouns (MNs) to construct a gazetteer for named entity recognition (NER). Since dependen...
Jun'ichi Kazama, Kentaro Torisawa