Sciweavers

280 search results - page 7 / 56
» Comparison of Cluster Algorithms for the Analysis of Text Da...
Sort
View
KDD
2002
ACM
170views Data Mining» more  KDD 2002»
14 years 8 months ago
Enhanced word clustering for hierarchical text classification
In this paper we propose a new information-theoretic divisive algorithm for word clustering applied to text classification. In previous work, such "distributional clustering&...
Inderjit S. Dhillon, Subramanyam Mallela, Rahul Ku...
SIGMOD
2009
ACM
136views Database» more  SIGMOD 2009»
14 years 7 months ago
A comparison of approaches to large-scale data analysis
There is currently considerable enthusiasm around the MapReduce (MR) paradigm for large-scale data analysis [17]. Although the basic control flow of this framework has existed in ...
Andrew Pavlo, Erik Paulson, Alexander Rasin, Danie...
AUSAI
2005
Springer
14 years 1 months ago
Semantic Correlation Network Based Text Clustering
Abstract. Text documents have sparse data spaces, and nearest neighbors may belong to different classes when using current existing proximity measures to describe the correlation ...
Shaoxu Song, Chunping Li
JCDL
2011
ACM
374views Education» more  JCDL 2011»
12 years 10 months ago
Comparative evaluation of text- and citation-based plagiarism detection approaches using guttenplag
Various approaches for plagiarism detection exist. All are based on more or less sophisticated text analysis methods such as string matching, fingerprinting or style comparison. I...
Bela Gipp, Norman Meuschke, Jöran Beel
IPM
2006
151views more  IPM 2006»
13 years 7 months ago
Document clustering using nonnegative matrix factorization
A methodology for automatically identifying and clustering semantic features or topics in a heterogeneous text collection is presented. Textual data is encoded using a low rank no...
Farial Shahnaz, Michael W. Berry, V. Paul Pauca, R...