Sciweavers

328 search results - page 25 / 66
» A Multi-level Approach for Document Clustering
Sort
View
DEXAW
2010
IEEE
204views Database» more  DEXAW 2010»
13 years 8 months ago
Scalable Recursive Top-Down Hierarchical Clustering Approach with Implicit Model Selection for Textual Data Sets
Automatic generation of taxonomies can be useful for a wide area of applications. In our application scenario a topical hierarchy should be constructed reasonably fast from a large...
Markus Muhr, Vedran Sabol, Michael Granitzer
HT
2010
ACM
13 years 5 months ago
Citation based plagiarism detection: a new approach to identify plagiarized work language independently
This paper describes a new approach towards detecting plagiarism and scientific documents that have been read but not cited. In contrast to existing approaches, which analyze docu...
Bela Gipp, Jöran Beel
MMM
2011
Springer
368views Multimedia» more  MMM 2011»
12 years 11 months ago
Correlated PLSA for Image Clustering
Probabilistic Latent Semantic Analysis (PLSA) has become a popular topic model for image clustering. However, the traditional PLSA method considers each image (document) independen...
Peng Li, Jian Cheng, Zechao Li, Hanqing Lu
VLDB
2007
ACM
93views Database» more  VLDB 2007»
14 years 7 months ago
Measuring the Structural Similarity of Semistructured Documents Using Entropy
We propose a technique for measuring the structural similarity of semistructured documents based on entropy. After extracting the structural information from two documents we use ...
Sven Helmer
JCDL
2011
ACM
374views Education» more  JCDL 2011»
12 years 10 months ago
Comparative evaluation of text- and citation-based plagiarism detection approaches using guttenplag
Various approaches for plagiarism detection exist. All are based on more or less sophisticated text analysis methods such as string matching, fingerprinting or style comparison. I...
Bela Gipp, Norman Meuschke, Jöran Beel