Sciweavers

108 search results - page 18 / 22
» Ontologies Improve Text Document Clustering
Sort
View
EMNLP
2008
13 years 9 months ago
Who is Who and What is What: Experiments in Cross-Document Co-Reference
This paper describes a language-independent, scalable system for both challenges of crossdocument co-reference: name variation and entity disambiguation. We provide system results...
Alex Baron, Marjorie Freedman
JCDL
2011
ACM
374views Education» more  JCDL 2011»
12 years 10 months ago
Comparative evaluation of text- and citation-based plagiarism detection approaches using guttenplag
Various approaches for plagiarism detection exist. All are based on more or less sophisticated text analysis methods such as string matching, fingerprinting or style comparison. I...
Bela Gipp, Norman Meuschke, Jöran Beel
CIKM
2010
Springer
13 years 6 months ago
Hypergraph-based multilevel matrix approximation for text information retrieval
In Latent Semantic Indexing (LSI), a collection of documents is often pre-processed to form a sparse term-document matrix, followed by a computation of a low-rank approximation to...
Haw-ren Fang, Yousef Saad
PKDD
2005
Springer
122views Data Mining» more  PKDD 2005»
14 years 1 months ago
A Probabilistic Clustering-Projection Model for Discrete Data
For discrete co-occurrence data like documents and words, calculating optimal projections and clustering are two different but related tasks. The goal of projection is to find a ...
Shipeng Yu, Kai Yu, Volker Tresp, Hans-Peter Krieg...
IPM
2007
106views more  IPM 2007»
13 years 7 months ago
Patent document categorization based on semantic structural information
The number of patent documents is currently rising rapidly worldwide, creating the need for an automatic categorization system to replace time-consuming and labor-intensive manual...
Jae-Ho Kim, Key-Sun Choi