Sciweavers

103 search results - page 7 / 21
» Models and Algorithms for Duplicate Document Detection
Sort
View
SIGIR
2004
ACM
14 years 23 days ago
Locality preserving indexing for document representation
Document representation and indexing is a key problem for document analysis and processing, such as clustering, classification and retrieval. Conventionally, Latent Semantic Index...
Xiaofei He, Deng Cai, Haifeng Liu, Wei-Ying Ma
ECIR
2009
Springer
14 years 4 months ago
Topic and Trend Detection in Text Collections Using Latent Dirichlet Allocation
Algorithms that enable the process of automatically mining distinct topics in document collections have become increasingly important due to their applications in many fields and ...
Levent Bolelli, Seyda Ertekin, C. Lee Giles
BMCBI
2006
103views more  BMCBI 2006»
13 years 7 months ago
Statistical inference of chromosomal homology based on gene colinearity and applications to Arabidopsis and rice
Background: The identification of chromosomal homology will shed light on such mysteries of genome evolution as DNA duplication, rearrangement and loss. Several approaches have be...
Xiyin Wang, Xiaoli Shi, Zhe Li, Qihui Zhu, Lei Kon...
SPIRE
2004
Springer
14 years 21 days ago
Indexing Text Documents Based on Topic Identification
This work provides algorithms and heuristics to index text documents by determining important topics in the documents. To index text documents, the work provides algorithms to gene...
Manonton Butarbutar, Susan McRoy
METRICS
1999
IEEE
13 years 11 months ago
Measuring Clone Based Reengineering Opportunities
Code duplication, plausibly caused by copying source code and slightly modifying it, is often observed in large systems. Clone detection and documentation have been investigated b...
Magdalena Balazinska, Ettore Merlo, Michel Dagenai...