Sciweavers

500 search results - page 7 / 100
» Document frequency and term specificity
Sort
View
SIGIR
2011
ACM
13 years 1 months ago
When documents are very long, BM25 fails!
We reveal that the Okapi BM25 retrieval function tends to overly penalize very long documents. To address this problem, we present a simple yet effective extension of BM25, namel...
Yuanhua Lv, ChengXiang Zhai
LREC
2008
139views Education» more  LREC 2008»
14 years 10 days ago
Experiments to Investigate the Connection between Case Distribution and Topical Relevance of Search Terms in an Information Retr
We have performed a set of experiments made to investigate the utility of morphological analysis to improve retrieval of documents written in languages with relatively large morph...
Jussi Karlgren, Hercules Dalianis, Bart Jongejan
AUSDM
2008
Springer
243views Data Mining» more  AUSDM 2008»
14 years 27 days ago
Structure-Based Document Model with Discrete Wavelet Transforms and Its Application to Document Classification
Term signal is an existing text representation that depicts a term as a vector of frequencies of occurrences in a number of user-defined partitions of a document. Although term si...
Supphachai Thaicharoen, Tom Altman, Krzysztof J. C...
LREC
2008
131views Education» more  LREC 2008»
14 years 9 days ago
Chinese Term Extraction Based on Delimiters
Existing techniques extract term candidates by looking for internal and contextual information associated with domain specific terms. The algorithms always face the dilemma that f...
Yuhang Yang, Qin Lu, Tiejun Zhao
EMNLP
2008
14 years 10 days ago
Relative Rank Statistics for Dialog Analysis
We introduce the relative rank differential statistic which is a non-parametric approach to document and dialog analysis based on word frequency rank-statistics. We also present a...
Juan Huerta