Sciweavers

22 search results - page 2 / 5
» Pivoted Document Length Normalization
Sort
View
CICLING
2010
Springer
14 years 2 months ago
Word Length n-Grams for Text Re-use Detection
Abstract. The automatic detection of shared content in written documents –which includes text reuse and its unacknowledged commitment, plagiarism– has become an important probl...
Alberto Barrón-Cedeño, Chiara Basile...
CIKM
2011
Springer
12 years 10 months ago
Lower-bounding term frequency normalization
In this paper, we reveal a common deficiency of the current retrieval models: the component of term frequency (TF) normalization by document length is not lower-bounded properly;...
Yuanhua Lv, ChengXiang Zhai
CIKM
2004
Springer
14 years 2 months ago
InfoAnalyzer: a computer-aided tool for building enterprise taxonomies
In this paper we study the problem of collecting training samples for building enterprise taxonomies. We develop a computer-aided tool named InfoAnalyzer, which can effectively as...
Li Zhang, Shixia Liu, Yue Pan, Liping Yang
ECIR
2003
Springer
14 years 7 days ago
Hierarchical Indexing and Flexible Element Retrieval for Structured Document
As more and more structured documents, such as SGML or XML documents become available on the Web, there is a growing demand to develop effective structured document retrieval which...
Hang Cui, Ji-Rong Wen, Tat-Seng Chua
TREC
1997
14 years 6 days ago
Short Queries, Natural Language and Spoken Document Retrieval: Experiments at Glasgow University
This paper contains a description of the methodology and results of the three TREC submissions made by the Glasgow IR group (glair). In addition to submitting to the ad hoc task, ...
Fabio Crestani, Mark Sanderson, Marcos Theophylact...