Sciweavers

1018 search results - page 19 / 204
» Document Representation in Natural Language Text Retrieval
Sort
View
IR
2007
13 years 8 months ago
An empirical study of tokenization strategies for biomedical information retrieval
Due to the great variation of biological names in biomedical text, appropriate tokenization is an important preprocessing step for biomedical information retrieval. Despite its im...
Jing Jiang, ChengXiang Zhai
LREC
2010
189views Education» more  LREC 2010»
13 years 10 months ago
Automatic Acquisition of Parallel Corpora from Websites with Dynamic Content
Parallel corpora are indispensable resources for a variety of multilingual natural language processing tasks. This paper presents a technique for fully automatic construction of c...
Yulia Tsvetkov, Shuly Wintner
NAACL
2007
13 years 9 months ago
Exploiting Event Semantics to Parse the Rhetorical Structure of Natural Language Text
Previous work on discourse parsing has mostly relied on surface syntactic and lexical features; the use of semantics is limited to shallow semantics. The goal of this thesis is to...
Rajen Subba
SIGIR
2006
ACM
14 years 2 months ago
Identifying comparative sentences in text documents
This paper studies the problem of identifying comparative sentences in text documents. The problem is related to but quite different from sentiment/opinion sentence identification...
Nitin Jindal, Bing Liu
ERCIMDL
1997
Springer
130views Education» more  ERCIMDL 1997»
14 years 18 days ago
Modelling the Retrieval of Structured Documents Containing Texts and Images
Abstract. We present a model for complex documents possibly consisting of a hierarchically structured set of images or texts. Documents are represented both at the form level (as s...
Carlo Meghini, Fabrizio Sebastiani, Umberto Stracc...