Sciweavers

274 search results - page 34 / 55
» Literature-Based Discovery by an Enhanced Information Retrie...
Sort
View
WWW
2008
ACM
14 years 8 months ago
As we may perceive: finding the boundaries of compound documents on the web
This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...
Pavel Dmitriev
PKDD
2007
Springer
120views Data Mining» more  PKDD 2007»
14 years 1 months ago
Site-Independent Template-Block Detection
Detection of template and noise blocks in web pages is an important step in improving the performance of information retrieval and content extraction. Of the many approaches propos...
Aleksander Kolcz, Wen-tau Yih
EACL
2003
ACL Anthology
13 years 9 months ago
An Integrated Term-Based Corpus Query System
In this paper we describe the X-TRACT workbench, which enables efficient termbased querying against a domain-specific literature corpus. Its main aim is to aid domain specialists ...
Kostas Manios, Goran Nenadic, Irena Spasic, Sophia...
WWW
2008
ACM
14 years 8 months ago
Topic modeling with network regularization
In this paper, we formally define the problem of topic modeling with network structure (TMN). We propose a novel solution to this problem, which regularizes a statistical topic mo...
Qiaozhu Mei, Deng Cai, Duo Zhang, ChengXiang Zhai
CLEF
2006
Springer
13 years 11 months ago
Vocabulary Reduction and Text Enrichment at WebCLEF
Nowadays, cross-lingual Information Retrieval (IR) is one of the greatest challenges to deal with. Besides, one of the most important issues in IR consists in the corpus vocabular...
Franco Rojas López, Héctor Jim&eacut...