Sciweavers

1319 search results - page 16 / 264
» Using the Structure of HTML Documents to Improve Retrieval
Sort
View
WWW
2008
ACM
14 years 8 months ago
As we may perceive: finding the boundaries of compound documents on the web
This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...
Pavel Dmitriev
FTDCS
2003
IEEE
14 years 20 days ago
pFilter: Global Information Filtering and Dissemination Using Structured Overlay Networks
The exponential data growth rate of the Internet makes it increasingly difficult for people to find desired information in a timely fashion. Information filtering and dissemina...
Chunqiang Tang, Zhichen Xu
IPM
2008
141views more  IPM 2008»
13 years 7 months ago
Towards a unified approach to document similarity search using manifold-ranking of blocks
Document similarity search (i.e. query by example) aims to retrieve a ranked list of documents similar to a query document in a text corpus or on the Web. Most existing approaches...
Xiaojun Wan, Jianwu Yang, Jianguo Xiao
SIGIR
2008
ACM
13 years 7 months ago
Improving biomedical document retrieval using domain knowledge
Research articles typically introduce new results or findings and relate them to knowledge entities of immediate relevance. However, a large body of context knowledge related to t...
Shuguang Wang, Milos Hauskrecht
DIAL
2004
IEEE
156views Image Analysis» more  DIAL 2004»
13 years 11 months ago
Xed: A New Tool for eXtracting Hidden Structures from Electronic Documents
PDF became a very common format for exchanging printable documents. Further, it can be easily generated from the major documents formats, which make a huge number of PDF documents...
Karim Hadjar, Maurizio Rigamonti, Denis Lalanne, R...