Sciweavers

1582 search results - page 180 / 317
» Digital Documents and Media
Sort
View
316
Voted
SIGIR
2011
ACM
14 years 9 months ago
Faster top-k document retrieval using block-max indexes
Large search engines process thousands of queries per second over billions of documents, making query processing a major performance bottleneck. An important class of optimization...
Shuai Ding, Torsten Suel
WWW
2008
ACM
16 years 6 months ago
As we may perceive: finding the boundaries of compound documents on the web
This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...
Pavel Dmitriev
KDD
2007
ACM
231views Data Mining» more  KDD 2007»
16 years 6 months ago
Xproj: a framework for projected structural clustering of xml documents
XML has become a popular method of data representation both on the web and in databases in recent years. One of the reasons for the popularity of XML has been its ability to encod...
Charu C. Aggarwal, Na Ta, Jianyong Wang, Jianhua F...
163
Voted
SIGIR
2005
ACM
15 years 11 months ago
Boosted decision trees for word recognition in handwritten document retrieval
Recognition and retrieval of historical handwritten material is an unsolved problem. We propose a novel approach to recognizing and retrieving handwritten manuscripts, based upon ...
Nicholas R. Howe, Toni M. Rath, R. Manmatha
CIKM
2004
Springer
15 years 11 months ago
Hierarchical document categorization with support vector machines
Automatically categorizing documents into pre-defined topic hierarchies or taxonomies is a crucial step in knowledge and content management. Standard machine learning techniques ...
Lijuan Cai, Thomas Hofmann