The PDF format is commonly used for the exchange of documents on the Web and there is a growing need to understand and extract or repurpose data held in PDF documents. Many system...
The availability of large, heterogeneous repositories of electronic documents is increasing rapidly, and the need for flexible, sophisticated document manipulation tools is growi...
Floriana Esposito, Stefano Ferilli, Teresa Maria A...
This paper describes the current state of RUgle, a system for classifying and indexing papers made available on the World Wide Web, in a domain-independent and universal manner. B...
Search engine technology plays an important role in Web information retrieval. However, with Internet information explosion, traditional searching techniques cannot provide satisfa...
Baile Shi, Guoyu Hao, Hongtao Xu, Mei Wang, Qi Zha...
In this paper, we study query evaluation on Active XML documents (AXML for short), a new generation of XML documents that has recently gained popularity. AXML documents are XML do...