Sciweavers

ICADL
2005
Springer

Harvesting for Full-Text Retrieval

14 years 6 months ago
Harvesting for Full-Text Retrieval
Abstract. We propose an approach to Distributed Information Retrieval based on the periodic and incremental centralisation of full-text indices of widely dispersed and autonomously managed content sources. Inspired by the success of the Open Archive Initiative’s protocol for metadata harvesting, the approach occupies middle ground between: (i) the crawling of content, and (ii) the distribution of retrieval. As in crawling, some data moves towards the retrieval process, but it is statistics about the content rather than content itself. As in distributed retrieval, some processing is distributed along with the data, but it is indexing rather than retrieval itself. We show that the approach retains the good properties of centralised retrieval without renouncing to cost-effective resource pooling. We discuss the requirements associated with the approach and identify two strategies to deploy it on top of the OAI infrastructure.
Fabio Simeoni, Murat Yakici, Steve Neely, Fabio Cr
Added 27 Jun 2010
Updated 27 Jun 2010
Type Conference
Year 2005
Where ICADL
Authors Fabio Simeoni, Murat Yakici, Steve Neely, Fabio Crestani
Comments (0)