Improving collection selection with overlap awareness in P2P search engines

16 years 1 months ago

Download www.mpi-inf.mpg.de

Collection selection has been a research issue for years. Typically, in related work, precomputed statistics are employed in order to estimate the expected result quality of each collection, and subsequently the collections are ranked accordingly. Our thesis is that this simple approach is insuﬃcient for several applications in which the collections typically overlap. This is the case, for example, for the collections built by autonomous peers crawling the web. We argue for the extension of existing quality measures using estimators of mutual overlap among collections and present experiments in which this combination outperforms CORI, a popular approach based on quality estimation. We outline our prototype implementation of a P2P web search engine, coined MINERVA1 , that allows handling large amounts of data in a distributed and self-organizing manner. We conduct experiments which show that taking overlap into account during collection selection can drastically decrease the number o...

Matthias Bender, Sebastian Michel, Peter Triantafi

Real-time Traffic

Collection Selection | Combination Outperforms Cori | SIGIR 2005 | Web Search |

claim paper

Post Info
More Details (n/a)

Added	26 Jun 2010
Updated	26 Jun 2010
Type	Conference
Year	2005
Where	SIGIR
Authors	Matthias Bender, Sebastian Michel, Peter Triantafillou, Gerhard Weikum, Christian Zimmer

Comments (0)

Sciweavers

Improving collection selection with overlap awareness in P2P search engines

Collection Selection | Combination Outperforms Cori | SIGIR 2005 | Web Search |

Explore & Download

Productivity Tools

Sciweavers