Sciweavers

SIGIR
2004
ACM

On scaling latent semantic indexing for large peer-to-peer systems

14 years 4 months ago
On scaling latent semantic indexing for large peer-to-peer systems
The exponential growth of data demands scalable infrastructures capable of indexing and searching rich content such as text, music, and images. A promising direction is to combine information retrieval with peer-to-peer technology for scalability, fault-tolerance, and low administration cost. One pioneering work along this direction is pSearch [32, 33]. pSearch places documents onto a peerto-peer overlay network according to semantic vectors produced using Latent Semantic Indexing (LSI). The search cost for a query is reduced since documents related to the query are likely to be co-located on a small number of nodes. Unfortunately, because of its reliance on LSI, pSearch also inherits the limitations of LSI. (1) When the corpus is large and heterogeneous, LSI’s retrieval quality is inferior to methods such as Okapi. (2) The Singular Value Decomposition (SVD) used in LSI is unscalable in terms of both memory consumption and computation time. This paper addresses the above limitations...
Chunqiang Tang, Sandhya Dwarkadas, Zhichen Xu
Added 30 Jun 2010
Updated 30 Jun 2010
Type Conference
Year 2004
Where SIGIR
Authors Chunqiang Tang, Sandhya Dwarkadas, Zhichen Xu
Comments (0)