

Focused Crawling Using Latent Semantic Indexing - An Application for Vertical Search Engines

14 years 8 months ago
Focused Crawling Using Latent Semantic Indexing - An Application for Vertical Search Engines
Vertical search engines and web portals are gaining ground over the general-purpose engines due to their limited size and their high precision for the domain they cover. The number of vertical portals has rapidly increased over the last years, making the importance of a topic-driven (focused) crawler evident. In this paper, we develop a latent semantic indexing classifier that combines link analysis with text content in order to retrieve and index domain specific web documents. We compare its efficiency with other well-known web information retrieval techniques. Our implementation presents a different approach to focused crawling and aims to overcome the size limitations of the initial training data while maintaining a high recall/precision ratio.
George Almpanidis, Constantine Kotropoulos, Ioanni
Added 27 Jun 2010
Updated 27 Jun 2010
Type Conference
Year 2005
Authors George Almpanidis, Constantine Kotropoulos, Ioannis Pitas
Comments (0)