Focused Crawling Using Latent Semantic Indexing - An Application for Vertical Search Engines

16 years 5 days ago

Download poseidon.csd.auth.gr

Vertical search engines and web portals are gaining ground over the general-purpose engines due to their limited size and their high precision for the domain they cover. The number of vertical portals has rapidly increased over the last years, making the importance of a topic-driven (focused) crawler evident. In this paper, we develop a latent semantic indexing classiﬁer that combines link analysis with text content in order to retrieve and index domain speciﬁc web documents. We compare its efﬁciency with other well-known web information retrieval techniques. Our implementation presents a different approach to focused crawling and aims to overcome the size limitations of the initial training data while maintaining a high recall/precision ratio.

George Almpanidis, Constantine Kotropoulos, Ioanni

Real-time Traffic

ERCIMDL 2005 | Latent Semantic Indexing | Vertical Portals | Vertical Search Engines |

claim paper

» Crawling the web for structured documents

» Efficient search in large textual collections with redundancy

» Supporting Program Comprehension Using Semantic and Structural Information

» Learning to reduce the semantic gap in web image retrieval and annotation

» Mining the Web for Synonyms PMIIR versus LSA on TOEFL

» A new visual search interface for web browsing

Post Info
More Details (n/a)

Added	27 Jun 2010
Updated	27 Jun 2010
Type	Conference
Year	2005
Where	ERCIMDL
Authors	George Almpanidis, Constantine Kotropoulos, Ioannis Pitas

Comments (0)

Sciweavers

Focused Crawling Using Latent Semantic Indexing - An Application for Vertical Search Engines

ERCIMDL 2005 | Latent Semantic Indexing | Vertical Portals | Vertical Search Engines |

Explore & Download

Productivity Tools

Sciweavers