Focused Crawling Using Context Graphs

15 years 10 months ago

Download clgiles.ist.psu.edu

Maintaining currency of search engine indices by exhaustive crawling is rapidly becoming impossible due to the increasing size and dynamic content of the web. Focused crawlers aim to search only the subset of the web related to a specific category, and offer a potential solution to the currency problem. The major problem in focused crawling is performing appropriate credit assignment to different documents along a crawl path, such that short-term gains are not pursued at the expense of less-obvious crawl paths that ultimately yield larger sets of valuable pages. To address this problem we present a focused crawling algorithm that builds a model for the context within which topically relevant pages occur on the web. This context model can capture typical link hierarchies within which valuable pages occur, as well as model content on documents that frequently cooccur with relevant pages. Our algorithm further leverages the existing capability of large search engines to provide partial r...

Michelangelo Diligenti, Frans Coetzee, Steve Lawre

Real-time Traffic

Database | Exhaustive Crawling | Focused Crawling | Focused Crawling Algorithm | VLDB 2000 |

claim paper

» Evaluation Methods for Focused Crawling

» Reinforcement Learning with Classifier Selection for Focused Crawling

» Focused crawling for both topical relevance and quality of medical information

» Focusing on novelty a crawling strategy to build diverse language models

» Line graph explorer scalable display of line graphs using FocusContext

» Context Modeling Context as a Dressing of a Focus

» Balloon Focus a Seamless MultiFocusContext Method for Treemaps

» Fillets Cues for Connections in FocusContext Views of GraphLike Diagrams

Post Info
More Details (n/a)

Added	26 Aug 2010
Updated	26 Aug 2010
Type	Conference
Year	2000
Where	VLDB
Authors	Michelangelo Diligenti, Frans Coetzee, Steve Lawrence, C. Lee Giles, Marco Gori

Comments (0)

Sciweavers

Focused Crawling Using Context Graphs

Database | Exhaustive Crawling | Focused Crawling | Focused Crawling Algorithm | VLDB 2000 |

Explore & Download

Productivity Tools

Sciweavers