Sciweavers

DMKD
2003
ACM

Deriving link-context from HTML tag tree

14 years 5 months ago
Deriving link-context from HTML tag tree
HTML anchors are often surrounded by text that seems to describe the destination page appropriately. The text surrounding a link or the link-context is used for a variety of tasks associated with Web information retrieval. These tasks can benefit by identifying regularities in the manner in which “good” contexts appear around links. In this paper, we describe a framework for conducting such a study. The framework serves as an evaluation platform for comparing various link-context derivation methods. We apply the framework to a sample of Web pages obtained from more than 10,000 different categories of the ODP. Our focus is on understanding the potential merits of using a Web page’s tag tree structure, for deriving link-contexts. We find that good link-context can be associated with tag tree hierarchy. Our results show that climbing up the tag tree when the linkcontext provided by greater depths is too short can provide better performance than some of the traditional techniques...
Gautam Pant
Added 05 Jul 2010
Updated 05 Jul 2010
Type Conference
Year 2003
Where DMKD
Authors Gautam Pant
Comments (0)