

Metadata Propagation in the Web Using Co-Citations

14 years 8 months ago
Metadata Propagation in the Web Using Co-Citations
Given the large heterogeneity of the World Wide Web, using metadata on the search engines side seems to be a useful track for information retrieval. Though, because a manual qualification at the Web scale is not accessible, this track is little followed. We propose a semi-automatic method for propagating metadata. In a first step, homegeneous corpus are extracted. We used in our study the following properties: the authority type, the site type, the information type, and the page type. This first step is realized by a clusterization which uses a similarity measure based on the cocitation frequency between pages. Given the cluster hierarchy, the second step selects a reduced number of documents to be manually qualified and propagates the given metadata values to the other documents belonging to the same cluster. A qualitative evaluation and a preliminary study about the scalability of this method are presented. 1 Context None of the available search engines seems to take into accoun...
Camille Prime-Claverie, Michel Beigbeder, Thierry
Added 28 Jun 2010
Updated 28 Jun 2010
Type Conference
Year 2005
Where WEBI
Authors Camille Prime-Claverie, Michel Beigbeder, Thierry Lafouge
Comments (0)