Metadata Propagation in the Web Using Co-Citations

16 years 3 days ago

Download www.emse.fr

Given the large heterogeneity of the World Wide Web, using metadata on the search engines side seems to be a useful track for information retrieval. Though, because a manual qualiﬁcation at the Web scale is not accessible, this track is little followed. We propose a semi-automatic method for propagating metadata. In a ﬁrst step, homegeneous corpus are extracted. We used in our study the following properties: the authority type, the site type, the information type, and the page type. This ﬁrst step is realized by a clusterization which uses a similarity measure based on the cocitation frequency between pages. Given the cluster hierarchy, the second step selects a reduced number of documents to be manually qualiﬁed and propagates the given metadata values to the other documents belonging to the same cluster. A qualitative evaluation and a preliminary study about the scalability of this method are presented. 1 Context None of the available search engines seems to take into accoun...

Camille Prime-Claverie, Michel Beigbeder, Thierry

Real-time Traffic

Internet Technology | Metadata | Search Engines | WEBI 2005 | ﬁrst Step |

claim paper

» Towards combining web classification and web information extraction a case study

» A Probabilistic Approach for Learning Folksonomies from Structured Data

» Improving application security with data flow assertions

» Spatial location and its relevance for terminological inferences in bioontologies

Post Info
More Details (n/a)

Added	28 Jun 2010
Updated	28 Jun 2010
Type	Conference
Year	2005
Where	WEBI
Authors	Camille Prime-Claverie, Michel Beigbeder, Thierry Lafouge

Comments (0)

Sciweavers

Metadata Propagation in the Web Using Co-Citations

Internet Technology | Metadata | Search Engines | WEBI 2005 | ﬁrst Step |

Explore & Download

Productivity Tools

Sciweavers