Resolving Person Names in Web People Search

15 years 2 months ago

Download staff.science.uva.nl

Disambiguating person names in a set of documents (such as a set of web pages returned in response to a person name) is a key task for the presentation of results and the automatic profiling of experts. With largely unstructured document and an unknown number of people with the same name the problem presents many difficulties and challenges. This chapter treats the task of person name disambiguation as a document clustering problem, where it is assumed that the documents represent particular people. This leads to the person cluster hypothesis, which states that similar documents tend to represent the same person. Single Pass Clustering, k-Means Clustering, Agglomerative Clustering and Probabilistic Latent Semantic Analysis are employed and empirically evaluated in this context. On the SemEval 2007 Web People Search it is shown that the person cluster hypothesis holds reasonably well and that the Single Pass Clustering and Agglomerative Clustering methods provide best performance.

Krisztian Balog, Leif Azzopardi, Maarten de Rijke

Real-time Traffic

Document Clustering Problem | Internet Technology | Person Cluster Hypothesis | Single Pass Clustering | WWW 2008 |

claim paper

Post Info
More Details (n/a)

Added	21 Nov 2009
Updated	21 Nov 2009
Type	Conference
Year	2008
Where	WWW
Authors	Krisztian Balog, Leif Azzopardi, Maarten de Rijke

Comments (0)

Sciweavers

Resolving Person Names in Web People Search

Document Clustering Problem | Internet Technology | Person Cluster Hypothesis | Single Pass Clustering | WWW 2008 |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers