Sciweavers

CICLING
2007
Springer

Unsupervised Discrimination of Person Names in Web Contexts

14 years 6 months ago
Unsupervised Discrimination of Person Names in Web Contexts
Ambiguous person names are a problem in many forms of written text, including that which is found on the Web. In this paper we explore the use of unsupervised clustering techniques to discriminate among entities named in Web pages. We examine three main issues via an extensive experimental study. First, the effect of using a held–out set of training data for feature selection versus using the data in which the ambiguous names occur. Second, the impact of using different measures of association for identifying lexical features. Third, the success of different cluster stopping measures that automatically determine the number of clusters in the data.
Ted Pedersen, Anagha Kulkarni
Added 07 Jun 2010
Updated 07 Jun 2010
Type Conference
Year 2007
Where CICLING
Authors Ted Pedersen, Anagha Kulkarni
Comments (0)