Scalable Name Disambiguation using Multi-level Graph Partition

15 years 8 months ago

Download pike.psu.edu

When non-unique values are used as the identiﬁer of entities, due to their homonym, confusion can occur. In particular, when (part of) “names” of entities are used as their identiﬁer, the problem is often referred to as the name disambiguation problem, where goal is to sort out the erroneous entities due to name homonyms (e.g., if only last name is used as the identiﬁer, one cannot distinguish “Vannevar Bush” from “George Bush”). In this paper, in particular, we study the scalability issue of the name disambiguation problem – when (1) a small number of entities with large contents or (2) a large number of entities get un-distinguishable due to homonyms, how to resolve it? We ﬁrst carefully examine two of the state-of-the-art solutions to the name disambiguation problem, and point out their limitations with respect to scalability. Then, we adapt the multi-level graph partition technique to solve the large-scale name disambiguation problem. Our claim is empirically...

Byung-Won On, Dongwon Lee

Real-time Traffic

Data Mining | Disambiguation Problem | Entities | Erroneous Entities | SDM 2007 |

claim paper

Added	30 Oct 2010
Updated	30 Oct 2010
Type	Conference
Year	2007
Where	SDM
Authors	Byung-Won On, Dongwon Lee

Sciweavers

Scalable Name Disambiguation using Multi-level Graph Partition

Data Mining | Disambiguation Problem | Entities | Erroneous Entities | SDM 2007 |

Explore & Download

Productivity Tools

Sciweavers