Resolving Identity Uncertainty with Learned Random Walks

16 years 2 months ago

Download www.seas.upenn.edu

A pervasive problem in large relational databases is identity uncertainty which occurs when multiple entries in a database refer to the same underlying entity in the world. Relational databases exhibit rich graphical structure and are naturally modeled as graphs whose nodes represent entities and whose typed-edges represent relations between them. We propose using random walk models for resolving identity uncertainty since they have proven effective for ﬁnding points which are proximately located in a network. Because not all types of relations are equally helpful in alleviating identity uncertainty, we develop a supervised approach to learning the usefulness of different database relations from a training set of database entries whose true identities are known. When tested on the task of resolving uncertainty of ambiguously named authors in bibliographical data, the learned random walk models yield performance superior to support vector machines, and to a related spectral clusterin...

Ted Sandler, Lyle H. Ungar, Koby Crammer

Real-time Traffic

Data Mining | ICDM 2009 | Identity Uncertainty | Random Walk Models | Relational Databases |

claim paper

Added	23 May 2010
Updated	23 May 2010
Type	Conference
Year	2009
Where	ICDM
Authors	Ted Sandler, Lyle H. Ungar, Koby Crammer

Sciweavers

Resolving Identity Uncertainty with Learned Random Walks

Data Mining | ICDM 2009 | Identity Uncertainty | Random Walk Models | Relational Databases |

Explore & Download

Productivity Tools

Sciweavers