On active learning of record matching packages

14 years 5 months ago

Download research.microsoft.com

We consider the problem of learning a record matching package (classiﬁer) in an active learning setting. In active learning, the learning algorithm picks the set of examples to be labeled, unlike more traditional passive learning setting where a user selects the labeled examples. Active learning is important for record matching since manually identifying a suitable set of labeled examples is difﬁcult. Previous algorithms that use active learning for record matching have serious limitations: The packages that they learn lack quality guarantees and the algorithms do not scale to large input sizes. We present new algorithms for this problem that overcome these limitations. Our algorithms are fundamentally different from traditional active learning approaches, and are designed ground up to exploit problem characteristics speciﬁc to record matching. We include a detailed experimental evaluation on realworld data demonstrating the effectiveness of our algorithms. Categories and Subjec...

Arvind Arasu, Michaela Götz, Raghav Kaushik

Real-time Traffic

Active Learning | Database | Record Matching | Record Matching Package | SIGMOD 2010 |

claim paper

Post Info
More Details (n/a)

Added	18 Jul 2010
Updated	18 Jul 2010
Type	Conference
Year	2010
Where	SIGMOD
Authors	Arvind Arasu, Michaela Götz, Raghav Kaushik

Comments (0)

Sciweavers

On active learning of record matching packages

Active Learning | Database | Record Matching | Record Matching Package | SIGMOD 2010 |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers