

Streaming Cross Document Entity Coreference Resolution

13 years 9 months ago
Streaming Cross Document Entity Coreference Resolution
Previous research in cross-document entity coreference has generally been restricted to the offline scenario where the set of documents is provided in advance. As a consequence, the dominant approach is based on greedy agglomerative clustering techniques that utilize pairwise vector comparisons and thus require O(n2) space and time. In this paper we explore identifying coreferent entity mentions across documents in high-volume streaming text, including methods for utilizing orthographic and contextual information. We test our methods using several corpora to quantitatively measure both the efficacy and scalability of our streaming approach. We show that our approach scales to at least an order of magnitude larger data than previous reported methods.
Delip Rao, Paul McNamee, Mark Dredze
Added 13 May 2011
Updated 13 May 2011
Type Journal
Year 2010
Authors Delip Rao, Paul McNamee, Mark Dredze
Comments (0)