Contextual search and name disambiguation in email using graphs

16 years 1 months ago

Download www.cs.cmu.edu

Similarity measures for text have historically been an important tool for solving information retrieval problems. In many interesting settings, however, documents are often closely connected to other documents, as well as other non-textual objects: for instance, email messages are connected to other messages via header information. In this paper we consider extended similarity metrics for documents and other objects embedded in graphs, facilitated via a lazy graph walk. We provide a detailed instantiation of this framework for email data, where content, social networks and a timeline are integrated in a structural graph. The suggested framework is evaluated for two email-related problems: disambiguating names in email documents, and threading. We show that reranking schemes based on the graph-walk similarity measures often outperform baseline methods, and that further improvements can be obtained by use of appropriate learning methods. Categories and Subject Descriptors H.3.3 [Informa...

Einat Minkov, William W. Cohen, Andrew Y. Ng

Real-time Traffic

Extended Similarity Metrics | Information Retrieval Problems | SIGIR 2006 | Similarity Measures |

claim paper

Post Info
More Details (n/a)

Added	14 Jun 2010
Updated	14 Jun 2010
Type	Conference
Year	2006
Where	SIGIR
Authors	Einat Minkov, William W. Cohen, Andrew Y. Ng

Comments (0)

Sciweavers

Contextual search and name disambiguation in email using graphs

Extended Similarity Metrics | Information Retrieval Problems | SIGIR 2006 | Similarity Measures |

Explore & Download

Productivity Tools

Sciweavers