Weighted Proximity Best-Joins for Information Retrieval

16 years 2 months ago

Download www.cs.duke.edu

We consider the problem of efficiently computing weighted proximity best-joins over multiple lists, with applications in information retrieval and extraction. We are given a multi-term query, and for each query term, a list of all its matches with scores, sorted by locations. The problem is to find the overall best matchset, consisting of one match from each list, such that the combined score according to a scoring function is maximized. We study three types of functions that consider both individual match scores and proximity of match locations in scoring a matchset. We present algorithms that exploit the properties of the scoring functions in order to achieve time complexities linear in the size of the match lists. Experiments show that these algorithms greatly outperform the naive algorithm based on taking the cross product of all match lists. Finally, we extend our algorithms for an alternative problem definition applicable to information extraction, where we need to find all good ...

AnHai Doan, Haixun Wang, Hao He, Jun Yang 0001, Ri

Real-time Traffic

Database | ICDE 2009 | Individual Match Scores | Match Lists | Match Locations | Problem Definition Applicable | Scoring Functions |

claim paper

Related Content

» Interactive Retrieval Using Weights

» Facetbased opinion retrieval from blogs

» Fast query execution for retrieval models based on pathconstrained random walks

» Proximity queries in large traffic networks

» University of Glasgow at TREC 2007 Experiments in Blog and Enterprise Tracks with Terrier

» Positional relevance model for pseudorelevance feedback

» Query Answering and Containment for Regular Path Queries under Distortions

Post Info
More Details (n/a)

Added	20 Oct 2009
Updated	20 Oct 2009
Type	Conference
Year	2009
Where	ICDE
Authors	AnHai Doan, Haixun Wang, Hao He, Jun Yang 0001, Risi Thonangi

Comments (0)

Sciweavers

Weighted Proximity Best-Joins for Information Retrieval

Database | ICDE 2009 | Individual Match Scores | Match Lists | Match Locations | Problem Definition Applicable | Scoring Functions |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers