Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

193

DEBU
2010

108views more DEBU 2010»

Weighted Set-Based String Similarity

15 years 6 months ago

Weighted Set-Based String Similarity

Download sites.computer.org

Consider a universe of tokens, each of which is associated with a weight, and a database consisting of strings that can be represented as subsets of these tokens. Given a query string, also represented as a set of tokens, a weighted string similarity query identifies all strings in the database whose similarity to the query is larger than a user specified threshold. Weighted string similarity queries are useful in applications like data cleaning and integration for finding approximate matches in the presence of typographical mistakes, multiple formatting conventions, data transformation errors, etc. We show that this problem has semantic properties that can be exploited to design index structures that support very efficient algorithms for query answering.

Marios Hadjieleftheriou, Divesh Srivastava

Real-time Traffic

DEBU 2010 | Query String | Weighted String | Weighted String Similarity |

claim paper

Related Content

» Bitparallel Computation of Local Similarity Score Matrices with Unitary Weights

» Distance Based Indexing for String Proximity Search

» Sorting suffixes of twopattern strings

» Syllables and other String Kernel Extensions

» Sampling dirty data for matching attributes

» Evaluating Information Content by Factoid Analysis Human annotation and stability

» Spatial Weighting for BagofVisualWords and Its Application in ContentBased Image Retrieval

» Webbased acquisition of Japanese katakana variants

» Distribution kernels based on moments of counts

Post Info
More Details (n/a)

Added	10 Dec 2010
Updated	10 Dec 2010
Type	Journal
Year	2010
Where	DEBU
Authors	Marios Hadjieleftheriou, Divesh Srivastava

Comments (0)