Sciweavers

516 search results - page 24 / 104
» Sets of k-Independent Strings
Sort
View
COCO
2004
Springer
79views Algorithms» more  COCO 2004»
14 years 29 days ago
On Pseudoentropy versus Compressibility
A source is compressible if we can efficiently compute short descriptions of strings in the support and efficiently recover the strings from the descriptions. A source has high ps...
Hoeteck Wee
ICDAR
1999
IEEE
13 years 12 months ago
Models and Algorithms for Duplicate Document Detection
This paper introduces a framework for clarifying and formalizing the duplicate document detection problem. Four distinct models are presented, each with a corresponding algorithm ...
Daniel P. Lopresti
ICDE
2009
IEEE
135views Database» more  ICDE 2009»
14 years 9 months ago
Space-Constrained Gram-Based Indexing for Efficient Approximate String Search
Abstract-- Answering approximate queries on string collections is important in applications such as data cleaning, query relaxation, and spell checking, where inconsistencies and e...
Alexander Behm, Shengyue Ji, Chen Li, Jiaheng Lu
FOCS
2009
IEEE
13 years 11 months ago
Space-Efficient Framework for Top-k String Retrieval Problems
Given a set D = {d1, d2, ..., dD} of D strings of total length n, our task is to report the "most relevant" strings for a given query pattern P. This involves somewhat mo...
Wing-Kai Hon, Rahul Shah, Jeffrey Scott Vitter
SIGMOD
2008
ACM
142views Database» more  SIGMOD 2008»
14 years 7 months ago
Cost-based variable-length-gram selection for string collections to support approximate queries efficiently
Approximate queries on a collection of strings are important in many applications such as record linkage, spell checking, and Web search, where inconsistencies and errors exist in...
Xiaochun Yang, Bin Wang, Chen Li