Given a set of strings S of equal lengths over an alphabet Σ, the closest string problem seeks a string over Σ whose maximum Hamming distance to any of the given strings is as s...
Similarity search and similarity join on strings are important for applications such as duplicate detection, error detection, data cleansing, or comparison of biological sequences....
We compare different statistical characterizations of a set of strings, for three different histogram-based distances. Given a distance, a set of strings may be characterized by it...
Abstract. Kernel based methods (such as k-nearest neighbors classifiers) for AI tasks translate the classification problem into a proximity search problem, in a space that is usu...
An approximate search query on a collection of strings finds those strings in the collection that are similar to a given query string, where similarity is defined using a given si...