To enable efficient similarity search in large databases, many indexing techniques use a linear transformation scheme to reduce dimensions and allow fast approximation. In this re...
We present a multi-dimensional indexing approach for fast sequence similarity search in DNA and protein databases. In particular, we propose effective transformations of subsequen...
Metric learning algorithms can provide useful distance functions for a variety of domains, and recent work has shown good accuracy for problems where the learner can access all di...
Prateek Jain, Brian Kulis, Inderjit S. Dhillon, Kr...
We study similarity queries for time series data where similarity is defined in terms of a set of linear transformations on the Fourier series representation of a sequence. We hav...
Identification of all objects in a dataset whose similarity is not less than a specified threshold is of major importance for management, search, and analysis of data. Set similari...