In this paper we discuss a rigorous foundation of similarity reasoning based on the concept of utility. If utility is formulated in mathematical terms it can serve as a formal spe...
This paper presents a MapReduce algorithm for computing pairwise document similarity in large document collections. MapReduce is an attractive framework because it allows us to de...
Similarity metrics that are learned from labeled training
data can be advantageous in terms of performance
and/or efficiency. These learned metrics can then be used
in conjuncti...
We study similarity queries for time series data where similarity is defined in terms of a set of linear transformations on the Fourier series representation of a sequence. We hav...
We propose a technique for measuring the structural similarity of semistructured documents based on entropy. After extracting the structural information from two documents we use ...