Pattern matching for intelligence organizations is a challenging problem. The data sets are large and noisy, and there is a flexible and constantly changing notion of what consti...
Michael Wolverton, Pauline Berry, Ian W. Harrison,...
We introduce a class of geodesic distances and extend the K-means clustering algorithm to employ this distance metric. Empirically, we demonstrate that our geodesic K-means algori...
In data integration applications, a join matches elements that are common to two data sources. Often, however, elements are represented slightly different in each source, so an app...
There has been significant recent interest in sparse metric learning (SML) in which we simultaneously learn both a good distance metric and a low-dimensional representation. Unfor...
Abstract. A crucial problem in machine learning is to choose an appropriate representation of data, in a way that emphasizes the relations we are interested in. In many cases this ...