The proliferation of video content on the web makes similarity detection an indispensable tool in web data management, searching, and navigation. In this paper, we propose a numbe...
We address the issue of classifying complex data. We focus on three main sources of complexity, namely, the high dimensionality of the observed data, the dependencies between these...
We present a new L1-distance-based k-means clustering algorithm to address the challenge of clustering high-dimensional proportional vectors. The new algorithm explicitly incorpor...
Bonnie K. Ray, Hisashi Kashima, Jianying Hu, Monin...
There is an increasing quantity of data with uncertainty arising from applications such as sensor network measurements, record linkage, and as output of mining algorithms. This un...
Supporting top-k queries over distributed collections of schemaless XML data poses two challenges. While XML supports expressive query languages such as XPath and XQuery, these la...