Nearest neighbor classifier is a widely-used effective method for multi-class problems. However, it suffers from the problem of the curse of dimensionality in high dimensional spac...
Guo-Jun Zhang, Ji-Xiang Du, De-Shuang Huang, Tat-M...
The problem of identifying approximately duplicate records in databases is an essential step for data cleaning and data integration processes. Most existing approaches have relied...
Most multimedia information retrieval systems use an indexing scheme to speed up similarity search. The index aims to discard large portions of the data collection at query time. ...
Mean shift clustering is a powerful unsupervised data
analysis technique which does not require prior knowledge
of the number of clusters, and does not constrain the shape
of th...
We introduce a new approach for Clustering and Aggregating Relational Data (CARD). We assume that data is available in a relational form, where we only have information about the ...