In this paper, we present a general guideline to find a better distance measure for similarity estimation based on statistical analysis of distribution models and distance function...
Jie Yu, Jaume Amores, Nicu Sebe, Petia Radeva, Qi ...
Information theoretic based measures form a fundamental class of similarity measures for comparing clusterings, beside the class of pair-counting based and set-matching based meas...
Numerous measures are used for performance evaluation in machine learning. In predictive knowledge discovery, the most frequently used measure is classification accuracy. With new...
The problem of identifying approximately duplicate records in databases is an essential step for data cleaning and data integration processes. Most existing approaches have relied...
A Hilbert space embedding for probability measures has recently been proposed, with applications including dimensionality reduction, homogeneity testing and independence testing. ...
Bharath K. Sriperumbudur, Arthur Gretton, Kenji Fu...