A critical problem in clustering research is the definition of a proper metric to measure distances between points. Semi-supervised clustering uses the information provided by the ...
The task in text retrieval is to find the subset of a collection of documents relevant to a user's information request, usually expressed as a set of words. Classically, docu...
The similarity join is an important database primitive which has been successfully applied to speed up applications such as similarity search, data analysis and data mining. The s...
Random projection (RP) is a common technique for dimensionality reduction under L2 norm for which many significant space embedding results have been demonstrated. However, many si...
Dimensionality reduction is an important preprocessing step in high-dimensional data analysis without losing intrinsic information. The problem of semi-supervised nonlinear dimensi...