In the k-medoid problem, given a dataset P, we are asked to choose k points in P as the medoids. The optimal medoid set minimizes the average Euclidean distance between the points ...
Many numerical algorithms are specified in terms of operations on vectors and matrices. Matrix operations can be executed extremely efficiently using specialized linear algebra k...
Binary classification is a core data mining task. For large datasets or real-time applications, desirable classifiers are accurate, fast, and need no parameter tuning. We presen...
Web image search has been explored and developed in academic as well as commercial areas for over a decade. To measure the similarity between Web images and user queries, most of ...
Ying Liu, Tao Qin, Tie-Yan Liu, Lei Zhang, Wei-Yin...
Abstract. Existing methods to text plagiarism analysis mainly base on “chunking”, a process of grouping a text into meaningful units each of which gets encoded by an integer nu...