Frequently the behaviour of an information system is functionally correct, but it does not meet some quality criteria, such as completeness, consistency, and usability. One way to...
A method is presented to partition a given set of data entries embedded in Euclidean space by recursively bisecting clusters into smaller ones. The initial set is subdivided into ...
In spite of the initialization problem, the ExpectationMaximization (EM) algorithm is widely used for estimating the parameters in several data mining related tasks. Most popular ...
Chandan K. Reddy, Hsiao-Dong Chiang, Bala Rajaratn...
The problem of record linkage focuses on determining whether two object descriptions refer to the same underlying entity. Addressing this problem effectively has many practical ap...
Document clustering has been used for better document retrieval, document browsing, and text mining in digital library. In this paper, we perform a comprehensive comparison study ...