In this paper a number of alternative strategies for distributed/parallel association rule mining are investigated. The methods examined make use of a data structure, the T-tree, ...
The top-k similarity joins have been extensively studied and used
in a wide spectrum of applications such as information retrieval, decision
making, spatial data analysis and dat...
This paper studies the problem of mining entity translation, specifically, mining English and Chinese name pairs. Existing efforts can be categorized into (a) a transliterationbas...
Gae-won You, Seung-won Hwang, Young-In Song, Long ...
Abstract. This paper shows how Wikipedia and the semantic knowledge it contains can be exploited for document clustering. We first create a concept-based document representation b...
Anna Huang, David N. Milne, Eibe Frank, Ian H. Wit...
Given a user-specified minimum correlation threshold and a market basket database with N items and T transactions, an all-strong-pairs correlation query finds all item pairs with...