Detection of near duplicate documents is an important problem in many data mining and information filtering applications. When faced with massive quantities of data, traditional d...
Aleksander Kolcz, Abdur Chowdhury, Joshua Alspecto...
The entity resolution (ER) problem, which identifies duplicate entities that refer to the same real world entity, is essential in many applications. In this paper, in particular,...
Byung-Won On, Ergin Elmacioglu, Dongwon Lee, Jaewo...
Abstract. The web with its rapid expansion has become an excellent resource for gathering information and people’s opinion. A company owner wants to know who is the competitor, a...
Rui Li, Shenghua Bao, Jin Wang, Yuanjie Liu, Yong ...
The relationship between doctors and their patients is gaining more and more importance in the health care providing. It determines the compliance of the treatment and a part of t...
Discovery of association rules is an important problem in database mining. In this paper we present new algorithms for fast association mining, which scan the database only once, ...
Mohammed Javeed Zaki, Srinivasan Parthasarathy, Mi...