The problem of similarity search (query-by-content) has attracted much research interest. It is a difficult problem because of the inherently high dimensionality of the data. The ...
Emergence of the web and online computing applications gave rise to rich large scale social activity data. One of the principal challenges then is to build models and understandin...
Frequent Pattern Mining (FPM) is a very powerful paradigm for mining informative and useful patterns in massive, complex datasets. In this paper we propose the Data Mining Templat...
Mohammed Javeed Zaki, Nilanjana De, Feng Gao, Paol...
Detection of near duplicate documents is an important problem in many data mining and information filtering applications. When faced with massive quantities of data, traditional d...
Aleksander Kolcz, Abdur Chowdhury, Joshua Alspecto...
A large body of work has been devoted to identifying community structure in networks. A community is often though of as a set of nodes that has more connections between its member...
Jure Leskovec, Kevin J. Lang, Anirban Dasgupta, Mi...