—A novel method CLOSS intended for textual databases is proposed. It successfully identifies misspelled string clusters, even if the cluster border is not prominent. The method ...
Kernel Miner is a new data-mining tool based on building the optimal decision forest. The tool won second place in the KDD'99 Classifier Learning Contest, August 1999. We des...
We consider the problem of detecting anomalies in high arity categorical datasets. In most applications, anomalies are defined as data points that are 'abnormal'. Quite ...
The explosion of Web opinion data has made essential the need for automatic tools to analyze and understand people’s sentiments toward different topics. In most sentiment analy...
The traditional association rule mining framework produces many redundant rules. The extent of redundancy is a lot larger than previously suspected. We present a new framework for...