Record linkage is the problem of identifying similar records across different data sources. The similarity between two records is defined based on domain-specific similarity functi...
This paper describes a new method for providing recommendations tailored to a user's preferences using text mining techniques and online technical specifications of products....
Alexander Yates, James Joseph, Ana-Maria Popescu, ...
Data cleaning is the process of correcting anomalies in a data source, that may for instance be due to typographical errors, or duplicate representations of an entity. It is a cruc...
Randomized Response techniques have been empirically investigated in privacy preserving association rule mining. However, previous research on privacy preserving market basket data...
We propose a new clustering algorithm, called SyMP, which is based on synchronization of pulse-coupled oscillators. SyMP represents each data point by an Integrate-and-Fire oscill...