Traditional content-based image retrieval (CBIR) systems often fail to meet a user's need due to the `semantic gap' between the extracted features of the systems and the...
This paper presents a framework for user-oriented text mining. It is then illustrated with an example of discovering knowledge from competitors’ websites. The knowledge to be di...
Discovering accurate and interesting classification rules is a significant task in the post-processing stage of a data mining (DM) process. Therefore, an optimization problem exis...
Heterogeneous and dirty data is abundant. It is stored under different, often opaque schemata, it represents identical real-world objects multiple times, causing duplicates, and ...
Alexander Bilke, Jens Bleiholder, Christoph Bö...
Record linkage, the problem of determining when two records refer to the same entity, has applications for both data cleaning (deduplication) and for integrating data from multipl...