The vast majority of work on skyline queries considers totally ordered domains, whereas in many applications some attributes are partially ordered, as for instance, domains of set ...
Usually, performance of classifiers is evaluated on real-world problems that mainly belong to public repositories. However, we ignore the inherent properties of these data and how...
Despite of the large number of algorithms developed for clustering, the study on comparing clustering results is limited. In this paper, we propose a measure for comparing cluster...
Deduplication is a key operation in integrating data from multiple sources. The main challenge in this task is designing a function that can resolve when a pair of records refer t...
The Web is constantly changing, but most tools used to access Web content deal only with what can be captured at a single instance in time. As a result, Web users may not have a g...