Part of the process of data integration is determining which sets of identifiers refer to the same real-world entities. In integrating databases found on the Web or obtained by us...
With the increasing availability of spatial data in many applications, spatial clustering and outlier detection has received a lot of attention in the database and data mining comm...
We study the task of detecting the occurrence of objects in large image collections or in videos, a problem that combines aspects of content based image retrieval and object locali...
Partial replication is one type of optimization to speed up execution of queries submitted to large datasets. In partial replication, a portion of the dataset is extracted, re-org...
Abstract. Most real-world datasets are, to a certain degree, skewed. When considered that they are also large, they become the pinnacle challenge in data analysis. More importantly...