Web search logs contain extremely sensitive data, as evidenced by the recent AOL incident. However, storing and analyzing search logs can be very useful for many purposes (i.e. in...
Query optimization in RDF Stores is a challenging problem as SPARQL queries typically contain many more joins than equivalent relational plans, and hence lead to a large join orde...
In this paper we review the history of systems for managing “Big Data” as well as today’s activities and architectures from the (perhaps biased) perspective of three “data...
Cloud-based data management platforms often employ multitenant databases, where service providers achieve economies of scale by consolidating multiple tenants on shared servers. I...
Sean Kenneth Barker, Yun Chi, Hyun Jin Moon, Hakan...
In this paper, we study how to find maximal k-edge-connected subgraphs from a large graph. k-edge-connected subgraphs can be used to capture closely related vertices, and findin...
Increased availability of large repositories of chemical compounds has created new challenges and opportunities for the application of data-mining and indexing techniques to probl...
We present an effective optimization framework for general SQLlike map-reduce queries, which is based on a novel query algebra and uses a small number of higher-order physical ope...
In this paper we consider the problem of answering queries using views, with or without ontological constraints, which is important for data integration, query optimization, and d...
Record linkage analysis, which matches records referring to the same real world entities from different data sets, is an important task in data integration. Uncertainty often exi...