Data integrated from multiple sources may contain inconsistencies that violate integrity constraints. The constraint repair problem attempts to find "low cost" changes t...
Philip Bohannon, Michael Flaster, Wenfei Fan, Raje...
We present a replication-based approach to fault-tolerant distributed stream processing in the face of node failures, network failures, and network partitions. Our approach aims t...
Magdalena Balazinska, Hari Balakrishnan, Samuel Ma...
We demonstrate the schema and ontology matching tool COMA++. It extends our previous prototype COMA utilizing a composite approach to combine different match algorithms [3]. COMA+...
David Aumueller, Hong Hai Do, Sabine Massmann, Erh...
Borealis is a distributed stream processing engine that is being developed at Brandeis University, Brown University, and MIT. Borealis inherits core stream processing functionalit...
We present techniques for privacy-preserving computation of multidimensional aggregates on data partitioned across multiple clients. Data from different clients is perturbed (rand...
Rakesh Agrawal, Ramakrishnan Srikant, Dilys Thomas
Research on query optimization has focused almost exclusively on reducing query execution time, while important qualities such as consistency and predictability have largely been ...
A common problem in many types of databases is retrieving the most similar matches to a query object. Finding those matches in a large database can be too slow to be practical, es...
The description, composition, and execution of even logically simple scientific workflows are often complicated by the need to deal with "messy" issues like heterogeneou...
Yong Zhao, James E. Dobson, Ian T. Foster, Luc Mor...
With the advent of Grid and application technologies, scientists and engineers are building more and more complex applications to manage and process large data sets, and execute s...