Abstract-- In recent years, uncertain data management applications have grown in importance because of the large number of hardware applications which measure data approximately. F...
We introduce Pulse, a framework for processing continuous queries over models of continuous-time data, which can compactly and accurately represent many real-world activities and p...
The randomized response (RR) technique is a promising technique to disguise private categorical data in Privacy-Preserving Data Mining (PPDM). Although a number of RR-based methods...
Sensors capable of sensing phenomena at high data rates on the order of tens to hundreds of thousands of samples per second are now widely deployed in many industrial, civil engine...
Lewis Girod, Yuan Mei, Ryan Newton, Stanislav Rost...
Mining frequent itemsets from data streams has proved to be very difficult because of computational complexity and the need for real-time response. In this paper, we introduce a no...
Abstract-- Privacy preservation in data mining demands protecting both input and output privacy. The former refers to sanitizing the raw data itself before performing mining. The l...
Self-managing solutions have recently attracted a lot of interest from the database community. The need for self-* properties is more evident in distributed applications comprising...
We study the following problem: how to efficiently find in a collection of strings those similar to a given query string? Various similarity functions can be used, such as edit dis...
Abstract-- Simultaneously clustering columns and rows (coclustering) of large data matrix is an important problem with wide applications, such as document mining, microarray analys...
Errors in estimating page counts can lead to poor choice of access methods and in turn to poor quality plans. Although there is past work in using execution feedback for accurate c...
Surajit Chaudhuri, Vivek R. Narasayya, Ravishankar...