Traditional approaches to rule-based information extraction (IE) have primarily been based on regular expression grammars. However, these grammar-based systems have difficulty scal...
Frederick Reiss, Sriram Raghavan, Rajasekar Krishn...
In this paper we present algorithms for building and maintaining efficient collection trees that provide the conduit to disseminate data required for processing monitoring queries...
Validation of multi-column schema matchings is essential for successful database integration. This task is especially difficult when the databases to be integrated contain little o...
Bing Tian Dai, Nick Koudas, Divesh Srivastava, Ant...
Abstract-- In recent years, uncertain data management applications have grown in importance because of the large number of hardware applications which measure data approximately. F...
We introduce Pulse, a framework for processing continuous queries over models of continuous-time data, which can compactly and accurately represent many real-world activities and p...
The randomized response (RR) technique is a promising technique to disguise private categorical data in Privacy-Preserving Data Mining (PPDM). Although a number of RR-based methods...
Sensors capable of sensing phenomena at high data rates on the order of tens to hundreds of thousands of samples per second are now widely deployed in many industrial, civil engine...
Lewis Girod, Yuan Mei, Ryan Newton, Stanislav Rost...
Mining frequent itemsets from data streams has proved to be very difficult because of computational complexity and the need for real-time response. In this paper, we introduce a no...
Abstract-- Privacy preservation in data mining demands protecting both input and output privacy. The former refers to sanitizing the raw data itself before performing mining. The l...
Self-managing solutions have recently attracted a lot of interest from the database community. The need for self-* properties is more evident in distributed applications comprising...