Abstract-- The rapid growth of Web communities has motivated many solutions for building community data portals. These solutions follow roughly two approaches. The first approach (...
Pedro DeRose, Xiaoyong Chai, Byron J. Gao, Warren ...
We study the scalable management of XML data in P2P networks based on distributed hash tables (DHTs). We identify performance limitations in this context, and propose an array of t...
We describe a machine-learning-based approach for extracting attribute labels from Web form interfaces. Having these labels is a requirement for several techniques that attempt to ...
Abstract-- Large graph datasets are common in many emerging database applications, and most notably in large-scale scientific applications. To fully exploit the wealth of informati...
Finding latent patterns in high dimensional data is an important research problem with numerous applications. Existing approaches can be summarized into 3 categories: feature selec...
In this paper, we propose the first formal privacy analysis of a data anonymization process known as the synthetic data generation, a technique becoming popular in the statistics c...
Ashwin Machanavajjhala, Daniel Kifer, John M. Abow...
Scientific and intelligence applications have special data handling needs. In these settings, data does not fit the standard model of short coded records that had dominated the dat...
Abstract-- Our demonstration introduces a novel system architecture which massively facilitates optimization in data stream management systems (DSMS). The basic idea is to decouple...
Many data integration solutions in the market today include tools for schema mapping, to help users visually relate elements of different schemas. Schema elements are connected wit...
Alessandro Raffio, Daniele Braga, Mauricio A. Hern...