on Uncertain Data (Extended Abstract) Ming Hua Jian Pei Wenjie Zhang Xuemin Lin Simon Fraser University, Canada The University of New South Wales & NICTA {mhua, jpei}@cs.sfu.c...
Developments in semantic search technology have motivated the need for efficient and scalable entity annotation techniques. We demonstrate RAD: a tool for Rapid Annotator Developme...
Abstract-- Most current information extraction (IE) approaches have considered only static text corpora, over which we typically have to apply IE only once. Many real-world text co...
Fei Chen 0002, AnHai Doan, Jun Yang 0001, Raghu Ra...
Abstract-- The rapid growth of Web communities has motivated many solutions for building community data portals. These solutions follow roughly two approaches. The first approach (...
Pedro DeRose, Xiaoyong Chai, Byron J. Gao, Warren ...
We study the scalable management of XML data in P2P networks based on distributed hash tables (DHTs). We identify performance limitations in this context, and propose an array of t...
We describe a machine-learning-based approach for extracting attribute labels from Web form interfaces. Having these labels is a requirement for several techniques that attempt to ...
Abstract-- Large graph datasets are common in many emerging database applications, and most notably in large-scale scientific applications. To fully exploit the wealth of informati...
Finding latent patterns in high dimensional data is an important research problem with numerous applications. Existing approaches can be summarized into 3 categories: feature selec...
In this paper, we propose the first formal privacy analysis of a data anonymization process known as the synthetic data generation, a technique becoming popular in the statistics c...
Ashwin Machanavajjhala, Daniel Kifer, John M. Abow...
Scientific and intelligence applications have special data handling needs. In these settings, data does not fit the standard model of short coded records that had dominated the dat...