Published data is prone to privacy attacks. Sanitization methods aim to prevent these attacks while maintaining usefulness of the data for legitimate users. Quantifying the trade-...
Recently TRW fielded a prototype system for a government customer. It provides a wide range of capabilities including data collection, hierarchical storage, automated distribution...
Predictive data mining typically relies on labeled data without exploiting a much larger amount of available unlabeled data. The goal of this paper is to show that using unlabeled...
Kang Peng, Slobodan Vucetic, Bo Han, Hongbo Xie, Z...
Validation of multi-column schema matchings is essential for successful database integration. This task is especially difficult when the databases to be integrated contain little o...
Bing Tian Dai, Nick Koudas, Divesh Srivastava, Ant...
Background: Whole genome association studies using highly dense single nucleotide polymorphisms (SNPs) are a set of methods to identify DNA markers associated with variation in a ...
Stephen J. Goodswen, Cedric Gondro, Nathan S. Wats...