We identify privacy risks associated with releasing network data sets and provide an algorithm that mitigates those risks. A network consists of entities connected by links repres...
Michael Hay, Gerome Miklau, David Jensen, Donald F...
Recently, a number of papers have been published showing the benefits of column stores over row stores. However, the research comparing the two in an "apples-to-apples" ...
The number of successful attacks on the Internet shows that it is very difficult to guarantee the security of online search engines. A breached server that is not detected in time...
We study selectivity estimation techniques for set similarity queries. A wide variety of similarity measures for sets have been proposed in the past. In this work we concentrate o...
Marios Hadjieleftheriou, Xiaohui Yu, Nick Koudas, ...
DBPubs is a system for effectively analyzing and exploring the content of database publications by combining keyword search with OLAP-style aggregations, navigation, and reporting...
Akanksha Baid, Andrey Balmin, Heasoo Hwang, Erik N...
The number of potentially-related data resources available for querying -- databases, data warehouses, virtual integrated schemas -continues to grow rapidly. Perhaps no area has s...
Partha Pratim Talukdar, Marie Jacob, Muhammad Salm...
With the rise of XML, the database community has been challenged by semi-structured data processing. Since the data type behind XML is the tree, state-of-the-art RDBMSs have learn...
Sequence data is ubiquitous and finding frequent sequences in a large database is one of the most common problems when analyzing sequence data. Unfortunately many sources of seque...
Online communities like Flickr, del.icio.us and YouTube have established themselves as very popular and powerful services for publishing and searching contents, but also for ident...
Tom Crecelius, Mouna Kacimi, Sebastian Michel, Tho...