In this paper, we study the problem of auditing a batch of SQL queries: given a set of SQL queries that have been posed over a database, determine whether some subset of these queries have revealed private information about an individual or group of individuals. In [2], the authors studied the problem of determining whether any single SQL query in isolation revealed information forbidden by the database system’s data disclosure policies. In this paper, we extend this work to the problem of auditing a batch of SQL queries. We define two different notions of auditing - semantic auditing and syntactic auditing - and show that while syntactic auditing seems more desirable, it is in fact NP-hard to achieve. The problem of semantic auditing of a batch of SQL queries is, however, tractable and we give a polynomial time algorithm for this purpose.
Rajeev Motwani, Shubha U. Nabar, Dilys Thomas