— Complex ad hoc join queries over enterprise databases are commonly used by business data analysts to understand and analyze a variety of enterprise-wide processes. However, effectively formulating such queries is a challenging task for human users, especially over databases that have large, heterogeneous schemas. In this paper, we propose a novel approach to automatically create join query recommendations based on input-output specifications (i.e., input tables on which selection conditions are imposed, and output tables whose attribute values must be in the result of the query). The recommended join query graph includes (i) “intermediate” tables, and (ii) join conditions that connect the input and output tables via the intermediate tables. Our method is based on analyzing an existing query log over the enterprise database. Borrowing from program slicing techniques, which extract parts of a program that affect the value of a given variable, we first extract “query slices”...
Xiaoyan Yang, Cecilia M. Procopiuc, Divesh Srivast