Most complex software projects are compiled using a build tool (e.g. make), which runs commands in an order satisfying userdefined dependencies. Unfortunately, most build tools r...
We consider the problem of repairing unranked trees (e.g., XML documents) satisfying a given restriction specification R (e.g., a DTD) into unranked trees satisfying a given targ...
We demonstrate the merits of using inter-document similarities for federated search. Specifically, we study a resultsmerging method that utilizes information induced from cluster...
Modeling user browsing behavior is an active research area with tangible real-world applications, e.g., organizations can adapt their online presence to their visitors browsing be...
Ideally, students in K-12 grade levels can turn to book recommenders to locate books that match their interests. Existing book recommenders, however, fail to take into account the...
Methods that reduce the amount of labeled data needed for training have focused more on selecting which documents to label than on which queries should be labeled. One exception t...
Spectral clustering is a widely used method for organizing data that only relies on pairwise similarity measurements. This makes its application to non-vectorial data straightforw...
Fabian L. Wauthier, Nebojsa Jojic, Michael I. Jord...
We tackle the challenging problem of mining the simplest Boolean patterns from categorical datasets. Instead of complete enumeration, which is typically infeasible for this class ...
Given huge collections of time-evolving events such as web-click logs, which consist of multiple attributes (e.g., URL, userID, timestamp), how do we find patterns and trends? Ho...
Computational experiments have become an integral part of the scientific method, but reproducing, archiving, and querying them is still a challenge. The first barrier to a wider...