A fundamental task of data analysis is comprehending what distinguishes clusters found within the data. We present the problem of mining distinguishing sets which seeks to find s...
Abstract. This paper proposes to exploit content and usage information to rearrange an inverted index for a full-text IR system. The idea is to merge the entries of two frequently ...
Methods based on Boolean satisfiability (SAT) typically use a Conjunctive Normal Form (CNF) representation of the Boolean formula, and exploit the structure of the given problem ...
When classifying high-dimensional sequence data, traditional methods (e.g., HMMs, CRFs) may require large amounts of training data to avoid overfitting. In such cases dimensional...
— Redistricting is the process of dividing a geographic area into districts or zones. This process has been considered in the past as a problem that is computationally too comple...