Sciweavers

ICAI
2010

An Efficient Sequential Covering Algorithm for Explaining Subsets of Data

13 years 9 months ago
An Efficient Sequential Covering Algorithm for Explaining Subsets of Data
Given a subset of data that differs from the rest, a user often wants an explanation as to why this is the case. For instance, in a database of flights, a user may want to understand why certain flights were very late. This paper presents ESCAPE, a sequential covering algorithm designed to generate explanations of subsets that take the form of disjunctive normal rules describing the characteristics ({attribute, value} pairs) that differentiates the subsets from the rest of the data. Our experiments demonstrate that ESCAPE discovers explanations that are both compact, in that just a few rules cover the subset, and specific, in that the rules cover the subset but not the rest of the data. Our experiments compare ESCAPE to RIPPER, a popular, traditional rule learning algorithm and show that ESCAPE's rules yield better covering explanations. Further, ESCAPE was designed to be efficient, and we formally demonstrate that ESCAPE runs in loglinear time.
Matthew Michelson, Sofus A. Macskassy
Added 11 Feb 2011
Updated 11 Feb 2011
Type Journal
Year 2010
Where ICAI
Authors Matthew Michelson, Sofus A. Macskassy
Comments (0)