Clustering is a data mining problem which finds dense regions in a sparse multi-dimensional data set. The attribute values and ranges of these regions characterize the clusters. ...
Web pages contain a combination of unique content and template material, which is present across multiple pages and used primarily for formatting, navigation, and branding. We stu...
The rapid globalization of Wikipedia is generating a parallel, multi-lingual corpus of unprecedented scale. Pages for the same topic in many different languages emerge both as a r...
Abstract. A new method is proposed for compiling causal independencies into Markov logic networks (MLNs). An MLN can be viewed as compactly representing a factorization of a joint ...
Sriraam Natarajan, Tushar Khot, Daniel Lowd, Prasa...
Discovering episodes, frequent sets of events from a sequence has been an active field in pattern mining. Traditionally, a level-wise approach is used to discover all frequent epis...