Sciweavers

KDD
2003
ACM
124views Data Mining» more  KDD 2003»
14 years 9 months ago
Information-theoretic co-clustering
Two-dimensional contingency or co-occurrence tables arise frequently in important applications such as text, web-log and market-basket data analysis. A basic problem in contingenc...
Inderjit S. Dhillon, Subramanyam Mallela, Dharmend...
KDD
2003
ACM
194views Data Mining» more  KDD 2003»
14 years 9 months ago
Finding recent frequent itemsets adaptively over online data streams
A data stream is a massive unbounded sequence of data elements continuously generated at a rapid rate. Consequently, the knowledge embedded in a data stream is more likely to be c...
Joong Hyuk Chang, Won Suk Lee
KDD
2003
ACM
111views Data Mining» more  KDD 2003»
14 years 9 months ago
Translation-invariant mixture models for curve clustering
Darya Chudova, Scott Gaffney, Eric Mjolsness, Padh...
KDD
2003
ACM
146views Data Mining» more  KDD 2003»
14 years 9 months ago
Probabilistic discovery of time series motifs
Several important time series data mining problems reduce to the core task of finding approximately repeated subsequences in a longer time series. In an earlier work, we formalize...
Bill Yuan-chi Chiu, Eamonn J. Keogh, Stefano Lonar...
KDD
2003
ACM
122views Data Mining» more  KDD 2003»
14 years 9 months ago
Understanding captions in biomedical publications
From the standpoint of the automated extraction of scientific knowledge, an important but little-studied part of scientific publications are the figures and accompanying captions....
William W. Cohen, Richard C. Wang, Robert F. Murph...
KDD
2003
ACM
146views Data Mining» more  KDD 2003»
14 years 9 months ago
Style mining of electronic messages for multiple authorship discrimination: first results
This paper considers the use of computational stylistics for performing authorship attribution of electronic messages, addressing categorization problems with as many as 20 differ...
Shlomo Argamon, Marin Saric, Sterling Stuart Stein
KDD
2003
ACM
152views Data Mining» more  KDD 2003»
14 years 9 months ago
An adaptive nearest neighbor search for a parts acquisition ePortal
Rafael Alonso, Jeffrey A. Bloom, Hua Li, Chumki Ba...
KDD
2003
ACM
130views Data Mining» more  KDD 2003»
14 years 9 months ago
Towards systematic design of distance functions for data mining applications
Distance function computation is a key subtask in many data mining algorithms and applications. The most effective form of the distance function can only be expressed in the conte...
Charu C. Aggarwal
KDD
2003
ACM
156views Data Mining» more  KDD 2003»
14 years 9 months ago
Mining distance-based outliers in near linear time with randomization and a simple pruning rule
Defining outliers by their distance to neighboring examples is a popular approach to finding unusual examples in a data set. Recently, much work has been conducted with the goal o...
Stephen D. Bay, Mark Schwabacher