Sciweavers

ICDM
2005
IEEE
122views Data Mining» more  ICDM 2005»
14 years 2 months ago
Finding Representative Set from Massive Data
In the information age, data is pervasive. In some applications, data explosion is a significant phenomenon. The massive data volume poses challenges to both human users and comp...
Feng Pan, Wei Wang 0010, Anthony K. H. Tung, Jiong...
ICDM
2005
IEEE
163views Data Mining» more  ICDM 2005»
14 years 2 months ago
Balancing Exploration and Exploitation: A New Algorithm for Active Machine Learning
Active machine learning algorithms are used when large numbers of unlabeled examples are available and getting labels for them is costly (e.g. requiring consulting a human expert)...
Thomas Takeo Osugi, Kun Deng, Stephen D. Scott
ICDM
2005
IEEE
162views Data Mining» more  ICDM 2005»
14 years 2 months ago
Mining Patterns That Respond to Actions
Data mining focuses on patterns that summarize the data. In this paper, we focus on mining patterns that could change the state by responding to opportunities of actions.
Yuelong Jiang, Ke Wang, Alexander Tuzhilin, Ada Wa...
ICDM
2005
IEEE
177views Data Mining» more  ICDM 2005»
14 years 2 months ago
Average Number of Frequent (Closed) Patterns in Bernouilli and Markovian Databases
In data mining, enumerate the frequent or the closed patterns is often the first difficult task leading to the association rules discovery. The number of these patterns represen...
Loïck Lhote, François Rioult, Arnaud S...
ICDM
2005
IEEE
143views Data Mining» more  ICDM 2005»
14 years 2 months ago
An Optimal Linear Time Algorithm for Quasi-Monotonic Segmentation
Monotonicity is a simple yet significant qualitative characteristic. We consider the problem of segmenting an array in up to K segments. We want segments to be as monotonic as po...
Daniel Lemire, Martin Brooks, Yuhong Yan
ICDM
2005
IEEE
215views Data Mining» more  ICDM 2005»
14 years 2 months ago
CLUGO: A Clustering Algorithm for Automated Functional Annotations Based on Gene Ontology
We address the issue of providing highly informative and comprehensive annotations using information revealed by the structured vocabularies of Gene Ontology (GO). For a target, a...
In-Yee Lee, Jan-Ming Ho, Ming-Syan Chen
ICDM
2005
IEEE
147views Data Mining» more  ICDM 2005»
14 years 2 months ago
Partial Elastic Matching of Time Series
We consider the problem of elastic matching of time series. We propose an algorithm that determines a subsequence of a target time series that best matches a query series. In the ...
Longin Jan Latecki, Vasileios Megalooikonomou, Qia...
ICDM
2005
IEEE
188views Data Mining» more  ICDM 2005»
14 years 2 months ago
Hierarchy-Regularized Latent Semantic Indexing
Organizing textual documents into a hierarchical taxonomy is a common practice in knowledge management. Beside textual features, the hierarchical structure of directories reflect...
Yi Huang, Kai Yu, Matthias Schubert, Shipeng Yu, V...
ICDM
2005
IEEE
109views Data Mining» more  ICDM 2005»
14 years 2 months ago
Triple Jump Acceleration for the EM Algorithm
This paper presents the triple jump framework for accelerating the EM algorithm and other bound optimization methods. The idea is to extrapolate the third search point based on th...
Han-Shen Huang, Bou-Ho Yang, Chun-Nan Hsu
ICDM
2005
IEEE
117views Data Mining» more  ICDM 2005»
14 years 2 months ago
On Learning Asymmetric Dissimilarity Measures
Many practical applications require that distance measures to be asymmetric and context-sensitive. We introduce Context-sensitive Learnable Asymmetric Dissimilarity (CLAD) measure...
Krishna Kummamuru, Raghu Krishnapuram, Rakesh Agra...