In the information age, data is pervasive. In some applications, data explosion is a significant phenomenon. The massive data volume poses challenges to both human users and comp...
Feng Pan, Wei Wang 0010, Anthony K. H. Tung, Jiong...
Active machine learning algorithms are used when large numbers of unlabeled examples are available and getting labels for them is costly (e.g. requiring consulting a human expert)...
Data mining focuses on patterns that summarize the data. In this paper, we focus on mining patterns that could change the state by responding to opportunities of actions.
Yuelong Jiang, Ke Wang, Alexander Tuzhilin, Ada Wa...
In data mining, enumerate the frequent or the closed patterns is often the first difficult task leading to the association rules discovery. The number of these patterns represen...
Monotonicity is a simple yet significant qualitative characteristic. We consider the problem of segmenting an array in up to K segments. We want segments to be as monotonic as po...
We address the issue of providing highly informative and comprehensive annotations using information revealed by the structured vocabularies of Gene Ontology (GO). For a target, a...
We consider the problem of elastic matching of time series. We propose an algorithm that determines a subsequence of a target time series that best matches a query series. In the ...
Longin Jan Latecki, Vasileios Megalooikonomou, Qia...
Organizing textual documents into a hierarchical taxonomy is a common practice in knowledge management. Beside textual features, the hierarchical structure of directories reflect...
Yi Huang, Kai Yu, Matthias Schubert, Shipeng Yu, V...
This paper presents the triple jump framework for accelerating the EM algorithm and other bound optimization methods. The idea is to extrapolate the third search point based on th...
Many practical applications require that distance measures to be asymmetric and context-sensitive. We introduce Context-sensitive Learnable Asymmetric Dissimilarity (CLAD) measure...