In this paper, we present a novel exploratory visual analytic system called TIARA (Text Insight via Automated Responsive Analytics), which combines text analytics and interactive ...
In plenty of scenarios, data can be represented as vectors mathematically abstracted as points in a Euclidean space. Because a great number of machine learning and data mining app...
Active learning (AL) is an increasingly popular strategy for mitigating the amount of labeled data required to train classifiers, thereby reducing annotator effort. We describe ...
Byron C. Wallace, Kevin Small, Carla E. Brodley, T...
Many real-world data mining tasks require the achievement of two distinct goals when applied to unseen data: first, to induce an accurate preference ranking, and second to give g...
Constrained clustering has been well-studied for algorithms like K-means and hierarchical agglomerative clustering. However, how to encode constraints into spectral clustering rem...
Many social Web sites allow users to annotate the content with descriptive metadata, such as tags, and more recently to organize content hierarchically. These types of structured ...
Anon Plangprasopchok, Kristina Lerman, Lise Getoor
Internet users regularly have the need to find biographies and facts of people of interest. Wikipedia has become the first stop for celebrity biographies and facts. However, Wik...
Xiaojiang Liu, Zaiqing Nie, Nenghai Yu, Ji-Rong We...
The detection of repeated subsequences, time series motifs, is a problem which has been shown to have great utility for several higher-level data mining algorithms, including clas...
Recent research in privacy-preserving data mining (PPDM) has become increasingly popular due to the wide application of data mining and the increased concern regarding the protect...
Bin Yang, Hiroshi Nakagawa, Issei Sato, Jun Sakuma