Sciweavers

PAKDD
2009
ACM
151views Data Mining» more  PAKDD 2009»
14 years 3 months ago
Budget Semi-supervised Learning
In this paper we propose to study budget semi-supervised learning, i.e., semi-supervised learning with a resource budget, such as a limited memory insufficient to accommodate and/...
Zhi-Hua Zhou, Michael Ng, Qiao-Qiao She, Yuan Jian...
PAKDD
2009
ACM
116views Data Mining» more  PAKDD 2009»
14 years 3 months ago
Scalable Web Mining with Newistic
Abstract. Newistic is a web mining platform that collects and analyses documents crawled from the Internet. Although it currently processes news articles, it can be easily adapted ...
Ovidiu Dan, Horatiu Mocian
PAKDD
2009
ACM
96views Data Mining» more  PAKDD 2009»
14 years 3 months ago
Aggregated Subset Mining
The usual data mining setting uses the full amount of data to derive patterns for different purposes. Taking cues from machine learning techniques, we explore ways to divide the d...
Albrecht Zimmermann, Björn Bringmann
PAKDD
2009
ACM
124views Data Mining» more  PAKDD 2009»
14 years 3 months ago
Dynamic Exponential Family Matrix Factorization
Abstract. We propose a new approach to modeling time-varying relational data such as e-mail transactions based on a dynamic extension of matrix factorization. To estimate effectiv...
Kohei Hayashi, Junichiro Hirayama, Shin Ishii
PAKDD
2009
ACM
115views Data Mining» more  PAKDD 2009»
14 years 3 months ago
Data Mining for Intrusion Detection: From Outliers to True Intrusions
Data mining for intrusion detection can be divided into several sub-topics, among which unsupervised clustering has controversial properties. Unsupervised clustering for intrusion...
Goverdhan Singh, Florent Masseglia, Céline ...
PAKDD
2009
ACM
72views Data Mining» more  PAKDD 2009»
14 years 3 months ago
A Multi-resolution Approach for Atypical Behaviour Mining
Atypical behaviours are the basis of a valuable knowledge in domains related to security (e.g. fraud detection for credit card [1], cyber security [4] or safety of critical systems...
Alice Marascu, Florent Masseglia
PAKDD
2009
ACM
127views Data Mining» more  PAKDD 2009»
14 years 3 months ago
Clustering Documents Using a Wikipedia-Based Concept Representation
Abstract. This paper shows how Wikipedia and the semantic knowledge it contains can be exploited for document clustering. We first create a concept-based document representation b...
Anna Huang, David N. Milne, Eibe Frank, Ian H. Wit...
PAKDD
2009
ACM
153views Data Mining» more  PAKDD 2009»
14 years 3 months ago
Outlier Detection in Axis-Parallel Subspaces of High Dimensional Data
Hans-Peter Kriegel, Peer Kröger, Erich Schube...
PAKDD
2009
ACM
133views Data Mining» more  PAKDD 2009»
14 years 3 months ago
On Link Privacy in Randomizing Social Networks
Many applications of social networks require relationship anonymity due to the sensitive, stigmatizing, or confidential nature of relationship. Recent work showed that the simple ...
Xiaowei Ying, Xintao Wu
PAKDD
2009
ACM
186views Data Mining» more  PAKDD 2009»
14 years 3 months ago
Pairwise Constrained Clustering for Sparse and High Dimensional Feature Spaces
Abstract. Clustering high dimensional data with sparse features is challenging because pairwise distances between data items are not informative in high dimensional space. To addre...
Su Yan, Hai Wang, Dongwon Lee, C. Lee Giles