Sciweavers

PAKDD
2009
ACM
80views Data Mining» more  PAKDD 2009»
14 years 6 months ago
Trace Mining from Distributed Assembly Databases for Causal Analysis
Shohei Hido, Hirofumi Matsuzawa, Fumihiko Kitayama...
PAKDD
2009
ACM
112views Data Mining» more  PAKDD 2009»
14 years 6 months ago
Romanization of Thai Proper Names Based on Popularity of Usages
The lack of standards for Romanization of Thai proper names makes searching activity a challenging task. This is particularly important when searching for people-related documents ...
Akegapon Tangverapong, Atiwong Suchato, Proadpran ...
PAKDD
2009
ACM
94views Data Mining» more  PAKDD 2009»
14 years 6 months ago
When does Co-training Work in Real Data?
Co-training, a paradigm of semi-supervised learning, may alleviate effectively the data scarcity problem (i.e., the lack of labeled examples) in supervised learning. The standard ...
Charles X. Ling, Jun Du, Zhi-Hua Zhou
PAKDD
2009
ACM
87views Data Mining» more  PAKDD 2009»
14 years 6 months ago
Application-Independent Feature Construction from Noisy Samples
When training classifiers, presence of noise can severely harm their performance. In this paper, we focus on “non-class” attribute noise and we consider how a frequent fault-t...
Dominique Gay, Nazha Selmaoui, Jean-Françoi...
PAKDD
2009
ACM
126views Data Mining» more  PAKDD 2009»
14 years 6 months ago
Tree-Based Method for Classifying Websites Using Extended Hidden Markov Models
One important problem proposed recently in the field of web mining is website classification problem. The complexity together with the necessity to have accurate and fast algorit...
Majid Yazdani, Milad Eftekhar, Hassan Abolhassani
PAKDD
2009
ACM
103views Data Mining» more  PAKDD 2009»
14 years 6 months ago
Hot Item Detection in Uncertain Data
Abstract. An object o of a database D is called a hot item, if there is a sufficiently large population of other objects in D that are similar to o. In other words, hot items are ...
Thomas Bernecker, Hans-Peter Kriegel, Matthias Ren...
PAKDD
2009
ACM
263views Data Mining» more  PAKDD 2009»
14 years 6 months ago
Spatial Weighting for Bag-of-Visual-Words and Its Application in Content-Based Image Retrieval
It is a challenging and important task to retrieve images from a large and highly varied image data set based on their visual contents. Problems like how to fill the semantic gap b...
Xin Chen, Xiaohua Hu, Xiajiong Shen
PAKDD
2009
ACM
151views Data Mining» more  PAKDD 2009»
14 years 6 months ago
Budget Semi-supervised Learning
In this paper we propose to study budget semi-supervised learning, i.e., semi-supervised learning with a resource budget, such as a limited memory insufficient to accommodate and/...
Zhi-Hua Zhou, Michael Ng, Qiao-Qiao She, Yuan Jian...
PAKDD
2009
ACM
116views Data Mining» more  PAKDD 2009»
14 years 6 months ago
Scalable Web Mining with Newistic
Abstract. Newistic is a web mining platform that collects and analyses documents crawled from the Internet. Although it currently processes news articles, it can be easily adapted ...
Ovidiu Dan, Horatiu Mocian
PAKDD
2009
ACM
96views Data Mining» more  PAKDD 2009»
14 years 6 months ago
Aggregated Subset Mining
The usual data mining setting uses the full amount of data to derive patterns for different purposes. Taking cues from machine learning techniques, we explore ways to divide the d...
Albrecht Zimmermann, Björn Bringmann