Sciweavers

69 search results - page 12 / 14
» Parallel Mining of Outliers in Large Database
Sort
View
KDD
2007
ACM
182views Data Mining» more  KDD 2007»
14 years 7 months ago
Cleaning disguised missing data: a heuristic approach
In some applications such as filling in a customer information form on the web, some missing values may not be explicitly represented as such, but instead appear as potentially va...
Ming Hua, Jian Pei
KDD
2004
ACM
147views Data Mining» more  KDD 2004»
14 years 25 days ago
Clustering time series from ARMA models with clipped data
Clustering time series is a problem that has applications in a wide variety of fields, and has recently attracted a large amount of research. In this paper we focus on clustering...
Anthony J. Bagnall, Gareth J. Janacek
BMCBI
2005
246views more  BMCBI 2005»
13 years 7 months ago
ParPEST: a pipeline for EST data analysis based on parallel computing
Background: Expressed Sequence Tags (ESTs) are short and error-prone DNA sequences generated from the 5' and 3' ends of randomly selected cDNA clones. They provide an im...
Nunzio D'Agostino, Mario Aversano, Maria Luisa Chi...
KDD
2001
ACM
216views Data Mining» more  KDD 2001»
14 years 7 months ago
The distributed boosting algorithm
In this paper, we propose a general framework for distributed boosting intended for efficient integrating specialized classifiers learned over very large and distributed homogeneo...
Aleksandar Lazarevic, Zoran Obradovic
VLDB
2005
ACM
180views Database» more  VLDB 2005»
14 years 29 days ago
Cache-conscious Frequent Pattern Mining on a Modern Processor
In this paper, we examine the performance of frequent pattern mining algorithms on a modern processor. A detailed performance study reveals that even the best frequent pattern min...
Amol Ghoting, Gregory Buehrer, Srinivasan Parthasa...