Sciweavers

1839 search results - page 345 / 368
» Feature Selection in Clustering Problems
Sort
View
DMKD
2004
ACM
139views Data Mining» more  DMKD 2004»
14 years 2 months ago
Iterative record linkage for cleaning and integration
Record linkage, the problem of determining when two records refer to the same entity, has applications for both data cleaning (deduplication) and for integrating data from multipl...
Indrajit Bhattacharya, Lise Getoor
SIGMOD
2010
ACM
214views Database» more  SIGMOD 2010»
14 years 1 months ago
ParaTimer: a progress indicator for MapReduce DAGs
Time-oriented progress estimation for parallel queries is a challenging problem that has received only limited attention. In this paper, we present ParaTimer, a new type of timere...
Kristi Morton, Magdalena Balazinska, Dan Grossman
NIPS
2007
13 years 10 months ago
Robust Regression with Twinned Gaussian Processes
We propose a Gaussian process (GP) framework for robust inference in which a GP prior on the mixing weights of a two-component noise model augments the standard process over laten...
Andrew Naish-Guzman, Sean B. Holden
BMCBI
2008
154views more  BMCBI 2008»
13 years 9 months ago
Bayesian models and meta analysis for multiple tissue gene expression data following corticosteroid administration
Background: This paper addresses key biological problems and statistical issues in the analysis of large gene expression data sets that describe systemic temporal response cascade...
Yulan Liang, Arpad Kelemen
BMCBI
2006
374views more  BMCBI 2006»
13 years 8 months ago
AMDA: an R package for the automated microarray data analysis
Background: Microarrays are routinely used to assess mRNA transcript levels on a genome-wide scale. Large amount of microarray datasets are now available in several databases, and...
Mattia Pelizzola, Norman Pavelka, Maria Foti, Paol...