Sciweavers

129 search results - page 9 / 26
» Fully distributed EM for very large datasets
Sort
View
PAKDD
2010
ACM
173views Data Mining» more  PAKDD 2010»
13 years 5 months ago
Distributed Knowledge Discovery with Non Linear Dimensionality Reduction
Data mining tasks results are usually improved by reducing the dimensionality of data. This improvement however is achieved harder in the case that data lay on a non linear manifol...
Panagis Magdalinos, Michalis Vazirgiannis, Dialect...
SDM
2008
SIAM
177views Data Mining» more  SDM 2008»
13 years 9 months ago
Practical Private Computation and Zero-Knowledge Tools for Privacy-Preserving Distributed Data Mining
In this paper we explore private computation built on vector addition and its applications in privacypreserving data mining. Vector addition is a surprisingly general tool for imp...
Yitao Duan, John F. Canny
DBKDA
2010
IEEE
127views Database» more  DBKDA 2010»
13 years 6 months ago
Failure-Tolerant Transaction Routing at Large Scale
—Emerging Web2.0 applications such as virtual worlds or social networking websites strongly differ from usual OLTP applications. First, the transactions are encapsulated in an AP...
Idrissa Sarr, Hubert Naacke, Stéphane Gan&c...
VLDB
2002
ACM
154views Database» more  VLDB 2002»
13 years 7 months ago
I/O-Conscious Data Preparation for Large-Scale Web Search Engines
Given that commercial search engines cover billions of web pages, efficiently managing the corresponding volumes of disk-resident data needed to answer user queries quickly is a f...
Maxim Lifantsev, Tzi-cker Chiueh
FOIKS
2008
Springer
14 years 4 months ago
Cost-minimising strategies for data labelling : optimal stopping and active learning
Supervised learning deals with the inference of a distribution over an output or label space $\CY$ conditioned on points in an observation space $\CX$, given a training dataset $D$...
Christos Dimitrakakis, Christian Savu-Krohn