Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

187

APPT
2005
Springer

159views Distributed and Parallel Com...» more APPT 2005»

Principal Component Analysis for Distributed Data Sets with Updating

16 years 10 days ago

Principal Component Analysis for Distributed Data Sets with Updating

Download www.math.cuhk.edu.hk

Identifying the patterns of large data sets is a key requirement in data mining. A powerful technique for this purpose is the principal component analysis (PCA). PCA-based clustering algorithms are effective when the data sets are found in the same location. In applications where the large data sets are physically far apart, moving huge amounts of data to a single location can become an impractical, or even impossible, task. A way around this problem was proposed in [10], where truncated singular value decompositions (SVDs) are computed locally and used to reduce the communication costs. Unfortunately, truncated SVDs introduce local approximation errors that could add up and would adversely affect the accuracy of the ﬁnal PCA. In this paper, we introduce a new method to compute the PCA without incurring local approximation errors. In addition, we consider the situation of updating the PCA when new data arrive at the various locations.

Zheng-Jian Bai, Raymond H. Chan, Franklin T. Luk

Real-time Traffic

APPT 2005 | Data Sets | Distributed And Parallel Programming | Large Data Sets | Local Approximation Errors |

claim paper

Related Content

» Updating mixture of principal components for error concealment

» Distributed Principal Component Analysis for Wireless Sensor Networks

» Data Acquisition through Joint Compressive Sensing and Principal Component Analysis

» An algorithm for the principal component analysis of large data sets

» Predicting Brain States from fMRI Data Incremental Functional Principal Component Regressi...

» Supervised principal component analysis for gene set enrichment of microarray data with co...

» Gene set analysis using principal components

» Handling of incomplete data sets using ICA and SOM in data mining

» Nonlinear Component Analysis for LargeScale Data Set Using FixedPoint Algorithm

Post Info
More Details (n/a)

Added	26 Jun 2010
Updated	26 Jun 2010
Type	Conference
Year	2005
Where	APPT
Authors	Zheng-Jian Bai, Raymond H. Chan, Franklin T. Luk

Comments (0)