Parallelization and Characterization of Probabilistic Latent Semantic Analysis

15 years 9 months ago

Download hpc.cs.tsinghua.edu.cn

Probabilistic Latent Semantic Analysis (PLSA) is one of the most popular statistical techniques for the analysis of two-model and co-occurrence data. It has applications in information retrieval and ﬁltering, nature language processing, machine learning from text, and other related areas. However, PLSA is rarely applied to large datasets due to its high computational complexity. This paper presents an optimized and parallelized implementation of PLSA which is capable of processing datasets with 10000 documents in seconds. Compared to the baseline program, our parallelized program can achieve speedup of more than six on an eight-processor machine. The characterization of the parallel program is also presented. The performance analysis of the parallel program indicates that this program is memory intensive and the limited memory bandwidth is the bottleneck for better speedup.

Chuntao Hong, Wenguang Chen, Weimin Zheng, Jiulong

Real-time Traffic

Distributed And Parallel Computing | ICPP 2008 | Parallel Program | Popular Statistical Techniques | Probabilistic Latent Semantic Analysis |

claim paper

» Web usage mining based on probabilistic latent semantic analysis

» Collaborative filtering via gaussian probabilistic latent semantic analysis

» Discovering User Access Pattern Based on Probabilistic Latent Factor Model

» Latent Layout Analysis for Discovering Objects in Images

» Improving Probabilistic Latent Semantic Analysis with Principal Component Analysis

» Geolocated image analysis using latent representations

» Information retrieval based on collaborative filtering with latent interest semantic map

» Probabilistic Latent Semantic Indexing

Post Info
More Details (n/a)

Added	30 May 2010
Updated	30 May 2010
Type	Conference
Year	2008
Where	ICPP
Authors	Chuntao Hong, Wenguang Chen, Weimin Zheng, Jiulong Shan, Yurong Chen, Yimin Zhang

Comments (0)

Sciweavers

Parallelization and Characterization of Probabilistic Latent Semantic Analysis

Distributed And Parallel Computing | ICPP 2008 | Parallel Program | Popular Statistical Techniques | Probabilistic Latent Semantic Analysis |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers