Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

124

PKDD
2005
Springer

favoriteEmaildiscussreport

122views Data Mining» more PKDD 2005»

A Probabilistic Clustering-Projection Model for Discrete Data

15 years 7 months ago

A Probabilistic Clustering-Projection Model for Discrete Data

Download www.dbs.informatik.uni-muenchen.de

For discrete co-occurrence data like documents and words, calculating optimal projections and clustering are two diﬀerent but related tasks. The goal of projection is to ﬁnd a low-dimensional latent space for words, and clustering aims at grouping documents based on their feature representations. In general projection and clustering are studied independently, but they both represent the intrinsic structure of data and should reinforce each other. In this paper we introduce a probabilistic clustering-projection (PCP) model for discrete data, where they are both represented in a uniﬁed framework. Clustering is seen to be performed in the projected space, and projection explicitly considers clustering structure. Iterating the two operations turns out to be exactly the variational EM algorithm under Bayesian model inference, and thus is guaranteed to improve the data likelihood. The model is evaluated on two text data sets, both showing very encouraging results.

Shipeng Yu, Kai Yu, Volker Tresp, Hans-Peter Krieg

Real-time Traffic

Bayesian Model Inference | Discrete Co-occurrence Data | Low-dimensional Latent Space | PKDD 2005 |

claim paper

Related Content

» GaP a factor model for discrete data

» Discrete Mixture Models for Unsupervised Image Segmentation

» Binomial Matrix Factorization for Discrete Collaborative Filtering

» The Orion Uncertain Data Management System

» Learning with Mixtures of Trees

» PerturbandMAP Random Fields Using Discrete Optimizationto Learn and Sample from Energy Mod...

» Aggregate Queries for Discrete and Continuous Probabilistic XML

» A generative probabilistic approach to visualizing sets of symbolic sequences

» Probabilistic Models For Joint Clustering And TimeWarping Of Multidimensional Curves

Post Info
More Details (n/a)

Added	28 Jun 2010
Updated	28 Jun 2010
Type	Conference
Year	2005
Where	PKDD
Authors	Shipeng Yu, Kai Yu, Volker Tresp, Hans-Peter Kriegel

Comments (0)