Hierarchy-Regularized Latent Semantic Indexing

16 years 5 days ago

Download www.dbs.informatik.uni-muenchen.de

Organizing textual documents into a hierarchical taxonomy is a common practice in knowledge management. Beside textual features, the hierarchical structure of directories reﬂects additional and important knowledge annotated by experts. It is generally desired to incorporate this information into text mining processes. In this paper, we propose hierarchy-regularized latent semantic indexing, which encodes the hierarchy into a similarity graph of documents and then formulates an optimization problem mapping each document into a low dimensional vector space. The new feature space preserves the intrinsic structure of the original taxonomy and thus provides a meaningful basis for various learning tasks like visualization and classiﬁcation. Our approach employs the information about class proximity and class speciﬁcity, and can naturally cope with multi-labeled documents. Our empirical studies show very encouraging results on two real-world data sets, the new Reuters (RCV1) benchmark ...

Yi Huang, Kai Yu, Matthias Schubert, Shipeng Yu, V

Real-time Traffic

Data Mining | Hierarchical Taxonomy | ICDM 2005 | Latent Semantic Indexing | Textual Documents |

claim paper

» A framework for understanding Latent Semantic Indexing LSI performance

» Automatic 3Language CrossLanguage Information Retrieval with Latent Semantic Indexing

» Incorporating Latent Semantic Indexing into Spectral Graph Transducer for Text Classificat...

» Supervised Latent Semantic Indexing Using Adaptive Sprinkling

» Understanding Latent Semantic Indexing A Topological Structure Analysis Using QAnalysis Me...

» On the equivalence between Nonnegative Matrix Factorization and Probabilistic Latent Seman...

» GraphBased Multilevel Dimensionality Reduction with Applications to Eigenfaces and Latent ...

» Understanding and Enhancing the FoldingIn Method in Latent Semantic Indexing

Post Info
More Details (n/a)

Added	24 Jun 2010
Updated	24 Jun 2010
Type	Conference
Year	2005
Where	ICDM
Authors	Yi Huang, Kai Yu, Matthias Schubert, Shipeng Yu, Volker Tresp, Hans-Peter Kriegel

Comments (0)

Sciweavers

Hierarchy-Regularized Latent Semantic Indexing

Data Mining | Hierarchical Taxonomy | ICDM 2005 | Latent Semantic Indexing | Textual Documents |

Explore & Download

Productivity Tools

Sciweavers