Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

183

Voted

EMNLP
2009

130views Natural Language Processing» more EMNLP 2009»

Multilingual Spectral Clustering Using Document Similarity Propagation

15 years 4 months ago

Multilingual Spectral Clustering Using Document Similarity Propagation

Download www.aclweb.org

We present a novel approach for multilingual document clustering using only comparable corpora to achieve cross-lingual semantic interoperability. The method models document collections as weighted graph, and supervisory information is given as sets of must-linked constraints for documents in different languages. Recursive k-nearest neighbor similarity propagation is used to exploit the prior knowledge and merge two language spaces. Spectral method is applied to find the best cuts of the graph. Experimental results show that using limited supervisory information, our method achieves promising clustering results. Furthermore, since the method does not need any language dependent information in the process, our algorithm can be applied to languages in various alphabetical systems.

Dani Yogatama, Kumiko Tanaka-Ishii

Real-time Traffic

Achieves Promising Clustering | EMNLP 2009 | K-nearest Neighbor Similarity | Natural Language Processing | Supervisory Information |

claim paper

Related Content

» Multilingual document clusters discovery

» Clustering and Visualization in a Multilingual Multidocument Summarization System

» Exploiting multilingual nomenclatures and languageindependent text features as an interlin...

» Constrained spectral clustering through affinity propagation

» Spectral Clustering for a Large Data Set by Reducing the Similarity Matrix Size

» Comparison of Cluster Algorithms for the Analysis of Text Data Using Kolmogorov Complexity

» A LanguageIndependent Approach to Identify the Named Entities in UnderResourced Languages ...

» Navigating multilingual news collections using automatically extracted information

» Parallel Spectral Clustering in Distributed Systems

Post Info
More Details (n/a)

Added	17 Feb 2011
Updated	17 Feb 2011
Type	Journal
Year	2009
Where	EMNLP
Authors	Dani Yogatama, Kumiko Tanaka-Ishii

Comments (0)