Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

178

RIAO
2004

157views Information Technology» more RIAO 2004»

Multilingual document clusters discovery

15 years 8 months ago

Multilingual document clusters discovery

Download www-list.cea.fr

Cross Language Information Retrieval community has brought up search engines over multilingual corpora, and multilingual text categorization systems. In this paper, we focus on the multilingual clusters discovery problem, which aim is to extract topic-related multilingual document clusters from a multilingual document collection in an unsupervised way. Our approach is based on a linguistic analysis of the documents that allows to identify relevant features for a vector representation of the documents, each language being associated with a different vector space. We propose a cross-lingual similarity measure for the documents, using bilingual dictionaries. A Shared Nearest Neighbor clustering algorithm is then used to build the clusters. We present an evaluation framework for this task, analyze and discuss the results we obtained and propose directions for future works. R

Benoît Mathieu, Romaric Besançon, Chr

Real-time Traffic

Multilingual | Nous | RIAO 2004 | RIAO 2007 | Sur Les |

claim paper

Related Content

» Multilingual Adaptive Search for Digital Libraries

» Exploiting multilingual nomenclatures and languageindependent text features as an interlin...

» Multilingual Document Clustering Using Wikipedia as External Knowledge

» Clustering and Visualization in a Multilingual Multidocument Summarization System

» An Intelligent Multilingual Information Browsing and Retrieval System Using Information Ex...

» A Latent Semantic Indexingbased approach to multilingual document clustering

» A LanguageIndependent Approach to Identify the Named Entities in UnderResourced Languages ...

» Multilingual Spectral Clustering Using Document Similarity Propagation

» Navigating multilingual news collections using automatically extracted information

Post Info
More Details (n/a)

Added	31 Oct 2010
Updated	31 Oct 2010
Type	Conference
Year	2004
Where	RIAO
Authors	Benoît Mathieu, Romaric Besançon, Christian Fluhr

Comments (0)