Using KCCA for Japanese-English cross-language information retrieval and document classification

15 years 6 months ago

Download eprints.pascal-network.org

Kernel Canonical Correlation Analysis (KCCA) is a method of correlating linear relationship between two variables in a kernel defined feature space. A machine learning algorithm based on KCCA is studied for cross-language information retrieval. We apply the algorithm in Japanese-English cross-language information retrieval. The results are quite encouraging and are significantly better than those obtained by other state of the art methods. Computational complexity is an important issue when applying KCCA to large dataset as in information retrieval. We experimentally evaluate several methods to alleviate the problem of applying KCCA to large datasets. We also investigate cross-language document classification using KCCA as well as other methods. Our results show that it is feasible to use a classifier learned in one language to classify the documents in other languages.

Yaoyong Li, John Shawe-Taylor

Real-time Traffic

Cross-language Information Retrieval | JIIS 2006 | KCCA | Kernel Canonical Correlation |

claim paper

» GeoCLEF 2007 The CLEF 2007 CrossLanguage Geographic Information Retrieval Track Overview

» Improving Retrieval Effectiveness by Reranking Documents Based on Controlled Vocabulary

» Prior Art Search Using International Patent Classification Codes and AllClaimsQueries

Post Info
More Details (n/a)

Added	13 Dec 2010
Updated	13 Dec 2010
Type	Journal
Year	2006
Where	JIIS
Authors	Yaoyong Li, John Shawe-Taylor

Comments (0)

Sciweavers

Using KCCA for Japanese-English cross-language information retrieval and document classification

Cross-language Information Retrieval | JIIS 2006 | KCCA | Kernel Canonical Correlation |

Explore & Download

Productivity Tools

Sciweavers