Sciweavers

SIGIR
2008
ACM

Generating diverse katakana variants based on phonemic mapping

13 years 11 months ago
Generating diverse katakana variants based on phonemic mapping
In Japanese, it is quite common for the same word to be written in several different ways. This is especially true for katakana words which are typically used for transliterating foreign languages. This ambiguity becomes critical for automatic processing such as information retrieval (IR). To tackle this problem, we propose a simple but effective approach to generating katakana variants by considering phonemic representation of the original language for a given word. The proposed approach is evaluated through an assessment of the variants it generates. Also, the impact of the generated variants on IR is studied in comparison to an existing approach using katakana rewriting rules. Categories and Subject Descriptors H.3.3 [Information storage and retrieval]: Information Search and Retrieval--Query formulation; I.2.7 [Artificial intelligence]: Natural Language Processing--Language models General Terms Algorithm, Experimentation, Languages Keywords Information Retrieval, Katakana Variants...
Kazuhiro Seki, Hiroyuki Hattori, Kuniaki Uehara
Added 15 Dec 2010
Updated 15 Dec 2010
Type Journal
Year 2008
Where SIGIR
Authors Kazuhiro Seki, Hiroyuki Hattori, Kuniaki Uehara
Comments (0)