Sciweavers

INTERSPEECH
2010

Wiktionary as a source for automatic pronunciation extraction

13 years 5 months ago
Wiktionary as a source for automatic pronunciation extraction
In this paper, we analyze whether dictionaries from the World Wide Web which contain phonetic notations, may support the rapid creation of pronunciation dictionaries within the speech recognition and speech synthesis system building process. As a representative dictionary, we selected Wiktionary [1] since it is at hand in multiple languages and, in addition to the definitions of the words, many phonetic notations in terms of the International Phonetic Alphabet (IPA) are available. Given word lists in four languages English, French, German, and Spanish, we calculated the percentage of words with phonetic notations in Wiktionary. Furthermore, two quality checks were performed: First, we compared pronunciations from Wiktionary to pronunciations from dictionaries based on the GlobalPhone project, which had been created in a rule-based fashion and were manually cross-checked [2]. Second, we analyzed the impact of Wiktionary pronunciations on automatic speech recognition (ASR) systems. Fren...
Tim Schlippe, Sebastian Ochs, Tanja Schultz
Added 18 May 2011
Updated 18 May 2011
Type Journal
Year 2010
Where INTERSPEECH
Authors Tim Schlippe, Sebastian Ochs, Tanja Schultz
Comments (0)