Vocabulary independent spoken query: a case for subword units

13 years 7 months ago

Download www.merl.com

In this work, we describe a subword unit approach for information retrieval of items by voice. An algorithm based on the minimum description length (MDL) principle converts an index written in terms of words into an index written in terms of phonetic subword units. A speech recognition engine that uses a language model and pronounciation dictionary built from such an inventory of subword units is completely independent from the information retrieval task. The recognition engine can remain fixed, making this approach ideal for resource constrained systems. In addition, we demonstrate that recall results at higher out of vocabulary (OOV) rates are much superior for the subword unit system. On a music lyrics task at 80% OOV, the subword-based recall is 75.2%, compared to 47.4% for a word system. Interspeech 2010 This work may not be copied or reproduced in whole or in part for any commercial purpose. Permission to copy in whole or in part without payment of fee is granted for nonprofit e...

Evandro B. Gouvêa, Tony Ezzat

Real-time Traffic

Electric Research Laboratories | INTERSPEECH 2010 | Mitsubishi Electric Research | Signal Processing | Subword Unit |

claim paper

Post Info
More Details (n/a)

Added	18 May 2011
Updated	18 May 2011
Type	Journal
Year	2010
Where	INTERSPEECH
Authors	Evandro B. Gouvêa, Tony Ezzat

Comments (0)

Sciweavers

Vocabulary independent spoken query: a case for subword units

Electric Research Laboratories | INTERSPEECH 2010 | Mitsubishi Electric Research | Signal Processing | Subword Unit |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers