A comparison of grapheme and phoneme-based units for Spanish spoken term detection

14 years 11 days ago

Download homepages.inf.ed.ac.uk

The ever-increasing volume of audio data available online through the world wide web means that automatic methods for indexing and search are becoming essential. Hidden Markov model (HMM) keyword spotting and lattice search techniques are the two most common approaches used by such systems. In keyword spotting, models or templates are defined for each search term prior to accessing the speech and used to find matches. Lattice search (referred to as spoken term detection), uses a pre-indexing of speech data in terms of word or sub-word units, which can then quickly be searched for arbitrary terms without referring to the original audio. In both cases, the search term can be modelled in terms of sub-word units, typically phonemes. For in-vocabulary words (i.e. words that appear in the pronunciation dictionary), the letter-to-sound conversion systems are accepted to work well. However, for out-ofvocabulary (OOV) search terms, letter-to-sound conversion must be used to generate a pronunci...

Javier Tejedor, Dong Wang, Joe Frankel, Simon King

Real-time Traffic

Search Term | SPEECH 2008 | Spoken Term Detection | Sub-word Units |

claim paper

Post Info
More Details (n/a)

Added	15 Dec 2010
Updated	15 Dec 2010
Type	Journal
Year	2008
Where	SPEECH
Authors	Javier Tejedor, Dong Wang, Joe Frankel, Simon King, José Colás

Comments (0)

Sciweavers

A comparison of grapheme and phoneme-based units for Spanish spoken term detection

Search Term | SPEECH 2008 | Spoken Term Detection | Sub-word Units |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers