Voice search of structured media data

15 years 9 months ago

Download research.microsoft.com

This paper addresses the problem of using unstructured queries to search a structured database in voice search applications. By incorporating structural information in music metadata, the end-toend search error has been reduced by 15% on text queries and up to 11% on spoken queries. Based on that, an HMM sequential rescoring model has reduced the error rate by 28% on text queries and up to 23% on spoken queries compared to the baseline system. Furthermore, a phonetic similarity model has been introduced to compensate speech recognition errors, which has improved the end-to-end search accuracy consistently across different levels of speech recognition accuracy.

Young-In Song, Ye-Yi Wang, Yun-Cheng Ju, Mike Selt

Real-time Traffic

ICASSP 2009 | Signal Processing | Spoken Queries | Text Queries | Unstructured Queries |

claim paper

» Avaaj Otalo a field study of an interactive voice forum for small farmers in rural India

» MediaPick tangible semantic media retrieval system

» An Adaptive Index Structure for HighDimensional Similarity Search

» MedSMan a streaming data management system over live multimedia

» Learning nonparametric models of pronunciation

» Extracting community structure through relational hypergraphs

» Search extension transforms Wiki into a relational system A case for flavonoid metabolite ...

» MetaFac community discovery via relational hypergraph factorization

Post Info
More Details (n/a)

Added	21 May 2010
Updated	21 May 2010
Type	Conference
Year	2009
Where	ICASSP
Authors	Young-In Song, Ye-Yi Wang, Yun-Cheng Ju, Mike Seltzer, Ivan Tashev, Alex Acero

Comments (0)

Sciweavers

Voice search of structured media data

ICASSP 2009 | Signal Processing | Spoken Queries | Text Queries | Unstructured Queries |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers