Cache-based language model adaptation using visual attention for ASR in meeting scenarios

14 years 5 months ago

Download www.eee.bham.ac.uk

In a typical group meeting involving discussion and collaboration, people look at one another, at shared information resources such as presentation material, and also at nothing in particular. In this work we investigate whether the knowledge of what a person is looking at may improve the performance of Automatic Speech Recognition (ASR). A framework for cache Language Model (LM) adaptation is proposed with the cache based on a person’s Visual Attention (VA) sequence. The framework attempts to measure the appropriateness of adaptation from VA sequence characteristics. Evaluation on the AMI Meeting corpus data shows reduced LM perplexity. This work demonstrates the potential for cache-based LM adaptation using VA information in large vocabulary ASR deployed in meeting scenarios. Categories and Subject Descriptors I.2.7 [Artiﬁcial Intelligence]: Natural Language Processing—language models General Terms Algorithms, Experimentation, Measurement, Performance

Neil Cooke, Martin J. Russell

Real-time Traffic

Cache-based Lm Adaptation | ICMI 2009 | Meeting Involving Discussion | VA Sequence Characteristics |

claim paper

Post Info
More Details (n/a)

Added	25 Jul 2010
Updated	25 Jul 2010
Type	Conference
Year	2009
Where	ICMI
Authors	Neil Cooke, Martin J. Russell

Comments (0)

Sciweavers

Cache-based language model adaptation using visual attention for ASR in meeting scenarios

Cache-based Lm Adaptation | ICMI 2009 | Meeting Involving Discussion | VA Sequence Characteristics |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers