Automatic lecture transcription by exploiting presentation slide information for language model adaptation

16 years 1 months ago

Download www.ar.media.kyoto-u.ac.jp

The paper addresses language model adaptation for automatic lecture transcription by fully exploiting presentation slide information used in the lecture. As the text in the presentation slides is small in its size and fragmentary in its content, a robust adaptation scheme is addressed by focusing on the keyword and topic information. Several methods are investigated and combined; ﬁrst, global topic adaptation is conducted based on PLSA (Probabilistic Latent Semantic Analysis) using keywords appearing in all slides. Web text is also retrieved to enhance the relevant text. Then, local preference of the keywords are reﬂected with a cache model by referring to the slide used during each utterance. Experimental evaluations on real lectures show that the proposed method combining the global and local slide information achieves a signiﬁcant improvement of recognition accuracy, especially in the detection rate of content keywords.

Tatsuya Kawahara, Yusuke Nemoto, Yuya Akita

Real-time Traffic