Information Retrieval Test Collection for Searching Spontaneous Czech Speech

16 years 22 days ago

Download research.microsoft.com

Abstract. This paper describes the design of the ﬁrst large-scale IR test collection built for the Czech language. The creation of this collection also happens to be very challenging, as it is based on a continuous text stream from automatic transcription of spontaneous speech and thus lacks clearly deﬁned document boundaries. All aspects of the collection building are presented, together with some general ﬁndings of initial experiments.

Pavel Ircing, Pavel Pecina, Douglas W. Oard, Jianq

Real-time Traffic

Continuous Text Stream | Signal Processing | Test Collection | TSD 2007 | ﬁrst Large-scale Ir |

claim paper

» Multimedia with a speech track searching spontaneous conversational speech

» Spoken content retrieval Searching spontaneous conversational speech

» Combining Multiple Models for Speech Information Retrieval

» Test Collections for Spoken Document Retrieval from Lecture Audio Data

» CLEF2006 CLSR at Maryland English and Czech

» Experiments for the Cross Language Speech Retrieval Task at CLEF 2006

» Combining LVCSR and vocabularyindependent ranked utterance retrieval for robust speech sea...

» CLEF 2007 Ad Hoc Track Overview

Post Info
More Details (n/a)

Added	09 Jun 2010
Updated	09 Jun 2010
Type	Conference
Year	2007
Where	TSD
Authors	Pavel Ircing, Pavel Pecina, Douglas W. Oard, Jianqiang Wang, Ryen W. White, Jan Hoidekr

Comments (0)

Sciweavers

Information Retrieval Test Collection for Searching Spontaneous Czech Speech

Continuous Text Stream | Signal Processing | Test Collection | TSD 2007 | ﬁrst Large-scale Ir |

Explore & Download

Productivity Tools

Sciweavers