Sciweavers

CLEF
2007
Springer

A Dirichlet-Smoothed Bigram Model for Retrieving Spontaneous Speech

14 years 5 months ago
A Dirichlet-Smoothed Bigram Model for Retrieving Spontaneous Speech
We present two simple but effective smoothing techniqes for the standard language model (LM) approach to information retrieval [12]. First, we extend the unigram Dirichlet smoothing technique popular in IR [17] to bigram modeling [16]. Second, we propose a method of collection expansion for more robust estimation of the LM prior, particularly intended for sparse collections. Retrieval experiments on the MALACH archive [9] of automatically transcribed and manually summarized spontaneous speech interviews demonstrates strong overall system performance and the relative contribution of our extensions1 .
Matthew Lease, Eugene Charniak
Added 07 Jun 2010
Updated 07 Jun 2010
Type Conference
Year 2007
Where CLEF
Authors Matthew Lease, Eugene Charniak
Comments (0)