

Sentence segmentation and punctuation recovery for spoken language translation

14 years 6 months ago
Sentence segmentation and punctuation recovery for spoken language translation
Sentence segmentation and punctuation recovery are critical components for effective spoken language translation (SLT). In this paper we describe our recent work on sentence segmentation and punctuation recovery for three different language pairs, namely for Englishto-Spanish, Arabic-to-English and Chinese-to-English. We show that the proposed approach works equally well in these very different language pairs. Furthermore, we introduce two features computed from the translation beam-search lattice that indicate if phrasal and target language model context is jeopardized when segmenting at a given word boundary. These features enable us to introduce short intra-sentence segments without degrading translation performance.
Matthias Paulik, Sharath Rao, Ian R. Lane, Stephan
Added 30 May 2010
Updated 30 May 2010
Type Conference
Year 2008
Authors Matthias Paulik, Sharath Rao, Ian R. Lane, Stephan Vogel, Tanja Schultz
Comments (0)