We compare machine learning approaches for sentence length reduction for automatic generation of subtitles for deaf and hearing-impaired people with a method which relies on hand-crafted deletion rules. We describe building the necessary resources for this task: a parallel corpus of examples of news broadcasts of the Flemish VRT broadcasting corporation, and a Dutch shallow parser based on the material of the Spoken Dutch Corpus (CGN). We evaluate the sentence simplifiers and discuss their performance.
Erik F. Tjong Kim Sang, Walter Daelemans, Anja H&o