This article examines the application of two single-document sentence compression techniques to the problem of multi-document summarization—a “parse-and-trim” approach and a...
David M. Zajic, Bonnie J. Dorr, Jimmy J. Lin, Rich...
Transaction services offered by public authorities vary from simple forms with few fields to multi-form compound documents with hundreds of input areas. In the latter case, field p...
Costas Vassilakis, George Lepouras, Stathis Rouvas...
Many diagrams contain compound objects composed of parts. We propose a recognition framework that learns parts in an unsupervised way, and requires training labels only for compou...
The Mixed Raster Content (MRC) document compression is a well documented standard. Its efficiency for representing sharp text and graphics over a background has been extensively p...
This paper presents two corpora produced within the RPM2 project: a multi-document summarization corpus and a sentence compression corpus. Both corpora are in French. The first on...