The Mixed Raster Content (MRC) document compression is a well documented standard. Its efficiency for representing sharp text and graphics over a background has been extensively p...
This paper presents two corpora produced within the RPM2 project: a multi-document summarization corpus and a sentence compression corpus. Both corpora are in French. The first on...
We present a domain-independent unsupervised topic segmentation approach based on hybrid document indexing. Lexical chains have been successfully employed to evaluate lexical cohe...