Annotating training data for event extraction is tedious and labor-intensive. Most current event extraction tasks rely on hundreds of annotated documents, but this is often not en...
This paper presents a new representation and evaluation procedure of page segmentation algorithms and analyzes six widely-used layout analysis algorithms using the procedure. The ...
In order to achieve better scalability and reduce latency in handling user requests, many Web applications make extensive use of data replication through caches and Content Delive...
Bogdan C. Popescu, Maarten van Steen, Bruno Crispo...
This paper presents a model for summarizing multiple untranscribed spoken documents. Without assuming the availability of transcripts, the model modifies a recently proposed unsup...
Shrack is a peer-to-peer framework for document sharing and tracking. Shrack peers provide support to researchers in forming direct collaboration in autonomous sharing and keeping...
Hathai Tanta-ngai, Vlado Keselj, Evangelos E. Mili...