This paper presents a flexible and effective examplebased approach for labeling title pages which can be used for automated extraction of bibliographic data. The labels of intere...
Joost van Beusekom, Daniel Keysers, Faisal Shafait...
Performance evaluation for document image analysis and understanding is a recurring problem. Many groundtruthed document image databases are now used to evaluate general algorithm...
: Variable data documents can be considered as functions of their bindings to values, and this function could be arbitrarily complex to build strongly-customised but high-value doc...
In this paper we propose a technique for evaluating similarity of XML Schema fragments. Firstly, we define classes of structurally and semantically equivalent XSD constructs. Then...
As the world wide web transforms from a vehicle of information dissemination and e-commerce transactions into a writable nexus of human collaboration, the Web 2.0 technologies at ...
Malan is a MApping LANguage that allows the generation of transformation programs by specifying a schema mapping between a source and target data schema. By working at the schema ...
Different dialects of XML have emerged as ubiquitous document exchange formats. For effective collaboration based on such documents, the capability to propagate edit operations pe...
Digital Libraries are complex information systems that involve rich sets of digital objects and their respective metadata, along with multiple organizational structures and servic...
In this paper, we present an analysis based on linguistic and typographic features that allows for the identification of titles in web documents. We focus in particular on procedu...