Sciweavers

DOCENG
2007
ACM
14 years 15 days ago
XML version detection
The problem of version detection is critical in many important application scenarios, including software clone identification, Web page ranking, plagiarism detection, and peer-to-...
Deise de Brum Saccol, Nina Edelweiss, Renata de Ma...
DOCENG
2007
ACM
14 years 15 days ago
Mapping paradigm for document transformation
Since the advent of XML, the ability to transform documents using transformation languages such as XSLT has become an important challenge. However, writing a transformation script...
Arnaud Blouin, Olivier Beaudoux
DOCENG
2007
ACM
14 years 15 days ago
SALT: a semantic approach for generating document representations
The structure of a document has an important influence on the perception of its content. Considering scientific publications, we can affirm that by making use of the ordinary line...
Tudor Groza, Alexander Schutz, Siegfried Handschuh
DOCENG
2007
ACM
14 years 15 days ago
Logical document conversion: combining functional and formal knowledge
We present in this paper a method for document layout analysis based on identifying the function of document elements (what they do). This approach is orthogonal and complementary...
Hervé Déjean, Jean-Luc Meunier
DOCENG
2007
ACM
14 years 15 days ago
Declarative extensions of XML languages
We present a set of XML language extensions that bring notions from functional programming to web authors, extending the power of declarative modelling for the web. Our previous w...
Simon J. Thompson, Peter R. King, Patrick Schmitz
DOCENG
2007
ACM
14 years 15 days ago
A model for mapping between printed and digital document instances
The first steps towards bridging the paper-digital divide have been achieved with the development of a range of technologies that allow printed documents to be linked to digital c...
Nadir Weibel, Moira C. Norrie, Beat Signer
DOCENG
2007
ACM
14 years 15 days ago
Genre driven multimedia document production by means of incremental transformation
Genre, like layout, is an important factor in effective communication, and automated tools which assist in genre compliance are thus of considerable value. Genres are reusable met...
Marc Nanard, Jocelyne Nanard, Peter R. King, Ludov...
DOCENG
2007
ACM
14 years 15 days ago
Structure and content analysis for html medical articles: a hidden markov model approach
We describe ongoing research on segmenting and labeling HTML medical journal articles. In contrast to existing approaches in which HTML tags usually serve as strong indicators, we...
Jie Zou, Daniel X. Le, George R. Thoma
DOCENG
2007
ACM
14 years 15 days ago
Speculative document evaluation
Optimisation of real world Variable Data printing (VDP) documents is a difficult problem because the interdependencies between layout functions may drastically reduce the number o...
Alexander J. Macdonald, David F. Brailsford, Steve...