This paper presents a unified framework for the evaluation of a range of structured document retrieval (SDR) approaches and tasks. The framework is based on a model of tree retrie...
Mir Sadek Ali, Mariano P. Consens, Gabriella Kazai...
Digitizing legacy documents and marking them up with XML is important for many scientific domains. However, creating comprehensive semantic markup of high quality is challenging. ...
We have developed a program called fiwalk which produces detailed XML describing all of the partitions and files on a hard drive or disk image, as well as any extractable metadat...
XML is becoming the standard representation format for metadata. Metadata for multimedia documents, as for instance MPEG-7, require approximate match search functionalities to be s...
Abstract. This paper explores the possibility of using a modified Expectation-Maximization algorithm to estimate parameters for a simple hierarchical generative model for XML retr...