This work explores the application of clustering methods for grouping structurally similar XML documents. Modeling the XML documents as rooted ordered labeled trees, we apply clust...
Theodore Dalamagas, Tao Cheng, Klaas-Jan Winkel, T...
Large XML data files, or XML databases, are now a common way to distribute scientific and bibliographic data, and storing such data efficiently is an important concern. A number o...
Sentence compression is the task of generating a grammatical short sentence from an original sentence, retaining important information. The existing methods of only removing the c...
This paper proposes a new method for document transformation using OCR to generate various XML documents from printed documents. The proposed method adopts a hierarchical transfor...
XML is the predominant format for representing structured information inside documents, but it stops at the level of files. This makes it hard to use XML-oriented tools to process...