More and more documents on the World Wide Web are based on templates. On a technical level this causes those documents to have a quite similar source code and DOM tree structure. G...
This paper presents Multilingual Document Clustering (MDC) on comparable corpora. Wikipedia, a structured multilingual knowledge base, has been highly exploited in many monolingual...
Nowadays, structured data such as sales and business forms are stored in data warehouses for decision makers to use. Further, unstructured data such as emails, html texts, images,...
In real-world Digital Libraries, Artificial Intelligence techniques are essential for tackling the automatic document processing task with sufficient flexibility. The great variab...
W3C Recommendations XML Encryption and XML-Digital Signature can be used to protect the confidentiality of and provide assurances about the integrity of XML documents transmitted...