Syllable-based compression achieves sufficiently good results on text documents of a medium size. Since the majority of XML documents are of that size, we suppose that the syllable...
Existing efforts on XML internationalization and localization have been focusing on the contents of XML documents instead of on the meta presentations such as tags and attributes...
Yijun Yu, Jianguo Lu, Jing-Hao Xue, Yi Zhang, Weiw...
The difficulty with information retrieval for OCR documents lies in the fact that OCR documents comprise of a significant amount of erroneous words and unfortunately most informat...
In this paper, we present and compare automatically generated titles for machine-translated documents using several different statistics-based methods. A Na
This paper presents an algorithm designed to segment and classify newspaper documents. A notable feature of this algorithm is the ability to detect lines in the document