This paper proposes the use of a crossbar-like tree structure to use with Dynamic Markov Compression (DMC) for the compression of Chinese text files. DMC had previously been found...
Abstract. Byte pair encoding (BPE) is a simple universal text compression scheme. Decompression is very fast and requires small work space. Moreover, it is easy to decompress an ar...
Classification of texts potentially containing a complex and specific terminology requires the use of learning methods that do not rely on extensive feature engineering. In this w...
The innate verbosity of the Extensible Markup Language remains one of its main weaknesses, especially when large XML documents are concerned. This problem can be solved with the a...
Przemyslaw Skibinski, Szymon Grabowski, Jakub Swac...