Approximate text search is a basic technique to handle recognized text that contains recognition errors. This paper proposes an approximate string search for recognized text using...
A large fraction of an XML document typically consists of text data. The XPath query language allows text search via the equal, contains, and starts-with predicates. Such predicate...
Diego Arroyuelo, Francisco Claude, Sebastian Manet...
We present a new statistical compression method, which we call Phrase Based Dense Code (PBDC), aimed at compressing large digital libraries. PBDC compresses the text collection to ...
Word-based compression over natural language text has shown to be a good choice to trade compression ratio and speed, obtaining compression ratios close to 30% and very fast decom...