Word-based Huffman coding has widespread use in information retrieval systems. Besides its compressing power, it also enables the implementation of both indexing and searching sch...
Recent advances in on-line data capturing technologies and its widespread deployment in devices like PDAs and notebook PCs is creating large amounts of handwritten data that need ...
Similarity search and similarity join on strings are important for applications such as duplicate detection, error detection, data cleansing, or comparison of biological sequences....
A large amount of handwritten documents exist in image form, as scanned documents. The supporting electronic media allows for better preservation, but to access their content they...
Antonio Clavelli, Luigi P. Cordella, Claudio De St...
XML is fast becoming the standard format to store, exchange and publish over the web, and is getting embedded in applications. Two challenges in handling XML are its size (the XML...
Paolo Ferragina, Fabrizio Luccio, Giovanni Manzini...