We introduce the relative rank differential statistic which is a non-parametric approach to document and dialog analysis based on word frequency rank-statistics. We also present a...
With the increasing popularity of digital cameras attached with various handheld devices, many new computational challenges have gained significance. One such problem is extractio...
Ujjwal Bhattacharya, Swapan K. Parui, Srikanta Mon...
Document-centric XML collections contain text-rich documents, marked up with XML tags. The tags add lightweight semantics to the text. Querying such collections calls for a hybrid...
While classic information retrieval methods return whole documents as a result of a query, many information demands would be better satisfied by fine-grain access inside the docu...
This paper presents a new approach to text processing, based on textemes. These are atomic text units generalising the concepts of character and glyph by merging them in a common ...