This paper suggests an alternative solution for the task of spoken document retrieval (SDR). The proposed system runs retrieval on multi-level transcriptions (word and phone) prod...
Shan Jin, Hemant Misra, Thomas Sikora, Joemon M. J...
The Mixed Raster Content (MRC) document compression standard (ITU T.44) specifies a multi-layer multi-resolution representation of a compound document. The model is very efficie...
Repetition of layout structure is prevalent in document images. In document design, such repetition conveys the underlying logical and functional structure of the data. For exampl...
This paper explores correspondence and mixture topic modeling of documents tagged from two different perspectives. There has been ongoing work in topic modeling of documents with...
This paper describes a top-down word image generation model for holistic handwritten word recognition. To generate a word image, it uses likelihoods based, respectively, on a ling...