This paper presents a language identification technique that detects Latin-based languages of imaged documents without OCR. The proposed technique detects languages through the wo...
Document ranking is well known to be a crucial process in information retrieval (IR). It presents retrieved documents in an order of their estimated degrees of relevance to query. ...
Over the last two centuries, reading styles have shifted away from the reading of documents from beginning to end and toward the skimming of documents in search of relevant inform...
Optimisation of real world Variable Data printing (VDP) documents is a difficult problem because the interdependencies between layout functions may drastically reduce the number o...
Alexander J. Macdonald, David F. Brailsford, Steve...
Text classification using positive and unlabeled data refers to the problem of building text classifier using positive documents (P) of one class and unlabeled documents (U) of man...