Word segmentation is a critical stage towards word and character recognition as well as word spotting and mainly concerns two basic aspects, distance computation and gap classific...
This paper introduces a framework for clarifying and formalizing the duplicate document detection problem. Four distinct models are presented, each with a corresponding algorithm ...
Annotations are an important part in today’s digital libraries and Web information systems as an instrument for interactive knowledge creation. Annotation-based document retrieva...
Commercial, non-profit and public organizations are accumulating huge amounts of electronically available text documents. Although composed of unstructured texts, documents contai...
The QCS information retrieval (IR) system is presented as a tool for querying, clustering, and summarizing document sets. QCS has been developed as a modular development framework...
Daniel M. Dunlavy, John M. Conroy, Dianne P. O'Lea...