Search engine click logs provide an invaluable source of relevance information but this information is biased because we ignore which documents from the result list the users have...
We investigate a representative case of sudden information need change of Web users. By analyzing search engine query logs, we show that the majority of queries submitted by users...
: Document clustering has been used for better document retrieval and text mining. In this paper, we investigate if a biomedical ontology improves biomedical literature clustering ...
The problem addressed in this paper is the automatic extraction of names from a document image. Our approach relies on the combination of two complementary analyses. First, the ima...
This paper reports a document retrieval technique that retrieves machine-printed Latin-based document images through word shape coding. Adopting the idea of image annotation, a wo...
Named entities (e.g., "Kofi Annan", "Coca-Cola", "Second World War") are ubiquitous in web pages and other types of document and often provide a simpl...
Felix Weigel, Klaus U. Schulz, Levin Brunner, Edua...
ct 7 Social animals or insects in nature often exhibit a form of emergent collective behavior known as flocking. In this paper, 8 we present a novel Flocking based approach for doc...
-When a document is fed to a scanner either mechanically or by a human operator for digitization, it suffers from some degrees of skew or tilt. Skew angle detection is an important...
Aradhya V. N. Manjunath, G. Hemantha Kumar, P. Shi...
: We address the problems of structuring and annotation of layout-oriented documents. We model the annotation problems as the collective classification on graph-like structures wit...
: Mass digitization of document collections with further processing and semantic annotation is an increasing activity among libraries and archives at large for preservation, browsi...