This paper presents a new enhanced text extraction algorithm from degraded document images on the basis of the probabilistic models. The observed document image is considered as a...
Management and retrieval of large volumes of text can be expensive in both space and time. Moreover, the range of document sizes in a large collection such as trec presents difficu...
Alistair Moffat, Ron Sacks-Davis, Ross Wilkinson, ...
This paper describes the design and use of a synthetic Web proxy workload generator called ProWGen to investigate the sensitivity of Web proxy cache replacement policies to five se...
: This paper presents a novel way of examining the accuracy of the evaluation measures commonly used in information retrieval experiments. It validates several of the rules-of-thum...
This paper describes an algorithm for detecting empty nodes in the Penn Treebank (Marcus et al., 1993), finding their antecedents, and assigning them function tags, without access...