A web-portal providing access to over 250.000 scanned and OCRed cultural heritage documents is analyzed. The collection consists of the complete Dutch Hansard from 1917 to 1995. E...
Results caching is an efficient technique for reducing the query processing load, hence it is commonly used in real search engines. This technique, however, bounds the maximum hit...
Search engine technology plays an important role in Web information retrieval. However, with Internet information explosion, traditional searching techniques cannot provide satisfa...
Baile Shi, Guoyu Hao, Hongtao Xu, Mei Wang, Qi Zha...
XML is a flexible and powerful tool that enables information and security sharing in heterogeneous environments. Scalable technologies are needed to effectively manage the growing...
Rebecca Cathey, Steven M. Beitzel, Eric C. Jensen,...
Abstract. The mathematical concept of document resemblance captures well the informal notion of syntactic similarity. The resemblance can be estimated using a fixed size “sketch...