Abstract. In contrast to electronic document collections we find in contemporary digital libraries, systems applied in a cultural domain have to satisfy specific requirements wit...
Current research in the field of automatic plagiarism detection for text documents focuses on algorithms that compare plagiarized documents against potential original documents. Th...
Large search engines process thousands of queries per second over billions of documents, making query processing a major performance bottleneck. An important class of optimization...
Web Mining Systems exploit the redundancy of data published on the Web to automatically extract information from existing web documents. The first step in the Information Extract...
Kostyantyn M. Shchekotykhin, Dietmar Jannach, Gerh...
—In this paper, a new binarization algorithm for degraded document images is proposed. The method is based on positive and negative pixel energies using the Laplacian of an image...