This paper presents an end-to-end administrative document analysis system. This system uses case-based reasoning in order to process documents from known and unknown classes. For ...
abstract of invited paper Document management has many aspects, among them acquisition, storage, retrieval, presentation and processing of documents (work flow). These aspects will...
Proliferation of digital libraries plus availability of electronic documents from the Internet have created new challenges for computer science researchers and professionals. Docum...
We present a novel approach for classifying documents that combines different pieces of evidence (e.g., textual features of documents, links, and citations) transparently, through...
Adriano Veloso, Wagner Meira Jr., Marco Cristo, Ma...
The goal of distributed information retrieval is to support effective searching over multiple document collections. For efficiency, queries should be routed to only those collectio...
Abstract. In this paper, we present an approach for classifying documents based on the notion of a semantic similarity and the effective representation of the content of the docume...
The first steps towards bridging the paper-digital divide have been achieved with the development of a range of technologies that allow printed documents to be linked to digital c...
Abstract. The automatic detection of shared content in written documents –which includes text reuse and its unacknowledged commitment, plagiarism– has become an important probl...
According to the logical model of Information Retrieval (IR), the task of IR can be described as the extraction, from a given document base, of those documents d that, given a que...
Carlo Meghini, Fabrizio Sebastiani, Umberto Stracc...
Knowledge Discovery in Databases (KDD) focuses on the computerized exploration of large amounts of data and on the discovery of interesting patterns within them. While most work on...