Documents often contain inherently many concepts reflecting specific and generic aspects. To automatically generate a short summary text of documents on similar topics, it is im...
Document registration is a problem where the image of a template document whose layout is known is registered with a test document image. Given the registration parameters, layout...
This work addresses the problem of document image analysis, and more particularly the topic of document structure recognition in old, damaged and handwritten document. The goal of...
This paper reports on the development and application of strategies and tools for geographic information seeking and knowledge building that leverages unstructured text resources ...
Brian M. Tomaszewski, Justine Blanford, Kevin Ross...
—Previous studies have demonstrated that document clustering performance can be improved significantly in lower dimensional linear subspaces. Recently, matrix factorization base...
The use of semantic information to improve IR is a long-standing goal. This paper presents a novel Document Expansion method based on a WordNet-based system to find related concep...
Cross Document Coreference (CDC) is the task of constructing the coreference chain for mentions of a person across a set of documents. This work offers a holistic view of using do...
Jian Huang 0002, Pucktada Treeratpituk, Sarah M. T...
Because of the increasing number of electronic data, designing efficient tools to retrieve and exploit documents is a major challenge. Current search engines suffer from two main d...
Sylvie Ranwez, Vincent Ranwez, Mohameth-Fran&ccedi...
Structured Information Retrieval is gaining a lot of interest in recent years, as this kind of information is becoming an invaluable asset for professional communities such as Sof...
In this paper, we introduce a visualization method that couples a trend chart with word clouds to illustrate temporal content evolutions in a set of documents. Specifically, we us...