Typographic and visual information is an integral part of textual documents. Most information extraction systems ignore most of this visual information, processing the text as a l...
Keyphrases provide semantic metadata that summarize and characterize documents. This paper describes Kea, an algorithm for automatically extracting keyphrases from text. Kea ident...
Ian H. Witten, Gordon W. Paynter, Eibe Frank, Carl...
Despite the huge amount of recent research efforts on entity resolution (matching) there has not yet been a comparative evaluation on the relative effectiveness and efficiency of ...
Abstract. This paper presents an architecture that enables the recognizer to learn incrementally and, thereby adapt to document image collections for performance improvement. We ar...
Abstract. For large state-space Markovian Decision Problems MonteCarlo planning is one of the few viable approaches to find near-optimal solutions. In this paper we introduce a new...