We present sppc, a high-performance system for intelligent text extraction and navigation from German free text documents. sppc consists of a set of domainindependent shallow core...
Named entities play an important role in Information Extraction. They represent unitary namable information within text. In this work, we focus on groups of named entities of the s...
Parallel corpora are indispensable resources for a variety of multilingual natural language processing tasks. This paper presents a technique for fully automatic construction of c...
A weakly-supervised extraction method identifies concepts within conceptual hierarchies, at the appropriate level of specificity (e.g., Bank vs. Institution), to which attribute...
This work applies boosted wrapper induction (BWI), a machine learning algorithm for information extraction from semi-structured documents, to the problem of named entity recogniti...