This paper presents a novel method for extracting information from collections of Web pages across different sites. Our method uses a standard wrapper induction algorithm and explo...
The entity resolution (ER) problem, which identifies duplicate entities that refer to the same real world entity, is essential in many applications. In this paper, in particular,...
Byung-Won On, Ergin Elmacioglu, Dongwon Lee, Jaewo...
Tables are a universal idiom to present relational data. Billions of tables on Web pages express entity references, attributes and relationships. This representation of relational...
This paper presents a Named Entity Recognition (NER) method dedicated to process speech transcriptions. The main principle behind this method is to collect in an unsupervised way ...
Background: The ability to distinguish between genes and proteins is essential for understanding biological text. Support Vector Machines (SVMs) have been proven to be very effici...
Tapio Pahikkala, Filip Ginter, Jorma Boberg, Jouni...