The paper introduces a new framework for feature learning in classification motivated by information theory. We first systematically study the information structure and present a n...
In this paper, we propose an innovative approach to extracting semi-structured data from Web sources. The idea is to collect a couple of example objects from the user and to use t...
Berthier A. Ribeiro-Neto, Alberto H. F. Laender, A...
Abstract. The Semantic Web promises to provide timely, targeted access to user-specified information online. Though standardized services exist for performing this work, specifying...
In this paper we discuss the possible application of new concepts in web content extraction: utility assessment, utility annealing, and dynamic aggregated document generation. Aft...
A variety of information extraction techniques rely on the fact that instances of the same relation are "distributionally similar," in that they tend to appear in simila...