In this paper we present the design, implementation and evaluation of SOBA, a system for ontology-based information extraction from heterogeneous data resources, including plain t...
Paul Buitelaar, Philipp Cimiano, Anette Frank, Mat...
Abstract. Biologists usually focus on only a small, individualized, subdomain of the huge domain of biology. With respect to their sub-domain, they often need data collected from v...
This paper proposes a method for finding related Web pages based on connectivity information of hyperlinks. As claimed by Kumar, a complete bipartite graph of Web pages can be reg...
We consider the problem of template-independent news extraction. The state-of-the-art news extraction method is based on template-level wrapper induction, which has two serious li...
Junfeng Wang, Xiaofei He, Can Wang, Jian Pei, Jiaj...