The World Wide Web can be viewed as a gigantic distributed database including millions of interconnected hosts some of which publish information via web servers or peer-to-peer sys...
Traditionally, information extraction from web tables has focused on small, more or less homogeneous corpora, often based on assumptions about the use of <table> tags. A mul...
In this paper we present HyperJournal, an Open Source web application for publishing on-line Open Access scholarly journals. In the first part (sections 1-3) we briefly describe t...
Extracting semantic relations among entities is an important first step in various tasks in Web mining and natural language processing such as information extraction, relation de...
Our KNOWITALL system aims to automate the tedious process of extracting large collections of facts (e.g., names of scientists or politicians) from the Web in an autonomous, domain...
Oren Etzioni, Michael J. Cafarella, Doug Downey, A...