Unsupervised named-entity extraction from the Web: An experimental study

15 years 6 months ago

Download turing.cs.washington.edu

The KNOWITALL system aims to automate the tedious process of extracting large collections of facts (e.g., names of scientists or politicians) from the Web in an unsupervised, domain-independent, and scalable manner. The paper presents an overview of KNOWITALL's novel architecture and design principles, emphasizing its distinctive ability to extract information without any hand-labeled training examples. In its first major run, KNOWITALL extracted over 50,000 class instances, but suggested a challenge: How can we improve KNOWITALL's recall and extraction rate without sacrificing precision? This paper presents three distinct ways to address this challenge and evaluates their performance. Pattern Learning learns domain-specific extraction rules, which enable additional extractions. Subclass Extraction automatically identifies sub-classes in order to boost recall (e.g., "chemist" and "biologist" are identified as sub-classes of "scientist"). List Ex...

Oren Etzioni, Michael J. Cafarella, Doug Downey, A

Real-time Traffic

AI 2005 | Artificial Intelligence | Class Instances | Domain-specific Extraction Rules | Hand-labeled Training |

claim paper

» Automatic extraction of clickable structured web contents for name entity queries

» WebSets extracting sets of entities from the web using unsupervised information extraction

» Learning to Extract Relations from the Web using Minimal Supervision

» URES an Unsupervised Web Relation Extraction System

» Exploiting web search to generate synonyms for entities

» Webderived resources for web information retrieval from conceptual hierarchies to attribut...

» Organizing and searching the world wide web of facts step two harnessing the wisdom of th...

Post Info
More Details (n/a)

Added	15 Dec 2010
Updated	15 Dec 2010
Type	Journal
Year	2005
Where	AI
Authors	Oren Etzioni, Michael J. Cafarella, Doug Downey, Ana-Maria Popescu, Tal Shaked, Stephen Soderland, Daniel S. Weld, Alexander Yates

Comments (0)

Sciweavers

Unsupervised named-entity extraction from the Web: An experimental study

AI 2005 | Artificial Intelligence | Class Instances | Domain-specific Extraction Rules | Hand-labeled Training |

Explore & Download

Productivity Tools

Sciweavers