Learning Ensembles of First-Order Clauses for Recall-Precision Curves: A Case Study in Biomedical Information Extraction

15 years 5 months ago

Download ftp.cs.wisc.edu

Many domains in the ﬁeld of Inductive Logic Programming (ILP) involve highly unbalanced data. Our research has focused on Information Extraction (IE), a task that typically involves many more negative examples than positive examples. IE is the process of ﬁnding facts in unstructured text, such as biomedical journals, and putting those facts in an organized system. In particular, we have focused on learning to recognize instances of the protein-localization relationship in Medline abstracts. We view the problem as a machine-learning task: given posinegative extractions from a training corpus of abstracts, learn a logical theory that performs well on a held-aside testing set. A common way to measure performance in these domains is to use precision and recall instead of simply using accuracy. We propose Gleaner, a randomized search method which collects good clauses from a broad spectrum of points along the recall dimension in recall-precision curves and employs an “at least N of th...

Mark Goadrich, Louis Oliphant, Jude W. Shavlik

Real-time Traffic

ILP 2004 | Inductive Logic Programming | Many Domains | Standard Aleph Theories |

claim paper

Post Info
More Details (n/a)

Added	02 Jul 2010
Updated	02 Jul 2010
Type	Conference
Year	2004
Where	ILP
Authors	Mark Goadrich, Louis Oliphant, Jude W. Shavlik

Comments (0)

Sciweavers

Learning Ensembles of First-Order Clauses for Recall-Precision Curves: A Case Study in Biomedical Information Extraction

ILP 2004 | Inductive Logic Programming | Many Domains | Standard Aleph Theories |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers