Gleaner: Creating ensembles of first-order clauses to improve recall-precision curves

15 years 6 months ago

Download ftp.cs.wisc.edu

Many domains in the field of Inductive Logic Programming (ILP) involve highly unbalanced data. A common way to measure performance in these domains is to use precision and recall instead of simply using accuracy. The goal of our research is to find new approaches within ILP particularly suited for large, highly-skewed domains. We propose Gleaner, a randomized search method that collects good clauses from a broad spectrum of points along the recall dimension in recall-precision curves and employs an "at least L of these K clauses" thresholding method to combine sets of selected clauses. Our research focuses on Multi-Slot Information Extraction (IE), a task that typically involves many more negative examples than positive examples. We formulate this problem into a relational domain, using two large testbeds involving the extraction of important relations from the s of biomedical journal articles. We compare Gleaner to ensembles of standard theories learned by Aleph, finding tha...

Mark Goadrich, Louis Oliphant, Jude W. Shavlik

Real-time Traffic

Inductive Logic Programming | Machine Learning | Many Domains | ML 2006 | Randomized Search Method |

claim paper

Added	14 Dec 2010
Updated	14 Dec 2010
Type	Journal
Year	2006
Where	ML
Authors	Mark Goadrich, Louis Oliphant, Jude W. Shavlik

Sciweavers

Gleaner: Creating ensembles of first-order clauses to improve recall-precision curves

Inductive Logic Programming | Machine Learning | Many Domains | ML 2006 | Randomized Search Method |

Explore & Download

Productivity Tools

Sciweavers