Learning Interestingness Measures in Terminology Extraction. A ROC-based approach

14 years 8 months ago

Download www.sis.pitt.edu

Abstract. In the ﬁeld of Text Mining, a key phase in data preparation is concerned with the extraction of terms, i.e. collocation of words attached to speciﬁc concepts (e.g. Philosophy-Dissertation). In this paper, Term Extraction is formalized as a supervised learning task, extracting a ranking hypothesis from a set of terms labeled as relevant/irrelevant by the expert. This task is tackled using the evolutionary algorithm ROGER, optimizing the area under the ROC curve attached to a ranking hypothesis. Empirical validation on two real-world applications demonstrates outstanding improvements compared to state-of-art interestingness measures in Term Extraction. The approach is found robust across domains (Molecular Biology, Curriculum Vitæ) and languages (English, French).

Mathieu Roche, Jérôme Azé, Yve

Real-time Traffic

Artificial Intelligence | Ranking Hypothesis | ROCAI 2004 | Supervised Learning Task | Term Extraction |

claim paper

Post Info
More Details (n/a)

Added	02 Jul 2010
Updated	02 Jul 2010
Type	Conference
Year	2004
Where	ROCAI
Authors	Mathieu Roche, Jérôme Azé, Yves Kodratoff, Michèle Sebag

Comments (0)

Sciweavers

Learning Interestingness Measures in Terminology Extraction. A ROC-based approach

Artificial Intelligence | Ranking Hypothesis | ROCAI 2004 | Supervised Learning Task | Term Extraction |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers