Comparing Set-Covering Strategies for Optimal Corpus Design

15 years 8 months ago

Download www.lrec-conf.org

This article is interested in the problem of the linguistic content of a speech corpus. Depending on the target task, the phonological and linguistic content of the corpus is controlled by collecting a set of sentences which covers a preset description of phonological attributes under the constraint of an overall duration as small as possible. This goal is classically achieved by greedy algorithms which however do not guarantee the optimality of the desired cover. In recent works, a lagrangian-based algorithm, called LamSCP, has been used to extract coverings of diphonemes from a large corpus in French, giving better results than a greedy algorithm. We propose to keep comparing both algorithms in terms of the shortest duration, stability and robustness by achieving multi-represented diphoneme or triphoneme covering. These coverings correspond to very large scale optimization problems, from a corpus in English. For each experiment, LamSCP improves the greedy results from 3.9 to 9.7 per...

Jonathan Chevelu, Nelly Barbot, Olivier Boëff

Real-time Traffic

Education | Greedy Algorithm | Linguistic Content | LREC 2008 | Speech Corpus |

claim paper

» Using Process Simulation to Compare Scheduling Strategies for Software Projects

» On the Design of Adaptive Control Strategies for Evolutionary Algorithms

» Optimization strategies for complex queries

» Evolution Strategies for MixedInteger Optimization of Optical Multilayer Systems

» Design and analysis of adaptive strategies for locating internetbased servers in MANETs

» Assessing Entailer with a Corpus of Natural Language from an Intelligent Tutoring System

» Incorporating drivability metrics into optimal energy management strategies for Hybrid Veh...

» Morphing methods in evolutionary design optimization

Post Info
More Details (n/a)

Added	29 Oct 2010
Updated	29 Oct 2010
Type	Conference
Year	2008
Where	LREC
Authors	Jonathan Chevelu, Nelly Barbot, Olivier Boëffard, Arnaud Delhay

Comments (0)

Sciweavers

Comparing Set-Covering Strategies for Optimal Corpus Design

Education | Greedy Algorithm | Linguistic Content | LREC 2008 | Speech Corpus |

Explore & Download

Productivity Tools

Sciweavers