Algorithms for Minimum Risk Chunking

14 years 9 months ago

Download symptotic.com

Abstract. Stochastic ﬁnite automata are useful for identifying substrings (chunks) within larger units of text. Relevant applications include tokenization, base-NP chunking, named entity recognition, and other information extraction tasks. For a given input string, a stochastic automaton represents a probability distribution over strings of labels encoding the location of chunks. For chunking and extraction tasks, the quality of predictions is evaluated in terms of precision and recall of the chunked/ extracted phrases when compared against some gold standard. However, traditional methods for estimating the parameters of a stochastic ﬁnite automaton and for decoding the best hypothesis do not pay attention to the evaluation criterion, which we take to be the well-known F-measure. We are interested in methods that remedy this situation, both in training and decoding. Our main result is a novel algorithm for eﬃciently evaluating expected F-measure. We present the algorithm and disc...

Martin Jansche

Real-time Traffic

Extraction Tasks | FSMNLP 2005 | Natural Language Processing | Stochastic ﬁnite Automata | Stochastic ﬁnite Automaton |

claim paper

Post Info
More Details (n/a)

Added	27 Jun 2010
Updated	27 Jun 2010
Type	Conference
Year	2005
Where	FSMNLP
Authors	Martin Jansche

Comments (0)

Sciweavers

Algorithms for Minimum Risk Chunking

Extraction Tasks | FSMNLP 2005 | Natural Language Processing | Stochastic ﬁnite Automata | Stochastic ﬁnite Automaton |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers