PAC-Learnability of Probabilistic Deterministic Finite State Automata in Terms of Variation Distance

16 years 3 months ago

Download www.csc.liv.ac.uk

We consider the problem of PAC-learning distributions over strings, represented by probabilistic deterministic ﬁnite automata (PDFAs). PDFAs are a probabilistic model for the generation of strings of symbols, that have been used in the context of speech and handwriting recognition, and bioinformatics. Recent work on learning PDFAs from random examples has used the KL-divergence as the error measure; here we use the variation distance. We build on recent work by Clark and Thollard, and show that the use of the variation distance allows simpliﬁcations to be made to the algorithms, and also a strengthening of the results; in particular that using the variation distance, we obtain polynomial sample size bounds that are independent of the expected length of strings.

Nick Palmer, Paul W. Goldberg

Real-time Traffic

ALT 2005 | Distance Allows Simpliﬁcations | Machine Learning | Polynomial Sample Size | Variation Distance |

claim paper

Added	14 Mar 2010
Updated	14 Mar 2010
Type	Conference
Year	2005
Where	ALT
Authors	Nick Palmer, Paul W. Goldberg

Sciweavers

PAC-Learnability of Probabilistic Deterministic Finite State Automata in Terms of Variation Distance

ALT 2005 | Distance Allows Simpliﬁcations | Machine Learning | Polynomial Sample Size | Variation Distance |

Explore & Download

Productivity Tools

Sciweavers