We propose and analyze a distribution learning algorithm for a subclass of Acyclic Probabilistic Finite Automata (APFA). This subclass is characterized by a certain distinguishability property of the automata’s states. Though hardness results are known for learning distributionsgeneratedbygeneral APFAs,weprove that our algorithm can indeed efficiently learn distributions generated by the subclass of APFAs we consider. In particular, we show that the KLdivergence between the distribution generated by the target source and the distribution generated by our hypothesis can be made small with high confidence in polynomial time. We present two applications of our algorithm. In the first, we show how to model cursively written letters. The resulting models are part of a complete cursive handwriting recognition system. In the second applicationwe demonstrate howAPFAs can be used to build multiple-pronunciation models for spoken words. We evaluate the APFA based pronunciation models on la...