In this paper, we propose a novel method for automatic segmentation of a Sanskrit string into different words. The input for our segmentizer is a Sanskrit string either encoded as...
We present the state of the art of a computational platform for the analysis of classical Sanskrit. The platform comprises modules for phonology, morphology, segmentation and shal...
Abstract. Language software applications encounter new words, e.g., acronyms, technical terminology, names or compounds of such words. In order to add new words to a lexicon, we ne...
We describe a Bayesian inference algorithm that can be used to train any cascade of weighted finite-state transducers on end-toend data. We also investigate the problem of automat...
David Chiang, Jonathan Graehl, Kevin Knight, Adam ...
In morphologically rich languages such as Arabic, the abundance of word forms resulting from increased morpheme combinations is significantly greater than for languages with fewer...