Long-span features, such as syntax, can improve language models for tasks such as speech recognition and machine translation. However, these language models can be difficult to u...
We present a directed Markov random field (MRF) model that combines n-gram models, probabilistic context free grammars (PCFGs) and probabilistic latent semantic analysis (PLSA) fo...
Shaojun Wang, Shaomin Wang, Russell Greiner, Dale ...
There are many accurate methods for language identification of long text samples, but identification of very short strings still presents a challenge. This paper studies a languag...
User interfaces (UIs) are essential components of most software systems, and significantly affect the effectiveness of installed applications. In addition, UIs often represent a s...
The amount of available Thai broadcast news transcribed text for training a language model is still very limited, comparing to other major languages. Since the construction of a b...