Language modeling for an inflected language such as Arabic poses new challenges for speech recognition and machine translation due to its rich morphology. Rich morphology results i...
This paper presents our latest efforts toward LVCSR systems for five Eastern European languages such as Bulgarian, Croatian, Czech, Polish, and Russian using our Rapid Language Ad...
Ngoc Thang Vu, Tim Schlippe, Franziska Kraus, Tanj...
Speech recognition of inflectional and morphologically rich languages like Czech is currently quite a challenging task, because simple n-gram techniques are unable to capture impo...
Abstract. In this article, we propose the use of suffix arrays to efficiently implement n-gram language models with practically unlimited size n. This approach, which is used with ...
This paper investigates the role of language in accessing information on the Internet. We combined data about website visitors through log-file analysis with data about web-hosts ...