Lexical Attraction Models (LAMs) were first introduced by Deniz Yuret in (Yuret 1998) to exemplify how an algorithm can learn word dependencies from raw text. His general thesis i...
This paper introduces new methods based on exponential families for modeling the correlations between words in text and speech. While previous work assumed the effects of word co-...
Semistatic word-based byte-oriented compression codes are known to be attractive alternatives to compress natural language texts. With compression ratios around 30%, they allow di...
In this article, we are studying the differences between the European languages using statistical and unsupervised methods. The analysis is conducted in different levels of languag...
Kimmo Kettunen, Markus Sadeniemi, Tiina Lindh-Knuu...