Documents often have inherently parallel structure: they may consist of a text and ries, or an abstract and a body, or parts presenting alternative views on the same problem. Reve...
In this paper, we describe our work on building a parallel treebank for a less studied and typologically dissimilar language pair, namely Swedish and Turkish. The treebank is a ba...
Abstract. This paper presents the evaluation methods and the preliminary results of a combined thematic segmentation of (a) meeting documents and (b)meeting speech transcript. Our ...
With the advent of XML we have seen a renewed interest in methods for computing the difference between trees. Methods that include heuristic elements play an important role in pr...
Tancred Lindholm, Jaakko Kangasharju, Sasu Tarkoma
We propose a language-independent method for the automatic extraction of transliteration pairs from parallel corpora. In contrast to previous work, our method uses no form of supe...