We report the results of an experiment to assess the ability of automated MT evaluation metrics to remain sensitive to variations in MT quality as the average quality of the compa...
This paper addresses the problem of synchronizing movie subtitles, which is necessary to improve alignment quality when building a parallel corpus out of translated subtitles. In ...
In this paper we deal with a recently developed large Czech MWE database containing at the moment 160 000 MWEs (treated as lexical units). It was compiled from various resources s...
This paper presents the automatic extension of Princeton WordNet with Named Entities (NEs). This new resource is called Named Entity WordNet. Our method maps the noun is-a hierarc...
The project presented here is a part of a long term research program aiming at a full lexicon grammar for Polish (SyntLex). The main concern of this project is computer-assisted a...
Grazyna Vetulani, Zygmunt Vetulani, Tomasz Obr&eci...
Linguists have long been producing grammatical decriptions of yet undescribed languages. This is a time-consuming process, which has already adapted to improved technology for rec...
In this paper we discuss how linguistic and geographic distances can be related using a 3D visualization. We will convert linguistic data for locations along the German-Dutch bord...
Folkert de Vriend, Jan Pieter Kunst, Louis ten Bos...
The paper will give an overview of developments in Estonia in the field of Human Language Technologies. Despite of the fact that Estonian is one of the smallest official languages...
We present the machine learning framework that we are developing, in order to support explorative search for non-trivial linguistic configurations in low-density languages (langua...
Texts generated by automatic speech recognition (ASR) systems have some specificities, related to the idiosyncrasies of oral productions or the principles of ASR systems, that mak...