Corpora of sentences annotated with grammatical information have been deployed by extending the basic lexical and morphological data with increasingly complex information, such as...
In this paper we report a way of constructing a translation corpus that contains not only source and target texts, but draft and final versions of target texts, through the transl...
Compilation of a 100 million words balanced corpus called the Balanced Corpus of Contemporary Written Japanese (or BCCWJ) is underway at the National Institute for Japanese Langua...
In this paper, we present several ways to measure and evaluate the annotation and annotators, proposed and used during the building of the Czech part of the Prague Czech-English D...
After a brief overview of the elements of modern grid computing, a number of common use-cases of natural language processing tasks running on the grid are presented, notably corpu...
Many contemporary language technology systems are characterized by long pipelines of tools with complex dependencies. Too often, these workflows are implemented by ad hoc scripts;...
This paper reports our experience when integrating differ resources and services into a grid environment. The use case we address implies the deployment of several NLP application...
In the paper we investigate the impact of data size on a Word Sense Disambiguation task (WSD). We question the assumption that the knowledge acquisition bottleneck, which is known...
We will look at how maps can be integrated in research resources, such as language databases and language corpora. By using maps, search results can be illustrated in a way that i...
Janne Bondi Johannessen, Kristin Hagen, Anders N&o...