Abstract. In the paper we introduce the on-line captioning system developed by our teams and used by the Czech Television (CTV), the public service broadcaster in the Czech Republi...
Abstract. Maltparser is a contemporary dependency parsing machine learningbased system that shows great accuracy. However 90% for Labelled Attachment Score (LAS) seems to be a de f...
Abstract. We present a systematic comparison of preprocessing techniques for two language pairs: English-Czech and English-Hindi. The two target languages, although both belonging ...
The vocabulary used in speech usually consists of two types of words: a limited set of common words, shared across multiple documents, and a virtually unlimited set of rare words, ...
Stefan Kombrink, Mirko Hannemann, Lukas Burget, Hy...
This paper presents two different approaches to automatic captioning of geo-tagged images by summarizing multiple web-documents that contain information related to an image’s lo...
Abstract. Evaluation is one of the hardest tasks in automatic text summarization. It is perhaps even harder to determine how much a particular component of a summarization system c...
In this article we present a simple method to extract Spanish nouns with the linguistic property of “human” animacy. We describe a non-supervised method based on lexical patter...
In this paper, we study the use of heterogeneous data for training of acoustic models. In initial experiments, a significant drop of accuracy has been observed on in-domain test s...