The vocabulary used in speech usually consists of two types of words: a limited set of common words, shared across multiple documents, and a virtually unlimited set of rare words, ...
Stefan Kombrink, Mirko Hannemann, Lukas Burget, Hy...
This paper presents two different approaches to automatic captioning of geo-tagged images by summarizing multiple web-documents that contain information related to an image’s lo...
Abstract. Evaluation is one of the hardest tasks in automatic text summarization. It is perhaps even harder to determine how much a particular component of a summarization system c...
In this article we present a simple method to extract Spanish nouns with the linguistic property of “human” animacy. We describe a non-supervised method based on lexical patter...
In this paper, we study the use of heterogeneous data for training of acoustic models. In initial experiments, a significant drop of accuracy has been observed on in-domain test s...
Modern speech recognition applications are becoming very complex program packages. To understand the error behaviour of the ASR systems, a special diagnosis - a procedure or a tool...
Content-Based Copy Detection (CBCD) of digital videos is an important research field that aims at the identification of modified copies of an original clip, e.g., on the Intern...
Does there exist a compact set of visual topics in form of keyword clusters capable to represent all images visual content within an acceptable error? In this paper, we answer thi...
Bag-of-visual Words (BoW) image representation is getting popular in computer vision and multimedia communities. However, experiments show that the traditional BoW representation ...