We present the first known empirical study on speech summarization without lexical features for Mandarin broadcast news. We evaluate acoustic, lexical and structural features as ...
This paper presents empirical results that contradict the prevailing opinion that entity extraction is a boring solved problem. In particular, we consider data sets that resemble ...
This paper investigates the problem of bootstrapping a statistical dialogue manager without access to training data and proposes a new probabilistic agenda-based method for simula...
Jost Schatzmann, Blaise Thomson, Karl Weilhammer, ...
This paper describes the CMU/InterACT effort in developing an Arabic Automatic Speech Recognition (ASR) system for broadcast news and conversations within the GALE 2006 evaluation...
We present an approach to using multiple preprocessing schemes to improve statistical word alignments. We show a relative reduction of alignment error rate of about 38%.
The aim of this work is to show the ability of finite-state transducers to simultaneously translate speech into multiple languages. Our proposal deals with an extension of stocha...
We evaluate semantic relatedness measures on different German datasets showing that their performance depends on: (i) the definition of relatedness that was underlying the constr...
We describe our linguistic rule-based tagger IceTagger, and compare its tagging accuracy to the TnT tagger, a state-of-theart statistical tagger, when tagging Icelandic, a morphol...
Customization to specific domains of discourse and/or user requirements is one of the greatest challenges for today’s Information Extraction (IE) systems. While demonstrably eff...