This paper deals with the treatment of constructed neologisms in a machine translation system. It focuses on a particular issue in Romance languages: relational adjectives and the...
Word sketches are part of the Sketch Engine corpus query system. They represent automatic, corpus-derived summaries of the words' grammatical and collocational behaviour. Bes...
Kremena Ivanova, Ulrich Heid, Sabine Schulte im Wa...
In this paper, a large-scale real-world speech database is introduced along with other multimedia driving data. We designed a data collection vehicle equipped with various sensors...
Tokenization is one of the initial steps done for almost any text processing task. It is not particularly recognized as a challenging task for English monolingual systems but it r...
A core ontology is a mid-level ontology which bridges the gap between an upper ontology and a domain ontology. Automatic Chinese core ontology construction can help quickly model ...
Reported speech in the form of direct and indirect reported speech is an important indicator of evidentiality in traditional newspaper texts, but also increasingly in the new medi...
A number of forensic studies published during the last 50 years report that intoxication with alcohol influences speech in a way that is made manifest in certain features of the s...
Florian Schiel, Christian Heinrich, Sabine Barf&uu...
Large speech and text corpora are crucial to the development of a state-of-the-art speech recognition system. This paper reports on the construction and evaluation of the first Th...
This paper describes part of the corpus collection efforts underway in the EC funded Companions project. The Companions project is collecting substantial quantities of dialogue a ...
Yorick Wilks, David Benyon, Christopher Brewster, ...