Sciweavers

281 search results - page 19 / 57
» Introducing the Enron Corpus
Sort
View
COLING
2000
13 years 9 months ago
A Method of Measuring Term Representativeness - Baseline Method Using Co-occurrence Distribution
This paper introduces a scheme, which we call the baseline method, to define a measure of term representativeness and measures defined by using the scheme. The representativeness ...
Toru Hisamitsu, Yoshiki Niwa, Jun-ichi Tsujii
SIGIR
2002
ACM
13 years 7 months ago
Liberal relevance criteria of TREC -: counting on negligible documents?
Most test collections (like TREC and CLEF) for experimental research in information retrieval apply binary relevance assessments. This paper introduces a four-point relevance scal...
Eero Sormunen
EMNLP
2010
13 years 5 months ago
Two Decades of Unsupervised POS Induction: How Far Have We Come?
Part-of-speech (POS) induction is one of the most popular tasks in research on unsupervised NLP. Many different methods have been proposed, yet comparisons are difficult to make s...
Christos Christodoulopoulos, Sharon Goldwater, Mar...
IWSDS
2010
13 years 5 months ago
D3 Toolkit: A Development Toolkit for Daydreaming Spoken Dialog Systems
Recently various data-driven spoken language technologies have been applied to spoken dialog system development. However, high cost of maintaining the spoken dialog systems is one ...
Donghyeon Lee, Kyungduk Kim, Cheongjae Lee, Junhwi...
LREC
2008
136views Education» more  LREC 2008»
13 years 9 months ago
Design and Recording of Czech Audio-Visual Database with Impaired Conditions for Continuous Speech Recognition
In this paper we discuss the design, acquisition and preprocessing of a Czech audio-visual speech corpus. The corpus is intended for training and testing of existing audio-visual ...
Jana Trojanová, Marek Hrúz, Pavel Ca...