Sciweavers

910 search results - page 140 / 182
» Standardization of Speech Corpus
Sort
View
LREC
2008
99views Education» more  LREC 2008»
13 years 10 months ago
Standardising Bilingual Lexical Resources According to the Lexicon Markup Framework
The Dutch HLT agency for language and speech technology (known as TST-centrale) at the Institute for Dutch Lexicology is responsible for the maintenance, distribution and accessib...
Isa Maks, Carole Tiberius, Remco van Veenendaal
ISDA
2008
IEEE
14 years 3 months ago
Compute the Term Contributed Frequency
In this paper, we propose an algorithm and data structure for computing the term contributed frequency (tcf) for all N-grams in a text corpus. Although term frequency is one of th...
Cheng-Lung Sung, Hsu-Chun Yen, Wen-Lian Hsu
ICTAI
2007
IEEE
14 years 3 months ago
On Evaluation Methodologies for Text Segmentation Algorithms
The WindowDiff evaluation measure [12] is becoming the standard criterion for evaluating text segmentation methods. Nevertheless, this metric is really not fair with regard to the...
Sylvain Lamprier, Tassadit Amghar, Bernard Levrat,...
SEMCO
2007
IEEE
14 years 2 months ago
Nested Named Entity Recognition in Historical Archive Text
This paper describes work on Named Entity Recognition (NER), in preparation for Relation Extraction (RE), on data from a historical archive organisation. As is often the case in t...
Kate Byrne
TSD
2007
Springer
14 years 2 months ago
Automatic Diacritic Restoration for Resource-Scarce Languages
Abstract. The orthography of many resource-scarce languages includes diacritically marked characters. Falling outside the scope of the standard Latin encoding, these characters are...
Guy De Pauw, Peter W. Wagacha, Gilles-Maurice de S...