Sciweavers

CLIN
2000
14 years 5 days ago
Transforming a Chunker to a Parser
Ever since the landmark paper Ramshaw and Marcus (1995), machine learning systems have been used successfully for identifying base phrases (chunks), the bottom constituents of a p...
Erik F. Tjong Kim Sang
CLIN
2000
14 years 5 days ago
Proper Name Extraction from Non-Journalistic Texts
This paper discusses the influence of the corpus on the automatic identification of proper names in texts. Techniques developed for the newswire genre are generally not sufficient...
Thierry Poibeau, Leila Kosseim
CLIN
2000
14 years 5 days ago
Syntactic Annotation for the Spoken Dutch Corpus Project (CGN)
Of the ten million words of contemporary standard Dutch in the Spoken Dutch Corpus (Corpus Gesproken Nederlands, CGN), a selection of one million words of natural spoken language ...
Heleen Hoekstra, Michael Moortgat, Ineke Schuurman...
CLIN
2000
14 years 5 days ago
Alpino: Wide-coverage Computational Analysis of Dutch
Alpino is a wide-coverage computational analyzer of Dutch which aims at accurate, full, parsing of unrestricted text. We describe the head-driven lexicalized grammar and the lexic...
Gosse Bouma, Gertjan van Noord, Rob Malouf
CLIN
2001
14 years 6 days ago
Multi-feature Error Detection in Spoken Dialogue Systems
The present paper evaluates the role selected features and feature combinations play for error detection in spoken dialogue systems. We investigate the relevance of various, readi...
Piroska Lendvai, Antal van den Bosch, Emiel Krahme...
CLIN
2001
14 years 6 days ago
The Alpino Dependency Treebank
In this paper we present the Alpino Dependency Treebank and the tools that we have developed to facilitate the annotation process. Annotation typically starts with parsing a sente...
Leonoor van der Beek, Gosse Bouma, Rob Malouf, Ger...
CLIN
2001
14 years 6 days ago
Memory-Based Phoneme-to-Grapheme Conversion
In this paper, we describe a method to enhance the readability of out-of-vocabulary items (OOVs) in the textual output in a large vocabulary continuous speech recognition system. ...
Bart Decadt, Jacques Duchateau, Walter Daelemans, ...
CLIN
2001
14 years 6 days ago
Applying Monte Carlo Techniques to Language Identification
Two major stages stages in language identification systems can be identified: the language modeling stage, where the distinctive features of languages are determined and stored in...
Arjen Poutsma
CLIN
2001
14 years 6 days ago
A Named Entity Recognition System for Dutch
We describe a Named Entity Recognition system for Dutch that combines gazetteers, handcrafted rules, and machine learning on the basis of seed material. We used gazetteers and a c...
Fien De Meulder, Walter Daelemans, Véroniqu...
CLIN
2001
14 years 6 days ago
Creating a Dutch Information Retrieval Test Corpus
This paper describes the first large-scale evaluation of information retrieval systems using Dutch documents and queries. We describe in detail the characteristics of the Dutch te...
Djoerd Hiemstra, David van Leeuwen