Sciweavers

LREC
2008
88views Education» more  LREC 2008»
13 years 9 months ago
Using Movie Subtitles for Creating a Large-Scale Bilingual Corpora
This paper presents a method for compiling a large-scale bilingual corpus from a database of movie subtitles. To create the corpus, we propose an algorithm based on Gale and Churc...
Einav Itamar, Alon Itai
LREC
2008
102views Education» more  LREC 2008»
13 years 9 months ago
Measures for Term and Sentence Relevances: an Evaluation for German
Terms, term relevances, and sentence relevances are concepts that figure in many NLP applications, such as Text Summarization. These concepts are implemented in various ways, thou...
Heike Bieler, Stefanie Dipper
LREC
2008
103views Education» more  LREC 2008»
13 years 9 months ago
Frame Information Transfer from English to Italian
We describe an automatic projection algorithm for transferring frame-semantic information from English to Italian texts, as a first sep towards the creation of Italian FrameNet. P...
Sara Tonelli, Emanuele Pianta
LREC
2008
120views Education» more  LREC 2008»
13 years 9 months ago
JMWNL: an Extensible Multilingual Library for Accessing Wordnets in Different Languages
In this paper we present JMWNL, a multilingual extension of the JWNL java library, which was originally developed for accessing Princeton WordNet dictionaries. JMWNL broadens the ...
Maria Teresa Pazienza, Armando Stellato, Alexandra...
LREC
2008
86views Education» more  LREC 2008»
13 years 9 months ago
A Real-World Emotional Speech Corpus for Modern Greek
The present paper deals with the design and the annotation of a Greek real-world emotional speech corpus. The speech data consist of recordings collected during the interaction of...
Theodoros Kostoulas, Todor Ganchev, Iosif Mporas, ...
LREC
2008
113views Education» more  LREC 2008»
13 years 9 months ago
Subdomain Sensitive Statistical Parsing using Raw Corpora
Modern statistical parsers are trained on large annotated corpora (treebanks). These treebanks usually consist of sentences addressing different subdomains (e.g. sports, politics,...
Barbara Plank, Khalil Sima'an
LREC
2008
77views Education» more  LREC 2008»
13 years 9 months ago
Word-Based or Morpheme-Based? Annotation Strategies for Modern Hebrew Clitics
Morphologically rich languages pose a challenge to the annotators of treebanks with respect to the status of orthographic (spacedelimited) words in the syntactic parse trees. In s...
Reut Tsarfaty, Yoav Goldberg
LREC
2008
72views Education» more  LREC 2008»
13 years 9 months ago
Extraction of Attribute Concepts from Japanese Adjectives
ibe various syntactic and semantic conditions for finding abstract nouns which refer to concepts of adjectives from a text, in an attempt to explore the creation of a thesaurus fr...
Kyoko Kanzaki, Francis Bond, Noriko Tomuro, Hitosh...
LREC
2008
97views Education» more  LREC 2008»
13 years 9 months ago
Enhancing an English-Polish Electronic Dictionary for Multiword Expression Research
This paper describes a project aimed at converting a legacy representation of English idioms into an XML-based format. The project is set in the context of a large electronic Engl...
Piotr Banski, Radoslaw Moszczynski
LREC
2008
153views Education» more  LREC 2008»
13 years 9 months ago
Automatic Rich Annotation of Large Corpus of Conversational transcribed speech: the Chunking Task of the EPAC Project
This paper describes the use of the CasSys platform in order to achieve the chunking of conversational speech transcripts by means of cascades of Unitex transducers. Our system is...
Jean-Yves Antoine, Abdenour Mokrane, Nathalie Frib...