Sciweavers

CICLING
2009
Springer
14 years 11 months ago
Exploiting Parallel Treebanks to Improve Phrase-Based Statistical Machine Translation
We use existing tools to automatically build two parallel treebanks from existing parallel corpora. We then show that combining the data extracted from both the treebanks and the ...
John Tinsley, Mary Hearne, Andy Way
CICLING
2009
Springer
14 years 11 months ago
Enriching Statistical Translation Models Using a Domain-Independent Multilingual Lexical Knowledge Base
This paper presents a method for improving phrase-based Statistical Machine Translation systems by enriching the original translation model with information derived from a multilin...
Miguel García, Jesús Giménez,...
CICLING
2009
Springer
14 years 11 months ago
Cross-Language Frame Semantics Transfer in Bilingual Corpora
Recent work on the transfer of semantic information across languages has been recently applied to the development of resources annotated with Frame information for different non-En...
Roberto Basili, Diego De Cao, Danilo Croce, Bonave...
CICLING
2009
Springer
14 years 11 months ago
Semi-supervised Clustering for Word Instances and Its Effect on Word Sense Disambiguation
We propose a supervised word sense disambiguation (WSD) system that uses features obtained from clustering results of word instances. Our approach is novel in that we employ semi-s...
Kazunari Sugiyama, Manabu Okumura
CICLING
2009
Springer
14 years 11 months ago
Business Specific Online Information Extraction from German Websites
This paper presents a system that uses the domain name of a German business website to locate its information pages (e.g. company profile, contact page, imprint) and then identifi...
Yeong Su Lee, Michaela Geierhos
CICLING
2009
Springer
14 years 11 months ago
Improved Unsupervised Name Discrimination with Very Wide Bigrams and Automatic Cluster Stopping
We cast name discrimination as a problem in clustering short contexts. Each occurrence of an ambiguous name is treated independently, and represented using second?order context vec...
Ted Pedersen
CICLING
2009
Springer
14 years 11 months ago
A Karaka Based Annotation Scheme for English
Ashwini Vaidya, Samar Husain, Prashanth Mannem, Di...
CICLING
2009
Springer
14 years 11 months ago
Reducing the Plagiarism Detection Search Space on the Basis of the Kullback-Leibler Distance
Abstract. Automatic plagiarism detection considering a reference corpus compares a suspicious text to a set of original documents in order to relate the plagiarised fragments to th...
Alberto Barrón-Cedeño, Paolo Rosso, ...
CICLING
2009
Springer
14 years 11 months ago
A General Method for Transforming Standard Parsers into Error-Repair Parsers
A desirable property for any system dealing with unrestricted natural language text is robustness, the ability to analyze any input regardless of its grammaticality. In this paper ...
Carlos Gómez-Rodríguez, Miguel A. Al...
CICLING
2009
Springer
14 years 11 months ago
Guessers for Finite-State Transducer Lexicons
Abstract. Language software applications encounter new words, e.g., acronyms, technical terminology, names or compounds of such words. In order to add new words to a lexicon, we ne...
Krister Lindén