Abstract. Previous work has shown that modeling relationships between articles of a regulation as vertices of a graph network works twice as better than traditional information ret...
Abstract. The Mongue-Elkan method is a general text string comparison method based on an internal character-based similarity measure (e.g. edit distance) combined with a token leve...
Sergio Jimenez, Claudia Becerra, Alexander F. Gelb...
Abstract. Automated language identification of written text is a wellestablished research domain that has received considerable attention in the past. By now, efficient and effecti...
The aim of this work is to evaluate the dependency-based annotation of EPEC (the Reference Corpus for the Processing of Basque) by means of an experiment: two annotators have synta...
Abstract. As any other classification task, Word Sense Disambiguation requires a large number of training examples. These examples, which are easily obtained for most of the tasks,...
Abstract. One question that arises if we want to evolve generation techniques to accommodate Web ontologies is how to capture and expose the relevant ontology content to the user. ...
Abstract. In many contexts today, documents are available in a number of versions. In addition to explicit knowledge that can be queried/searched in documents, these documents also...
Where the field has been and where it is going? It is relatively easy to know where we have been, but harder (and more valuable) to know where we are going. The title of this paper...
We use existing tools to automatically build two parallel treebanks from existing parallel corpora. We then show that combining the data extracted from both the treebanks and the ...
This paper presents a method for improving phrase-based Statistical Machine Translation systems by enriching the original translation model with information derived from a multilin...