Web count statistics gathered from search engines have been widely used as a resource in a variety of NLP tasks. For some tasks, however, the information they exploit is not fine-...
In this paper we describe ANNALIST (Annotation, Alignment and Scoring Tool), a scoring system for the evaluation of the output of semantic annotation systems. ANNALIST has been de...
George Demetriou, Robert J. Gaizauskas, Haotian Su...
Language models used in current automatic speech recognition systems are trained on general-purpose corpora and are therefore not relevant to transcribe spoken documents dealing w...
This paper introduces a new architecture that aims at combining molecular biology data with information automatically extracted from scientific literature (using text mining techn...
This paper describes the evaluation methodology used to evaluate the TC-STAR speech-to-speech translation (SST) system and their results from the third year of the project. It fol...
This paper presents a general methodology to mapping EuroWordNets (Vossen, 1998) to the Suggested Upper Merged Ontology (SUMO; (Niles and Pease, 2001)), and we show its applicatio...
Open answers in questionnaires contain valuable information that is very time-consuming to analyze manually. We present a method for hypothesis generation from questionnaires base...
While the Web is facing interesting new changes in the way users access, interact and even participate to its growth, the most traditional applications dedicated to its fruition: ...
Maria Teresa Pazienza, Marco Pennacchiotti, Armand...
This paper discusses the problem of utilising multiply annotated data in training biomedical information extraction systems. Two corpora, annotated with entities and relations, an...