We describe a web application called "ANC2Go" that enables the user to select data from the Open American National Corpus (OANC) and the Manually Annotated Sub-corpus (M...
A SuperSense Tagger is a tool for the automatic analysis of texts that associates to each noun, verb, adjective and adverb a semantic category within a general taxonomy. The devel...
Giuseppe Attardi, Stefano Dei Rossi, Giulia Di Pie...
This article presents a new freely available trilingual corpus (Catalan, Spanish, English) that contains large portions of the Wikipedia and has been automatically enriched with l...
Samuel Reese, Gemma Boleda, Montse Cuadros, Llu&ia...
In this paper we present a partial dependency parser for Irish, in which Constraint Grammar (CG) rules are used to annotate dependency relations and grammatical functions in unres...
Questions are not asked in isolation. Their context, viz. the preceding interactions, might be of help to understand them and retrieve the correct answer. Previous research in Int...
Raffaella Bernardi, Manuel Kirschner, Zorana Ratko...
This paper describes the development of a structured document collection containing user-generated text and numerical metadata for exploring the exploitation of metadata in inform...
Walid Magdy, Jinming Min, Johannes Leveling, Garet...
This article describes an age-annotated database of German telephone speech. All in all 47 hours of prompted and free text was recorded, uttered by 954 paid participants in a styl...
Felix Burkhardt, Martin Eckert, Wiebke Johannsen, ...
Analysts in various domains, especially intelligence and financial, have to constantly extract useful knowledge from large amounts of unstructured or semi-structured data. Keyword...
Mithun Balakrishna, Dan I. Moldovan, Marta Tatu, M...
This paper describes a system for linking the thesaurus of the Netherlands Institute for Sound and Vision to English WordNet and dbpedia. The thesaurus contains subject (concept) ...