Part-of-speech or morphological tags are important means of annotation in a vast number of corpora. However, different sets of tags are used in different corpora, even for the sam...
This paper presents AnCora, a multilingual corpus annotated at different linguistic levels consisting of 500,000 words in Catalan (AnCora-Ca) and in Spanish (AnCora-Es). At presen...
The computational linguistics community in The Netherlands and Belgium has long recognized the dire need for a major reference corpus of written Dutch. In part to answer this need...
Nelleke Oostdijk, Martin Reynaert, Paola Monachesi...
We present a new coding mechanism, spatiotemporal coding, that allows coders to annotate points and regions in the video frame by drawing directly on the screen. Coders can not on...
Combinatorial Category Grammar is (CCG) a lexicalized grammar formalism which is expressed by syntactic category, a logical form representation. There are difficulties in represen...
The Semantic Web of the future will be characterized by using a very large number of ontologies embedded in ontology networks. It is important to provide strong methodological sup...
The orthographical complexities of Chinese, Japanese, Korean (CJK) and Arabic pose a special challenge to developers of NLP applications. These difficulties are exacerbated by the...
We implement several different methods for generating jokes in English. The common theme is to intentionally produce poor utterances by breaking Grice's maxims of conversatio...
We address the question of which syntactic representation is best suited for role-semantic analysis of English in the FrameNet paradigm. We compare systems based on dependencies a...