This paper describes past, ongoing and planned work on the collection and transcription of spoken language samples for all the South African official languages and as part of this...
Text clustering is potentially very useful for exploration of text sets that are too large to study manually. The success of such a tool depends on whether the results can be expl...
Folksonomies are unsystematic, unsophisticated collections of keywords associated by social bookmarking users to web content and, despite their inconsistency problems (typographic...
In this paper we describe a proof-of-concept for the bootstrapping of a Persian WordNet. This effort was motivated by previous work done at Stanford University on bootstrapping an...
The development of technologies to address machine translation and distillation of multilingual broadcast data depends heavily on the collection of large volumes of material from ...
This paper describes a Web service for accessing WordNet-type semantic lexicons. The central idea behind the service design is: given a query, the primary functionality of lexicon...
Savas Ali Bora, Yoshihiko Hayashi, Monica Monachin...
We investigate a number of approaches to generating Stanford Dependencies, a widely used semantically-oriented dependency representation. We examine algorithms specifically design...
Daniel Cer, Marie-Catherine de Marneffe, Daniel Ju...
One of the methods that has been proposed for dealing with real-word errors (errors that occur when a correctly spelled word is substituted for the one intended) is the "conf...
This paper proposes to introduce a novel reordering model in the open-source Moses toolkit. The main idea is to provide weighted reordering hypotheses to the SMT decoder. These hy...
This paper presents a corpus of annotated motion events and their event structure. We consider motion events triggered by a set of motion evoking words and contemplate both litera...
Kirk Roberts, Srikanth Gullapalli, Cosmin Adrian B...