We investigate the problem of learning a part-of-speech (POS) lexicon for a resource-poor language, dialectal Arabic. Developing a high-quality lexicon is often the first step tow...
Web extraction systems attempt to use the immense amount of unlabeled text in the Web in order to create large lists of entities and relations. Unlike traditional IE methods, the ...
We propose a conditional random fieldbased method for supertagging, and apply it to the task of learning new lexical items for HPSG-based precision grammars of English and Japanes...
This paper presents a method of automatically constructing information extraction patterns on predicate-argument structures (PASs) obtained by full parsing from a smaller training...
In this paper we describe and evaluate several statistical models for the task of realization ranking, i.e. the problem of discriminating between competing surface realizations ge...
We present an investigation of recently proposed character and word sequence kernels for the task of authorship attribution based on relatively short texts. Performance is compare...
In this paper we study the utility of discourse structure for spoken dialogue performance modeling. We experiment with various ways of exploiting the discourse structure: in isola...