In this paper, we develop an RST-style textlevel discourse parser, based on the HILDA discourse parser (Hernault et al., 2010b). We significantly improve its tree-building step b...
This paper presents a two-step approach to compress spontaneous spoken utterances. In the first step, we use a sequence labeling method to determine if a word in the utterance ca...
Though polarity classification has been extensively explored at document level, there has been little work investigating feature design at sentence level. Due to the small number ...
ct 9 In this paper, a new novelty detection approach based on the identification of sentence level information patterns is 10 proposed. First, ``novelty'' is redefined ba...
In this paper we present the prototype based text matching methodology used in the Routing Sub-Task of TREC 2001 Filtering Track. The methodology examines texts on word and senten...
Ari Visa, Jarmo Toivonen, Tomi Vesanen, Jarno M&au...
Recently, categorical grammar has been focused as a powerful grammar. This paper aims to develop a framework for automatic CG tagging for Thai. We investigated two main algorithms...
The detection of new information in a document stream is an important component of many potential applications. In this work, a new novelty detection approach based on the identif...
Novelty detection is a difficult task, particularly at sentence level. Most of the approaches proposed in the past consist of re-ordering all sentences following their novelty sco...
In European-funded project MIS under the MLIS programme, the authors attempted a computer-driven translation package for tourism texts in 5 languages. It was believed such a packa...
In this paper we compare the robustness of several types of stylistic markers to help discriminate authorship at sentence level. We train a SVM-based classifier using each set of ...