A Study of Parentheticals in Discourse Corpora - Implications for NLG Systems

15 years 8 months ago

Download www.lrec-conf.org

This paper presents a corpus study of parenthetical constructions in two different corpora: the Penn Discourse Treebank (PDTB, (PDTBGroup, 2008)) and the RST Discourse Treebank (Carlson et al., 2001). The motivation for the study is to gain a better understanding of the rhetorical properties of parentheticals in order to enable a natural language generation system to produce parentheticals as part of a rhetorically well-formed output. We argue that there is a correlation between syntactic and rhetorical types of parentheticals and establish two main categories: ELABORATION/EXPANSION-type NP-modifier parentheticals and NON-ELABORATION/EXPANSION-type VP- or S-modifier parentheticals. We show several strategies for extracting these from the two corpora and discuss how the seemingly contradictory results obtained can be reconciled in light of the rhetorical and syntactic properties of parentheticals as well as the decisions taken in the annotation guidelines.

Eva Banik, Alan Lee

Real-time Traffic

Discourse Treebank | Education | LREC 2008 | Penn Discourse Treebank | RST Discourse Treebank |

claim paper

Post Info
More Details (n/a)

Added	29 Oct 2010
Updated	29 Oct 2010
Type	Conference
Year	2008
Where	LREC
Authors	Eva Banik, Alan Lee

Comments (0)

Sciweavers

A Study of Parentheticals in Discourse Corpora - Implications for NLG Systems

Discourse Treebank | Education | LREC 2008 | Penn Discourse Treebank | RST Discourse Treebank |

Explore & Download

Productivity Tools

Sciweavers