Constructing the CODA Corpus: A Parallel Corpus of Monologues and Expository Dialogues

15 years 1 months ago

Download www.lrec-conf.org

We describe the construction of the CODA corpus, a parallel corpus of monologues and expository dialogues. The dialogue part of the corpus consists of expository, i.e., information-delivering rather than dramatic, dialogues written by several acclaimed authors. The monologue part of the corpus is a paraphrase in monologue form of these dialogues by a human annotator. The corpus was constructed as a resource for extracting rules for automated generation of dialogue from monologue. Using authored dialogues allows us to analyse the techniques used by accomplished writers for presenting information in the form of dialogue. The dialogues are annotated with dialogue acts and the monologues with rhetorical structure. We developed annotation and translation guidelines together with a custom-developed tool for carrying out translation, alignment and annotation.

Svetlana Stoyanchev, Paul Piwek

Real-time Traffic

Dialogue | Education | Expository Dialogues | LREC 2010 | Monologues |

claim paper

Added	29 Oct 2010
Updated	29 Oct 2010
Type	Conference
Year	2010
Where	LREC
Authors	Svetlana Stoyanchev, Paul Piwek

Sciweavers

Constructing the CODA Corpus: A Parallel Corpus of Monologues and Expository Dialogues

Dialogue | Education | Expository Dialogues | LREC 2010 | Monologues |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers