The Nijmegen Corpus of Casual French

15 years 1 months ago

Download mirjamernestus.ruhosting.nl

This article describes the preparation, recording and orthographic transcription of a new speech corpus, the Nijmegen Corpus of Casual French (NCCFr). The corpus contains a total of over 36 hours of recordings of 46 French speakers engaged in conversations with friends. Casual speech was elicited during three different parts, which together provided around ninety minutes of speech from every pair of speakers. While Parts 1 and 2 did not require participants to perform any specific task, in Part 3 participants negotiated a common answer to general questions about society. Comparisons with the ESTER corpus of journalistic speech show that the two corpora contain speech of considerably different registers. A number of indicators of casualness, including swear words, casual words, verlan, disfluencies and word repetitions, are more frequent in the NCCFr than in the ESTER corpus, while the use of double negation, an indicator of formal speech, is less frequent. In general, these estimates ...

Francisco Torreira, Martine Adda-Decker, Mirjam Er

Real-time Traffic

Casual Speech | ESTER Corpus | Nijmegen Corpus | Security Privacy | SPEECH 2010 |

claim paper

Added	21 May 2011
Updated	21 May 2011
Type	Journal
Year	2010
Where	SPEECH
Authors	Francisco Torreira, Martine Adda-Decker, Mirjam Ernestus

Sciweavers

The Nijmegen Corpus of Casual French

Casual Speech | ESTER Corpus | Nijmegen Corpus | Security Privacy | SPEECH 2010 |

Explore & Download

Productivity Tools

Sciweavers