Quick Rich Transcriptions of Arabic Broadcast News Speech Data

15 years 8 months ago

Download www.lrec-conf.org

This paper describes the collect and transcription of a large set of Arabic broadcast news speech data. A total of more than 2000 hours of data was transcribed. The transcription factor for transcribing the broadcast news data has been reduced using a method such as Quick Rich Transcription (QRTR) as well as reducing the number of quality controls performed on the data. The data was collected from several Arabic TV and radio sources and from both Modern Standard Arabic and dialectal Arabic. The orthographic transcriptions included segmentation, speaker turns, topics, sentence unit types and a minimal noise mark-up. The transcripts were produced as a part of the GALE project.

Chomicha Bendahman, Meghan Lammie Glenn, Djamel Mo

Real-time Traffic

Education | LREC 2008 | Modern Standard Arabic | Quick Rich Transcription | Speech Data |

claim paper

» First Broadcast News Transcription System for Khmer Language

» Webassisted annotation semantic indexing and search of television and radio news

» Complementary Video and Audio Analysis for Broadcast News Archives

Post Info
More Details (n/a)

Added	29 Oct 2010
Updated	29 Oct 2010
Type	Conference
Year	2008
Where	LREC
Authors	Chomicha Bendahman, Meghan Lammie Glenn, Djamel Mostefa, Niklas Paulsson, Stephanie Strassel

Comments (0)

Sciweavers

Quick Rich Transcriptions of Arabic Broadcast News Speech Data

Education | LREC 2008 | Modern Standard Arabic | Quick Rich Transcription | Speech Data |

Explore & Download

Productivity Tools

Sciweavers