Sciweavers

LREC
2008

The ATCOSIM Corpus of Non-Prompted Clean Air Traffic Control Speech

14 years 1 months ago
The ATCOSIM Corpus of Non-Prompted Clean Air Traffic Control Speech
Air traffic control (ATC) is based on voice communication between pilots and controllers and uses a highly task and domain specific language. Due to this very reason, spoken language technologies for ATC require domain-specific corpora, of which only few exist to this day. The ATCOSIM Air Traffic Control Simulation Speech corpus is a speech database of non-prompted and clean ATC operator speech. It consists of ten hours of speech data, which were recorded in typical ATC control room conditions during ATC real-time simulations. The database includes orthographic transcriptions and additional information on speakers and recording sessions. The ATCOSIM corpus is publicly available and provided online free of charge. In this paper, we first give an overview of ATC related corpora and their shortcomings. We then show the difficulties in obtaining operational ATC speech recordings and propose the use of existing ATC real-time simulations. We describe the recording, transcription, production...
Konrad Hofbauer, Stefan Petrik, Horst Hering
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2008
Where LREC
Authors Konrad Hofbauer, Stefan Petrik, Horst Hering
Comments (0)