This paper presents the EPAC corpus which is composed by a set of 100 hours of conversational speech manually transcribed and by the outputs of automatic tools (automatic segmenta...
This paper describes a novel Bayesian approach to unsupervised topic segmentation. Unsupervised systems for this task are driven by lexical cohesion: the tendency of wellformed se...
The PASCAL Speech Separation Challenge (SSC) is based on a corpus of sentences from the Wall Street Journal task read by two speakers simultaneously and captured with two circular ...
John W. McDonough, Ken'ichi Kumatani, Tobias Gehri...
The second workshop on Searching Spontaneous Conversational Speech (SSCS 2008) was held in Singapore on July 24, 2008 in conjunction with the 31st Annual International ACM SIGIR C...
Language models used in current automatic speech recognition systems are trained on general-purpose corpora and are therefore not relevant to transcribe spoken documents dealing w...