Continuous speech input for ASR processing is usually presegmented into speech stretches by pauses. In this paper, we propose that smaller, prosodically defined units can be ident...
Yi-Fen Liu, Shu-Chuan Tseng, Jyh-Shing Roger Jang,...
In this paper, we adopt a direct modeling approach to utilize conversational gesture cues in detecting sentence boundaries, called SUs, in video taped conversations. We treat the ...