Usable speech is a novel concept related to the co-channel speech problem. Co-channel speech occurs when more than one person is talking at the same time. The idea of usable speech is to identify and extract those portions of co-channel speech that are still useful for speech processing applications such as speaker identification or speech recognition, which do not work in cochannel environments. Usable speech measures are features that are extracted from the co-channel signal to detect the presence of usable as well as co-channel (unusable) speech. Several usable speech measures are currently being developed; however, these measures detect only about 75% of the usable speech. To improve on this performance, nonlinear estimation and Bayesian classification are used to fuse the information in two recently proposed usable speech measures. Using fusion resulted in a 15% increase in hits (usable speech frames detected) and a 37% decrease in false alarms.
Robert E. Yantorno, Brett Y. Smolenski