The ability for humans to understand and process the emotional content of speech is unsurpassed by simulated intelligent agents. Beyond the linguistic content of speech are the underlying prosodic features naturally understood by humans. The goal of emotional speech processing systems is to extract and classify human speech for these so called paralinguistic elements. Presented here is a proof-of-concept system designed to analyze speech in real-time for coupled interactions with spiking neural models. Based on proven feature extraction algorithms, the resulting system provides two interface options to running simulations on the NeoCortical Simulator. Some basic tests using new recordings as well as a subset from a published emotional database were completed with promising results.
Corey M. Thibeault, Oscar Sessions, Philip H. Good