A major issue in evaluating speech enhancement and hearing compensation algorithms is to come up with a suitable metric that predicts intelligibility as judged by a human listener. Previous methods such as the widely used Speech Transmission Index (STI) fail to account for masking effects that arise from the highly nonlinear cochlear transfer function. We therefore propose a Neural Articulation Index (NAI) that estimates speech intelligibility from the instantaneous neural spike rate over time, produced when a signal is processed by an auditory neural model. By using a well developed model of the auditory periphery and detection theory we show that human perceptual discrimination closely matches the modeled distortion in the instantaneous spike rates of the auditory nerve. In highly rippled frequency transfer conditions the NAI’s prediction error is 8% versus the STI’s prediction error of 10.8%.
Jeff Bondy, Ian C. Bruce, Suzanna Becker, Simon Ha