We address the blind source separation (BSS) problem for the convolutive mixing case. Second-order statistical methods are employed assuming the source signals are non-stationary and possibly also non-white. The proposed algorithm is based on a joint-diagonalization approach, where we search for a single polynomial matrix that jointly diagonalizes a set of measured spatiotemporal correlation matrices. In contrast to most other algorithms based on similar concepts, we define the underlying cost function entirely in the time-domain. Furthermore, we present an efficient implementation of the proposed algorithm which is based on fast convolution techniques.