Multimodal information fusion using the iterative decoding algorithm and its application to audio-visual speech recognition