In this paper, we propose a multi-microphone joint optimal estimation of the direction of arrival (DOA) and the source speech signal through newly introduced EM beamforming. This produces a posterior PDF for the DOA, based only on the reliable speech spectrum. By maximizing over the posterior PDF of the DOA, we achieve maximum a posteriori DOA estimation. After convergence, the estimated source spectrum through weighted sum in the Bayesian sense is a maximum likelihood estimate (MLE). This is a sufficient statistic for minimum mean square error (MMSE) optimal estimation using a subsequent single channel MMSE filter.