In this work we derive a new cepstrum based maximum likelihood fundamental frequency estimator that exploits the information of multiple microphones. The new approach results in a maximum search on the sum of the microphone cepstra. We compare the new approach to a maximum search on the cepstrum of the output signal of a delay-and-sum beamformer. We show that the new approach outperforms the beamforming approach for all considered input signal-to-noise ratios. We develop a general framework which includes the cepstral harmonics of the fundamental frequency and extend the approach towards a maximum a posteriori fundamental period tracker that further enhances the results and increases the robustness in noisy environments.