Abstract. This paper is a documentation of the acoustic person tracking system developed by TUT. The system performance was evaluated in the CLEAR 2007 evaluation. The proposed system is designed to track a speaker position in a meeting room domain using only audio data. In the CLEAR 2007 evaluation the audio data consists of recordings from multiple microphone arrays. The meeting rooms are equipped with three to seven arrays. Speaker localization is performed by mapping pairwise cross-correlations of microphone signals into a three dimensional likelihood field. The resulting likelihood is used as source evidence for a particle filtering algorithm. A point estimate for the speaker position for each time frame is derived from the resulting sequential process. Results indicate an 85 % success rate of localization with 15 cm average accuracy.