In this paper, we show that the standard point of view of the neuroimaging community about fMRI time series alignment should be revisited to overcome the bias induced by activations. We propose to perform a two-stage alignment. The first motion estimation is used to infer a mask of activated areas. The second motion estimation discards these areas during the similarity measure estimations. Simulated and actual time series are used to show that this dedicated approach is more efficient than standard robust similarity measures.