This paper proposes a unified approach for initializing, detecting and tracking of multiple moving objects. Object initialization is achieved through novel seed selection which is adaptively activated, depending on the quality of tracking, to select the best possible frames along the temporal direction for object detection. EM algorithm is then employed to robustly segment and detect multiple objects in a selected frame. Each detected object is represented by an appearance-basedmodel and mean shift tracking procedure is adopted to rapidly and effectively track the target objects.