Groupwise shape registration of raw edge sequence is addressed. Automatically extracted edge maps are treated as noised input shape of the deformable object and their registration are considered, results can be used to build statistical shape models without laborious manual labeling process. Dealing with raw edges poses several challenges, to fight against them a novel spatio-temporal generative model is proposed which joints shape registration and trajectory tracking. Mean shape, consistent correspondences among edge sequence and associated non-rigid transformations are jointly inferred under EM framework. Our algorithm is tested on real video sequences of a dancing ballerina, talking face, and walking person. Results achieved are interesting, promising, and prove the robustness of our method. Potential applications can be found in statistical shape analysis, action recognition, object tracking, etc.