This paper presents a solution to the problem of unsupervised classification of dynamic obstacles in urban environments. A track-based model is introduced for the integration of 2D laser and vision information that provides a robust spatio-temporal synthesis of the sensed moving obstacles and forms the basis for suitable algorithms to perform unsupervised classification by clustering. This work presents various contributions in order to achieve accurate and efficient performance, initially using laser tracks for classification, and then incorporating visual tracks to the model. A procedure is proposed for accurate unsupervised classification of dynamic obstacles using a laser stamp representation of the tracks. Laser data is then integrated with visual information through a single-instance visual stamp representation, which is finally extended using a multiple instance framework to robustly deal with challenges associated with perception in real-world scenarios. The proposed algorithm...