This paper presents an efficient and scalable coding scheme for transmitting a stream of 3D models extracted from a video. As in classical model-based video coding, the geometry, connectivity, and texture of the 3D models have to be transmitted, as well as the camera position for each frame in the original video. The proposed method is based on exploiting the interrelations existing between each type of information, instead of coding them independently, allowing a better prediction of the next 3D model in the stream. Scalability is achieved through the use of waveletbased representations for both texture and geometry of the models. A consistent connectivity is built for all 3D models extracted from the video sequence, which allows a more compact representation and straightforward geometric morphing between successive models. Furthermore this leads to a consistent wavelet decomposition for 3D models in the stream. Quantitative and qualitative results for the proposed scheme are compare...