The image compression standard JPEG 2000 offers a high compression ef ciency as well as a great exibility in the way it accesses the content in terms of spatial location, quality level, and resolution. This paper explores how transmission systems conveying video surveillance sequences can bene t from this exibility. Rather than transmitting each frame independently as it is generally done in the literature for JPEG 2000 based systems, we adopt a conditional replenishment scheme to exploit the temporal correlation of the video sequence. As a rst contribution, we propose a rate-distortion optimal strategy to select the most pro table packets to transmit. As a second contribution, we provide the client with two references, the previous reconstructed frame and an estimation of the current scene background, which improves the transmission system performances.