This article presents a modular architecture for multicamera tracking in the context of sports broadcasting. For each video stream, a geometrical module continuously performs the image-to-model homography estimation. A localfeature based tracking module tracks the players in each view. A supervisor module collects, associates and fuses the data provided by the tracking modules. The originality of the proposed system is three-fold. First, it allows to localize the targets on the ground with rotating and zooming cameras; second, it does not use background modeling techniques; and third, the local tracking can cope with severe occlusions. We present experimental results on raw TV-camera footage of a soccer game.