In this paper, we present a system for simultaneous tracking of multiple persons in a smartroom using multiple cameras. Robust person tracks are created, continuously adapted, and deleted by fusing cues from foreground segmentation maps and various appearance-based object detectors. Tracking is performed using color histograms which are automatically filtered and adaptated based on local image characteristics. Tracks from the various 2D views are merged to 3D position estimates by an intelligent fusion algorithm based on triangulation error reduction. The approach allows to robustly track moving, standing or sitting persons in cluttered environments and to successfully recover lost tracks at any point in the room. We also introduce a new set of metrics to measure multiple object tracking performance. Our system reaches a high tracking accuracy with average position errors of less than 17cm.