— Vision is one of the most powerful sensory modalities in robotics, allowing operation in dynamic environments. One of our long-term research interests is mobile manipulation, where precise location of the target object is commonly required during task execution. Recently, a number of approaches have been proposed for real-time 3D tracking and most of them utilize an edge (wireframe) model of the target. However, the use of an edge model has significant problems in complex scenes due to occlusions and multiple responses, especially in terms of initialization. In this paper, we propose a new tracking method based on integration of model-based cues with automatically generated model-free cues, in order to improve tracking accuracy and to avoid weaknesses of edge based tracking. The integration is performed in a Kalman filter framework that operates in real-time. Experimental evaluation shows that the inclusion of model-free cues offers superior performance.