In this paper, we study the feasibility of SIFT features for the tasks of object recognition and tracking within the framework of the IVSEE system design. The IVSEE system is intended to imitate the early functionalities of the human visual system in enclosed environments. The goal of this system is to be able to; perform basic object recognition, determine object states and spatial interrelations, and all of this engaged with a purposive system behavior (e.g. object tracking). To implement this system, we turn to wellknown and state-of-the-art techniques from the literature, and choose SIFT features for the stages of object extraction and recognition. We have performed (and present here) experimental work carried out to determine the adequacy of these features for the system goals. Results confirm SIFT features as a good implementation choice.