We present a study on the use of soft computing techniques for object tracking/segmentation in surveillance video clips. A number of artificial creatures, conceptually, "inhabit" our image sequences. They explore the images looking for moving objects and learn their features, to distinguish the tracked objects from other moving objects in the scene. Their behaviour is controlled by neural networks evolved by an evolutionary algorithm while the ability to learn is granted by a Self Organizing Map trained while tracking. Population performance is evaluated on both artificial and real video sequences and some results are discussed.