In this paper we describe a multi-modal ear and face biometric system. The system is comprised of two components: a 3D ear recognition component and a 2D face recognition component. For the 3D ear recognition, a series of frames is extracted from a video clip and the region of interest (i.e., ear) in each frame is independently reconstructed in 3D using Shape From Shading. The resulting 3D models are then registered using the iterative closest point algorithm. We iteratively consider each model in the series as a reference model and calculate the similarity between the reference model and every model in the series using a similarity cost function. Cross validation is performed to assess the relative fidelity of each 3D model. The model that demonstrates the greatest overall similarity is determined to be the most stable 3D model and is subsequently enrolled in the database. For the 2D face recognition, a set of facial landmarks is extracted from frontal facial images using the Active ...