In this paper we describe a multi-modal ear and face biometric system. The system is comprised of two components: a 3D ear recognition component and a 2D face recognition component. For the 3D ear recognition, a series of frames is extracted from a video clip and the region of interest (i.e., ear) in each frame is independently reconstructed in 3D using Shape From Shading. The resulting 3D models are then registered using the iterative closest point algorithm. We iteratively consider each model in the series as a reference model and calculate the similarity between the reference model and every model in the series using a similarity cost function. Cross validation is performed to assess the relative fidelity of each 3D model. The model that demonstrates the greatest overall similarity is determined to be the most stable 3D model and is subsequently enrolled in the database. For the 2D face recognition, a set of facial landmarks is extracted from frontal facial images using the Active Shape Model. Then, the response of facial images to a series of Gabor filters at the locations of facial landmarks are calculated. The Gabor features (attributes) are stored in the database as the face model for recognition. The similarity between the Gabor features of a probe facial image and the reference models are utilized to determine the best match. The match scores of the ear recognition and face recognition modalities are fused to boost the overall recognition rate of the system. Experiments are conducted using a gallery set of 402 video clips and a probe of 60 video clips (images). As a result, a rank-one identification rate of 100% was achieved using the weighted sum technique for fusion.