Multimodal biometrics recognition from facial video with missing modalities using deep learning

Research output: Contribution to journalArticlepeer-review

5 Scopus citations


Biometrics identification using multiple modalities has attracted the attention of many researchers as it produces more robust and trustworthy results than single modality biometrics. In this paper, we present a novel multimodal recognition system that trains a deep learning network to automatically learn features after extracting multiple biometric modalities from a single data source, i.e., facial video clips. Utilizing different modalities, i.e., left ear, left profile face, frontal face, right profile face, and right ear, present in the facial video clips, we train supervised denoising auto-encoders to automatically extract robust and non-redundant features. The automatically learned features are then used to train modality specific sparse classifiers to perform the multimodal recognition. Moreover, the proposed technique has proven robust when some of the above modalities were missing during the testing. The proposed system has three main components that are responsible for detection, which consists of modality specific detectors to automatically detect images of different modalities present in facial video clips; feature selection, which uses supervised denoising sparse auto-encoders network to capture discriminative representations that are robust to the illumination and pose variations; and classification, which consists of a set of modality specific sparse representation classifiers for unimodal recognition, followed by score level fusion of the recognition results of the available modalities. Experiments conducted on the constrained facial video dataset (WVU) and the unconstrained facial video dataset (HONDA/UCSD), resulted in a 99.17% and 97.14% Rank-1 recognition rates, respectively. The multimodal recognition accuracy demonstrates the superiority and robustness of the proposed approach irrespective of the illumination, nonplanar movement, and pose variations present in the video clips even in the situation of missing modalities.

Original languageEnglish (US)
Pages (from-to)6-29
Number of pages24
JournalJournal of Information Processing Systems
Issue number1
StatePublished - Jan 1 2020


  • Auto-encoder
  • Deep learning
  • Multimodal biometrics
  • Sparse classification

ASJC Scopus subject areas

  • Software
  • Information Systems


Dive into the research topics of 'Multimodal biometrics recognition from facial video with missing modalities using deep learning'. Together they form a unique fingerprint.

Cite this