IF-MCA: Importance Factor-based Multiple Correspondence Analysis for Multimedia Data Analytics

Yimin Yang, Samira Pouyanfar, Haiman Tian, Min Chen, Shu Ching Chen, Mei-Ling Shyu

Research output: Contribution to journalArticle

5 Scopus citations

Abstract

Multimedia concept detection is a challenging topic due to the well-known class imbalance issue where the data instances are distributed unevenly across different classes. This problem becomes even more prominent when the minority class that contains an extremely small proportion of the data actually represents the concept of interest as occurred in many real-world applications such as frauds in banking transactions and goal events in soccer videos. Traditional data mining approaches often have difficulty handling largely skewed data distributions. To address this issue, in this paper, an Importance Factor based Multiple Correspondence Analysis (IF-MAC) framework is proposed to deal with the imbalanced datasets. Specifically, a Hierarchical Information Gain Analysis (HIGA) method, which is inspired by the decision tree algorithm, is presented for critical feature selection and Importance Factor (IF) assignment. Then the derived IF is incorporated with the Multiple Correspondence Analysis (MCA) algorithm for effective concept detection and retrieval. The comparison results in video concept detection using the disaster dataset and the soccer dataset demonstrate the effectiveness of the proposed framework.

Original languageEnglish (US)
JournalIEEE Transactions on Multimedia
DOIs
StateAccepted/In press - Oct 5 2017

    Fingerprint

Keywords

  • Algorithm design and analysis
  • Data mining
  • Decision trees
  • Feature extraction
  • feature selection
  • Importance factor
  • information gain
  • Multimedia communication
  • multiple correspondence analysis (MCA)
  • Testing
  • Training

ASJC Scopus subject areas

  • Signal Processing
  • Media Technology
  • Computer Science Applications
  • Electrical and Electronic Engineering

Cite this