Deep Learning with MCA-based Instance Selection and Bootstrapping for Imbalanced Data Classification

Sheng Guan, Min Chen, Hsin Yu Ha, Shu Ching Chen, Mei-Ling Shyu, Chengde Zhang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

10 Scopus citations

Abstract

In this paper, we propose an extended deep learning approach that incorporates instance selection and bootstrapping techniques for imbalanced data classification. In supervised learning, classification performance often deteriorates when the training set is imbalanced where at least one of the classes has a substantially fewer number of instances than the others. We propose to use adaptive synthetic sampling approach (ADASYN) to generate synthetic instances for the minority class. A data pruning process based on multiple correspondence analysis (MCA) is then performed to identify a sub-set of synthetic instances that are most suitable to supplement the existing minority instances. This results in a relatively more balanced training dataset which is then bootstrapped and fed into the convolutional neural networks (CNNs) for classification. Furthermore, we propose to use low-level features pre-processed by principal component analysis (PCA), instead of the commonly used raw signal data, as the input to CNNs to reduce the computational time. The experimental results show the effectiveness of our framework in classifying 54 TRECVID concepts with different imbalanced levels by comparing with other state-of-the-art methods.

Original languageEnglish (US)
Title of host publicationProceedings - 2015 IEEE Conference on Collaboration and Internet Computing, CIC 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages288-295
Number of pages8
ISBN (Print)9781509000890
DOIs
StatePublished - Mar 1 2016
Event1st IEEE International Conference on Collaboration and Internet Computing, CIC 2015 - Hangzhou, China
Duration: Oct 28 2015Oct 30 2015

Other

Other1st IEEE International Conference on Collaboration and Internet Computing, CIC 2015
CountryChina
CityHangzhou
Period10/28/1510/30/15

    Fingerprint

Keywords

  • Bootstrapping
  • Classification
  • Convolutional neural network (CNN)
  • Imbalanced data
  • Multiple correspondence analysis (MCA)
  • Supervised learning

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Computer Networks and Communications

Cite this

Guan, S., Chen, M., Ha, H. Y., Chen, S. C., Shyu, M-L., & Zhang, C. (2016). Deep Learning with MCA-based Instance Selection and Bootstrapping for Imbalanced Data Classification. In Proceedings - 2015 IEEE Conference on Collaboration and Internet Computing, CIC 2015 (pp. 288-295). [7423094] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/CIC.2015.40