Deep Learning for Imbalanced Multimedia Data Classification

Yilin Yan, Min Chen, Mei-Ling Shyu, Shu Ching Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

43 Citations (Scopus)

Abstract

Classification of imbalanced data is an important research problem as lots of real-world data sets have skewed class distributions in which the majority of data instances (examples) belong to one class and far fewer instances belong to others. While in many applications, the minority instances actually represent the concept of interest (e.g., fraud in banking operations, abnormal cell in medical data, etc.), a classifier induced from an imbalanced data set is more likely to be biased towards the majority class and show very poor classification accuracy on the minority class. Despite extensive research efforts, imbalanced data classification remains one of the most challenging problems in data mining and machine learning, especially for multimedia data. To tackle this challenge, in this paper, we propose an extended deep learning approach to achieve promising performance in classifying skewed multimedia data sets. Specifically, we investigate the integration of bootstrapping methods and a state-of-the-art deep learning approach, Convolutional Neural Networks (CNNs), with extensive empirical studies. Considering the fact that deep learning approaches such as CNNs are usually computationally expensive, we propose to feed low-level features to CNNs and prove its feasibility in achieving promising performance while saving a lot of training time. The experimental results show the effectiveness of our framework in classifying severely imbalanced data in the TRECVID data set.

Original languageEnglish (US)
Title of host publicationProceedings - 2015 IEEE International Symposium on Multimedia, ISM 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages483-488
Number of pages6
ISBN (Print)9781509003792
DOIs
StatePublished - Mar 25 2016
Event17th IEEE International Symposium on Multimedia, ISM 2015 - Miami, United States
Duration: Dec 14 2015Dec 16 2015

Other

Other17th IEEE International Symposium on Multimedia, ISM 2015
CountryUnited States
CityMiami
Period12/14/1512/16/15

Fingerprint

Neural networks
Data mining
Learning systems
Classifiers
Deep learning

Keywords

  • classification
  • convolutional neural network (CNN)
  • deep learning
  • imbalanced data
  • semantic indexing

ASJC Scopus subject areas

  • Computer Science Applications
  • Hardware and Architecture
  • Software
  • Computer Networks and Communications

Cite this

Yan, Y., Chen, M., Shyu, M-L., & Chen, S. C. (2016). Deep Learning for Imbalanced Multimedia Data Classification. In Proceedings - 2015 IEEE International Symposium on Multimedia, ISM 2015 (pp. 483-488). [7442383] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ISM.2015.126

Deep Learning for Imbalanced Multimedia Data Classification. / Yan, Yilin; Chen, Min; Shyu, Mei-Ling; Chen, Shu Ching.

Proceedings - 2015 IEEE International Symposium on Multimedia, ISM 2015. Institute of Electrical and Electronics Engineers Inc., 2016. p. 483-488 7442383.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Yan, Y, Chen, M, Shyu, M-L & Chen, SC 2016, Deep Learning for Imbalanced Multimedia Data Classification. in Proceedings - 2015 IEEE International Symposium on Multimedia, ISM 2015., 7442383, Institute of Electrical and Electronics Engineers Inc., pp. 483-488, 17th IEEE International Symposium on Multimedia, ISM 2015, Miami, United States, 12/14/15. https://doi.org/10.1109/ISM.2015.126
Yan Y, Chen M, Shyu M-L, Chen SC. Deep Learning for Imbalanced Multimedia Data Classification. In Proceedings - 2015 IEEE International Symposium on Multimedia, ISM 2015. Institute of Electrical and Electronics Engineers Inc. 2016. p. 483-488. 7442383 https://doi.org/10.1109/ISM.2015.126
Yan, Yilin ; Chen, Min ; Shyu, Mei-Ling ; Chen, Shu Ching. / Deep Learning for Imbalanced Multimedia Data Classification. Proceedings - 2015 IEEE International Symposium on Multimedia, ISM 2015. Institute of Electrical and Electronics Engineers Inc., 2016. pp. 483-488
@inproceedings{7b8201e72a444df79a70c9ba78149c9f,
title = "Deep Learning for Imbalanced Multimedia Data Classification",
abstract = "Classification of imbalanced data is an important research problem as lots of real-world data sets have skewed class distributions in which the majority of data instances (examples) belong to one class and far fewer instances belong to others. While in many applications, the minority instances actually represent the concept of interest (e.g., fraud in banking operations, abnormal cell in medical data, etc.), a classifier induced from an imbalanced data set is more likely to be biased towards the majority class and show very poor classification accuracy on the minority class. Despite extensive research efforts, imbalanced data classification remains one of the most challenging problems in data mining and machine learning, especially for multimedia data. To tackle this challenge, in this paper, we propose an extended deep learning approach to achieve promising performance in classifying skewed multimedia data sets. Specifically, we investigate the integration of bootstrapping methods and a state-of-the-art deep learning approach, Convolutional Neural Networks (CNNs), with extensive empirical studies. Considering the fact that deep learning approaches such as CNNs are usually computationally expensive, we propose to feed low-level features to CNNs and prove its feasibility in achieving promising performance while saving a lot of training time. The experimental results show the effectiveness of our framework in classifying severely imbalanced data in the TRECVID data set.",
keywords = "classification, convolutional neural network (CNN), deep learning, imbalanced data, semantic indexing",
author = "Yilin Yan and Min Chen and Mei-Ling Shyu and Chen, {Shu Ching}",
year = "2016",
month = "3",
day = "25",
doi = "10.1109/ISM.2015.126",
language = "English (US)",
isbn = "9781509003792",
pages = "483--488",
booktitle = "Proceedings - 2015 IEEE International Symposium on Multimedia, ISM 2015",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - Deep Learning for Imbalanced Multimedia Data Classification

AU - Yan, Yilin

AU - Chen, Min

AU - Shyu, Mei-Ling

AU - Chen, Shu Ching

PY - 2016/3/25

Y1 - 2016/3/25

N2 - Classification of imbalanced data is an important research problem as lots of real-world data sets have skewed class distributions in which the majority of data instances (examples) belong to one class and far fewer instances belong to others. While in many applications, the minority instances actually represent the concept of interest (e.g., fraud in banking operations, abnormal cell in medical data, etc.), a classifier induced from an imbalanced data set is more likely to be biased towards the majority class and show very poor classification accuracy on the minority class. Despite extensive research efforts, imbalanced data classification remains one of the most challenging problems in data mining and machine learning, especially for multimedia data. To tackle this challenge, in this paper, we propose an extended deep learning approach to achieve promising performance in classifying skewed multimedia data sets. Specifically, we investigate the integration of bootstrapping methods and a state-of-the-art deep learning approach, Convolutional Neural Networks (CNNs), with extensive empirical studies. Considering the fact that deep learning approaches such as CNNs are usually computationally expensive, we propose to feed low-level features to CNNs and prove its feasibility in achieving promising performance while saving a lot of training time. The experimental results show the effectiveness of our framework in classifying severely imbalanced data in the TRECVID data set.

AB - Classification of imbalanced data is an important research problem as lots of real-world data sets have skewed class distributions in which the majority of data instances (examples) belong to one class and far fewer instances belong to others. While in many applications, the minority instances actually represent the concept of interest (e.g., fraud in banking operations, abnormal cell in medical data, etc.), a classifier induced from an imbalanced data set is more likely to be biased towards the majority class and show very poor classification accuracy on the minority class. Despite extensive research efforts, imbalanced data classification remains one of the most challenging problems in data mining and machine learning, especially for multimedia data. To tackle this challenge, in this paper, we propose an extended deep learning approach to achieve promising performance in classifying skewed multimedia data sets. Specifically, we investigate the integration of bootstrapping methods and a state-of-the-art deep learning approach, Convolutional Neural Networks (CNNs), with extensive empirical studies. Considering the fact that deep learning approaches such as CNNs are usually computationally expensive, we propose to feed low-level features to CNNs and prove its feasibility in achieving promising performance while saving a lot of training time. The experimental results show the effectiveness of our framework in classifying severely imbalanced data in the TRECVID data set.

KW - classification

KW - convolutional neural network (CNN)

KW - deep learning

KW - imbalanced data

KW - semantic indexing

UR - http://www.scopus.com/inward/record.url?scp=84969645930&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84969645930&partnerID=8YFLogxK

U2 - 10.1109/ISM.2015.126

DO - 10.1109/ISM.2015.126

M3 - Conference contribution

AN - SCOPUS:84969645930

SN - 9781509003792

SP - 483

EP - 488

BT - Proceedings - 2015 IEEE International Symposium on Multimedia, ISM 2015

PB - Institute of Electrical and Electronics Engineers Inc.

ER -