Utilizing concept correlations for effective imbalanced data classification

Yilin Yan, Yang Liu, Mei-Ling Shyu, Min Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

18 Scopus citations

Abstract

Data imbalance is a challenging and common problem in data mining and machine learning areas, and has attracted significant research efforts. A data set is considered imbalanced when the data instances (samples) are not close to uniformly distributed across different classes/categories, which is very common in real-world data sets. It is likely to result in biased classification results. In this paper, a two-phase classification framework is proposed to make the classification of imbalanced data more accurate and stable. The proposed framework is based on the correlations generated between concepts. The general idea is to identify negative data instances which have certain positive correlations with data instances in the target concept to facilitate the classification task. The experimental results show that our framework is effective in imbalanced data classification and is robust to feature descriptors by comparing it with four existing approaches using four different kinds of feature representations.

Original languageEnglish (US)
Title of host publicationProceedings of the 2014 IEEE 15th International Conference on Information Reuse and Integration, IEEE IRI 2014
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages561-568
Number of pages8
ISBN (Print)9781479958801
DOIs
StatePublished - Feb 27 2014
Event15th IEEE International Conference on Information Reuse and Integration, IEEE IRI 2014 - San Francisco, United States
Duration: Aug 13 2014Aug 15 2014

Other

Other15th IEEE International Conference on Information Reuse and Integration, IEEE IRI 2014
CountryUnited States
CitySan Francisco
Period8/13/148/15/14

Keywords

  • Classification
  • Correlation
  • Imbalanced data
  • Rare class mining
  • Skewed data

ASJC Scopus subject areas

  • Information Systems

Fingerprint Dive into the research topics of 'Utilizing concept correlations for effective imbalanced data classification'. Together they form a unique fingerprint.

  • Cite this

    Yan, Y., Liu, Y., Shyu, M-L., & Chen, M. (2014). Utilizing concept correlations for effective imbalanced data classification. In Proceedings of the 2014 IEEE 15th International Conference on Information Reuse and Integration, IEEE IRI 2014 (pp. 561-568). [7051939] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/IRI.2014.7051939