Correlation maximisation-based discretisation for supervised classification

Qiusha Zhu, Lin Lin, Mei Ling Shyu

Research output: Contribution to journalArticle

2 Scopus citations

Abstract

This paper proposes a novel supervised discretisation algorithm based on Correlation Maximisation (CM) using Multiple Correspondence Analysis (MCA). MCA is an effective technique to capture the correlation between multiple variables. For each numeric feature, the proposed discretisation algorithm utilises MCA to measure the correlations between feature intervals/items and classes, and the set of cut-points yielding the maximum correlation is chosen as the discretisation scheme for that feature. Therefore, the discretised feature can not only produce a concise summarisation of the original numeric feature but also provide the maximum correlation information to predict class labels. Experiments are conducted by comparing to seven state-of-the-art supervised discretisation algorithms using six well-known classifiers on 19 UCI data sets. Experimental results demonstrate that the proposed discretisation algorithm can automatically generate a set of features (feature intervals) that produce the best classification results on average.

Original languageEnglish (US)
Pages (from-to)40-59
Number of pages20
JournalInternational Journal of Business Intelligence and Data Mining
Volume7
Issue number1-2
DOIs
StatePublished - Aug 1 2012

Keywords

  • Correlation
  • Discretisation
  • MCA
  • Multiple correspondence analysis
  • Supervised classification

ASJC Scopus subject areas

  • Management Information Systems
  • Information Systems and Management
  • Statistics, Probability and Uncertainty

Fingerprint Dive into the research topics of 'Correlation maximisation-based discretisation for supervised classification'. Together they form a unique fingerprint.

  • Cite this