Feature selection using correlation and reliability based scoring metric for video semantic detection

Qiusha Zhu, Lin Lin, Mei-Ling Shyu, Shu Ching Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

30 Citations (Scopus)

Abstract

Content-based multimedia retrieval faces many challenges such as semantic gap, imbalanced data, and varied qualities of the media. Feature selection as a component of the retrieval process plays an important role. The aim of feature selection is to identify a subset of features by removing irrelevant or redundant features. An effective subset of features can not only improve model performance and reduce computational complexity, but also enhance semantic interpretability. To achieve these objectives, in this paper, a novel metric that integrates the correlation and reliability information between each feature and each class obtained from Multiple Correspondence Analysis (MCA) is proposed to score the features for feature selection. Based on these scores, a ranked list of features can be generated and different selection criteria can be adopted to select a subset of features. To evaluate the proposed framework, four other wellknown feature selection methods, namely information gain, chisquare measure, correlation-based feature selection, and relief are compared with the proposed method over five popular classifiers using the benchmark data from TRECVID 2009 highlevel feature extraction task. The results show that the proposed method outperforms the other methods in terms of classification accuracy, the size of feature subspace, and the ability to capture the semantic information.

Original languageEnglish
Title of host publicationProceedings - 2010 IEEE 4th International Conference on Semantic Computing, ICSC 2010
Pages462-469
Number of pages8
DOIs
StatePublished - Dec 1 2010
Event4th IEEE International Conference on Semantic Computing, ICSC 2010 - Pittsburgh, PA, United States
Duration: Sep 22 2010Sep 24 2010

Other

Other4th IEEE International Conference on Semantic Computing, ICSC 2010
CountryUnited States
CityPittsburgh, PA
Period9/22/109/24/10

Fingerprint

Scoring
Feature Selection
Feature extraction
Semantics
Metric
Subset
Retrieval
Multiple Correspondence Analysis
Information Gain
Chi-square
Interpretability
Performance Model
Feature Extraction
Multimedia
Computational Complexity
Classifier
Integrate
Subspace
Computational complexity
Benchmark

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Theoretical Computer Science

Cite this

Zhu, Q., Lin, L., Shyu, M-L., & Chen, S. C. (2010). Feature selection using correlation and reliability based scoring metric for video semantic detection. In Proceedings - 2010 IEEE 4th International Conference on Semantic Computing, ICSC 2010 (pp. 462-469). [5629038] https://doi.org/10.1109/ICSC.2010.65

Feature selection using correlation and reliability based scoring metric for video semantic detection. / Zhu, Qiusha; Lin, Lin; Shyu, Mei-Ling; Chen, Shu Ching.

Proceedings - 2010 IEEE 4th International Conference on Semantic Computing, ICSC 2010. 2010. p. 462-469 5629038.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Zhu, Q, Lin, L, Shyu, M-L & Chen, SC 2010, Feature selection using correlation and reliability based scoring metric for video semantic detection. in Proceedings - 2010 IEEE 4th International Conference on Semantic Computing, ICSC 2010., 5629038, pp. 462-469, 4th IEEE International Conference on Semantic Computing, ICSC 2010, Pittsburgh, PA, United States, 9/22/10. https://doi.org/10.1109/ICSC.2010.65
Zhu Q, Lin L, Shyu M-L, Chen SC. Feature selection using correlation and reliability based scoring metric for video semantic detection. In Proceedings - 2010 IEEE 4th International Conference on Semantic Computing, ICSC 2010. 2010. p. 462-469. 5629038 https://doi.org/10.1109/ICSC.2010.65
Zhu, Qiusha ; Lin, Lin ; Shyu, Mei-Ling ; Chen, Shu Ching. / Feature selection using correlation and reliability based scoring metric for video semantic detection. Proceedings - 2010 IEEE 4th International Conference on Semantic Computing, ICSC 2010. 2010. pp. 462-469
@inproceedings{4e2b61cb809f4163b810f24751a442c9,
title = "Feature selection using correlation and reliability based scoring metric for video semantic detection",
abstract = "Content-based multimedia retrieval faces many challenges such as semantic gap, imbalanced data, and varied qualities of the media. Feature selection as a component of the retrieval process plays an important role. The aim of feature selection is to identify a subset of features by removing irrelevant or redundant features. An effective subset of features can not only improve model performance and reduce computational complexity, but also enhance semantic interpretability. To achieve these objectives, in this paper, a novel metric that integrates the correlation and reliability information between each feature and each class obtained from Multiple Correspondence Analysis (MCA) is proposed to score the features for feature selection. Based on these scores, a ranked list of features can be generated and different selection criteria can be adopted to select a subset of features. To evaluate the proposed framework, four other wellknown feature selection methods, namely information gain, chisquare measure, correlation-based feature selection, and relief are compared with the proposed method over five popular classifiers using the benchmark data from TRECVID 2009 highlevel feature extraction task. The results show that the proposed method outperforms the other methods in terms of classification accuracy, the size of feature subspace, and the ability to capture the semantic information.",
author = "Qiusha Zhu and Lin Lin and Mei-Ling Shyu and Chen, {Shu Ching}",
year = "2010",
month = "12",
day = "1",
doi = "10.1109/ICSC.2010.65",
language = "English",
isbn = "9780769541549",
pages = "462--469",
booktitle = "Proceedings - 2010 IEEE 4th International Conference on Semantic Computing, ICSC 2010",

}

TY - GEN

T1 - Feature selection using correlation and reliability based scoring metric for video semantic detection

AU - Zhu, Qiusha

AU - Lin, Lin

AU - Shyu, Mei-Ling

AU - Chen, Shu Ching

PY - 2010/12/1

Y1 - 2010/12/1

N2 - Content-based multimedia retrieval faces many challenges such as semantic gap, imbalanced data, and varied qualities of the media. Feature selection as a component of the retrieval process plays an important role. The aim of feature selection is to identify a subset of features by removing irrelevant or redundant features. An effective subset of features can not only improve model performance and reduce computational complexity, but also enhance semantic interpretability. To achieve these objectives, in this paper, a novel metric that integrates the correlation and reliability information between each feature and each class obtained from Multiple Correspondence Analysis (MCA) is proposed to score the features for feature selection. Based on these scores, a ranked list of features can be generated and different selection criteria can be adopted to select a subset of features. To evaluate the proposed framework, four other wellknown feature selection methods, namely information gain, chisquare measure, correlation-based feature selection, and relief are compared with the proposed method over five popular classifiers using the benchmark data from TRECVID 2009 highlevel feature extraction task. The results show that the proposed method outperforms the other methods in terms of classification accuracy, the size of feature subspace, and the ability to capture the semantic information.

AB - Content-based multimedia retrieval faces many challenges such as semantic gap, imbalanced data, and varied qualities of the media. Feature selection as a component of the retrieval process plays an important role. The aim of feature selection is to identify a subset of features by removing irrelevant or redundant features. An effective subset of features can not only improve model performance and reduce computational complexity, but also enhance semantic interpretability. To achieve these objectives, in this paper, a novel metric that integrates the correlation and reliability information between each feature and each class obtained from Multiple Correspondence Analysis (MCA) is proposed to score the features for feature selection. Based on these scores, a ranked list of features can be generated and different selection criteria can be adopted to select a subset of features. To evaluate the proposed framework, four other wellknown feature selection methods, namely information gain, chisquare measure, correlation-based feature selection, and relief are compared with the proposed method over five popular classifiers using the benchmark data from TRECVID 2009 highlevel feature extraction task. The results show that the proposed method outperforms the other methods in terms of classification accuracy, the size of feature subspace, and the ability to capture the semantic information.

UR - http://www.scopus.com/inward/record.url?scp=79952059916&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79952059916&partnerID=8YFLogxK

U2 - 10.1109/ICSC.2010.65

DO - 10.1109/ICSC.2010.65

M3 - Conference contribution

AN - SCOPUS:79952059916

SN - 9780769541549

SP - 462

EP - 469

BT - Proceedings - 2010 IEEE 4th International Conference on Semantic Computing, ICSC 2010

ER -