Enhancing multimedia imbalanced concept detection using VIMP in Random Forests

Saad Sadiq, Yilin Yan, Mei-Ling Shyu, Shu Ching Chen, Hemant Ishwaran

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)

Abstract

Recent developments in social media and cloud storage lead to an exponential growth in the amount of multimedia data, which increases the complexity of managing, storing, indexing, and retrieving information from such big data. Many current content-based concept detection approaches lag from successfully bridging the semantic gap. To solve this problem, a multi-stage random forest framework is proposed to generate predictor variables based on multivariate regressions using variable importance (VIMP). By fine tuning the forests and significantly reducing the predictor variables, the concept detection scores are evaluated when the concept of interest is rare and imbalanced, i.e., having little collaboration with other high level concepts. Using classical multivariate statistics, estimating the value of one coordinate using other coordinates standardizes the covariates and it depends upon the variance of the correlations instead of the mean. Thus, conditional dependence on the data being normally distributed is eliminated. Experimental results demonstrate that the proposed framework outperforms those approaches in the comparison in terms of the Mean Average Precision (MAP) values.

Original languageEnglish (US)
Title of host publicationProceedings - 2016 IEEE 17th International Conference on Information Reuse and Integration, IRI 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages601-608
Number of pages8
ISBN (Electronic)9781509032075
DOIs
StatePublished - 2016
Event17th IEEE International Conference on Information Reuse and Integration, IRI 2016 - Pittsburgh, United States
Duration: Jul 28 2016Jul 30 2016

Other

Other17th IEEE International Conference on Information Reuse and Integration, IRI 2016
CountryUnited States
CityPittsburgh
Period7/28/167/30/16

Fingerprint

Tuning
Semantics
Statistics
Big data
Multimedia
Predictors

Keywords

  • Multimedia imbalanced concept detection
  • Multivariate regression
  • Random forests
  • Variable importance (VIMP)

ASJC Scopus subject areas

  • Information Systems
  • Information Systems and Management

Cite this

Sadiq, S., Yan, Y., Shyu, M-L., Chen, S. C., & Ishwaran, H. (2016). Enhancing multimedia imbalanced concept detection using VIMP in Random Forests. In Proceedings - 2016 IEEE 17th International Conference on Information Reuse and Integration, IRI 2016 (pp. 601-608). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/IRI.2016.87

Enhancing multimedia imbalanced concept detection using VIMP in Random Forests. / Sadiq, Saad; Yan, Yilin; Shyu, Mei-Ling; Chen, Shu Ching; Ishwaran, Hemant.

Proceedings - 2016 IEEE 17th International Conference on Information Reuse and Integration, IRI 2016. Institute of Electrical and Electronics Engineers Inc., 2016. p. 601-608.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Sadiq, S, Yan, Y, Shyu, M-L, Chen, SC & Ishwaran, H 2016, Enhancing multimedia imbalanced concept detection using VIMP in Random Forests. in Proceedings - 2016 IEEE 17th International Conference on Information Reuse and Integration, IRI 2016. Institute of Electrical and Electronics Engineers Inc., pp. 601-608, 17th IEEE International Conference on Information Reuse and Integration, IRI 2016, Pittsburgh, United States, 7/28/16. https://doi.org/10.1109/IRI.2016.87
Sadiq S, Yan Y, Shyu M-L, Chen SC, Ishwaran H. Enhancing multimedia imbalanced concept detection using VIMP in Random Forests. In Proceedings - 2016 IEEE 17th International Conference on Information Reuse and Integration, IRI 2016. Institute of Electrical and Electronics Engineers Inc. 2016. p. 601-608 https://doi.org/10.1109/IRI.2016.87
Sadiq, Saad ; Yan, Yilin ; Shyu, Mei-Ling ; Chen, Shu Ching ; Ishwaran, Hemant. / Enhancing multimedia imbalanced concept detection using VIMP in Random Forests. Proceedings - 2016 IEEE 17th International Conference on Information Reuse and Integration, IRI 2016. Institute of Electrical and Electronics Engineers Inc., 2016. pp. 601-608
@inproceedings{716e4687c5234509aadcd9f8b211e7c4,
title = "Enhancing multimedia imbalanced concept detection using VIMP in Random Forests",
abstract = "Recent developments in social media and cloud storage lead to an exponential growth in the amount of multimedia data, which increases the complexity of managing, storing, indexing, and retrieving information from such big data. Many current content-based concept detection approaches lag from successfully bridging the semantic gap. To solve this problem, a multi-stage random forest framework is proposed to generate predictor variables based on multivariate regressions using variable importance (VIMP). By fine tuning the forests and significantly reducing the predictor variables, the concept detection scores are evaluated when the concept of interest is rare and imbalanced, i.e., having little collaboration with other high level concepts. Using classical multivariate statistics, estimating the value of one coordinate using other coordinates standardizes the covariates and it depends upon the variance of the correlations instead of the mean. Thus, conditional dependence on the data being normally distributed is eliminated. Experimental results demonstrate that the proposed framework outperforms those approaches in the comparison in terms of the Mean Average Precision (MAP) values.",
keywords = "Multimedia imbalanced concept detection, Multivariate regression, Random forests, Variable importance (VIMP)",
author = "Saad Sadiq and Yilin Yan and Mei-Ling Shyu and Chen, {Shu Ching} and Hemant Ishwaran",
year = "2016",
doi = "10.1109/IRI.2016.87",
language = "English (US)",
pages = "601--608",
booktitle = "Proceedings - 2016 IEEE 17th International Conference on Information Reuse and Integration, IRI 2016",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - Enhancing multimedia imbalanced concept detection using VIMP in Random Forests

AU - Sadiq, Saad

AU - Yan, Yilin

AU - Shyu, Mei-Ling

AU - Chen, Shu Ching

AU - Ishwaran, Hemant

PY - 2016

Y1 - 2016

N2 - Recent developments in social media and cloud storage lead to an exponential growth in the amount of multimedia data, which increases the complexity of managing, storing, indexing, and retrieving information from such big data. Many current content-based concept detection approaches lag from successfully bridging the semantic gap. To solve this problem, a multi-stage random forest framework is proposed to generate predictor variables based on multivariate regressions using variable importance (VIMP). By fine tuning the forests and significantly reducing the predictor variables, the concept detection scores are evaluated when the concept of interest is rare and imbalanced, i.e., having little collaboration with other high level concepts. Using classical multivariate statistics, estimating the value of one coordinate using other coordinates standardizes the covariates and it depends upon the variance of the correlations instead of the mean. Thus, conditional dependence on the data being normally distributed is eliminated. Experimental results demonstrate that the proposed framework outperforms those approaches in the comparison in terms of the Mean Average Precision (MAP) values.

AB - Recent developments in social media and cloud storage lead to an exponential growth in the amount of multimedia data, which increases the complexity of managing, storing, indexing, and retrieving information from such big data. Many current content-based concept detection approaches lag from successfully bridging the semantic gap. To solve this problem, a multi-stage random forest framework is proposed to generate predictor variables based on multivariate regressions using variable importance (VIMP). By fine tuning the forests and significantly reducing the predictor variables, the concept detection scores are evaluated when the concept of interest is rare and imbalanced, i.e., having little collaboration with other high level concepts. Using classical multivariate statistics, estimating the value of one coordinate using other coordinates standardizes the covariates and it depends upon the variance of the correlations instead of the mean. Thus, conditional dependence on the data being normally distributed is eliminated. Experimental results demonstrate that the proposed framework outperforms those approaches in the comparison in terms of the Mean Average Precision (MAP) values.

KW - Multimedia imbalanced concept detection

KW - Multivariate regression

KW - Random forests

KW - Variable importance (VIMP)

UR - http://www.scopus.com/inward/record.url?scp=84991216144&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84991216144&partnerID=8YFLogxK

U2 - 10.1109/IRI.2016.87

DO - 10.1109/IRI.2016.87

M3 - Conference contribution

AN - SCOPUS:84991216144

SP - 601

EP - 608

BT - Proceedings - 2016 IEEE 17th International Conference on Information Reuse and Integration, IRI 2016

PB - Institute of Electrical and Electronics Engineers Inc.

ER -