Negative-Based Sampling for Multimedia Retrieval

Hsin Yu Ha, Shu Ching Chen, Mei-Ling Shyu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Nowadays, in such a high-tech living lifestyle, profusion of multimedia data are produced and propagated around the world. To identify meaningful semantic concepts from the large amount of data, one of the major challenges is called the data imbalance problem. Data imbalance occurs when the number of positive instances (i.e., instances which contain the target concept) is greatly less than the number of negative instances (i.e., instances which do not contain the target concept). In other words, the ratio between positive and negative instances is extremely low. Rebalancing the dataset is usually proposed to resolve the problem by sampling or data pruning. In this paper, we propose a sampling method which consists of three stages, namely selecting features to identify the negative instances, producing negative ranking scores, and performing sampling. The method is compared with some other existing methods on the TRECVID dataset and is demonstrated to have better performance.

Original languageEnglish (US)
Title of host publicationProceedings - 2015 IEEE 16th International Conference on Information Reuse and Integration, IRI 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages64-71
Number of pages8
ISBN (Print)9781467366564
DOIs
StatePublished - Oct 19 2015
Event16th IEEE International Conference on Information Reuse and Integration, IRI 2015 - San Francisco, United States
Duration: Aug 13 2015Aug 15 2015

Other

Other16th IEEE International Conference on Information Reuse and Integration, IRI 2015
CountryUnited States
CitySan Francisco
Period8/13/158/15/15

Fingerprint

Sampling
Semantics
Multimedia
Imbalance
Sampling methods
High-tech
Rebalancing
Ranking
Lifestyle
Pruning

Keywords

  • FC-MST
  • Feature Selection
  • Multimedia
  • Sampling

ASJC Scopus subject areas

  • Information Systems
  • Information Systems and Management
  • Electrical and Electronic Engineering

Cite this

Ha, H. Y., Chen, S. C., & Shyu, M-L. (2015). Negative-Based Sampling for Multimedia Retrieval. In Proceedings - 2015 IEEE 16th International Conference on Information Reuse and Integration, IRI 2015 (pp. 64-71). [7300956] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/IRI.2015.20

Negative-Based Sampling for Multimedia Retrieval. / Ha, Hsin Yu; Chen, Shu Ching; Shyu, Mei-Ling.

Proceedings - 2015 IEEE 16th International Conference on Information Reuse and Integration, IRI 2015. Institute of Electrical and Electronics Engineers Inc., 2015. p. 64-71 7300956.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Ha, HY, Chen, SC & Shyu, M-L 2015, Negative-Based Sampling for Multimedia Retrieval. in Proceedings - 2015 IEEE 16th International Conference on Information Reuse and Integration, IRI 2015., 7300956, Institute of Electrical and Electronics Engineers Inc., pp. 64-71, 16th IEEE International Conference on Information Reuse and Integration, IRI 2015, San Francisco, United States, 8/13/15. https://doi.org/10.1109/IRI.2015.20
Ha HY, Chen SC, Shyu M-L. Negative-Based Sampling for Multimedia Retrieval. In Proceedings - 2015 IEEE 16th International Conference on Information Reuse and Integration, IRI 2015. Institute of Electrical and Electronics Engineers Inc. 2015. p. 64-71. 7300956 https://doi.org/10.1109/IRI.2015.20
Ha, Hsin Yu ; Chen, Shu Ching ; Shyu, Mei-Ling. / Negative-Based Sampling for Multimedia Retrieval. Proceedings - 2015 IEEE 16th International Conference on Information Reuse and Integration, IRI 2015. Institute of Electrical and Electronics Engineers Inc., 2015. pp. 64-71
@inproceedings{37e6360e67d845b5be2cf81a05133e5e,
title = "Negative-Based Sampling for Multimedia Retrieval",
abstract = "Nowadays, in such a high-tech living lifestyle, profusion of multimedia data are produced and propagated around the world. To identify meaningful semantic concepts from the large amount of data, one of the major challenges is called the data imbalance problem. Data imbalance occurs when the number of positive instances (i.e., instances which contain the target concept) is greatly less than the number of negative instances (i.e., instances which do not contain the target concept). In other words, the ratio between positive and negative instances is extremely low. Rebalancing the dataset is usually proposed to resolve the problem by sampling or data pruning. In this paper, we propose a sampling method which consists of three stages, namely selecting features to identify the negative instances, producing negative ranking scores, and performing sampling. The method is compared with some other existing methods on the TRECVID dataset and is demonstrated to have better performance.",
keywords = "FC-MST, Feature Selection, Multimedia, Sampling",
author = "Ha, {Hsin Yu} and Chen, {Shu Ching} and Mei-Ling Shyu",
year = "2015",
month = "10",
day = "19",
doi = "10.1109/IRI.2015.20",
language = "English (US)",
isbn = "9781467366564",
pages = "64--71",
booktitle = "Proceedings - 2015 IEEE 16th International Conference on Information Reuse and Integration, IRI 2015",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - Negative-Based Sampling for Multimedia Retrieval

AU - Ha, Hsin Yu

AU - Chen, Shu Ching

AU - Shyu, Mei-Ling

PY - 2015/10/19

Y1 - 2015/10/19

N2 - Nowadays, in such a high-tech living lifestyle, profusion of multimedia data are produced and propagated around the world. To identify meaningful semantic concepts from the large amount of data, one of the major challenges is called the data imbalance problem. Data imbalance occurs when the number of positive instances (i.e., instances which contain the target concept) is greatly less than the number of negative instances (i.e., instances which do not contain the target concept). In other words, the ratio between positive and negative instances is extremely low. Rebalancing the dataset is usually proposed to resolve the problem by sampling or data pruning. In this paper, we propose a sampling method which consists of three stages, namely selecting features to identify the negative instances, producing negative ranking scores, and performing sampling. The method is compared with some other existing methods on the TRECVID dataset and is demonstrated to have better performance.

AB - Nowadays, in such a high-tech living lifestyle, profusion of multimedia data are produced and propagated around the world. To identify meaningful semantic concepts from the large amount of data, one of the major challenges is called the data imbalance problem. Data imbalance occurs when the number of positive instances (i.e., instances which contain the target concept) is greatly less than the number of negative instances (i.e., instances which do not contain the target concept). In other words, the ratio between positive and negative instances is extremely low. Rebalancing the dataset is usually proposed to resolve the problem by sampling or data pruning. In this paper, we propose a sampling method which consists of three stages, namely selecting features to identify the negative instances, producing negative ranking scores, and performing sampling. The method is compared with some other existing methods on the TRECVID dataset and is demonstrated to have better performance.

KW - FC-MST

KW - Feature Selection

KW - Multimedia

KW - Sampling

UR - http://www.scopus.com/inward/record.url?scp=84959124515&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84959124515&partnerID=8YFLogxK

U2 - 10.1109/IRI.2015.20

DO - 10.1109/IRI.2015.20

M3 - Conference contribution

SN - 9781467366564

SP - 64

EP - 71

BT - Proceedings - 2015 IEEE 16th International Conference on Information Reuse and Integration, IRI 2015

PB - Institute of Electrical and Electronics Engineers Inc.

ER -