Web media semantic concept retrieval via tag removal and model fusion

Chao Chen, Qiusha Zhu, Lin Lin, Mei-Ling Shyu

Research output: Contribution to journalArticle

30 Citations (Scopus)

Abstract

Multimedia data on social websites contain rich semantics and are often accompanied with user-defined tags. To enhance Web media semantic concept retrieval, the fusion of tag-based and content-based models can be used, though it is very challenging. In this article, a novel semantic concept retrieval framework that incorporates tag removal and model fusion is proposed to tackle such a challenge. Tags with useful information can facilitate media search, but they are often imprecise, which makes it important to apply noisy tag removal (by deleting uncorrelated tags) to improve the performance of semantic concept retrieval. Therefore, a multiple correspondence analysis (MCA)-based tag removal algorithm is proposed, which utilizes MCA's ability to capture the relationships among nominal features and identify representative and discriminative tags holding strong correlations with the target semantic concepts. To further improve the retrieval performance, a novel model fusion method is also proposed to combine ranking scores from both tag-based and content-based models, where the adjustment of ranking scores, the reliability of models, and the correlations between the intervals divided on the ranking scores and the semantic concepts are all considered. Comparative results with extensive experiments on the NUS-WIDE-LITE as well as the NUS-WIDE-270K benchmark datasets with 81 semantic concepts show that the proposed framework outperforms baseline results and the other comparison methods with each component being evaluated separately.

Original languageEnglish
Article number61
JournalACM Transactions on Intelligent Systems and Technology
Volume4
Issue number4
DOIs
StatePublished - Oct 21 2013

Fingerprint

Fusion
Retrieval
Fusion reactions
Semantics
Ranking
Model
Multiple Correspondence Analysis
Concepts
Comparison Method
Websites
Categorical or nominal
Multimedia
Baseline
Adjustment
Benchmark
Target
Interval
Experiments
Experiment

Keywords

  • Model fusion
  • Multimedia semantic concept retrieval
  • Multiple correspondence analysis (MCA)
  • Noisy tag removal
  • Social tags

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Artificial Intelligence

Cite this

Web media semantic concept retrieval via tag removal and model fusion. / Chen, Chao; Zhu, Qiusha; Lin, Lin; Shyu, Mei-Ling.

In: ACM Transactions on Intelligent Systems and Technology, Vol. 4, No. 4, 61, 21.10.2013.

Research output: Contribution to journalArticle

@article{61c991d2eb224e3c963eff65ef593bbc,
title = "Web media semantic concept retrieval via tag removal and model fusion",
abstract = "Multimedia data on social websites contain rich semantics and are often accompanied with user-defined tags. To enhance Web media semantic concept retrieval, the fusion of tag-based and content-based models can be used, though it is very challenging. In this article, a novel semantic concept retrieval framework that incorporates tag removal and model fusion is proposed to tackle such a challenge. Tags with useful information can facilitate media search, but they are often imprecise, which makes it important to apply noisy tag removal (by deleting uncorrelated tags) to improve the performance of semantic concept retrieval. Therefore, a multiple correspondence analysis (MCA)-based tag removal algorithm is proposed, which utilizes MCA's ability to capture the relationships among nominal features and identify representative and discriminative tags holding strong correlations with the target semantic concepts. To further improve the retrieval performance, a novel model fusion method is also proposed to combine ranking scores from both tag-based and content-based models, where the adjustment of ranking scores, the reliability of models, and the correlations between the intervals divided on the ranking scores and the semantic concepts are all considered. Comparative results with extensive experiments on the NUS-WIDE-LITE as well as the NUS-WIDE-270K benchmark datasets with 81 semantic concepts show that the proposed framework outperforms baseline results and the other comparison methods with each component being evaluated separately.",
keywords = "Model fusion, Multimedia semantic concept retrieval, Multiple correspondence analysis (MCA), Noisy tag removal, Social tags",
author = "Chao Chen and Qiusha Zhu and Lin Lin and Mei-Ling Shyu",
year = "2013",
month = "10",
day = "21",
doi = "10.1145/2508037.2508042",
language = "English",
volume = "4",
journal = "ACM Transactions on Intelligent Systems and Technology",
issn = "2157-6904",
publisher = "Association for Computing Machinery (ACM)",
number = "4",

}

TY - JOUR

T1 - Web media semantic concept retrieval via tag removal and model fusion

AU - Chen, Chao

AU - Zhu, Qiusha

AU - Lin, Lin

AU - Shyu, Mei-Ling

PY - 2013/10/21

Y1 - 2013/10/21

N2 - Multimedia data on social websites contain rich semantics and are often accompanied with user-defined tags. To enhance Web media semantic concept retrieval, the fusion of tag-based and content-based models can be used, though it is very challenging. In this article, a novel semantic concept retrieval framework that incorporates tag removal and model fusion is proposed to tackle such a challenge. Tags with useful information can facilitate media search, but they are often imprecise, which makes it important to apply noisy tag removal (by deleting uncorrelated tags) to improve the performance of semantic concept retrieval. Therefore, a multiple correspondence analysis (MCA)-based tag removal algorithm is proposed, which utilizes MCA's ability to capture the relationships among nominal features and identify representative and discriminative tags holding strong correlations with the target semantic concepts. To further improve the retrieval performance, a novel model fusion method is also proposed to combine ranking scores from both tag-based and content-based models, where the adjustment of ranking scores, the reliability of models, and the correlations between the intervals divided on the ranking scores and the semantic concepts are all considered. Comparative results with extensive experiments on the NUS-WIDE-LITE as well as the NUS-WIDE-270K benchmark datasets with 81 semantic concepts show that the proposed framework outperforms baseline results and the other comparison methods with each component being evaluated separately.

AB - Multimedia data on social websites contain rich semantics and are often accompanied with user-defined tags. To enhance Web media semantic concept retrieval, the fusion of tag-based and content-based models can be used, though it is very challenging. In this article, a novel semantic concept retrieval framework that incorporates tag removal and model fusion is proposed to tackle such a challenge. Tags with useful information can facilitate media search, but they are often imprecise, which makes it important to apply noisy tag removal (by deleting uncorrelated tags) to improve the performance of semantic concept retrieval. Therefore, a multiple correspondence analysis (MCA)-based tag removal algorithm is proposed, which utilizes MCA's ability to capture the relationships among nominal features and identify representative and discriminative tags holding strong correlations with the target semantic concepts. To further improve the retrieval performance, a novel model fusion method is also proposed to combine ranking scores from both tag-based and content-based models, where the adjustment of ranking scores, the reliability of models, and the correlations between the intervals divided on the ranking scores and the semantic concepts are all considered. Comparative results with extensive experiments on the NUS-WIDE-LITE as well as the NUS-WIDE-270K benchmark datasets with 81 semantic concepts show that the proposed framework outperforms baseline results and the other comparison methods with each component being evaluated separately.

KW - Model fusion

KW - Multimedia semantic concept retrieval

KW - Multiple correspondence analysis (MCA)

KW - Noisy tag removal

KW - Social tags

UR - http://www.scopus.com/inward/record.url?scp=84885653186&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84885653186&partnerID=8YFLogxK

U2 - 10.1145/2508037.2508042

DO - 10.1145/2508037.2508042

M3 - Article

AN - SCOPUS:84885653186

VL - 4

JO - ACM Transactions on Intelligent Systems and Technology

JF - ACM Transactions on Intelligent Systems and Technology

SN - 2157-6904

IS - 4

M1 - 61

ER -