Sparse Linear Integration of Content and Context Modalities for Semantic Concept Retrieval

Qiusha Zhu, Mei-Ling Shyu

Research output: Contribution to journalArticle

13 Citations (Scopus)

Abstract

The semantic gap between low-level visual features and high-level semantics is a well-known challenge in content-based multimedia information retrieval. With the rapid popularization of social media, which allows users to assign tags to describe images and videos, attention is naturally drawn to take advantage of these metadata in order to bridge the semantic gap. This paper proposes a sparse linear integration (SLI) model that focuses on integrating visual content and its associated metadata, which are referred to as the content and the context modalities, respectively, for semantic concept retrieval. An optimization problem is formulated to approximate an instance using a sparse linear combination of other instances and minimize the difference between them. The prediction score of a concept for a test instance measures how well it can be reconstructed by the positive instances of that concept. Two benchmark image data sets and their associated tags are used to evaluate the SLI model. Experimental results show promising performance by comparing with the approaches based on a single modality and approaches based on popular fusion methods.

Original languageEnglish (US)
Pages (from-to)152-160
Number of pages9
JournalIEEE Transactions on Emerging Topics in Computing
Volume3
Issue number2
DOIs
StatePublished - Jun 1 2015

Fingerprint

Semantics
Metadata
Information retrieval
Fusion reactions

Keywords

  • Multimodal integration
  • Semantic concept retrieval
  • Sparse linear methods

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Computer Science Applications
  • Human-Computer Interaction
  • Information Systems

Cite this

Sparse Linear Integration of Content and Context Modalities for Semantic Concept Retrieval. / Zhu, Qiusha; Shyu, Mei-Ling.

In: IEEE Transactions on Emerging Topics in Computing, Vol. 3, No. 2, 01.06.2015, p. 152-160.

Research output: Contribution to journalArticle

@article{4ac2325e88944590be4996920c7e3fb3,
title = "Sparse Linear Integration of Content and Context Modalities for Semantic Concept Retrieval",
abstract = "The semantic gap between low-level visual features and high-level semantics is a well-known challenge in content-based multimedia information retrieval. With the rapid popularization of social media, which allows users to assign tags to describe images and videos, attention is naturally drawn to take advantage of these metadata in order to bridge the semantic gap. This paper proposes a sparse linear integration (SLI) model that focuses on integrating visual content and its associated metadata, which are referred to as the content and the context modalities, respectively, for semantic concept retrieval. An optimization problem is formulated to approximate an instance using a sparse linear combination of other instances and minimize the difference between them. The prediction score of a concept for a test instance measures how well it can be reconstructed by the positive instances of that concept. Two benchmark image data sets and their associated tags are used to evaluate the SLI model. Experimental results show promising performance by comparing with the approaches based on a single modality and approaches based on popular fusion methods.",
keywords = "Multimodal integration, Semantic concept retrieval, Sparse linear methods",
author = "Qiusha Zhu and Mei-Ling Shyu",
year = "2015",
month = "6",
day = "1",
doi = "10.1109/TETC.2014.2384992",
language = "English (US)",
volume = "3",
pages = "152--160",
journal = "IEEE Transactions on Emerging Topics in Computing",
issn = "2168-6750",
publisher = "IEEE Computer Society",
number = "2",

}

TY - JOUR

T1 - Sparse Linear Integration of Content and Context Modalities for Semantic Concept Retrieval

AU - Zhu, Qiusha

AU - Shyu, Mei-Ling

PY - 2015/6/1

Y1 - 2015/6/1

N2 - The semantic gap between low-level visual features and high-level semantics is a well-known challenge in content-based multimedia information retrieval. With the rapid popularization of social media, which allows users to assign tags to describe images and videos, attention is naturally drawn to take advantage of these metadata in order to bridge the semantic gap. This paper proposes a sparse linear integration (SLI) model that focuses on integrating visual content and its associated metadata, which are referred to as the content and the context modalities, respectively, for semantic concept retrieval. An optimization problem is formulated to approximate an instance using a sparse linear combination of other instances and minimize the difference between them. The prediction score of a concept for a test instance measures how well it can be reconstructed by the positive instances of that concept. Two benchmark image data sets and their associated tags are used to evaluate the SLI model. Experimental results show promising performance by comparing with the approaches based on a single modality and approaches based on popular fusion methods.

AB - The semantic gap between low-level visual features and high-level semantics is a well-known challenge in content-based multimedia information retrieval. With the rapid popularization of social media, which allows users to assign tags to describe images and videos, attention is naturally drawn to take advantage of these metadata in order to bridge the semantic gap. This paper proposes a sparse linear integration (SLI) model that focuses on integrating visual content and its associated metadata, which are referred to as the content and the context modalities, respectively, for semantic concept retrieval. An optimization problem is formulated to approximate an instance using a sparse linear combination of other instances and minimize the difference between them. The prediction score of a concept for a test instance measures how well it can be reconstructed by the positive instances of that concept. Two benchmark image data sets and their associated tags are used to evaluate the SLI model. Experimental results show promising performance by comparing with the approaches based on a single modality and approaches based on popular fusion methods.

KW - Multimodal integration

KW - Semantic concept retrieval

KW - Sparse linear methods

UR - http://www.scopus.com/inward/record.url?scp=84933040972&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84933040972&partnerID=8YFLogxK

U2 - 10.1109/TETC.2014.2384992

DO - 10.1109/TETC.2014.2384992

M3 - Article

AN - SCOPUS:84933040972

VL - 3

SP - 152

EP - 160

JO - IEEE Transactions on Emerging Topics in Computing

JF - IEEE Transactions on Emerging Topics in Computing

SN - 2168-6750

IS - 2

ER -