Sparse Linear Integration of Content and Context Modalities for Semantic Concept Retrieval

Qiusha Zhu, Mei-Ling Shyu

Research output: Contribution to journalArticle

14 Scopus citations

Abstract

The semantic gap between low-level visual features and high-level semantics is a well-known challenge in content-based multimedia information retrieval. With the rapid popularization of social media, which allows users to assign tags to describe images and videos, attention is naturally drawn to take advantage of these metadata in order to bridge the semantic gap. This paper proposes a sparse linear integration (SLI) model that focuses on integrating visual content and its associated metadata, which are referred to as the content and the context modalities, respectively, for semantic concept retrieval. An optimization problem is formulated to approximate an instance using a sparse linear combination of other instances and minimize the difference between them. The prediction score of a concept for a test instance measures how well it can be reconstructed by the positive instances of that concept. Two benchmark image data sets and their associated tags are used to evaluate the SLI model. Experimental results show promising performance by comparing with the approaches based on a single modality and approaches based on popular fusion methods.

Original languageEnglish (US)
Pages (from-to)152-160
Number of pages9
JournalIEEE Transactions on Emerging Topics in Computing
Volume3
Issue number2
DOIs
StatePublished - Jun 1 2015

Keywords

  • Multimodal integration
  • Semantic concept retrieval
  • Sparse linear methods

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Computer Science Applications
  • Human-Computer Interaction
  • Information Systems

Fingerprint Dive into the research topics of 'Sparse Linear Integration of Content and Context Modalities for Semantic Concept Retrieval'. Together they form a unique fingerprint.

  • Cite this