Sparse Linear Integration of Content and Context Modalities for Semantic Concept Retrieval

Qiusha Zhu, Mei Ling Shyu

Research output: Contribution to journalArticlepeer-review

14 Scopus citations


The semantic gap between low-level visual features and high-level semantics is a well-known challenge in content-based multimedia information retrieval. With the rapid popularization of social media, which allows users to assign tags to describe images and videos, attention is naturally drawn to take advantage of these metadata in order to bridge the semantic gap. This paper proposes a sparse linear integration (SLI) model that focuses on integrating visual content and its associated metadata, which are referred to as the content and the context modalities, respectively, for semantic concept retrieval. An optimization problem is formulated to approximate an instance using a sparse linear combination of other instances and minimize the difference between them. The prediction score of a concept for a test instance measures how well it can be reconstructed by the positive instances of that concept. Two benchmark image data sets and their associated tags are used to evaluate the SLI model. Experimental results show promising performance by comparing with the approaches based on a single modality and approaches based on popular fusion methods.

Original languageEnglish (US)
Pages (from-to)152-160
Number of pages9
JournalIEEE Transactions on Emerging Topics in Computing
Issue number2
StatePublished - Jun 1 2015


  • Multimodal integration
  • Semantic concept retrieval
  • Sparse linear methods

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Information Systems
  • Human-Computer Interaction
  • Computer Science Applications


Dive into the research topics of 'Sparse Linear Integration of Content and Context Modalities for Semantic Concept Retrieval'. Together they form a unique fingerprint.

Cite this