An efficient deep residual-inception network for multimedia classification

Samira Pouyanfar, Shu Ching Chen, Mei-Ling Shyu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Citations (Scopus)

Abstract

Deep learning has led to many breakthroughs in machine perception and data mining. Although there are many substantial advances of deep learning in the applications of image recognition and natural language processing, very few work has been done in video analysis and semantic event detection. Very deep inception and residual networks have yielded promising results in the 2014 and 2015 ILSVRC challenges, respectively. Now the question is whether these architectures are applicable to and computationally reasonable in a variety of multimedia datasets. To answer this question, an efficient and lightweight deep convolutional network is proposed in this paper. This network is carefully designed to decrease the depth and width of the state-of-the-art networks while maintaining the high-performance. The proposed deep network includes the traditional convolutional architecture in conjunction with residual connections and very light inception modules. Experimental results demonstrate that the proposed network not only accelerates the training procedure, but also improves the performance in different multimedia classification tasks.

Original languageEnglish (US)
Title of host publication2017 IEEE International Conference on Multimedia and Expo, ICME 2017
PublisherIEEE Computer Society
Pages373-378
Number of pages6
ISBN (Electronic)9781509060672
DOIs
StatePublished - Aug 28 2017
Event2017 IEEE International Conference on Multimedia and Expo, ICME 2017 - Hong Kong, Hong Kong
Duration: Jul 10 2017Jul 14 2017

Other

Other2017 IEEE International Conference on Multimedia and Expo, ICME 2017
CountryHong Kong
CityHong Kong
Period7/10/177/14/17

Fingerprint

Image recognition
Data mining
Semantics
Processing
Deep learning

Keywords

  • Convolutional neural network
  • Deep learning
  • Multimedia classification
  • Residual-Inception
  • Video event detection

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Computer Science Applications

Cite this

Pouyanfar, S., Chen, S. C., & Shyu, M-L. (2017). An efficient deep residual-inception network for multimedia classification. In 2017 IEEE International Conference on Multimedia and Expo, ICME 2017 (pp. 373-378). [8019447] IEEE Computer Society. https://doi.org/10.1109/ICME.2017.8019447

An efficient deep residual-inception network for multimedia classification. / Pouyanfar, Samira; Chen, Shu Ching; Shyu, Mei-Ling.

2017 IEEE International Conference on Multimedia and Expo, ICME 2017. IEEE Computer Society, 2017. p. 373-378 8019447.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Pouyanfar, S, Chen, SC & Shyu, M-L 2017, An efficient deep residual-inception network for multimedia classification. in 2017 IEEE International Conference on Multimedia and Expo, ICME 2017., 8019447, IEEE Computer Society, pp. 373-378, 2017 IEEE International Conference on Multimedia and Expo, ICME 2017, Hong Kong, Hong Kong, 7/10/17. https://doi.org/10.1109/ICME.2017.8019447
Pouyanfar S, Chen SC, Shyu M-L. An efficient deep residual-inception network for multimedia classification. In 2017 IEEE International Conference on Multimedia and Expo, ICME 2017. IEEE Computer Society. 2017. p. 373-378. 8019447 https://doi.org/10.1109/ICME.2017.8019447
Pouyanfar, Samira ; Chen, Shu Ching ; Shyu, Mei-Ling. / An efficient deep residual-inception network for multimedia classification. 2017 IEEE International Conference on Multimedia and Expo, ICME 2017. IEEE Computer Society, 2017. pp. 373-378
@inproceedings{9bb1fccb421640a8bede1fe28ff05f5a,
title = "An efficient deep residual-inception network for multimedia classification",
abstract = "Deep learning has led to many breakthroughs in machine perception and data mining. Although there are many substantial advances of deep learning in the applications of image recognition and natural language processing, very few work has been done in video analysis and semantic event detection. Very deep inception and residual networks have yielded promising results in the 2014 and 2015 ILSVRC challenges, respectively. Now the question is whether these architectures are applicable to and computationally reasonable in a variety of multimedia datasets. To answer this question, an efficient and lightweight deep convolutional network is proposed in this paper. This network is carefully designed to decrease the depth and width of the state-of-the-art networks while maintaining the high-performance. The proposed deep network includes the traditional convolutional architecture in conjunction with residual connections and very light inception modules. Experimental results demonstrate that the proposed network not only accelerates the training procedure, but also improves the performance in different multimedia classification tasks.",
keywords = "Convolutional neural network, Deep learning, Multimedia classification, Residual-Inception, Video event detection",
author = "Samira Pouyanfar and Chen, {Shu Ching} and Mei-Ling Shyu",
year = "2017",
month = "8",
day = "28",
doi = "10.1109/ICME.2017.8019447",
language = "English (US)",
pages = "373--378",
booktitle = "2017 IEEE International Conference on Multimedia and Expo, ICME 2017",
publisher = "IEEE Computer Society",

}

TY - GEN

T1 - An efficient deep residual-inception network for multimedia classification

AU - Pouyanfar, Samira

AU - Chen, Shu Ching

AU - Shyu, Mei-Ling

PY - 2017/8/28

Y1 - 2017/8/28

N2 - Deep learning has led to many breakthroughs in machine perception and data mining. Although there are many substantial advances of deep learning in the applications of image recognition and natural language processing, very few work has been done in video analysis and semantic event detection. Very deep inception and residual networks have yielded promising results in the 2014 and 2015 ILSVRC challenges, respectively. Now the question is whether these architectures are applicable to and computationally reasonable in a variety of multimedia datasets. To answer this question, an efficient and lightweight deep convolutional network is proposed in this paper. This network is carefully designed to decrease the depth and width of the state-of-the-art networks while maintaining the high-performance. The proposed deep network includes the traditional convolutional architecture in conjunction with residual connections and very light inception modules. Experimental results demonstrate that the proposed network not only accelerates the training procedure, but also improves the performance in different multimedia classification tasks.

AB - Deep learning has led to many breakthroughs in machine perception and data mining. Although there are many substantial advances of deep learning in the applications of image recognition and natural language processing, very few work has been done in video analysis and semantic event detection. Very deep inception and residual networks have yielded promising results in the 2014 and 2015 ILSVRC challenges, respectively. Now the question is whether these architectures are applicable to and computationally reasonable in a variety of multimedia datasets. To answer this question, an efficient and lightweight deep convolutional network is proposed in this paper. This network is carefully designed to decrease the depth and width of the state-of-the-art networks while maintaining the high-performance. The proposed deep network includes the traditional convolutional architecture in conjunction with residual connections and very light inception modules. Experimental results demonstrate that the proposed network not only accelerates the training procedure, but also improves the performance in different multimedia classification tasks.

KW - Convolutional neural network

KW - Deep learning

KW - Multimedia classification

KW - Residual-Inception

KW - Video event detection

UR - http://www.scopus.com/inward/record.url?scp=85030239495&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85030239495&partnerID=8YFLogxK

U2 - 10.1109/ICME.2017.8019447

DO - 10.1109/ICME.2017.8019447

M3 - Conference contribution

AN - SCOPUS:85030239495

SP - 373

EP - 378

BT - 2017 IEEE International Conference on Multimedia and Expo, ICME 2017

PB - IEEE Computer Society

ER -