TY - GEN
T1 - An efficient deep residual-inception network for multimedia classification
AU - Pouyanfar, Samira
AU - Chen, Shu Ching
AU - Shyu, Mei Ling
N1 - Funding Information:
For Shu-Ching Chen, this research is partially supported by NSF CNS-1461926.
PY - 2017/8/28
Y1 - 2017/8/28
N2 - Deep learning has led to many breakthroughs in machine perception and data mining. Although there are many substantial advances of deep learning in the applications of image recognition and natural language processing, very few work has been done in video analysis and semantic event detection. Very deep inception and residual networks have yielded promising results in the 2014 and 2015 ILSVRC challenges, respectively. Now the question is whether these architectures are applicable to and computationally reasonable in a variety of multimedia datasets. To answer this question, an efficient and lightweight deep convolutional network is proposed in this paper. This network is carefully designed to decrease the depth and width of the state-of-the-art networks while maintaining the high-performance. The proposed deep network includes the traditional convolutional architecture in conjunction with residual connections and very light inception modules. Experimental results demonstrate that the proposed network not only accelerates the training procedure, but also improves the performance in different multimedia classification tasks.
AB - Deep learning has led to many breakthroughs in machine perception and data mining. Although there are many substantial advances of deep learning in the applications of image recognition and natural language processing, very few work has been done in video analysis and semantic event detection. Very deep inception and residual networks have yielded promising results in the 2014 and 2015 ILSVRC challenges, respectively. Now the question is whether these architectures are applicable to and computationally reasonable in a variety of multimedia datasets. To answer this question, an efficient and lightweight deep convolutional network is proposed in this paper. This network is carefully designed to decrease the depth and width of the state-of-the-art networks while maintaining the high-performance. The proposed deep network includes the traditional convolutional architecture in conjunction with residual connections and very light inception modules. Experimental results demonstrate that the proposed network not only accelerates the training procedure, but also improves the performance in different multimedia classification tasks.
KW - Convolutional neural network
KW - Deep learning
KW - Multimedia classification
KW - Residual-Inception
KW - Video event detection
UR - http://www.scopus.com/inward/record.url?scp=85030239495&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85030239495&partnerID=8YFLogxK
U2 - 10.1109/ICME.2017.8019447
DO - 10.1109/ICME.2017.8019447
M3 - Conference contribution
AN - SCOPUS:85030239495
T3 - Proceedings - IEEE International Conference on Multimedia and Expo
SP - 373
EP - 378
BT - 2017 IEEE International Conference on Multimedia and Expo, ICME 2017
PB - IEEE Computer Society
T2 - 2017 IEEE International Conference on Multimedia and Expo, ICME 2017
Y2 - 10 July 2017 through 14 July 2017
ER -