Multigap: Multi-pooled inception network with text augmentation for aesthetic prediction of photographs

Citation

Hii, Yong Lian and See, John and Kairanbay, Magzhan and Wong, Lai Kuan (2017) Multigap: Multi-pooled inception network with text augmentation for aesthetic prediction of photographs. In: 2017 IEEE International Conference on Image Processing (ICIP), 17-20 Sept. 2017, Beijing, China.

[img] Text
22.pdf - Published Version
Restricted to Repository staff only

Download (1MB)

Abstract

With the advent of deep learning, convolutional neural networks have solved many imaging problems to a large extent. However, it remains to be seen if the image “bottleneck” can be unplugged by harnessing complementary sources of data. In this paper, we present a new approach to image aesthetic evaluation that learns both visual and textual features simultaneously. Our network extracts visual features by appending global average pooling blocks on multiple inception modules (MultiGAP), while textual features from associated user comments are learned from a recurrent neural network. Experimental results show that the proposed method is capable of achieving state-of-the-art performance on the AVA / AVAComments datasets. We also demonstrate the capability of our approach in visualizing aesthetic activations

Item Type: Conference or Workshop Item (Paper)
Uncontrolled Keywords: learning (artificial intelligence)
Subjects: Q Science > QH Natural history
Divisions: Faculty of Computing and Informatics (FCI)
Depositing User: Ms Rosnani Abd Wahab
Date Deposited: 30 Mar 2021 19:08
Last Modified: 30 Mar 2021 19:08
URII: http://shdl.mmu.edu.my/id/eprint/7568

Downloads

Downloads per month over past year

View ItemEdit (login required)