Citation
Hii, Yong Lian and See, John and Kairanbay, Magzhan and Wong, Lai Kuan (2017) Multigap: Multi-pooled inception network with text augmentation for aesthetic prediction of photographs. In: 2017 IEEE International Conference on Image Processing (ICIP), 17-20 Sept. 2017, Beijing, China.
Text
22.pdf - Published Version Restricted to Repository staff only Download (1MB) |
Abstract
With the advent of deep learning, convolutional neural networks have solved many imaging problems to a large extent. However, it remains to be seen if the image “bottleneck” can be unplugged by harnessing complementary sources of data. In this paper, we present a new approach to image aesthetic evaluation that learns both visual and textual features simultaneously. Our network extracts visual features by appending global average pooling blocks on multiple inception modules (MultiGAP), while textual features from associated user comments are learned from a recurrent neural network. Experimental results show that the proposed method is capable of achieving state-of-the-art performance on the AVA / AVAComments datasets. We also demonstrate the capability of our approach in visualizing aesthetic activations
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Uncontrolled Keywords: | learning (artificial intelligence) |
Subjects: | Q Science > QH Natural history |
Divisions: | Faculty of Computing and Informatics (FCI) |
Depositing User: | Ms Rosnani Abd Wahab |
Date Deposited: | 30 Mar 2021 19:08 |
Last Modified: | 30 Mar 2021 19:08 |
URII: | http://shdl.mmu.edu.my/id/eprint/7568 |
Downloads
Downloads per month over past year
Edit (login required) |