Header menu link for other important links
X
SACIC: A semantics-aware convolutional image captioner using multi-level pervasive attention
Sandeep Narayan Parameswaran,
Published in Springer
2019
Volume: 11955 LNCS
   
Pages: 64 - 76
Abstract
Attention mechanisms alongside encoder-decoder architectures have become integral components for solving the image captioning problem. The attention mechanism recombines an encoding of the image depending on the state of the decoder, to generate the caption sequence. The decoder is predominantly recurrent in nature. In contrast, we propose a novel network possessing attention-like properties that are pervasive through its layers, by utilizing a convolutional neural network (CNN) to refine and combine representations at multiple levels of the architecture for captioning images. We also enable the model to use explicit higher-level semantic information obtained by performing panoptic segmentation on the image. The attention capability of the model is visually demonstrated, and an experimental evaluation is shown on the MS-COCO dataset. We exhibit that the approach is more robust, efficient, and yields better performance in comparison to the state-of-the-art architectures for image captioning. © Springer Nature Switzerland AG 2019.
About the journal
JournalData powered by TypesetLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
PublisherData powered by TypesetSpringer
ISSN03029743
Open AccessNo
Concepts (18)
  •  related image
    Computer vision
  •  related image
    Convolution
  •  related image
    Decoding
  •  related image
    Deep learning
  •  related image
    Deep neural networks
  •  related image
    Multilayer neural networks
  •  related image
    Network architecture
  •  related image
    Semantics
  •  related image
    Signal encoding
  •  related image
    ATTENTION MECHANISMS
  •  related image
    Convolutional neural network
  •  related image
    ENCODER-DECODER ARCHITECTURE
  •  related image
    Experimental evaluation
  •  related image
    IMAGE CAPTIONING
  •  related image
    INTEGRAL COMPONENTS
  •  related image
    Semantic information
  •  related image
    State of the art
  •  related image
    Image segmentation