Header menu link for other important links
X
Domain-specific semantics guided approach to video captioning
, Hemalatha M.
Published in Institute of Electrical and Electronics Engineers Inc.
2020
Pages: 1576 - 1585
Abstract
In video captioning, the description of a video usually relies on the domain to which the video belongs. Typically, the videos belong to wide range domains such as sports, music, news, cooking, etc. In many cases, a video can be associated with more than one domain. In this paper, we propose an approach to video captioning that uses domain-specific decoders. We build a domain classifier to obtain the estimates of probabilities of a video belonging to different domains. For each video, we identify the top - k domains based on the estimated probabilities. Each video in the training data set is shared in training the domain-specific decoders of top-k labels obtained from the domain classifier. The domain-specific decoders use the domain-specific semantic tags for generating captions. The proposed approach uses the Temporal VLAD for preprocessing the features extracted from 2D-CNN and 3D-CNN features. The preprocessed features provide better feature representation of the videos. The effectiveness of the proposed approach is demonstrated through the results of experimental studies on Microsoft Video Description (MSVD) corpus and MSR-VTT dataset. © 2020 IEEE.