Content Based Image Retrieval (CBIR) systems retrieve relevant images from a database based on the content of the query. Most CBIR systems take a query image as input and retrieve similar images from a gallery, based on the global features (such as texture, shape, and color) extracted from an image. There are several ways of querying from an image database for retrieval purpose. Some of which are text, image, and sketch. However, the traditional methodologies support only one of the domains at a time. There is a need of bridging the gap between different domains (sketch and image) for enabling a Multi-Modal CBIR system. In this work, we propose a novel bimodal query based retrieval framework, which can take inputs from both sketch and image domains. The proposed framework aims at reducing the domain gap by learning a mapping function using Generative Adversarial Networks (GANs) and supervised deep domain adaptation techniques. Extensive experimentation and comparison with several baselines on two popular sketch datasets (Sketchy and TU-Berlin) show the effectiveness of our proposed framework. © 2018 IEEE.

Girraj Pahariya

Content based retrieval

Query processing

Textures

ADVERSARIAL NETWORKS

Content based image retrieval

CONTENTBASED IMAGE RETRIEVAL (CBIR) SYSTEM

Different domains

domain adaptation

GLOBAL FEATURE

MAPPING FUNCTIONS

RETRIEVAL FRAMEWORKS

Search engines

IIT Madras is a public technical and research university located in Chennai, Tamil Nadu. Founded in 1959, it is recognised as an Institute of National Importance.

IIT Madras has been ranked as the top engineering institute in India for four years in a row by the National Institutional Ranking Framework of the MHRD

It currently offers undergraduate, postgraduate and research degrees across 16 disciplines in Engineering, Sciences, Humanities and Management. About 596 faculty belonging to science and engineering departments and centres of the Institute are engaged in teaching, research and industrial consultancy.

IIT Madras

Bi-Modal Content Based Image Retrieval using Multi-class Cycle-GAN

2018 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2018

Most approaches for object recognition (OR) use a single feature descriptor to identify the object class from a query image. However, specifically in case of variations in appearance, scale and illumination, the performance of features not only vary depending on the class, but also on the query sample. We propose a biological inspired framework for OR using concepts from feature integration theory (FIT). Our model uses a hierarchy of visual features for OR. The key components in the proposed approach are: (i) SALCUT - unsupervised segmentation for salient object localization; (ii) optimal feature selection - identify appropriate features for each class, at each level of feature hierarchy, for a test instance; (iii) feature combination - which happens at higher levels of feature hierarchy, if features selected at the lower level are unable to classify a test instance. Our method outperforms several state-of-the-art techniques, when validated using two real-world datasets. © 2014 IEEE.

2014 IEEE International Conference on Image Processing, ICIP 2014

Hierarchy of visual features for object recognition

Content Based Image Retrieval (CBIR) techniques retrieve similar digital images from a large database. As the user often does not provide any clue (indication) of the region of interest in a query image, most methods of CBIR rely on a representation of the global content of the image. The desired content in an image is often localized (e.g. car appearing salient in a street) instead of being holistic, demanding the need for an object-centric CBIR. We propose a biologically inspired framework WOW ("What"Object is "Where") for this purpose. Design of WOW framework is motivated by the cognitive model of human visual perception and feature integration theory (FIT). The key contributions in the proposed approach are: (i) Feedback mechanism between Recognition ("What") and Localization ("Where") modules (both supervised), for a cohesive decision based on mutual consensus; (ii) Hierarchy of visual features (based on FIT) for an efficient recognition task. Integration of information from the two channels ("What" and "Where") in an iterative feedback mechanism, helps to filter erroneous contents in the outputs of individual modules. Finally, using a similarity criteria based on HOG features (spatially localized by WOW) for matching, our system effectively retrieves a set of rank-ordered samples from the gallery. Experimentation done on various real-life datasets (including PASCAL) exhibits the superior performance of the proposed method. Copyright 2014 ACM.

ACM International Conference Proceeding Series

Revealing what to extract from where, for object-centric content based image retrieval (CBIR)

2017 IEEE International Conference on Computer Vision (ICCV)

Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks

Unified Deep Supervised Domain Adaptation and Generalization

StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks

Journal	Data powered by Typeset2018 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2018
Publisher	Data powered by TypesetInstitute of Electrical and Electronics Engineers Inc.
Open Access	No