Albert Gordo, José A. Rodriguez, Florent Perronnin, Ernest Valveny
IEEE Conference on Computer Vision and Pattern Recognition, Providence, Rhode Island, USA, June 18-20, 2012.
In this article, we focus on the problem of large-scale
instance-level image retrieval. For efficiency reasons, it is
common to represent an image by a fixed-length descriptor
which is subsequently encoded into a small number of bits.
We note that most encoding techniques include an unsupervised
dimensionality reduction step. Our goal in this work
is to learn a better subspace in a supervised manner. We especially
raise the following question: “can category-level
labels be used to learn such a subspace?�
To answer this question, we experiment with four learning
techniques: the first one is based on a metric learning
framework, the second one on attribute representations,
the third one on Canonical Correlation Analysis (CCA) and
the fourth one on Joint Subspace and Classifier Learning
(JSCL). While the first three approaches have been applied
in the past to the image retrieval problem , we believe we
are the first to show the usefulness of JSCL in this context.
In our experiments, we use ImageNet as a source of
category-level labels and report retrieval results on two
standard datasets: INRIA Holidays and the University of
Kentucky benchmark. Our experimental study shows that
metric learning and attributes do not lead to any significant
improvement in retrieval accuracy, as opposed to CCA and
JSCL. As an example, we report on Holiday an increase in
accuracy from 39.3% to 48.6% with 32-dimensional representations.
Overall JSCL is shown to yield the best results.
Report number: