Florent Perronnin, Jorge Sanchez
IEEE Computer Vision and Pattern Recognition (CVPR),Colorado Springs,USA,June 21-23,2011
We consider image cIassification on a large-scale, i.e.when millions of images are involved. First, we study image classjfication accuracy as a function of the signature dimensionality and the training set size. We show experimentally that the larger the training set the more highdimensional signatures make a difference. Second, we explore data compression on very large signatures (on the
order of 10^5 dimensions). We show how the gain in storage can be traded against a loss in accuracy and or an increase in CPU cost. We experiment with two lossy compression strategies: a dimensionality reduction technique known as
the hash kernel and an encoding technique based on product quantizers. We report results on very large databases showing that we can reduce the storage of our signatures by a factor 64 to 128 with little loss in accuracy.
Report number: