Florent Perronnin, Yan Liu
CVPR 2009 (Computer Vision Pattern Recognition), Miami, Florida, USA, June 13-20, 2009
A state-of-the-art approach to measure the similarity of two images is to model each image by a continuous distribution, generally a Gaussian mixture model (GMM), and to compute a probabilistic similarity between the GMMs.
One limitation of traditionalmeasures such as the Kullback- Leibler (KL) divergence and the Probability Product Kernel (PPK) is that they measure a global match of distributions.
This paper introduces a novel image representation. We propose to approximate an image, modeled by a GMM, as a convex combination of K reference image GMMs, and then to describe the image as the K-dimensional vector of mixture weights. The computed weights encode a similarity that favors local matches (i.e. matches of individual Gaussians) and is therefore fundamentally different from the KL or PPK. Although the computation of the mixture weights is a convex optimization problem, its direct optimization is difficult. We propose two approximate optimization algorithms: the first one based on traditional sampling methods, the second one based on a variational bound approximation
of the true objective function.
We apply this novel representation to the image categorization problem and compare its performance to traditional kernel-based methods. We demonstrate on the PASCAL VOC 2007
Report number: