Guillaume Bouchard, Bill Triggs
Appeared in Computer Vision and Pattern Recognition, Volume 1, pp. 710-715.
We propose a generative model that codes the geometry and appearance of generic visual object categories as a loose hierarchy of parts, with probabilistic spatial relations linking parts to subparts, soft assignment of subparts to parts, and scale invariant key-point based local features at the lowest level of the hierarchy. The method is designed to efficiently handle categories containing hundreds of redundand local features, such as those returned by current key-point detectors. This robustness allows it to outperform constellation syle models, despite their stronger spatial models. The model is initialized by robust bottom-up voting over location-scale pyramids, and optimized by Expectation Maximization. Training is rapid, and objects do not need to be marked in the training images. Experiments on several popular datasets show the method s ability to capture complex natural object classes.
Report number: