José A. Rodriguez, Diane Larlus
ICCV 2013, Sydney, Australia, December 3-6, 2013
We tackle the detection of prominent objects in images
as a retrieval task: given a global image descriptor, we find
the most similar images in an annotated dataset, and transfer
the object bounding boxes. We refer to this approach
as data driven detection (DDD), that is an alternative to
sliding windows. Previous works have used similar notions
but with task-independent similarities and representations,
i.e. they were not tailored to the end-goal of localization.
This article proposes two contributions: (i) a metric learning
algorithm and (ii) a representation of images as object
probability maps, that are both optimized for detection. We
show experimentally that these two contributions are crucial
to DDD, do not require costly additional operations,
and in some cases yield comparable or better results than
state-of-the-art detectors despite conceptual simplicity and
increased speed. As an application of prominent object
detection, we improve fine-grained categorization by precropping images with the proposed approach.
Report number: