David Novotny, Diane Larlus, Andrea Vedaldi
ECCV, Amsterdam, The Netherlands, October 11-14, 2016.
While recent research in image understanding has often fo-
cused on recognizing more types of objects, understanding more about the
objects is just as important. Recognizing object parts and attributes has
been extensively studied before, yet learning large space of such concepts
remains elusive due to the high cost of providing detailed object annota-
tions for supervision. The key contribution of this paper is an algorithm
to learn the nameable parts of objects automatically, from images ob-
tained by querying Web search engines. The key challenge is the high
level of noise in the annotations; to address it, we propose a new uni ed
embedding space where the appearance and geometry of objects and their
semantic parts are represented uniformly. Geometric relationships are in-
duced in a soft manner by a rich set of non-semantic mid-level anchors,
bridging the gap between semantic and non-semantic parts.We also show
that the resulting embedding provides a visually-intuitive mechanism to
navigate the learned concepts and their corresponding images.
Report number: