Speaker: Vittorio Ferrari, associate professor at University of Edinburgh, Edinburgh, U.K.

Abstract: ImageNet is a large hierarchical database of object classes containing 15 million images. Unfortunately only a small fraction of them is manually annotated with bounding-boxes, and none with pixelwise segmentations. This prevents useful developments, such as learning object detectors for thousands of classes. Our goal is to automatically populate ImageNet with many more bounding-boxes and segmentations, by leveraging existing manual annotations and by transferring knowledge between classes across the semantic hierarchy of ImageNet. To be useful as training data, these auto-annotations needs to be of high quality, otherwise they will lead to models fit to their errors. This requires an auto-annotation method capable of self-assessment, i.e. estimating the quality of its own output. This allows to return only accurate annotations, automatically discarding the rest. In this talk I will give an overview of our research on ImageNet auto-annotation, with particular focus on a recently developed technique capable of self-assessment. I will present results on half a million images, covering more than 500 object classes. These auto-annotated bounding-boxes and segmentations are available for download at our website.