Monday, 29th May at 2:00PM

Speaker: Svetlana Lazebnik, associate professor at University of Illinois at Urbana-Champaign, Urbana, IL, U.S.A

Abstract: Numerous real-world tasks can benefit from practical systems that can identify objects in scenes based on language and understand language grounded in visual context. This presentation will focus on my group's recent work on developing systems for jointly modeling images and language. I will talk about neural models for learning cross-modal embeddings for text-to-image and image-to-text search, and about the challenging task of grounding or localizing of textual mentions of entities in an image. Finally, I will discuss applications of our models to automatic image description and visual question answering.