Towards joint understanding of images and language

Published by NAVER LABS Europe at 24 May 2017

Monday, 29^th May at 2:00PM

Speaker: Svetlana Lazebnik, associate professor at University of Illinois at Urbana-Champaign, Urbana, IL, U.S.A.

Abstract: Numerous real-world tasks can benefit from practical systems that can identify objects in scenes based on language and understand language grounded in visual context. This presentation will focus on my group’s recent work on developing systems for jointly modeling images and language. I will talk about neural models for learning cross-modal embeddings for text-to-image and image-to-text search, and about the challenging task of grounding or localizing of textual mentions of entities in an image. Finally, I will discuss applications of our models to automatic image description and visual question answering.

NAVER FRANCE Gender Equality 2024

NAVER FRANCE Gender Equality 2023

VISION

Perception to help robots understand and interact with the environment.

INTERACTION

Equip robots to interact safely with humans, other robots and systems.

ACTION

Providing embodied agents with sequential decision-making capabilities to safely execute complex tasks in dynamic environments.

Action

All

Publications

Blog

News

Careers

People

Towards joint understanding of images and language

All

Publications

Blog

News

Careers

People

Cookie settings