Olivier Pietquin, directeur de recherche at SequeL team, INRIA, Lille, France

Abstract: Modern interactive systems are mostly based on an aggregation of Machine Learning (ML) modules that are trained on batch data (e.g. automatic speech, gesture or emotion recognition, language understanding or generation, text-to-speech synthesis, etc.). Yet, closing the interaction loop with ML-based techniques is still an issue because keeping the human in the learning loop raises new challenges for ML (such as non-stationarity, subjective evaluation, risk aversion, safe exploration, etc.). In this talk, the speaker will discuss a Reinforcement Learning (RL) approach to this problem and show how some of these challenges can be addressed with direct and inverse RL methods in the context of Markov Decision Processes and Stochastic Games.