Skip to content. Skip to navigation
CIM Menus

Virtual Informal Systems Seminar (VISS) Centre for Intelligent Machines (CIM) and Groupe d'Etudes et de Recherche en Analyse des Decisions (GERAD)

Reinforcement Learning with Robustness and Safety Guarantees

Dileep Kalathil
Department of Electrical and Computer Engineering Texas A&M University

October 30, 2020 at  11:30 AM


Reinforcement Learning (RL) is the class of machine learning that addresses the problem of learning to control unknown dynamical systems. RL has achieved remarkable success recently in applications like playing games and robotics. However, most of these successes are limited to very structured or simulated environments. When applied to real-world systems, RL algorithms face two fundamental sources of fragility. Firstly, the real-world system parameters can be very different from that of the nominal values used for training RL algorithms. Secondly, the control policy for any real-world system is required to maintain some necessary safety criteria to avoid undesirable outcomes. Most deep RL algorithms overlook these fundamental challenges which often results in learned polices that can performs poorly in the real-world setting. We address these issues in two steps. First, we propose a robust reinforcement learning algorithm to train policies that account for the possible parameter mismatches between the simulation system and real-world system. Second, we develop a safe reinforcement learning algorithm to learn policies such that the frequency of visiting undesirable states and expensive actions satisfies the safety constraints.


Dileep Kalathil is an Assistant Professor in the Department of Electrical and Computer Engineering at Texas A&M University. His main research area is reinforcement learning, with applications in cyber-physical systems, intelligent transportation systems and power systems. In particular, his research addresses three fundamental problems in RL: (i) How to develop data efficient RL algorithms? (ii) How to develop safe and robust RL algorithms? and (iii) How to develop scalable multi-agent RL algorithms? Before joining TAMU, he was a postdoctoral researcher in the EECS department at UC Berkeley. He received his PhD from University of Southern California (USC) in 2014 where he won the best PhD Dissertation Prize in the Department of Electrical Engineering. He received an M. Tech. from IIT Madras where he won the award for the best academic performance in the Electrical Engineering Department.