This video is part of the Deep Reinforcement Learning Summit, San Francisco, 2019 Event. If you would like to access all of the videos please click here.

Rewards, Resets, Exploration: Bottlenecks For Scaling Deep RL In Robotics

Deep RL for practical applications such as robotics has been seen as great challenges. However, recent successes of sample-efficient deep RL algorithms in real-world learning and robust Sim2Real transfers appear to hint that the main bottlenecks for scaling deep RL in robotics lie elsewhere. In this talk, I'll focus on three critical components required for massively scaling deep RL in simulation or real-world: rewards, resets, and exploration. I'll discuss our recent work on the universal reward definitions through natural languages; the concurrent learning of reset policy for safe and continual learning; and the efficient exploration through goal-driven or empowerment-based action abstractions. I'll end the talk by highlighting future directions and other challenges toward enabling robots to be as diversely functional as humans.

Shane Gu, Research Scientist at Google Brain

Shane Gu is a Research Scientist at Google Brain, where he mainly works on problems in deep learning, reinforcement learning, robotics, and probabilistic machine learning. His recent research focuses on sample-efficient RL methods that could scale to solve difficult continuous control problems in the real-world, which have been covered by Google Research Blogpost and MIT Technology Review. He completed his PhD in Machine Learning at the University of Cambridge and the Max Planck Institute for Intelligent Systems in Tübingen, where he was co-supervised by Richard E. Turner, Zoubin Ghahramani, and Bernhard Schölkopf. During his PhD, he also collaborated closely with Sergey Levine at UC Berkeley/Google Brain and Timothy Lillicrap at DeepMind. He holds a B.ASc. in Engineering Science from the University of Toronto, where he did his thesis with Geoffrey Hinton in distributed training of neural networks using evolutionary algorithms.

Cookies help us deliver our services. By using our services, you agree to our use of cookies. Learn more