Abstract

We propose Bayesian Inverse Reinforcement Learning with Failure (BIRLF), which makes use of failed demonstrations that were often ignored or filtered in previous methods due to the difficulties to incorporate them in addition to the successful ones. Specifically, we leverage halfspaces derived from policy optimality conditions to incorporate failed demonstrations under Bayesian Inverse Reinforcement Learning (BIRL) framework. Under the continuous control setting, the reward function and policy are learned in an alternative manner, both of which are estimated by function approximators to guarantee the learning ability. Our approach is formulated as a model-free Inverse Reinforcement Learning (IRL) method that naturally accommodates more complex environments with continuous state and action spaces. In experiments, we demonstrate the proposed method in a virtual grasping task, achieving a significant performance boost compared to existing methods.

Paper

Learning Virtual Grasp with Failed Demonstrations via Bayesian Inverse Reinforcement Learning
Xu Xie*, Changyang Li*, ChiZhang, Yixin Zhu, Song-Chun Zhu
International Conference on Intelligent Robots and Systems (IROS), 2019
(* indicates equal contribution.)
Paper / Demo

Team

Xu Xie1,2

Changyang Li1

Chi Zhang1,2

Yixin Zhu1,2

Song-Chun Zhu1,2

1 UCLA Center for Vision, Cognition, Learning and Autonomy

2 International Center for AI and Robot Autonomy (CARA)

Features
Code

View on Github

Bibtex

@inproceedings{xie2019vrgrasp,
author={Xie, Xu and Li, Changyang and Zhang, Chi and Zhu, Yixin and Zhu, Song-Chun},
title={Learning Virtual Grasp with Failed Demonstrations via Bayesian Inverse Reinforcement Learning},
booktitle={2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
year={2019}}