Reinforcement learning with temporal logic rewards

Li, Xiao; Vasile, Cristian-Ioan; Belta, Calin

Repository landing page

research

oai:open.bu.edu:2144/29608

Reinforcement learning with temporal logic rewards

Authors: Xiao Li
Cristian-Ioan Vasile
Calin Belta
Publication date: 1 January 2017
Publisher: 'Institute of Electrical and Electronics Engineers (IEEE)'

Abstract

Reinforcement learning (RL) depends critically on the choice of reward functions used to capture the desired behavior and constraints of a robot. Usually, these are handcrafted by a expert designer and represent heuristics for relatively simple tasks. Real world applications typically involve more complex tasks with rich temporal and logical structure. In this paper we take advantage of the expressive power of temporal logic (TL) to specify complex rules the robot should follow, and incorporate domain knowledge into learning. We propose Truncated Linear Temporal Logic (TLTL) as a specification language,We propose Truncated Linear Temporal Logic (TLTL) as a specification language,that is arguably well suited for the robotics applications, We show in simulated trials that learning is faster and policies obtained using the proposed approach outperform the ones learned using heuristic rewards in terms of the robustness degree, i.e., how well the tasks are satisfied. Furthermore, we demonstrate the proposed RL approach in a toast-placing task learned by a Baxter robot.This work is partially supported by the ONR under grants N00014-14-1-0554 and by the NSF under grants NRI-1426907 and CMMI-1400167 (N00014-14-1-0554 - ONR; NRI-1426907 - NSF; CMMI-1400167 - NSF

Similar works

Full text

Boston University Institutional Repository (OpenBU)

oai:open.bu.edu:2144/29608

Last time updated on 09/07/2019

This paper was published in Boston University Institutional Repository (OpenBU).

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.