Robust satisfaction of temporal logic specifications via reinforcement learning

Jones, Austin; Aksaray, Derya; Kong, Zhaodan; Schwager, Mac; Belta, Calin

Repository landing page

research

oai:open.bu.edu:2144/29609

Robust satisfaction of temporal logic specifications via reinforcement learning

Authors: Austin Jones
Derya Aksaray
Zhaodan Kong
Mac Schwager
Calin Belta
Publication date: 1 January 2015
Publisher

Abstract

We consider the problem of steering a system with unknown, stochastic dynamics to satisfy a rich, temporally-layered task given as a signal temporal logic formula. We represent the system as a finite-memory Markov decision process with unknown transition probabilities and whose states are built from a partition of the state space. We present provably convergent reinforcement learning algorithms to maximize the probability of satisfying a given specification and to maximize the average expected robustness, i.e. a measure of how strongly the formula is satisfied. Robustness allows us to quantify progress towards satisfying a given specification. We demonstrate via a pair of robot navigation simulation case studies that, due to the quantification of progress towards satisfaction, reinforcement learning with robustness maximization performs better than probability maximization in terms of both probability of satisfaction and expected robustness with a low number of training examples

Similar works

Full text

Boston University Institutional Repository (OpenBU)

oai:open.bu.edu:2144/29609

Last time updated on 09/07/2019

This paper was published in Boston University Institutional Repository (OpenBU).

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.