Influencing Exploration in Actor-Critic Reinforcement Learning Algorithms

Gough, Andrew R

Repository landing page

oai:digitalcommons.calpoly.edu:theses-3228

Influencing Exploration in Actor-Critic Reinforcement Learning Algorithms

Authors: Andrew R Gough
Publication date: 1 June 2018
Publisher: DigitalCommons@CalPoly
Doi

Abstract

Reinforcement Learning (RL) is a subset of machine learning primarily concerned with goal-directed learning and optimal decision making. RL agents learn based on a reward signal discovered from trial and error in complex, uncertain environments with the goal of maximizing positive reward signals. RL approaches need to scale up as they are applied to more complex environments with extremely large state spaces. Inefficient exploration methods cannot sufficiently explore complex environments in a reasonable amount of time, and optimal policies will be unrealized resulting in RL agents failing to solve an environment. This thesis proposes a novel variant of the Actor-Advantage Critic (A2C) algorithm. The variant is validated against two state-of-the-art RL algorithms, Deep Q-Network (DQN) and A2C, across six Atari 2600 games of varying difficulty. The experimental results are competitive with state-of-the-art and achieve lower variance and quicker learning speed. Additionally, the thesis introduces a metric to objectively quantify the difficulty of any Markovian environment with respect to the exploratory capacity of RL agents

Similar works

Full text

Open in the Core reader

Download PDF

DigitalCommons@CalPoly

oai:digitalcommons.calpoly.edu...

Last time updated on 21/10/2022

This paper was published in DigitalCommons@CalPoly.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.