Repository landing page

We are not able to resolve this OAI Identifier to the repository landing page. If you are the repository manager for this record, please head to the Dashboard and adjust the settings.

Influencing Exploration in Actor-Critic Reinforcement Learning Algorithms

Abstract

Reinforcement Learning (RL) is a subset of machine learning primarily concerned with goal-directed learning and optimal decision making. RL agents learn based on a reward signal discovered from trial and error in complex, uncertain environments with the goal of maximizing positive reward signals. RL approaches need to scale up as they are applied to more complex environments with extremely large state spaces. Inefficient exploration methods cannot sufficiently explore complex environments in a reasonable amount of time, and optimal policies will be unrealized resulting in RL agents failing to solve an environment. This thesis proposes a novel variant of the Actor-Advantage Critic (A2C) algorithm. The variant is validated against two state-of-the-art RL algorithms, Deep Q-Network (DQN) and A2C, across six Atari 2600 games of varying difficulty. The experimental results are competitive with state-of-the-art and achieve lower variance and quicker learning speed. Additionally, the thesis introduces a metric to objectively quantify the difficulty of any Markovian environment with respect to the exploratory capacity of RL agents

Similar works

This paper was published in DigitalCommons@CalPoly.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.