On learning history based policies for controlling Markov decision processes

Patil, Gandharv; Mahajan, Aditya; Precup, Doina

Computer Science > Machine Learning

arXiv:2211.03011 (cs)

[Submitted on 6 Nov 2022]

Title:On learning history based policies for controlling Markov decision processes

Authors:Gandharv Patil, Aditya Mahajan, Doina Precup

View PDF

Abstract:Reinforcementlearning(RL)folkloresuggeststhathistory-basedfunctionapproximationmethods,suchas recurrent neural nets or history-based state abstraction, perform better than their memory-less counterparts, due to the fact that function approximation in Markov decision processes (MDP) can be viewed as inducing a Partially observable MDP. However, there has been little formal analysis of such history-based algorithms, as most existing frameworks focus exclusively on memory-less features. In this paper, we introduce a theoretical framework for studying the behaviour of RL algorithms that learn to control an MDP using history-based feature abstraction mappings. Furthermore, we use this framework to design a practical RL algorithm and we numerically evaluate its effectiveness on a set of continuous control tasks.

Subjects:	Machine Learning (cs.LG); Systems and Control (eess.SY); Machine Learning (stat.ML)
Cite as:	arXiv:2211.03011 [cs.LG]
	(or arXiv:2211.03011v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2211.03011

Submission history

From: Gandharv Patil [view email]
[v1] Sun, 6 Nov 2022 02:47:55 UTC (9,855 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2211

Change to browse by:

cs
cs.SY
eess
eess.SY
stat
stat.ML

References & Citations

export BibTeX citation

Computer Science > Machine Learning

Title:On learning history based policies for controlling Markov decision processes

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:On learning history based policies for controlling Markov decision processes

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators