Bayesian Reinforcement Learning in Markovian and non-Markovian Tasks

Associated organisational units

Text available via DOI:

https://doi.org/10.1109/SSCI.2015.91
Final published version

View graph of relations

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review

Published

Adnane Ez-Zizi
Simon Farrell
David Stuart Leslie

More...

Publication date	7/12/2015
Host publication	Computational Intelligence, 2015 IEEE Symposium Series on
Place of Publication	Cape Town
Publisher	IEEE
Pages	579-586
Number of pages	7
ISBN (print)	9781479975600
<mark>Original language</mark>	English

Abstract

We present a Bayesian reinforcement learning model with a working memory module which can solve some non-Markovian decision processes. The model is tested, and compared against SARSA (lambda), on a standard working-memory task from the psychology literature. Our method uses the Kalman temporal difference framework, And its extension to stochastic state transitions, to give posterior distributions over state-action values. This framework provides a natural mechanism for using reward information to update more than the current state-action pair, and thus negates the use of eligibility traces. Furthermore, the existence of full posterior distributions allows the use of Thompson sampling for action selection, which in turn removes the need to choose an appropriately parameterised action-selection method.

Research

Associated organisational units

Links

Text available via DOI:

Bayesian Reinforcement Learning in Markovian and non-Markovian Tasks

Abstract

Quick Links

Connect With Us

Faculties & Depts

Contact Us