Home > Research > Publications & Outputs > Reinforcement learning under uncertainty

Electronic data

Links

Text available via DOI:

View graph of relations

Reinforcement learning under uncertainty: expected versus unexpected uncertainty and state versus reward uncertainty

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Published

Standard

Reinforcement learning under uncertainty: expected versus unexpected uncertainty and state versus reward uncertainty. / Ez-Zizi, Adnane; Farrell, Simon; Leslie, David et al.
In: Computational Brain and Behavior, Vol. 6, No. 4, 01.12.2023, p. 626-650.

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Harvard

Ez-Zizi, A, Farrell, S, Leslie, D, Malhotra, G & Ludwig, CJH 2023, 'Reinforcement learning under uncertainty: expected versus unexpected uncertainty and state versus reward uncertainty', Computational Brain and Behavior, vol. 6, no. 4, pp. 626-650. https://doi.org/10.1007/s42113-022-00165-y

APA

Ez-Zizi, A., Farrell, S., Leslie, D., Malhotra, G., & Ludwig, C. J. H. (2023). Reinforcement learning under uncertainty: expected versus unexpected uncertainty and state versus reward uncertainty. Computational Brain and Behavior, 6(4), 626-650. https://doi.org/10.1007/s42113-022-00165-y

Vancouver

Ez-Zizi A, Farrell S, Leslie D, Malhotra G, Ludwig CJH. Reinforcement learning under uncertainty: expected versus unexpected uncertainty and state versus reward uncertainty. Computational Brain and Behavior. 2023 Dec 1;6(4):626-650. Epub 2023 Mar 20. doi: 10.1007/s42113-022-00165-y

Author

Ez-Zizi, Adnane ; Farrell, Simon ; Leslie, David et al. / Reinforcement learning under uncertainty : expected versus unexpected uncertainty and state versus reward uncertainty. In: Computational Brain and Behavior. 2023 ; Vol. 6, No. 4. pp. 626-650.

Bibtex

@article{e1d4ab02bf294e3e8321fce6b8351fe8,
title = "Reinforcement learning under uncertainty: expected versus unexpected uncertainty and state versus reward uncertainty",
abstract = "Two prominent types of uncertainty that have been studied extensively are expected and unexpected uncertainty. Studies suggest that humans are capable of learning from reward under both expected and unexpected uncertainty when the source of variability is the reward. How do people learn when the source of uncertainty is the environment{\textquoteright}s state and the rewards themselves are deterministic? How does their learning compare with the case of reward uncertainty? The present study addressed these questions using behavioural experimentation and computational modelling. Experiment 1 showed that human subjects were generally able to use reward feedback to successfully learn the task rules under state uncertainty, and were able to detect a non-signalled reversal of stimulus-response contingencies. Experiment 2, which combined all four types of uncertainties—expected versus unexpected uncertainty, and state versus reward uncertainty—highlighted key similarities and differences in learning between state and reward uncertainties. We found that subjects performed significantly better in the state uncertainty condition, primarily because they explored less and improved their state disambiguation. We also show that a simple reinforcement learning mechanism that ignores state uncertainty and updates the state-action value of only the identified state accounted for the behavioural data better than both a Bayesian reinforcement learning model that keeps track of belief states and a model that acts based on sampling from past experiences. Our findings suggest a common mechanism supports reward-based learning under state and reward uncertainty.",
keywords = "Bayesian reinforcement learning, Expected and unexpected uncertainty, Reinforcement learning, Sampling-based learning",
author = "Adnane Ez-Zizi and Simon Farrell and David Leslie and Gaurav Malhotra and Ludwig, {Casimir J. H.}",
year = "2023",
month = dec,
day = "1",
doi = "10.1007/s42113-022-00165-y",
language = "English",
volume = "6",
pages = "626--650",
journal = "Computational Brain and Behavior",
issn = "2522-087X",
publisher = "Springer",
number = "4",

}

RIS

TY - JOUR

T1 - Reinforcement learning under uncertainty

T2 - expected versus unexpected uncertainty and state versus reward uncertainty

AU - Ez-Zizi, Adnane

AU - Farrell, Simon

AU - Leslie, David

AU - Malhotra, Gaurav

AU - Ludwig, Casimir J. H.

PY - 2023/12/1

Y1 - 2023/12/1

N2 - Two prominent types of uncertainty that have been studied extensively are expected and unexpected uncertainty. Studies suggest that humans are capable of learning from reward under both expected and unexpected uncertainty when the source of variability is the reward. How do people learn when the source of uncertainty is the environment’s state and the rewards themselves are deterministic? How does their learning compare with the case of reward uncertainty? The present study addressed these questions using behavioural experimentation and computational modelling. Experiment 1 showed that human subjects were generally able to use reward feedback to successfully learn the task rules under state uncertainty, and were able to detect a non-signalled reversal of stimulus-response contingencies. Experiment 2, which combined all four types of uncertainties—expected versus unexpected uncertainty, and state versus reward uncertainty—highlighted key similarities and differences in learning between state and reward uncertainties. We found that subjects performed significantly better in the state uncertainty condition, primarily because they explored less and improved their state disambiguation. We also show that a simple reinforcement learning mechanism that ignores state uncertainty and updates the state-action value of only the identified state accounted for the behavioural data better than both a Bayesian reinforcement learning model that keeps track of belief states and a model that acts based on sampling from past experiences. Our findings suggest a common mechanism supports reward-based learning under state and reward uncertainty.

AB - Two prominent types of uncertainty that have been studied extensively are expected and unexpected uncertainty. Studies suggest that humans are capable of learning from reward under both expected and unexpected uncertainty when the source of variability is the reward. How do people learn when the source of uncertainty is the environment’s state and the rewards themselves are deterministic? How does their learning compare with the case of reward uncertainty? The present study addressed these questions using behavioural experimentation and computational modelling. Experiment 1 showed that human subjects were generally able to use reward feedback to successfully learn the task rules under state uncertainty, and were able to detect a non-signalled reversal of stimulus-response contingencies. Experiment 2, which combined all four types of uncertainties—expected versus unexpected uncertainty, and state versus reward uncertainty—highlighted key similarities and differences in learning between state and reward uncertainties. We found that subjects performed significantly better in the state uncertainty condition, primarily because they explored less and improved their state disambiguation. We also show that a simple reinforcement learning mechanism that ignores state uncertainty and updates the state-action value of only the identified state accounted for the behavioural data better than both a Bayesian reinforcement learning model that keeps track of belief states and a model that acts based on sampling from past experiences. Our findings suggest a common mechanism supports reward-based learning under state and reward uncertainty.

KW - Bayesian reinforcement learning

KW - Expected and unexpected uncertainty

KW - Reinforcement learning

KW - Sampling-based learning

U2 - 10.1007/s42113-022-00165-y

DO - 10.1007/s42113-022-00165-y

M3 - Journal article

VL - 6

SP - 626

EP - 650

JO - Computational Brain and Behavior

JF - Computational Brain and Behavior

SN - 2522-087X

IS - 4

ER -