Rights statement: ©2015 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE."
Accepted author manuscript, 277 KB, PDF document
Available under license: CC BY: Creative Commons Attribution 4.0 International License
Rights statement: ©2015 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
Final published version, 323 KB, PDF document
Available under license: CC BY
Final published version
Licence: CC BY
Research output: Contribution to Journal/Magazine › Journal article › peer-review
Research output: Contribution to Journal/Magazine › Journal article › peer-review
}
TY - JOUR
T1 - Mixed-strategy learning with continuous action sets
AU - Perkins, Steven
AU - Mertikopoulos, Panayotis
AU - Leslie, David Stuart
N1 - ©2015 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
PY - 2017/1
Y1 - 2017/1
N2 - Motivated by the recent applications of game-theoretical learning to the design of distributed control systems, we study a class of control problems that can be formulated as potential games with continuous action sets. We propose an actor-critic reinforcement learning algorithm that adapts mixed strategies over continuous action spaces. To analyse the algorithm we extend the theory of finite-dimensional two-timescale stochastic approximation to a Banach space setting, and prove that the continuous dynamics of the process converge to equilibrium in the case of potential games. These results combine to give a provablyconvergent learning algorithm in which players do not need to keep track of the controls selected by other agents.
AB - Motivated by the recent applications of game-theoretical learning to the design of distributed control systems, we study a class of control problems that can be formulated as potential games with continuous action sets. We propose an actor-critic reinforcement learning algorithm that adapts mixed strategies over continuous action spaces. To analyse the algorithm we extend the theory of finite-dimensional two-timescale stochastic approximation to a Banach space setting, and prove that the continuous dynamics of the process converge to equilibrium in the case of potential games. These results combine to give a provablyconvergent learning algorithm in which players do not need to keep track of the controls selected by other agents.
U2 - 10.1109/TAC.2015.2511930
DO - 10.1109/TAC.2015.2511930
M3 - Journal article
VL - 62
SP - 379
EP - 384
JO - IEEE Transactions on Automatic Control
JF - IEEE Transactions on Automatic Control
SN - 0018-9286
IS - 1
ER -