Partially Observable Monte Carlo Planning in Public Game Tree for complex imperfect information game King UP

Computing and Communications

Electronic data

2022GaoMRes
Final published version, 20.8 MB, PDF document
Available under license: CC BY-SA: Creative Commons Attribution-ShareAlike 4.0 International License

Text available via DOI:

https://doi.org/10.17635/lancaster/thesis/1737
Final published version

View graph of relations

Research output: Thesis › Master's Thesis

Published

Standard

Partially Observable Monte Carlo Planning in Public Game Tree for complex imperfect information game King UP. / Peng, Gao.
Lancaster University, 2022. 54 p.

Research output: Thesis › Master's Thesis

Bibtex

@mastersthesis{04392ec02ddd4871b9b2763aeeeb6eea,

title = "Partially Observable Monte Carlo Planning in Public Game Tree for complex imperfect information game King UP",

abstract = "In the past, Game AI has been a challenging area where humans devote much effort to it. The game could be classified by whether players could fully observe the state of the game into two different game types: perfect information game and imperfect information game. While perfect information game has been studied well, like Go and chess, the research on imperfect-information games is limited to poker and dice. The board game King Up is a sort of imperfect information game with more complicated rules and attributes, such as partial observation, multiple players, partly shared goals, mixed strategy (cooperate and compete temporarily for goals), and a mixture of dynamic game and static game (vote and move). Until now, King UP has not been solved by existing methods.Extended from Partially Observable Monte Carlo Planning, we propose a novelalgorithm called POMCP in a Public Game Tree to solve the problem, in whichwe also combine the techniques from CFR with POMCP. The public game tree isa data structure used to store belief over partial observable information and could be applied in most imperfect information games with multiple players; inspired by CFR, we propose local search and decision estimation processes to compute the value function over correlated state and action; we also implement two decision estimation methods, deterministic policy by Upper Confidence Bound and stochastic policy by Regret Matching. In the experiment, we evaluate the convergency situation of the algorithm. Also, we evaluate the algorithm{\textquoteright}s performance by further experiments like tournaments between different decision estimations and verify the generalization by Rock-PaperScissors.At last, we rethink the motivation, insights, and contributions. Our algorithm could work with less prior knowledge and perform well in complex, imperfectinformation games with multiple players like King UP and mixture game types (dynamic and static game).",

author = "Gao Peng",

year = "2022",

doi = "10.17635/lancaster/thesis/1737",

language = "English",

publisher = "Lancaster University",

school = "Lancaster University",

}

RIS

TY - THES

T1 - Partially Observable Monte Carlo Planning in Public Game Tree for complex imperfect information game King UP

AU - Peng, Gao

PY - 2022

Y1 - 2022

N2 - In the past, Game AI has been a challenging area where humans devote much effort to it. The game could be classified by whether players could fully observe the state of the game into two different game types: perfect information game and imperfect information game. While perfect information game has been studied well, like Go and chess, the research on imperfect-information games is limited to poker and dice. The board game King Up is a sort of imperfect information game with more complicated rules and attributes, such as partial observation, multiple players, partly shared goals, mixed strategy (cooperate and compete temporarily for goals), and a mixture of dynamic game and static game (vote and move). Until now, King UP has not been solved by existing methods.Extended from Partially Observable Monte Carlo Planning, we propose a novelalgorithm called POMCP in a Public Game Tree to solve the problem, in whichwe also combine the techniques from CFR with POMCP. The public game tree isa data structure used to store belief over partial observable information and could be applied in most imperfect information games with multiple players; inspired by CFR, we propose local search and decision estimation processes to compute the value function over correlated state and action; we also implement two decision estimation methods, deterministic policy by Upper Confidence Bound and stochastic policy by Regret Matching. In the experiment, we evaluate the convergency situation of the algorithm. Also, we evaluate the algorithm’s performance by further experiments like tournaments between different decision estimations and verify the generalization by Rock-PaperScissors.At last, we rethink the motivation, insights, and contributions. Our algorithm could work with less prior knowledge and perform well in complex, imperfectinformation games with multiple players like King UP and mixture game types (dynamic and static game).

AB - In the past, Game AI has been a challenging area where humans devote much effort to it. The game could be classified by whether players could fully observe the state of the game into two different game types: perfect information game and imperfect information game. While perfect information game has been studied well, like Go and chess, the research on imperfect-information games is limited to poker and dice. The board game King Up is a sort of imperfect information game with more complicated rules and attributes, such as partial observation, multiple players, partly shared goals, mixed strategy (cooperate and compete temporarily for goals), and a mixture of dynamic game and static game (vote and move). Until now, King UP has not been solved by existing methods.Extended from Partially Observable Monte Carlo Planning, we propose a novelalgorithm called POMCP in a Public Game Tree to solve the problem, in whichwe also combine the techniques from CFR with POMCP. The public game tree isa data structure used to store belief over partial observable information and could be applied in most imperfect information games with multiple players; inspired by CFR, we propose local search and decision estimation processes to compute the value function over correlated state and action; we also implement two decision estimation methods, deterministic policy by Upper Confidence Bound and stochastic policy by Regret Matching. In the experiment, we evaluate the convergency situation of the algorithm. Also, we evaluate the algorithm’s performance by further experiments like tournaments between different decision estimations and verify the generalization by Rock-PaperScissors.At last, we rethink the motivation, insights, and contributions. Our algorithm could work with less prior knowledge and perform well in complex, imperfectinformation games with multiple players like King UP and mixture game types (dynamic and static game).

U2 - 10.17635/lancaster/thesis/1737

DO - 10.17635/lancaster/thesis/1737

M3 - Master's Thesis

PB - Lancaster University

ER -

Research

Electronic data

Text available via DOI: