Partially Observable Monte Carlo Planning in Public Game Tree for complex imperfect information game King UP

Computing and Communications

Electronic data

2022GaoMRes
Final published version, 20.8 MB, PDF document
Available under license: CC BY-SA: Creative Commons Attribution-ShareAlike 4.0 International License

Text available via DOI:

https://doi.org/10.17635/lancaster/thesis/1737
Final published version

View graph of relations

Research output: Thesis › Master's Thesis

Published

Gao Peng

More...

Publication date	2022
Number of pages	54
Qualification	Masters by Research
Awarding Institution	Lancaster University
Supervisors/Advisors	Soriano Marcolino, Leandro, Supervisor
Publisher	Lancaster University
<mark>Original language</mark>	English

Abstract

In the past, Game AI has been a challenging area where humans devote much effort to it. The game could be classified by whether players could fully observe the state of the game into two different game types: perfect information game and imperfect information game. While perfect information game has been studied well, like Go and chess, the research on imperfect-information games is limited to poker and dice. The board game King Up is a sort of imperfect information game with more complicated rules and attributes, such as partial observation, multiple players, partly shared goals, mixed strategy (cooperate and compete temporarily for goals), and a mixture of dynamic game and static game (vote and move). Until now, King UP has not been solved by existing methods.

Extended from Partially Observable Monte Carlo Planning, we propose a novel
algorithm called POMCP in a Public Game Tree to solve the problem, in which
we also combine the techniques from CFR with POMCP. The public game tree is
a data structure used to store belief over partial observable information and could be applied in most imperfect information games with multiple players; inspired by CFR, we propose local search and decision estimation processes to compute the value function over correlated state and action; we also implement two decision estimation methods, deterministic policy by Upper Confidence Bound and stochastic policy by Regret Matching.

In the experiment, we evaluate the convergency situation of the algorithm. Also, we evaluate the algorithm’s performance by further experiments like tournaments between different decision estimations and verify the generalization by Rock-PaperScissors.

At last, we rethink the motivation, insights, and contributions. Our algorithm could work with less prior knowledge and perform well in complex, imperfectinformation games with multiple players like King UP and mixture game types (dynamic and static game).

Research

Electronic data

Text available via DOI:

Partially Observable Monte Carlo Planning in Public Game Tree for complex imperfect information game King UP

Abstract

Quick Links

Connect With Us

Faculties & Depts

Contact Us