Automated planning in repeated adversarial games

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review

Published

Standard

Automated planning in repeated adversarial games. / De Cote, Enrique Munoz; Chapman, Archie C.; Sykulski, Adam M. et al.
Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence, UAI 2010. 2010. p. 376-383.

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review

Harvard

De Cote, EM, Chapman, AC, Sykulski, AM & Jennings, NR 2010, Automated planning in repeated adversarial games. in Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence, UAI 2010. pp. 376-383, 26th Conference on Uncertainty in Artificial Intelligence, UAI 2010, Catalina Island, CA, United States, 8/07/10.

APA

De Cote, E. M., Chapman, A. C., Sykulski, A. M., & Jennings, N. R. (2010). Automated planning in repeated adversarial games. In Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence, UAI 2010 (pp. 376-383)

Vancouver

De Cote EM, Chapman AC, Sykulski AM, Jennings NR. Automated planning in repeated adversarial games. In Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence, UAI 2010. 2010. p. 376-383

Author

De Cote, Enrique Munoz ; Chapman, Archie C. ; Sykulski, Adam M. et al. / Automated planning in repeated adversarial games. Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence, UAI 2010. 2010. pp. 376-383

Bibtex

@inproceedings{1f09e7ab2a6f4a0b9f081f1278bf011e,

title = "Automated planning in repeated adversarial games",

abstract = "Game theory's prescriptive power typically relies on full rationality and/or self-play interactions. In contrast, this work sets aside these fundamental premises and focuses instead on heterogeneous autonomous interactions between two or more agents. Specifically, we introduce a new and concise representation for repeated adversarial (constant-sum) games that highlight the necessary features that enable an automated planing agent to reason about how to score above the game's Nash equilibrium, when facing heterogeneous adversaries. To this end, we present TeamUP, a model-based RL algorithm designed for learning and planning such an abstraction. In essence, it is somewhat similar to R-max with a cleverly engineered reward shaping that treats exploration as an adversarial optimization problem. In practice, it attempts to find an ally with which to tacitly collude (in more than two-player games) and then collaborates on a joint plan of actions that can consistently score a high utility in adversarial repeated games. We use the inaugural Lemonade Stand Game Tournament1 to demonstrate the effectiveness of our approach, and find that TeamUP is the best performing agent, demoting the Tournament's actual winning strategy into second place. In our experimental analysis, we show hat our strategy successfully and consistently builds collaborations with many different heterogeneous (and sometimes very sophisticated) adversaries.",

author = "{De Cote}, {Enrique Munoz} and Chapman, {Archie C.} and Sykulski, {Adam M.} and Jennings, {Nicholas R.}",

year = "2010",

language = "English",

isbn = "9780974903965",

pages = "376--383",

booktitle = "Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence, UAI 2010",

note = "26th Conference on Uncertainty in Artificial Intelligence, UAI 2010 ; Conference date: 08-07-2010 Through 11-07-2010",

}

RIS

TY - GEN

T1 - Automated planning in repeated adversarial games

AU - De Cote, Enrique Munoz

AU - Chapman, Archie C.

AU - Sykulski, Adam M.

AU - Jennings, Nicholas R.

PY - 2010

Y1 - 2010

N2 - Game theory's prescriptive power typically relies on full rationality and/or self-play interactions. In contrast, this work sets aside these fundamental premises and focuses instead on heterogeneous autonomous interactions between two or more agents. Specifically, we introduce a new and concise representation for repeated adversarial (constant-sum) games that highlight the necessary features that enable an automated planing agent to reason about how to score above the game's Nash equilibrium, when facing heterogeneous adversaries. To this end, we present TeamUP, a model-based RL algorithm designed for learning and planning such an abstraction. In essence, it is somewhat similar to R-max with a cleverly engineered reward shaping that treats exploration as an adversarial optimization problem. In practice, it attempts to find an ally with which to tacitly collude (in more than two-player games) and then collaborates on a joint plan of actions that can consistently score a high utility in adversarial repeated games. We use the inaugural Lemonade Stand Game Tournament1 to demonstrate the effectiveness of our approach, and find that TeamUP is the best performing agent, demoting the Tournament's actual winning strategy into second place. In our experimental analysis, we show hat our strategy successfully and consistently builds collaborations with many different heterogeneous (and sometimes very sophisticated) adversaries.

AB - Game theory's prescriptive power typically relies on full rationality and/or self-play interactions. In contrast, this work sets aside these fundamental premises and focuses instead on heterogeneous autonomous interactions between two or more agents. Specifically, we introduce a new and concise representation for repeated adversarial (constant-sum) games that highlight the necessary features that enable an automated planing agent to reason about how to score above the game's Nash equilibrium, when facing heterogeneous adversaries. To this end, we present TeamUP, a model-based RL algorithm designed for learning and planning such an abstraction. In essence, it is somewhat similar to R-max with a cleverly engineered reward shaping that treats exploration as an adversarial optimization problem. In practice, it attempts to find an ally with which to tacitly collude (in more than two-player games) and then collaborates on a joint plan of actions that can consistently score a high utility in adversarial repeated games. We use the inaugural Lemonade Stand Game Tournament1 to demonstrate the effectiveness of our approach, and find that TeamUP is the best performing agent, demoting the Tournament's actual winning strategy into second place. In our experimental analysis, we show hat our strategy successfully and consistently builds collaborations with many different heterogeneous (and sometimes very sophisticated) adversaries.

M3 - Conference contribution/Paper

AN - SCOPUS:80053159020

SN - 9780974903965

SP - 376

EP - 383

BT - Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence, UAI 2010

T2 - 26th Conference on Uncertainty in Artificial Intelligence, UAI 2010

Y2 - 8 July 2010 through 11 July 2010

ER -

Research