Enhancing deep reinforcement learning for scale flexibility in real-time strategy games

Computing and Communications

Associated organisational unit

Artificial Intelligence

Electronic data

ENTCOM__Scale_invariant_Reinforcement_learning
Accepted author manuscript, 1.73 MB, PDF document
Available under license: CC BY: Creative Commons Attribution 4.0 International License

Text available via DOI:

https://doi.org/10.1016/j.entcom.2024.100843
Final published version

View graph of relations

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Published

Standard

Enhancing deep reinforcement learning for scale flexibility in real-time strategy games. / Lemos, Marcelo Luiz Harry Diniz; Vieira, Ronaldo e Silva; Tavares, Anderson Rocha et al.
In: Entertainment Computing, Vol. 52, 100843, 31.01.2025.

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Harvard

Lemos, MLHD, Vieira, RES, Tavares, AR, Marcolino, LS & Chaimowicz, L 2025, 'Enhancing deep reinforcement learning for scale flexibility in real-time strategy games', Entertainment Computing, vol. 52, 100843. https://doi.org/10.1016/j.entcom.2024.100843

APA

Lemos, M. L. H. D., Vieira, R. E. S., Tavares, A. R., Marcolino, L. S., & Chaimowicz, L. (2025). Enhancing deep reinforcement learning for scale flexibility in real-time strategy games. Entertainment Computing, 52, Article 100843. https://doi.org/10.1016/j.entcom.2024.100843

Vancouver

Lemos MLHD, Vieira RES, Tavares AR, Marcolino LS, Chaimowicz L. Enhancing deep reinforcement learning for scale flexibility in real-time strategy games. Entertainment Computing. 2025 Jan 31;52:100843. Epub 2024 Aug 7. doi: 10.1016/j.entcom.2024.100843

Author

Lemos, Marcelo Luiz Harry Diniz ; Vieira, Ronaldo e Silva ; Tavares, Anderson Rocha et al. / Enhancing deep reinforcement learning for scale flexibility in real-time strategy games. In: Entertainment Computing. 2025 ; Vol. 52.

Bibtex

@article{b5d4c447fffc4f5b8b7106ea8fdc51fd,

title = "Enhancing deep reinforcement learning for scale flexibility in real-time strategy games",

abstract = "Real-time strategy (RTS) games present a unique challenge for AI agents due to the combination of several fundamental AI problems. While Deep Reinforcement Learning (DRL) has shown promise in the development of autonomous agents for the genre, existing architectures often struggle with games featuring maps of varying dimensions. This limitation hinders the agent{\textquoteright}s ability to generalize its learned strategies across different scenarios. This paper proposes a novel approach that overcomes this problem by incorporating Spatial Pyramid Pooling (SPP) within a DRL framework. We leverage the GridNet architecture{\textquoteright}s encoder–decoder structure and integrate an SPP layer into the critic network of the Proximal Policy Optimization (PPO) algorithm. This SPP layer dynamically generates a standardized representation of the game state, regardless of the initial observation size. This allows the agent to effectively adapt its decision-making process to any map configuration. Our evaluations demonstrate that the proposed method significantly enhances the model{\textquoteright}s flexibility and efficiency in training agents for various RTS game scenarios, albeit with some discernible limitations when applied to very small maps. This approach paves the way for more robust and adaptable AI agents capable of excelling in sequential decision problems with variable-size observations.",

author = "Lemos, {Marcelo Luiz Harry Diniz} and Vieira, {Ronaldo e Silva} and Tavares, {Anderson Rocha} and Marcolino, {Leandro Soriano} and Luiz Chaimowicz",

year = "2025",

month = jan,

day = "31",

doi = "10.1016/j.entcom.2024.100843",

language = "English",

volume = "52",

journal = "Entertainment Computing",

issn = "1875-9521",

publisher = "Elsevier",

}

RIS

TY - JOUR

T1 - Enhancing deep reinforcement learning for scale flexibility in real-time strategy games

AU - Lemos, Marcelo Luiz Harry Diniz

AU - Vieira, Ronaldo e Silva

AU - Tavares, Anderson Rocha

AU - Marcolino, Leandro Soriano

AU - Chaimowicz, Luiz

PY - 2025/1/31

Y1 - 2025/1/31

N2 - Real-time strategy (RTS) games present a unique challenge for AI agents due to the combination of several fundamental AI problems. While Deep Reinforcement Learning (DRL) has shown promise in the development of autonomous agents for the genre, existing architectures often struggle with games featuring maps of varying dimensions. This limitation hinders the agent’s ability to generalize its learned strategies across different scenarios. This paper proposes a novel approach that overcomes this problem by incorporating Spatial Pyramid Pooling (SPP) within a DRL framework. We leverage the GridNet architecture’s encoder–decoder structure and integrate an SPP layer into the critic network of the Proximal Policy Optimization (PPO) algorithm. This SPP layer dynamically generates a standardized representation of the game state, regardless of the initial observation size. This allows the agent to effectively adapt its decision-making process to any map configuration. Our evaluations demonstrate that the proposed method significantly enhances the model’s flexibility and efficiency in training agents for various RTS game scenarios, albeit with some discernible limitations when applied to very small maps. This approach paves the way for more robust and adaptable AI agents capable of excelling in sequential decision problems with variable-size observations.

AB - Real-time strategy (RTS) games present a unique challenge for AI agents due to the combination of several fundamental AI problems. While Deep Reinforcement Learning (DRL) has shown promise in the development of autonomous agents for the genre, existing architectures often struggle with games featuring maps of varying dimensions. This limitation hinders the agent’s ability to generalize its learned strategies across different scenarios. This paper proposes a novel approach that overcomes this problem by incorporating Spatial Pyramid Pooling (SPP) within a DRL framework. We leverage the GridNet architecture’s encoder–decoder structure and integrate an SPP layer into the critic network of the Proximal Policy Optimization (PPO) algorithm. This SPP layer dynamically generates a standardized representation of the game state, regardless of the initial observation size. This allows the agent to effectively adapt its decision-making process to any map configuration. Our evaluations demonstrate that the proposed method significantly enhances the model’s flexibility and efficiency in training agents for various RTS game scenarios, albeit with some discernible limitations when applied to very small maps. This approach paves the way for more robust and adaptable AI agents capable of excelling in sequential decision problems with variable-size observations.

U2 - 10.1016/j.entcom.2024.100843

DO - 10.1016/j.entcom.2024.100843

M3 - Journal article

VL - 52

JO - Entertainment Computing

JF - Entertainment Computing

SN - 1875-9521

M1 - 100843

ER -

Research

Associated organisational unit

Electronic data

Links

Text available via DOI: