Accepted author manuscript, 1.44 MB, PDF document
Available under license: CC BY-NC: Creative Commons Attribution-NonCommercial 4.0 International License
Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review
Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review
}
TY - GEN
T1 - An Online Incremental Learning Approach for Configuring Multi-arm Bandits Algorithms
AU - Alsomali, Mohammad
AU - Rodrigues-Filho, Roberto
AU - Soriano Marcolino, Leandro
AU - Porter, Barry
PY - 2024/7/4
Y1 - 2024/7/4
N2 - This paper introduces Dynamic Bayesian Optimisationfor Multi-Arm Bandits (DBO-MAB), an algorithm that dynamicallyadapts hyperparameters of multi-arm bandit algorithms using incrementalBayesian optimisation. DBO-MAB addresses the challengeof tuning hyperparameters in uncertain and dynamic environments,particularly for applications like web server optimisation. It uses adynamic range adjustment approach based on the interquartile mean(IQM) of observed rewards to focus the search space on promisingregions. Evaluated across diverse static and dynamic environments,DBO-MAB outperforms state-of-the-art algorithms such as BootstrappedUCB and f-Discounted-Sliding-Window Thompson Sampling,reducing average response time by ≈ 55%.
AB - This paper introduces Dynamic Bayesian Optimisationfor Multi-Arm Bandits (DBO-MAB), an algorithm that dynamicallyadapts hyperparameters of multi-arm bandit algorithms using incrementalBayesian optimisation. DBO-MAB addresses the challengeof tuning hyperparameters in uncertain and dynamic environments,particularly for applications like web server optimisation. It uses adynamic range adjustment approach based on the interquartile mean(IQM) of observed rewards to focus the search space on promisingregions. Evaluated across diverse static and dynamic environments,DBO-MAB outperforms state-of-the-art algorithms such as BootstrappedUCB and f-Discounted-Sliding-Window Thompson Sampling,reducing average response time by ≈ 55%.
M3 - Conference contribution/Paper
BT - ECAI
ER -