An Online Incremental Learning Approach for Configuring Multi-arm Bandits Algorithms

Computing and Communications

Associated organisational units

Electronic data

m2033
Accepted author manuscript, 1.44 MB, PDF document
Available under license: CC BY-NC: Creative Commons Attribution-NonCommercial 4.0 International License

View graph of relations

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review

Forthcoming

More...

Publication date	4/07/2024
Host publication	ECAI: European Conference On Artificial Intelligence
Number of pages	8
<mark>Original language</mark>	English

Abstract

This paper introduces Dynamic Bayesian Optimisation
for Multi-Arm Bandits (DBO-MAB), an algorithm that dynamically
adapts hyperparameters of multi-arm bandit algorithms using incremental
Bayesian optimisation. DBO-MAB addresses the challenge
of tuning hyperparameters in uncertain and dynamic environments,
particularly for applications like web server optimisation. It uses a
dynamic range adjustment approach based on the interquartile mean
(IQM) of observed rewards to focus the search space on promising
regions. Evaluated across diverse static and dynamic environments,
DBO-MAB outperforms state-of-the-art algorithms such as Bootstrapped
UCB and f-Discounted-Sliding-Window Thompson Sampling,
reducing average response time by ≈ 55%.

Research

Associated organisational units

Electronic data

An Online Incremental Learning Approach for Configuring Multi-arm Bandits Algorithms

Abstract

Quick Links

Connect With Us

Faculties & Depts

Contact Us