Home > Research > Publications & Outputs > An Online Incremental Learning Approach for Con...

Electronic data

  • m2033

    Accepted author manuscript, 1.44 MB, PDF document

    Available under license: CC BY-NC: Creative Commons Attribution-NonCommercial 4.0 International License

View graph of relations

An Online Incremental Learning Approach for Configuring Multi-arm Bandits Algorithms

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Forthcoming
Publication date4/07/2024
Host publicationECAI: European Conference On Artificial Intelligence
Number of pages8
<mark>Original language</mark>English

Abstract

This paper introduces Dynamic Bayesian Optimisation
for Multi-Arm Bandits (DBO-MAB), an algorithm that dynamically
adapts hyperparameters of multi-arm bandit algorithms using incremental
Bayesian optimisation. DBO-MAB addresses the challenge
of tuning hyperparameters in uncertain and dynamic environments,
particularly for applications like web server optimisation. It uses a
dynamic range adjustment approach based on the interquartile mean
(IQM) of observed rewards to focus the search space on promising
regions. Evaluated across diverse static and dynamic environments,
DBO-MAB outperforms state-of-the-art algorithms such as Bootstrapped
UCB and f-Discounted-Sliding-Window Thompson Sampling,
reducing average response time by ≈ 55%.