BinaryBandit - Research Portal | Lancaster University

Management Science

Electronic data

Jacko2019_binarybandit_wp
Final published version, 650 KB, PDF document

Keywords

Multi-armed bandit problem, Design of sequential experiments, Bayesian decision theory, Dynamic programming, Index rules, Response-adaptive randomization, Julia programming language

View graph of relations

BinaryBandit: An Efficient Julia Package for Optimization and Evaluation of the Finite-Horizon Bandit Problem with Binary Responses

Research output: Working paper

Published

Peter Jacko

More...

Publication date	1/08/2019
Place of Publication	Lancaster
Publisher	Lancaster University Management School
Number of pages	13
<mark>Original language</mark>	English

Publication series

Name	Management Science Working Paper
Publisher	Lancaster University Management School
No.	4
Volume	2019

Abstract

Variants of the multi-armed bandit problem for design of sequential experiments have been studied in several disciplines for almost a century, but the performance evaluation of proposed designs or finding a Bayes-optimal design over the finite horizon has resisted derivation of a closed formulae. Computational optimization and evaluation is thus the only possible approach. The BinaryBandit package in Julia programming language has been developed to provide such framework with a number of designs, easily extendable to add new designs. The package is based on the use an efficient implementation of backward recursion which gives accurate (up to computer accuracy) evaluation for small and moderate horizons. For instance, on a standard laptop or desktop computer, the Bayes-optimal design for the two-armed problem can be computed for offline use or evaluated in online fashion in a few minutes (horizon around $ 1,000 $), in a few hours (horizon around $ 2,000 $), or in a few days (horizon around $ 4,000 $). 32GB of RAM allows storing (e.g., for offline use) of the whole design up to horizon around $ 1440 $; when its storing is not needed (e.g., for Bayesian evaluation or for calculation of the initial action) it allows up to horizon around $ 4440 $. These problems are significantly larger than what has been reported in the literature, since moderate and large horizons have only been evaluated by simulation, trading-off accuracy. This paper describes the details of the backward recursion implementation and
gives an example of the package usage.

Research

Electronic data

Keywords

BinaryBandit: An Efficient Julia Package for Optimization and Evaluation of the Finite-Horizon Bandit Problem with Binary Responses

Publication series

Abstract

Quick Links

Connect With Us

Faculties & Depts

Contact Us