Home > Research > Publications & Outputs > BinaryBandit

Electronic data

View graph of relations

BinaryBandit: An Efficient Julia Package for Optimization and Evaluation of the Finite-Horizon Bandit Problem with Binary Responses

Research output: Working paper

Published
NullPointerException

Abstract

Variants of the multi-armed bandit problem for design of sequential experiments have been studied in several disciplines for almost a century, but the performance evaluation of proposed designs or finding a Bayes-optimal design over the finite horizon has resisted derivation of a closed formulae. Computational optimization and evaluation is thus the only possible approach. The BinaryBandit package in Julia programming language has been developed to provide such framework with a number of designs, easily extendable to add new designs. The package is based on the use an efficient implementation of backward recursion which gives accurate (up to computer accuracy) evaluation for small and moderate horizons. For instance, on a standard laptop or desktop computer, the Bayes-optimal design for the two-armed problem can be computed for offline use or evaluated in online fashion in a few minutes (horizon around $ 1,000 $), in a few hours (horizon around $ 2,000 $), or in a few days (horizon around $ 4,000 $). 32GB of RAM allows storing (e.g., for offline use) of the whole design up to horizon around $ 1440 $; when its storing is not needed (e.g., for Bayesian evaluation or for calculation of the initial action) it allows up to horizon around $ 4440 $. These problems are significantly larger than what has been reported in the literature, since moderate and large horizons have only been evaluated by simulation, trading-off accuracy. This paper describes the details of the backward recursion implementation and
gives an example of the package usage.