BinaryBandit - Research Portal | Lancaster University

Management Science

Electronic data

Jacko2019_binarybandit_wp
Final published version, 650 KB, PDF document

Keywords

Multi-armed bandit problem, Design of sequential experiments, Bayesian decision theory, Dynamic programming, Index rules, Response-adaptive randomization, Julia programming language

View graph of relations

BinaryBandit: An Efficient Julia Package for Optimization and Evaluation of the Finite-Horizon Bandit Problem with Binary Responses

Research output: Working paper

Published

Standard

BinaryBandit: An Efficient Julia Package for Optimization and Evaluation of the Finite-Horizon Bandit Problem with Binary Responses. / Jacko, Peter.
Lancaster: Lancaster University Management School, 2019. (Management Science Working Paper; Vol. 2019, No. 4).

Research output: Working paper

Bibtex

@techreport{ed7f4439832e424fbca2035c5361145c,

title = "BinaryBandit: An Efficient Julia Package for Optimization and Evaluation of the Finite-Horizon Bandit Problem with Binary Responses",

abstract = "Variants of the multi-armed bandit problem for design of sequential experiments have been studied in several disciplines for almost a century, but the performance evaluation of proposed designs or finding a Bayes-optimal design over the finite horizon has resisted derivation of a closed formulae. Computational optimization and evaluation is thus the only possible approach. The BinaryBandit package in Julia programming language has been developed to provide such framework with a number of designs, easily extendable to add new designs. The package is based on the use an efficient implementation of backward recursion which gives accurate (up to computer accuracy) evaluation for small and moderate horizons. For instance, on a standard laptop or desktop computer, the Bayes-optimal design for the two-armed problem can be computed for offline use or evaluated in online fashion in a few minutes (horizon around $ 1,000 $), in a few hours (horizon around $ 2,000 $), or in a few days (horizon around $ 4,000 $). 32GB of RAM allows storing (e.g., for offline use) of the whole design up to horizon around $ 1440 $; when its storing is not needed (e.g., for Bayesian evaluation or for calculation of the initial action) it allows up to horizon around $ 4440 $. These problems are significantly larger than what has been reported in the literature, since moderate and large horizons have only been evaluated by simulation, trading-off accuracy. This paper describes the details of the backward recursion implementation andgives an example of the package usage.",

keywords = "Multi-armed bandit problem, Design of sequential experiments, Bayesian decision theory, Dynamic programming, Index rules, Response-adaptive randomization, Julia programming language",

author = "Peter Jacko",

year = "2019",

month = aug,

day = "1",

language = "English",

series = "Management Science Working Paper",

publisher = "Lancaster University Management School",

number = "4",

type = "WorkingPaper",

institution = "Lancaster University Management School",

}

RIS

TY - UNPB

T1 - BinaryBandit

T2 - An Efficient Julia Package for Optimization and Evaluation of the Finite-Horizon Bandit Problem with Binary Responses

AU - Jacko, Peter

PY - 2019/8/1

Y1 - 2019/8/1

N2 - Variants of the multi-armed bandit problem for design of sequential experiments have been studied in several disciplines for almost a century, but the performance evaluation of proposed designs or finding a Bayes-optimal design over the finite horizon has resisted derivation of a closed formulae. Computational optimization and evaluation is thus the only possible approach. The BinaryBandit package in Julia programming language has been developed to provide such framework with a number of designs, easily extendable to add new designs. The package is based on the use an efficient implementation of backward recursion which gives accurate (up to computer accuracy) evaluation for small and moderate horizons. For instance, on a standard laptop or desktop computer, the Bayes-optimal design for the two-armed problem can be computed for offline use or evaluated in online fashion in a few minutes (horizon around $ 1,000 $), in a few hours (horizon around $ 2,000 $), or in a few days (horizon around $ 4,000 $). 32GB of RAM allows storing (e.g., for offline use) of the whole design up to horizon around $ 1440 $; when its storing is not needed (e.g., for Bayesian evaluation or for calculation of the initial action) it allows up to horizon around $ 4440 $. These problems are significantly larger than what has been reported in the literature, since moderate and large horizons have only been evaluated by simulation, trading-off accuracy. This paper describes the details of the backward recursion implementation andgives an example of the package usage.

AB - Variants of the multi-armed bandit problem for design of sequential experiments have been studied in several disciplines for almost a century, but the performance evaluation of proposed designs or finding a Bayes-optimal design over the finite horizon has resisted derivation of a closed formulae. Computational optimization and evaluation is thus the only possible approach. The BinaryBandit package in Julia programming language has been developed to provide such framework with a number of designs, easily extendable to add new designs. The package is based on the use an efficient implementation of backward recursion which gives accurate (up to computer accuracy) evaluation for small and moderate horizons. For instance, on a standard laptop or desktop computer, the Bayes-optimal design for the two-armed problem can be computed for offline use or evaluated in online fashion in a few minutes (horizon around $ 1,000 $), in a few hours (horizon around $ 2,000 $), or in a few days (horizon around $ 4,000 $). 32GB of RAM allows storing (e.g., for offline use) of the whole design up to horizon around $ 1440 $; when its storing is not needed (e.g., for Bayesian evaluation or for calculation of the initial action) it allows up to horizon around $ 4440 $. These problems are significantly larger than what has been reported in the literature, since moderate and large horizons have only been evaluated by simulation, trading-off accuracy. This paper describes the details of the backward recursion implementation andgives an example of the package usage.

KW - Multi-armed bandit problem

KW - Design of sequential experiments

KW - Bayesian decision theory

KW - Dynamic programming

KW - Index rules

KW - Response-adaptive randomization

KW - Julia programming language

M3 - Working paper

T3 - Management Science Working Paper

BT - BinaryBandit

PB - Lancaster University Management School

CY - Lancaster

ER -

Research

Electronic data

Keywords

BinaryBandit: An Efficient Julia Package for Optimization and Evaluation of the Finite-Horizon Bandit Problem with Binary Responses

Standard

Harvard

APA

Vancouver

Author

Bibtex

RIS

Quick Links

Connect With Us

Faculties & Depts

Contact Us