Bandit Procedures for Designing Patient-Centric Clinical Trials

Associated organisational units

Text available via DOI:

https://doi.org/10.1007/978-3-031-01926-5_14
Final published version

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Chapter

Published

Publication date	21/09/2022
Host publication	Springer Series in Supply Chain Management
Editors	Xi Chen, Stefanus Jasin, Cong Shi
Place of Publication	London
Publisher	Springer Nature
Pages	365-389
Number of pages	25
Edition	1
ISBN (electronic)	9783031019265
ISBN (print)	9783031019258
<mark>Original language</mark>	English

Publication series

Name	Springer Series in Supply Chain Management
Volume	18
ISSN (Print)	2365-6395
ISSN (electronic)	2365-6409

Abstract

Multi-armed bandit problems (MABPs) are a special type of optimal control problem that has been studied in the fields of operations research, statistics, machine learning, economics, and others. It is a framework well suited to model resource allocation under uncertainty in a wide variety of contexts. Across the existing theoretical literature, the use of bandit models to optimally design clinical trials is one of the most typical motivating application, where the word “optimally” refers to designing the so-called patient-centric trials, which would take into account the benefit of the in-trial patients and thus are by some researchers considered more ethical. Nevertheless, the resulting theory has had little influence on the actual design of clinical trials. Contrary to similar learning problems arising for instance in digital marketing where interventions can be tested on millions of users at negligible cost, clinical trials are about “small data”, as recruiting patients is remarkably expensive and (in many cases) ethically challenging. In this book chapter, we review a variety of operations research and machine learning approaches that lead to algorithms to “solve” the finite-horizon MABP and then interpret them in the context of designing clinical trials. Due to the focus on small sizes, we do not resort to the use of the normal distribution to approximate a binomial distribution which is a common practice for large samples either “for simplicity” or “for ease of computation”. Solving a MABP essentially means to derive a response-adaptive procedure for allocating patients to arms in a finite sample experiment with no early stopping. We evaluate and compare the performance of these procedures, including the traditional and still dominant clinical trial design choice: equal fixed randomization. Our results illustrate how bandit approaches could offer significant advantages, mainly in terms of allocating more patients to better interventions, but still pose important inferential challenges, particularly in terms of their resulting lower statistical power, potential for bias in estimation and existence of closed-form test distributions or asymptotic theory. We illustrate some promising modifications to bandit procedures to address power and bias issues, and we reflect upon the open challenges that remain for an increased uptake of bandit models in clinical trials.

Research

Associated organisational units

Links

Text available via DOI: