Home > Research > Publications & Outputs > Discounted multi-armed bandit problems on a col...
View graph of relations

Discounted multi-armed bandit problems on a collection of machines with varying speeds

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Published

Standard

Discounted multi-armed bandit problems on a collection of machines with varying speeds. / Glazebrook, Kevin; Dunn, R. T.
In: Mathematics of Operations Research, Vol. 29, No. 2, 2004, p. 266-279.

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Harvard

APA

Vancouver

Glazebrook K, Dunn RT. Discounted multi-armed bandit problems on a collection of machines with varying speeds. Mathematics of Operations Research. 2004;29(2):266-279. doi: 10.1287/moor.1030.0068

Author

Glazebrook, Kevin ; Dunn, R. T. / Discounted multi-armed bandit problems on a collection of machines with varying speeds. In: Mathematics of Operations Research. 2004 ; Vol. 29, No. 2. pp. 266-279.

Bibtex

@article{2b7e177123ed4755bf98b7bf6f2bfb2e,
title = "Discounted multi-armed bandit problems on a collection of machines with varying speeds",
abstract = "This paper is the first to consider general multiarmed bandit problems on parallel machines working at different speeds. Block allocation policies make a once-for-all allocation of bandits to machines at time zero. In this class we describe how to achieve Blackwell optimality under given conditions. The block allocation policy identified allocates the bandits with the largest guaranteed reward rates to the machines operating at greatest speed. This policy is shown to be average-reward optimal in the class of general (nonanticipative, nonidling) policies.",
keywords = "average reward optimality, Blackwell optimality, Gittins index, multiarmed bandit, sensitive discount optimality",
author = "Kevin Glazebrook and Dunn, {R. T.}",
note = "RAE_import_type : Journal article RAE_uoa_type : Statistics and Operational Research",
year = "2004",
doi = "10.1287/moor.1030.0068",
language = "English",
volume = "29",
pages = "266--279",
journal = "Mathematics of Operations Research",
issn = "0364-765X",
publisher = "INFORMS Inst.for Operations Res.and the Management Sciences",
number = "2",

}

RIS

TY - JOUR

T1 - Discounted multi-armed bandit problems on a collection of machines with varying speeds

AU - Glazebrook, Kevin

AU - Dunn, R. T.

N1 - RAE_import_type : Journal article RAE_uoa_type : Statistics and Operational Research

PY - 2004

Y1 - 2004

N2 - This paper is the first to consider general multiarmed bandit problems on parallel machines working at different speeds. Block allocation policies make a once-for-all allocation of bandits to machines at time zero. In this class we describe how to achieve Blackwell optimality under given conditions. The block allocation policy identified allocates the bandits with the largest guaranteed reward rates to the machines operating at greatest speed. This policy is shown to be average-reward optimal in the class of general (nonanticipative, nonidling) policies.

AB - This paper is the first to consider general multiarmed bandit problems on parallel machines working at different speeds. Block allocation policies make a once-for-all allocation of bandits to machines at time zero. In this class we describe how to achieve Blackwell optimality under given conditions. The block allocation policy identified allocates the bandits with the largest guaranteed reward rates to the machines operating at greatest speed. This policy is shown to be average-reward optimal in the class of general (nonanticipative, nonidling) policies.

KW - average reward optimality

KW - Blackwell optimality

KW - Gittins index

KW - multiarmed bandit

KW - sensitive discount optimality

U2 - 10.1287/moor.1030.0068

DO - 10.1287/moor.1030.0068

M3 - Journal article

VL - 29

SP - 266

EP - 279

JO - Mathematics of Operations Research

JF - Mathematics of Operations Research

SN - 0364-765X

IS - 2

ER -