Prospects for bandit solutions in sensor management

Management Science

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Published

Standard

Prospects for bandit solutions in sensor management. / Pavlidis, N; Adams, N M; Nicholson, M et al.
In: The Computer Journal, Vol. 53, No. 9, 2010, p. 1370-1383.

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Harvard

Pavlidis, N, Adams, NM, Nicholson, M & Hand, DJ 2010, 'Prospects for bandit solutions in sensor management', The Computer Journal, vol. 53, no. 9, pp. 1370-1383. https://doi.org/10.1093/comjnl/bxp122

APA

Pavlidis, N., Adams, N. M., Nicholson, M., & Hand, D. J. (2010). Prospects for bandit solutions in sensor management. The Computer Journal, 53(9), 1370-1383. https://doi.org/10.1093/comjnl/bxp122

Vancouver

Pavlidis N, Adams NM, Nicholson M, Hand DJ. Prospects for bandit solutions in sensor management. The Computer Journal. 2010;53(9):1370-1383. doi: 10.1093/comjnl/bxp122

Author

Pavlidis, N ; Adams, N M ; Nicholson, M et al. / Prospects for bandit solutions in sensor management. In: The Computer Journal. 2010 ; Vol. 53, No. 9. pp. 1370-1383.

Bibtex

@article{79120b4cb9e849fd80bc076ce16efac7,

title = "Prospects for bandit solutions in sensor management",

abstract = "Sensor management in information-rich and dynamic environments can be posed as a sequential action selection problem with side information. To study such problems we employ the dynamic multi-armed bandit with covariates framework. In this generalization of the multi-armed bandit, the expected rewards are time-varying linear functions of the covariate vector. The learning goal is to associate the covariate with the optimal action at each instance, essentially learning to partition the covariate space adaptively. Applications of sensor management are frequently in environments in which the precise nature of the dynamics is unknown. In such settings, the sensor manager tracks the evolving environment by observing only the covariates and the consequences of the selected actions. This creates difficulties not encountered in static problems, and changes the exploitation–exploration dilemma. We study the relationship between the different factors of the problem and provide interesting insights. The impact of the environment dynamics on the action selection problem is influenced by the covariate dimensionality. We present the surprising result that strategies that perform very little or no exploration perform surprisingly well in dynamic environments",

author = "N Pavlidis and Adams, {N M} and M Nicholson and Hand, {D J}",

year = "2010",

doi = "10.1093/comjnl/bxp122",

language = "English",

volume = "53",

pages = "1370--1383",

journal = "The Computer Journal",

issn = "0010-4620",

publisher = "Oxford University Press",

number = "9",

}

RIS

TY - JOUR

T1 - Prospects for bandit solutions in sensor management

AU - Pavlidis, N

AU - Adams, N M

AU - Nicholson, M

AU - Hand, D J

PY - 2010

Y1 - 2010

N2 - Sensor management in information-rich and dynamic environments can be posed as a sequential action selection problem with side information. To study such problems we employ the dynamic multi-armed bandit with covariates framework. In this generalization of the multi-armed bandit, the expected rewards are time-varying linear functions of the covariate vector. The learning goal is to associate the covariate with the optimal action at each instance, essentially learning to partition the covariate space adaptively. Applications of sensor management are frequently in environments in which the precise nature of the dynamics is unknown. In such settings, the sensor manager tracks the evolving environment by observing only the covariates and the consequences of the selected actions. This creates difficulties not encountered in static problems, and changes the exploitation–exploration dilemma. We study the relationship between the different factors of the problem and provide interesting insights. The impact of the environment dynamics on the action selection problem is influenced by the covariate dimensionality. We present the surprising result that strategies that perform very little or no exploration perform surprisingly well in dynamic environments

AB - Sensor management in information-rich and dynamic environments can be posed as a sequential action selection problem with side information. To study such problems we employ the dynamic multi-armed bandit with covariates framework. In this generalization of the multi-armed bandit, the expected rewards are time-varying linear functions of the covariate vector. The learning goal is to associate the covariate with the optimal action at each instance, essentially learning to partition the covariate space adaptively. Applications of sensor management are frequently in environments in which the precise nature of the dynamics is unknown. In such settings, the sensor manager tracks the evolving environment by observing only the covariates and the consequences of the selected actions. This creates difficulties not encountered in static problems, and changes the exploitation–exploration dilemma. We study the relationship between the different factors of the problem and provide interesting insights. The impact of the environment dynamics on the action selection problem is influenced by the covariate dimensionality. We present the surprising result that strategies that perform very little or no exploration perform surprisingly well in dynamic environments

U2 - 10.1093/comjnl/bxp122

DO - 10.1093/comjnl/bxp122

M3 - Journal article

VL - 53

SP - 1370

EP - 1383

JO - The Computer Journal

JF - The Computer Journal

SN - 0010-4620

IS - 9

ER -

Research

Associated organisational unit

Electronic data

Links

Text available via DOI: