Home > Research > Publications & Outputs > Poisson process bandits

Electronic data

  • 2020grantphd

    Final published version, 1.93 MB, PDF document

    Available under license: CC BY-NC-ND: Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License

Text available via DOI:

View graph of relations

Poisson process bandits: Sequential models and algorithms for maximising the detection of point data

Research output: ThesisDoctoral Thesis

Published

Standard

Poisson process bandits : Sequential models and algorithms for maximising the detection of point data. / Grant, James.

Lancaster University, 2020. 271 p.

Research output: ThesisDoctoral Thesis

Harvard

APA

Vancouver

Author

Bibtex

@phdthesis{b90f8d33761847cb8e4ad2e5321f9e38,
title = "Poisson process bandits: Sequential models and algorithms for maximising the detection of point data",
abstract = "In numerous settings in areas as diverse as security, ecology, astronomy, and logistics, it is desirable to optimally deploy a limited resource to observe events, which may be modelled as point data arising according to a Non-homogeneous Poisson process. Increasingly, thanks to developments in mobile and adaptive technologies, it is possible to update a deployment of such resourceand gather feedback on the quality of multiple actions. Such a capability presents the opportunity to learn, and with it a classic problem in operations research and machine learning - the explorationexploitation dilemma. To perform optimally, how should investigative choices which explore thevalue of poorly understood actions and optimising choices which choose actions known to be of a high value be balanced? Effective techniques exist to resolve this dilemma in simpler settings, but the Poisson process data brings new challenges.In this thesis, effective solution methods for the problem of sequentially deploying resource are developed, via a combination of efficient inference schemes, bespoke optimisation approaches, and advanced sequential decision-making strategies. Furthermore, extensive theoretical work providesstrong guarantees on the performance of the proposed solution methods and an understanding of the challenges of this problem and more complex extensions.In particular, Upper Confidence Bound and Thompson Sampling (TS) approaches are derived for combinatorial and continuum-armed bandit versions of the problem, with accompanying analysis displaying that the regret of the approaches is of optimal order. A broader understanding of the performance of TS based on non-parametric models for smooth reward functions is developed, and new posterior contraction results for the Gaussian Cox Process, a popular Bayesian non-parametric model of point data, are derived. These results point to effective strategies for more challenging variants of the event detection problem, and more generally advance the understanding of bandit decision-making with complex data structures.",
author = "James Grant",
year = "2020",
month = feb,
day = "17",
doi = "10.17635/lancaster/thesis/871",
language = "English",
publisher = "Lancaster University",
school = "Lancaster University",

}

RIS

TY - THES

T1 - Poisson process bandits

T2 - Sequential models and algorithms for maximising the detection of point data

AU - Grant, James

PY - 2020/2/17

Y1 - 2020/2/17

N2 - In numerous settings in areas as diverse as security, ecology, astronomy, and logistics, it is desirable to optimally deploy a limited resource to observe events, which may be modelled as point data arising according to a Non-homogeneous Poisson process. Increasingly, thanks to developments in mobile and adaptive technologies, it is possible to update a deployment of such resourceand gather feedback on the quality of multiple actions. Such a capability presents the opportunity to learn, and with it a classic problem in operations research and machine learning - the explorationexploitation dilemma. To perform optimally, how should investigative choices which explore thevalue of poorly understood actions and optimising choices which choose actions known to be of a high value be balanced? Effective techniques exist to resolve this dilemma in simpler settings, but the Poisson process data brings new challenges.In this thesis, effective solution methods for the problem of sequentially deploying resource are developed, via a combination of efficient inference schemes, bespoke optimisation approaches, and advanced sequential decision-making strategies. Furthermore, extensive theoretical work providesstrong guarantees on the performance of the proposed solution methods and an understanding of the challenges of this problem and more complex extensions.In particular, Upper Confidence Bound and Thompson Sampling (TS) approaches are derived for combinatorial and continuum-armed bandit versions of the problem, with accompanying analysis displaying that the regret of the approaches is of optimal order. A broader understanding of the performance of TS based on non-parametric models for smooth reward functions is developed, and new posterior contraction results for the Gaussian Cox Process, a popular Bayesian non-parametric model of point data, are derived. These results point to effective strategies for more challenging variants of the event detection problem, and more generally advance the understanding of bandit decision-making with complex data structures.

AB - In numerous settings in areas as diverse as security, ecology, astronomy, and logistics, it is desirable to optimally deploy a limited resource to observe events, which may be modelled as point data arising according to a Non-homogeneous Poisson process. Increasingly, thanks to developments in mobile and adaptive technologies, it is possible to update a deployment of such resourceand gather feedback on the quality of multiple actions. Such a capability presents the opportunity to learn, and with it a classic problem in operations research and machine learning - the explorationexploitation dilemma. To perform optimally, how should investigative choices which explore thevalue of poorly understood actions and optimising choices which choose actions known to be of a high value be balanced? Effective techniques exist to resolve this dilemma in simpler settings, but the Poisson process data brings new challenges.In this thesis, effective solution methods for the problem of sequentially deploying resource are developed, via a combination of efficient inference schemes, bespoke optimisation approaches, and advanced sequential decision-making strategies. Furthermore, extensive theoretical work providesstrong guarantees on the performance of the proposed solution methods and an understanding of the challenges of this problem and more complex extensions.In particular, Upper Confidence Bound and Thompson Sampling (TS) approaches are derived for combinatorial and continuum-armed bandit versions of the problem, with accompanying analysis displaying that the regret of the approaches is of optimal order. A broader understanding of the performance of TS based on non-parametric models for smooth reward functions is developed, and new posterior contraction results for the Gaussian Cox Process, a popular Bayesian non-parametric model of point data, are derived. These results point to effective strategies for more challenging variants of the event detection problem, and more generally advance the understanding of bandit decision-making with complex data structures.

U2 - 10.17635/lancaster/thesis/871

DO - 10.17635/lancaster/thesis/871

M3 - Doctoral Thesis

PB - Lancaster University

ER -