Home > Research > Publications & Outputs > Robust and personalised online learning

Electronic data

  • 2025ArabzadehPhD

    Final published version, 1.52 MB, PDF document

    Embargo ends: 1/06/27

    Available under license: CC BY: Creative Commons Attribution 4.0 International License

Text available via DOI:

View graph of relations

Robust and personalised online learning

Research output: ThesisDoctoral Thesis

Published
Publication date2025
Number of pages169
QualificationPhD
Awarding Institution
Supervisors/Advisors
Thesis sponsors
  • UKRI EPSRC
Publisher
  • Lancaster University
<mark>Original language</mark>English

Abstract

Over the past decade, multi-armed bandits have attracted significant attention from the online learning and machine learning communities, owing to their broad applicability in both theory and practice. From recommendation systems to adaptive control problems, the bandit framework effectively tackles the exploration-exploitation trade-off by modelling sequential decision making problems in a mathematically tractable manner. Meanwhile, the rapid growth of data through smartphones, edge devices, and networked sensors has created an urgent need for private and decentralised solutions. Federated learning meets this need by enabling collaborative model training without centralising raw user data, thus preserving user privacy and mitigating risks associated with data transfer.
In this thesis, we integrate advanced online learning solutions into a federated environment, focusing on the X-armed bandit problem, a generalisation of multi-armed bandit to continuous action spaces. We present two major lines of work: one addressing the personalisation challenge by adapting to heterogeneous user distributions, and another ensuring robustness when facing corrupted or adversarial clients. Our solution employs an optimistic, phase-based approach that enhances efficiency, supported by confidence bounds that guarantee reliable performance. Beyond the federated setting, we also introduce a \emph{corruption-robust} solution for the centralised version of X-armed bandits, providing theoretical guarantees on performance under adversarial perturbations. Rigorous theoretical analyses confirm the effectiveness of our methods and also offer insights into robust, privacy-aware sequential decision-making in distributed environments.