Home > Research > Publications & Outputs > Anti-Intelligent UAV Jamming Strategy via Deep ...

Electronic data

  • icc_final

    Rights statement: ©2019 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

    Accepted author manuscript, 220 KB, PDF document

    Available under license: CC BY-NC: Creative Commons Attribution-NonCommercial 4.0 International License

Links

Text available via DOI:

View graph of relations

Anti-Intelligent UAV Jamming Strategy via Deep Q-Networks

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Published
Publication date20/05/2019
Host publication2019 IEEE International Conference on Communications, ICC 2019 - Proceedings
PublisherIEEE
Number of pages6
ISBN (electronic)9781538680889
ISBN (print)9781538680896
<mark>Original language</mark>English

Publication series

NameIEEE International Conference on Communications
Volume2019-May
ISSN (Print)1550-3607

Abstract

The downlink communications are vulnerable to intelligent unmanned aerial vehicle (UAV) jamming attack which can learn the optimal attack strategy in complex communication environments. In this paper, we propose an anti-intelligent UAV jamming strategy, in which the mobile users can learn the optimal defense strategy to prevent jamming. Specifically, the UAV jammer acts as a leader and the users act as followers. The problem is formulated as a stackelberg dynamic game, which includes the leader sub-game and the followers sub-game. As the UAV jammer is only aware of the incomplete channel state information (CSI) of the users, we model the leader sub-game as a partially observable Markov decision process (POMDP). The optimal jamming trajectory is obtained via deep recurrent Q-networks (DRQN) in the three-dimension space. For the followers sub-game, we use the Markov decision process (MDP) to model it. Then the optimal communication trajectory can be learned via deep Q-networks (DQN) in the two-dimension space. We prove the existence of the stackelberg equilibrium. The simulations show that the proposed strategy outperforms the benchmark strategies.

Bibliographic note

©2019 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.