Home > Research > Publications & Outputs > Optimization for Interval Type-2 Polynomial Fuz...

Links

Text available via DOI:

View graph of relations

Optimization for Interval Type-2 Polynomial Fuzzy Systems: A Deep Reinforcement Learning Approach

Research output: Contribution to Journal/MagazineJournal articlepeer-review

E-pub ahead of print
  • Bo Xiao
  • Hak-Keung Lam
  • Chengbin Xuan
  • Ziwei Wang
  • Eric Yeatman
Close
<mark>Journal publication date</mark>7/07/2022
<mark>Journal</mark>IEEE Transactions on Artificial Intelligence
Number of pages12
Publication StatusE-pub ahead of print
Early online date7/07/22
<mark>Original language</mark>English

Abstract

It is known that the interval type-2 (IT2) fuzzy controllers are superior compared to their type-1 counterparts in terms of robustness, flexibility, etc. However, how to conduct the type reduction optimally with the consideration of system stability under the fuzzy-model-based (FMB) control framework is still an open problem. To address this issue, we present a new approach through the membership-function-dependent (MFD) and deep reinforcement learning (DRL) approaches. In the proposed approach, the reduction of IT2 membership functions of the fuzzy controller is completing during optimizing the control performance. Another fundamental issue is that the stability conditions must hold subject to different type-reduction methods. It is tedious and impractical to resolve the stability conditions according to different type-reduction methods, which could lead to infinite possibility. It is more practical to guarantee the holding of stability conditions during type-reduction rather than resolving the stability conditions, the MFD approach is proposed with the imperfect premise matching (IPM) concept. Thanks to the unique merit of the MFD approach, the stability conditions according to all the different embedded type-1 membership functions within the footprint of uncertainty (FOU) are guaranteed to be valid. During the control processes, the state transitions associated with properly engineered cost/reward function can be used to approximately calculate the deterministic policy gradient to optimize the acting policy and then to improve the control performance through determining the grade of IT2 membership functions of the fuzzy controller. The detailed simulation example is provided to verify the merits of the proposed approach.