Home > Research > Publications & Outputs > Fully unsupervised fault detection and identifi...
View graph of relations

Fully unsupervised fault detection and identification based on recursive density estimation and self-evolving cloud-based classifier

Research output: Contribution to Journal/MagazineJournal articlepeer-review

<mark>Journal publication date</mark>20/02/2015
Issue numberA
Number of pages15
Pages (from-to)289-303
Publication StatusPublished
<mark>Original language</mark>English


In this paper, we propose a two-stage algorithm for real-time fault detection and identification of industrial plants. Our proposal is based on the analysis of selected features using recursive density estimation and a new evolving classifier algorithm. More specifically, the proposed approach for the detection stage is based on the concept of the density in the data space, which is not the same as the probability density function, but is a very useful measure for abnormality/outliers detection. This density can be expressed by a Cauchy function and can be calculated recursively, which makes it memory and computational power efficient and, therefore, applicable to on-line applications. The identification/diagnosis stage is based on a self-developing (evolving) fuzzy-rule-based classifier system proposed in this paper, called the AutoClass. An important property of AutoClass is that it can start learning “from scratch”. Not only do the fuzzy rules not need to be pre-specified, but neither do the number of classes for AutoClass (the number may grow, with new class labels being added by the online learning process), in a fully unsupervised manner. In the event that an initial rule base exists, AutoClass can evolve/develop it further based on the newly arrived faulty state data. In order to validate our proposal, we present experimental results from a level control didactic process, where control and error signals are used as features for the fault detection and identification system, but the approach is generic and the number of features can be significant due to the computationally lean methodology, since covariance or more complex calculations, as well as storage of old data, are not required. The obtained results are significantly better than the traditional approaches.