Home > Research > Publications & Outputs > A predictive fault-tolerance framework for IoT ...

Electronic data

Text available via DOI:

View graph of relations

A predictive fault-tolerance framework for IoT systems

Research output: ThesisDoctoral Thesis

Published

Standard

A predictive fault-tolerance framework for IoT systems. / Power, Alexander.
Lancaster University, 2020. 211 p.

Research output: ThesisDoctoral Thesis

Harvard

APA

Power, A. (2020). A predictive fault-tolerance framework for IoT systems. [Doctoral Thesis, Lancaster University]. Lancaster University. https://doi.org/10.17635/lancaster/thesis/1063

Vancouver

Power A. A predictive fault-tolerance framework for IoT systems. Lancaster University, 2020. 211 p. doi: 10.17635/lancaster/thesis/1063

Author

Bibtex

@phdthesis{eba6f7414f5941edb814d1d4d9ba6d10,
title = "A predictive fault-tolerance framework for IoT systems",
abstract = "As Internet of Things (IoT) systems scale, attributes such as availability, reliability, safety, maintainability, security, and performance become increasingly more important. A key challenge to realise IoT is how to provide a dependable infrastructure for the billions of expected IoT devices. A dependable IoT system is one that can defensibly be trusted to deliver its intended service within a given time period.To define a FT-support solution that is applicable to all IoT systems, it is important that error definition is a generic, language-agnostic process, so that FT can be applied as a software pattern. It must also be interoperable, so that FT support can be easily 'plugged into' any existing IoT system, which is facilitated by an adherence to standards and protocols. Lastly, it is important that FT support is, itself, fault tolerant, so that it can be depended on to provide correct support for IoT systems.The work in this thesis considers how real-time and historical data analysis techniques can be combined to monitor an IoT environment and analyse its short- and long-term data to make the system as resilient to failure as possible. Specifically, complex event processing (CEP) is proposed for real-time error detection based on the analysis of stream data in an IoT system, where errors are defined as nondeterministic finite automata (NFA). For long-term error analysis, machine learning (ML) is proposed to predict when an error is likely to occur and mitigate imminent system faults based on previous experience of erroneous system behaviour in the IoT system.The contribution is threefold: (1) a language-agnostic approach to error definition using NFAs, designed to provide 'FT as a service' for easy deployment and integration into existing IoT systems; (2) an implementation of NFAs on a bespoke CEP system, BoboCEP, that provides distributed, resilient event processing at the network edge via active replication; and (3) a ML approach to intelligent FT that can learn from system errors over time to ensure correct long-term FT support. The proposed solution was evaluated using two vertical-farming testbeds and a dataset from a real-world vertical farm. Results showed that the proposed solution could detect and predict the successful detection and recovery of erroneous system behaviours. A performance analysis of BoboCEP was conducted with favourable results.",
author = "Alexander Power",
year = "2020",
month = aug,
day = "11",
doi = "10.17635/lancaster/thesis/1063",
language = "English",
publisher = "Lancaster University",
school = "Lancaster University",

}

RIS

TY - BOOK

T1 - A predictive fault-tolerance framework for IoT systems

AU - Power, Alexander

PY - 2020/8/11

Y1 - 2020/8/11

N2 - As Internet of Things (IoT) systems scale, attributes such as availability, reliability, safety, maintainability, security, and performance become increasingly more important. A key challenge to realise IoT is how to provide a dependable infrastructure for the billions of expected IoT devices. A dependable IoT system is one that can defensibly be trusted to deliver its intended service within a given time period.To define a FT-support solution that is applicable to all IoT systems, it is important that error definition is a generic, language-agnostic process, so that FT can be applied as a software pattern. It must also be interoperable, so that FT support can be easily 'plugged into' any existing IoT system, which is facilitated by an adherence to standards and protocols. Lastly, it is important that FT support is, itself, fault tolerant, so that it can be depended on to provide correct support for IoT systems.The work in this thesis considers how real-time and historical data analysis techniques can be combined to monitor an IoT environment and analyse its short- and long-term data to make the system as resilient to failure as possible. Specifically, complex event processing (CEP) is proposed for real-time error detection based on the analysis of stream data in an IoT system, where errors are defined as nondeterministic finite automata (NFA). For long-term error analysis, machine learning (ML) is proposed to predict when an error is likely to occur and mitigate imminent system faults based on previous experience of erroneous system behaviour in the IoT system.The contribution is threefold: (1) a language-agnostic approach to error definition using NFAs, designed to provide 'FT as a service' for easy deployment and integration into existing IoT systems; (2) an implementation of NFAs on a bespoke CEP system, BoboCEP, that provides distributed, resilient event processing at the network edge via active replication; and (3) a ML approach to intelligent FT that can learn from system errors over time to ensure correct long-term FT support. The proposed solution was evaluated using two vertical-farming testbeds and a dataset from a real-world vertical farm. Results showed that the proposed solution could detect and predict the successful detection and recovery of erroneous system behaviours. A performance analysis of BoboCEP was conducted with favourable results.

AB - As Internet of Things (IoT) systems scale, attributes such as availability, reliability, safety, maintainability, security, and performance become increasingly more important. A key challenge to realise IoT is how to provide a dependable infrastructure for the billions of expected IoT devices. A dependable IoT system is one that can defensibly be trusted to deliver its intended service within a given time period.To define a FT-support solution that is applicable to all IoT systems, it is important that error definition is a generic, language-agnostic process, so that FT can be applied as a software pattern. It must also be interoperable, so that FT support can be easily 'plugged into' any existing IoT system, which is facilitated by an adherence to standards and protocols. Lastly, it is important that FT support is, itself, fault tolerant, so that it can be depended on to provide correct support for IoT systems.The work in this thesis considers how real-time and historical data analysis techniques can be combined to monitor an IoT environment and analyse its short- and long-term data to make the system as resilient to failure as possible. Specifically, complex event processing (CEP) is proposed for real-time error detection based on the analysis of stream data in an IoT system, where errors are defined as nondeterministic finite automata (NFA). For long-term error analysis, machine learning (ML) is proposed to predict when an error is likely to occur and mitigate imminent system faults based on previous experience of erroneous system behaviour in the IoT system.The contribution is threefold: (1) a language-agnostic approach to error definition using NFAs, designed to provide 'FT as a service' for easy deployment and integration into existing IoT systems; (2) an implementation of NFAs on a bespoke CEP system, BoboCEP, that provides distributed, resilient event processing at the network edge via active replication; and (3) a ML approach to intelligent FT that can learn from system errors over time to ensure correct long-term FT support. The proposed solution was evaluated using two vertical-farming testbeds and a dataset from a real-world vertical farm. Results showed that the proposed solution could detect and predict the successful detection and recovery of erroneous system behaviours. A performance analysis of BoboCEP was conducted with favourable results.

U2 - 10.17635/lancaster/thesis/1063

DO - 10.17635/lancaster/thesis/1063

M3 - Doctoral Thesis

PB - Lancaster University

ER -