Home > Research > Publications & Outputs > The customizable fault/error model for dependab...

Links

Text available via DOI:

View graph of relations

The customizable fault/error model for dependable distributed systems

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Published

Standard

The customizable fault/error model for dependable distributed systems. / Walter, C.J.; Suri, Neeraj.
In: Theoretical Computer Science, Vol. 290, No. 2, 02.01.2003, p. 1223-1251.

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Harvard

Walter, CJ & Suri, N 2003, 'The customizable fault/error model for dependable distributed systems', Theoretical Computer Science, vol. 290, no. 2, pp. 1223-1251. https://doi.org/10.1016/S0304-3975(01)00203-1

APA

Vancouver

Walter CJ, Suri N. The customizable fault/error model for dependable distributed systems. Theoretical Computer Science. 2003 Jan 2;290(2):1223-1251. doi: 10.1016/S0304-3975(01)00203-1

Author

Walter, C.J. ; Suri, Neeraj. / The customizable fault/error model for dependable distributed systems. In: Theoretical Computer Science. 2003 ; Vol. 290, No. 2. pp. 1223-1251.

Bibtex

@article{a99e39680911427c8200fb23b13d6848,
title = "The customizable fault/error model for dependable distributed systems",
abstract = "Dependability is a qualitative term referring to a system's ability to meet its service requirements in the presence of faults. The types and number of faults covered by a system play a primary role in determining the level of dependability which that system can potentially provide. Given the variety and multiplicity of fault types, to simplify the design process, the system algorithm design often focuses on specific fault types, resulting in either over-optimistic (all fault permanent) or over-pessimistic (all faults malicious) dependable system designs. A more practical and realistic approach is to recognize that faults of varied severity levels and of differing occurrence probabilities may appear as combinations rather than the assumed single fault type occurrences. The ability to allow the user to select/customize a particular combination of fault types of varied severity characterizes the proposed customizable fault/error model (CFEM). The CFEM organizes diverse fault categories into a cohesive framework by classifying faults according to the effect they have on the required system services rather than by targeting the source of the fault condition. In this paper, we develop (a) the complete framework for the CFEM fault classification, (b) the voting functions applicable under the CFEM, and (c) the fundamental distributed services of consensus and convergence under the CFEM on which dependable distributed functionality can be supported. {\textcopyright} 2002 Elsevier Science B.V. All rights reserved.",
keywords = "Dependability, Distributed systems, Error classification, Fault modeling, Computer system recovery, Error analysis, Fault tolerant computer systems, Probability, Systems analysis, Theorem proving, Dependable distributed systems, Distributed computer systems",
author = "C.J. Walter and Neeraj Suri",
year = "2003",
month = jan,
day = "2",
doi = "10.1016/S0304-3975(01)00203-1",
language = "English",
volume = "290",
pages = "1223--1251",
journal = "Theoretical Computer Science",
issn = "0304-3975",
publisher = "Elsevier",
number = "2",

}

RIS

TY - JOUR

T1 - The customizable fault/error model for dependable distributed systems

AU - Walter, C.J.

AU - Suri, Neeraj

PY - 2003/1/2

Y1 - 2003/1/2

N2 - Dependability is a qualitative term referring to a system's ability to meet its service requirements in the presence of faults. The types and number of faults covered by a system play a primary role in determining the level of dependability which that system can potentially provide. Given the variety and multiplicity of fault types, to simplify the design process, the system algorithm design often focuses on specific fault types, resulting in either over-optimistic (all fault permanent) or over-pessimistic (all faults malicious) dependable system designs. A more practical and realistic approach is to recognize that faults of varied severity levels and of differing occurrence probabilities may appear as combinations rather than the assumed single fault type occurrences. The ability to allow the user to select/customize a particular combination of fault types of varied severity characterizes the proposed customizable fault/error model (CFEM). The CFEM organizes diverse fault categories into a cohesive framework by classifying faults according to the effect they have on the required system services rather than by targeting the source of the fault condition. In this paper, we develop (a) the complete framework for the CFEM fault classification, (b) the voting functions applicable under the CFEM, and (c) the fundamental distributed services of consensus and convergence under the CFEM on which dependable distributed functionality can be supported. © 2002 Elsevier Science B.V. All rights reserved.

AB - Dependability is a qualitative term referring to a system's ability to meet its service requirements in the presence of faults. The types and number of faults covered by a system play a primary role in determining the level of dependability which that system can potentially provide. Given the variety and multiplicity of fault types, to simplify the design process, the system algorithm design often focuses on specific fault types, resulting in either over-optimistic (all fault permanent) or over-pessimistic (all faults malicious) dependable system designs. A more practical and realistic approach is to recognize that faults of varied severity levels and of differing occurrence probabilities may appear as combinations rather than the assumed single fault type occurrences. The ability to allow the user to select/customize a particular combination of fault types of varied severity characterizes the proposed customizable fault/error model (CFEM). The CFEM organizes diverse fault categories into a cohesive framework by classifying faults according to the effect they have on the required system services rather than by targeting the source of the fault condition. In this paper, we develop (a) the complete framework for the CFEM fault classification, (b) the voting functions applicable under the CFEM, and (c) the fundamental distributed services of consensus and convergence under the CFEM on which dependable distributed functionality can be supported. © 2002 Elsevier Science B.V. All rights reserved.

KW - Dependability

KW - Distributed systems

KW - Error classification

KW - Fault modeling

KW - Computer system recovery

KW - Error analysis

KW - Fault tolerant computer systems

KW - Probability

KW - Systems analysis

KW - Theorem proving

KW - Dependable distributed systems

KW - Distributed computer systems

U2 - 10.1016/S0304-3975(01)00203-1

DO - 10.1016/S0304-3975(01)00203-1

M3 - Journal article

VL - 290

SP - 1223

EP - 1251

JO - Theoretical Computer Science

JF - Theoretical Computer Science

SN - 0304-3975

IS - 2

ER -