Research output: Contribution to Journal/Magazine › Journal article › peer-review
Research output: Contribution to Journal/Magazine › Journal article › peer-review
}
TY - JOUR
T1 - Application-level diagnostic and membership protocols for generic time-triggered systems
AU - Serafini, M.
AU - Bokor, P.
AU - Suri, Neeraj
AU - Vinter, J.
AU - Ademaj, A.
AU - Brandstätter, W.
AU - Tagliabo, F.
AU - Koch, J.
PY - 2011/3/1
Y1 - 2011/3/1
N2 - We present online tunable diagnostic and membership protocols for generic time-triggered (TT) systems to detect crashes, send/receive omission faults, and network partitions. Compared to existing diagnostic and membership protocols for TT systems, our protocols do not rely on the single-fault assumption and also tolerate non-fail-silent (Byzantine) faults. They run at the application level and can be added on top of any TT system (possibly as a middleware component) without requiring modifications at the system level. The information on detected faults is accumulated using a penalty/reward algorithm to handle transient faults. After a fault is detected, the likelihood of node isolation can be adapted to different system configurations, including configurations where functions with different criticality levels are integrated. All protocols are formally verified using model checking. Using actual automotive and aerospace parameters, we also experimentally demonstrate the transient fault handling capabilities of the protocols. © 2011 IEEE.
AB - We present online tunable diagnostic and membership protocols for generic time-triggered (TT) systems to detect crashes, send/receive omission faults, and network partitions. Compared to existing diagnostic and membership protocols for TT systems, our protocols do not rely on the single-fault assumption and also tolerate non-fail-silent (Byzantine) faults. They run at the application level and can be added on top of any TT system (possibly as a middleware component) without requiring modifications at the system level. The information on detected faults is accumulated using a penalty/reward algorithm to handle transient faults. After a fault is detected, the likelihood of node isolation can be adapted to different system configurations, including configurations where functions with different criticality levels are integrated. All protocols are formally verified using model checking. Using actual automotive and aerospace parameters, we also experimentally demonstrate the transient fault handling capabilities of the protocols. © 2011 IEEE.
KW - Diagnosis
KW - membership
KW - time-triggered systems
KW - transient faults
KW - Application level
KW - Membership protocols
KW - Middleware components
KW - Network partitions
KW - Node isolation
KW - System configurations
KW - System levels
KW - Time triggered
KW - Time-triggered systems
KW - Tt systems
KW - Fault tree analysis
KW - Middleware
KW - Model checking
KW - Online systems
KW - Quality assurance
KW - Reliability
KW - Network protocols
U2 - 10.1109/TDSC.2010.23
DO - 10.1109/TDSC.2010.23
M3 - Journal article
VL - 8
SP - 177
EP - 193
JO - IEEE Transactions on Dependable and Secure Computing
JF - IEEE Transactions on Dependable and Secure Computing
SN - 1545-5971
IS - 2
ER -