Resources: BW - Chapter 2 - Reliability and fault tolerance.pdf


The introduction of Redundant components into a system so that faults can be detected and tolerated.


Acceptance Test Failure Modes

Levels of fault tolerance in a system:

  • Full fault tolerance Most safety-critical systems
    • The system continuous to operate in the presence of faults, albeit for a limited period, with no significant loss of functionality or performance.
  • Graceful degradation / Fail soft
    • The system continuous to operate in the presence of errors, accepting a partial degradation of functionality or performance during recovery or repair.
  • Fail safe
    • The system maintains its integrity while accepting a temporary halt in its operation.

Two discussed techniques: