A fault-tolerant software strategy for digital systemsTechniques developed for producing fault-tolerant software are described. Tolerance is required because of the impossibility of defining fault-free software. Faults are caused by humans and can appear anywhere in the software life cycle. Tolerance is effected through error detection, damage assessment, recovery, and fault treatment, followed by return of the system to service. Multiversion software comprises two or more versions of the software yielding solutions which are examined by a decision algorithm. Errors can also be detected by extrapolation from previous results or by the acceptability of results. Violations of timing specifications can reveal errors, or the system can roll back to an error-free state when a defect is detected. The software, when used in flight control systems, must not impinge on time-critical responses. Efforts are still needed to reduce the costs of developing the fault-tolerant systems.
Document ID
19850035682
Acquisition Source
Legacy CDMS
Document Type
Conference Paper
Authors
Hitt, E. F. (Battelle Columbus Labs. OH, United States)
Webb, J. J. (Battelle Columbus Laboratories Columbus, OH, United States)