NASA Logo

NTRS

NTRS - NASA Technical Reports Server

Back to Results
Measurement and analysis of operating system fault toleranceThis paper demonstrates a methodology to model and evaluate the fault tolerance characteristics of operational software. The methodology is illustrated through case studies on three different operating systems: the Tandem GUARDIAN fault-tolerant system, the VAX/VMS distributed system, and the IBM/MVS system. Measurements are made on these systems for substantial periods to collect software error and recovery data. In addition to investigating basic dependability characteristics such as major software problems and error distributions, we develop two levels of models to describe error and recovery processes inside an operating system and on multiple instances of an operating system running in a distributed environment. Based on the models, reward analysis is conducted to evaluate the loss of service due to software errors and the effect of the fault-tolerance techniques implemented in the systems. Software error correlation in multicomputer systems is also investigated.
Document ID
19930003352
Acquisition Source
Legacy CDMS
Document Type
Contractor Report (CR)
Authors
Lee, I.
(Illinois Univ. Urbana-Champaign, IL, United States)
Tang, D.
(Illinois Univ. Urbana-Champaign, IL, United States)
Iyer, R. K.
(Illinois Univ. Urbana-Champaign, IL, United States)
Date Acquired
September 6, 2013
Publication Date
October 1, 1992
Subject Category
Computer Operations And Hardware
Report/Patent Number
NAS 1.26:190973
UILU-ENG-92-2240
NASA-CR-190973
CRHC-92-22
Report Number: NAS 1.26:190973
Report Number: UILU-ENG-92-2240
Report Number: NASA-CR-190973
Report Number: CRHC-92-22
Accession Number
93N12540
Funding Number(s)
CONTRACT_GRANT: NAG1-613
CONTRACT_GRANT: N00014-91-J-1116
Distribution Limits
Public
Copyright
Work of the US Gov. Public Use Permitted.
No Preview Available