NASA Logo

NTRS

NTRS - NASA Technical Reports Server

Back to Results
Redundancy management for efficient fault recovery in NASA's distributed computing systemThe management of redundancy in computer systems was studied and guidelines were provided for the development of NASA's fault-tolerant distributed systems. Fault recovery and reconfiguration mechanisms were examined. A theoretical foundation was laid for redundancy management by efficient reconfiguration methods and algorithmic diversity. Algorithms were developed to optimize the resources for embedding of computational graphs of tasks in the system architecture and reconfiguration of these tasks after a failure has occurred. The computational structure represented by a path and the complete binary tree was considered and the mesh and hypercube architectures were targeted for their embeddings. The innovative concept of Hybrid Algorithm Technique was introduced. This new technique provides a mechanism for obtaining fault tolerance while exhibiting improved performance.
Document ID
19910008298
Acquisition Source
Legacy CDMS
Document Type
Contractor Report (CR)
Authors
Malek, Miroslaw
(Texas Univ. Austin, TX, United States)
Pandya, Mihir
(Texas Univ. Austin, TX, United States)
Yau, Kitty
(Texas Univ. Austin, TX, United States)
Date Acquired
September 6, 2013
Publication Date
February 15, 1991
Subject Category
Computer Programming And Software
Report/Patent Number
NAS 1.26:187879
NASA-CR-187879
Report Number: NAS 1.26:187879
Report Number: NASA-CR-187879
Accession Number
91N17611
Funding Number(s)
CONTRACT_GRANT: NAG9-351
Distribution Limits
Public
Copyright
Work of the US Gov. Public Use Permitted.
No Preview Available