NASA Logo

NTRS

NTRS - NASA Technical Reports Server

Back to Results
Expert System for UNIX System Reliability and Availability EnhancementHighly reliable and available systems are critical to the airline industry. However, most off-the-shelf computer operating systems and hardware do not have built-in fault tolerant mechanisms, the UNIX workstation is one example. In this research effort, we have developed a rule-based Expert System (ES) to monitor, command, and control a UNIX workstation system with hot-standby redundancy. The ES on each workstation acts as an on-line system administrator to diagnose, report, correct, and prevent certain types of hardware and software failures. If a primary station is approaching failure, the ES coordinates the switch-over to a hot-standby secondary workstation. The goal is to discover and solve certain fatal problems early enough to prevent complete system failure from occurring and therefore to enhance system reliability and availability. Test results show that the ES can diagnose all targeted faulty scenarios and take desired actions in a consistent manner regardless of the sequence of the faults. The ES can perform designated system administration tasks about ten times faster than an experienced human operator. Compared with a single workstation system, our hot-standby redundancy system downtime is predicted to be reduced by more than 50 percent by using the ES to command and control the system.
Document ID
19930016399
Acquisition Source
Legacy CDMS
Document Type
Conference Paper
Authors
Xu, Catherine Q.
(Aeronautical Radio, Inc. Annapolis, MD, United States)
Date Acquired
September 6, 2013
Publication Date
February 1, 1993
Publication Information
Publication: NASA, Washington, Technology 2002: The Third National Technology Transfer Conference and Exposition, Volume 1
Subject Category
Quality Assurance And Reliability
Accession Number
93N25588
Distribution Limits
Public
Copyright
Work of the US Gov. Public Use Permitted.
No Preview Available