NASA Logo

NTRS

NTRS - NASA Technical Reports Server

Back to Results
Health Monitor for Multitasking, Safety-Critical, Real-Time SoftwareHealth Manager can detect Bad Health prior to a failure occurring by periodically monitoring the application software by looking for code corruption errors, and sanity-checking each critical data value prior to use. A processor s memory can fail and corrupt the software, or the software can accidentally write to the wrong address and overwrite the executing software. This innovation will continuously calculate a checksum of the software load to detect corrupted code. This will allow a system to detect a failure before it happens. This innovation monitors each software task (thread) so that if any task reports "bad health," or does not report to the Health Manager, the system is declared bad. The Health Manager reports overall system health to the outside world by outputting a square wave signal. If the square wave stops, this indicates that system health is bad or hung and cannot report. Either way, "bad health" can be detected, whether caused by an error, corrupted data, or a hung processor. A separate Health Monitor Task is started and run periodically in a loop that starts and stops pending on a semaphore. Each monitored task registers with the Health Manager, which maintains a count for the task. The registering task must indicate if it will run more or less often than the Health Manager. If the task runs more often than the Health Manager, the monitored task calls a health function that increments the count and verifies it did not go over max-count. When the periodic Health Manager runs, it verifies that the count did not go over the max-count and zeroes it. If the task runs less often than the Health Manager, the periodic Health Manager will increment the count. The monitored task zeroes the count, and both the Health Manager and monitored task verify that the count did not go over the max-count.
Document ID
20110012599
Acquisition Source
Kennedy Space Center
Document Type
Other - NASA Tech Brief
Authors
Zoerner, Roger
(NASA Kennedy Space Center Cocoa Beach, FL, United States)
Date Acquired
August 25, 2013
Publication Date
June 1, 2011
Publication Information
Publication: NASA Tech Briefs, June 2011
Subject Category
Man/System Technology And Life Support
Report/Patent Number
KSC-12809
Distribution Limits
Public
Copyright
Public Use Permitted.
No Preview Available