Lazy checkpoint coordination for bounding rollback propagationIndependent checkpointing allows maximum process autonomy but suffers from potential domino effects. Coordinated checkpointing eliminates the domino effect by sacrificing a certain degree of process autonomy. In this paper, we propose the technique of lazy checkpoint coordination which preserves process autonomy while employing communication-induced checkpoint coordination for bounding rollback propagation. The introduction of the notion of laziness allows a flexible trade-off between the cost for checkpoint coordination and the average rollback distance. Worst-case overhead analysis provides a means for estimating the extra checkpoint overhead. Communication trace-driven simulation for several parallel programs is used to evaluate the benefits of the proposed scheme for real applications.
Document ID
19940034721
Acquisition Source
Legacy CDMS
Document Type
Conference Paper
Authors
Wang, Yi-Min (NASA Langley Research Center Hampton, VA, United States)
Fuchs, W. K. (Illinois Univ. Urbana, United States)
Date Acquired
August 16, 2013
Publication Date
October 1, 1993
Subject Category
Computer Programming And Software
Meeting Information
Meeting: Symposium on Reliable Distributed Systems