No cache based techniques for roll-forward fault recovery exist at present. A split-cache approach is proposed that provides e cient support for checkpointing and roll-forward fault recovery in distributed systems. This approach obviates the use of discrete stable storage or explicit synchronization among the processors. Stability of the checkpoint intervals is used as a driver for real time operations.