Abstract:
A programming method and structure for operating a computing system to restart a total subsystem or a subset of that subsystem to an operable state following a total interruptions (system failure or termination, either normal or abnormal). The subsystem isolates inoperable resources while permitting the others to resume by independently maintaining in a first structure recording the completion state of a resource manager's recovery responsibility with respect to each interrupted work unit and in a second structure the operational states and recovery log interest scopes of each resource manager. The completion state can be influenced by the starting or not of a resource manager, and if restarted, the presence or absence of a resource subset required to accomplish the work unit recovery.