The architecture for a shared memory CPU is described. The CPU allows for parallelism down to the level of single instructions and is tolerant of memory latency. All executable instructions and memory accesses are time stamped. The TimeWarp algorithm is used for managing synchronisation. This algorithm is optimistic and requires that all computations can be rolled back. The basic functions required for implementing the control and memory system used by TimeWarp are described. The memory model presented to the programmer is a single linear address space modified by a single thread of comtrol. Thus, at the software level there is no need for explicit synchronising actions when accessing memory. The physical implementation, however, is multiple CPUs with their own caches and local memory with each CPU simultaneously executing multiple threads of control.
John G. Cleary, Murray Pearson, Husam Kinawi