The performance of a concurrent multithreaded architectural model, called superthreading 15 , is studied in this paper. It tries to integrate optimizing compilation techniques and run-time hardware support to exploit both thread-level and instruction-level parallelism, as opposed to exploiting only instruction-level parallelism in existing superscalars. The superthreaded architecture uses a thread pipelining execution model to enhance the overlapping between threads, and to facilitate data dependence enforcement between threads through compiler-directed, hardware-supported, threadlevel control speculation and run-time data dependence checking. We also evaluate the performance of the superthreaded processor through a detailed trace-driven simulator. Our results show that the superthreaded execution model can obtain good performance by exploiting both thread-level and instruction-level parallelism in programs. We also study the design parameters of its main system components, such as th...