The shading processors in graphics hardware are becoming increasingly general-purpose. We test, through simulation and benchmarking, the potential performance impact of replacing these processors with a fully generalpurpose parallel processor, without the fixed-function graphics hardware legacy of current graphics processing units (GPUs). The representative general-purpose processor we test against is XMT (for eXplicit Multi-Threading1 ), a PRAM-like single-chip parallel architecture. Performance is compared for two characteristic shaders running in a fragment-limited GPU benchmark harness and on a cycle-accurate XMT simulator. The general-purpose processor is found to be significantly faster at a compute-only shader, but slower on a memory bound texture shader. Finally we analyze the design tradeoffs that would allow combining the best of both worlds: (i) a competitive XMT texture shader, with (ii) a general-purpose easy-to-program XMT many-core approach that scales up or down to t...
Thomas M. DuBois, Bryant Lee, Yi Wang, Marc Olano,