The real time 3D graphics becomes one of the attractive applications for 3G wireless terminals although their battery lifetime and memory bandwidth limit the system resources for graphics processing. Instead of using the dedicated hardware engine with complex functions, we propose an efficient hardware architecture of low power vertex shader with programmability. Our architecture includes the following three features: I) a fixed-point SIMD datapath to exploit parallelism in vertex processing while keeping the power consumption low, II) a multithreaded coprocessor interface to decrease unwanted stalls between the main processor and the vertex shader, reducing power consumption by instruction-level power management, III) a programmable vertex engine to increases the datapath throughput by concurrent operations with main processor. Simulation results show that full 3D geometry pipeline can be performed at 7.2M vertices/sec with 115mW power consumption for polygons using the OpenGL lighti...