Efficient VLSI implementation of multiple-input multiple-output (MIMO) detectors plays an important role in the real-life implementation of MIMO communication systems. However, most highperformance MIMO detection algorithms developed so far largely lack the operational parallelism and regularity that are essential for high-throughput and low-power VLSI implementations. In this paper, following the theme of parallelism/regularity-driven algorithm design, we propose hard/soft-output MIMO detection algorithms that have high operational parallelism and regular/static data flow structure with fixed detection delay. Besides those properties desirable for VLSI implementations, such algorithms can achieve superior detection performance as demonstrated in the simulations.