As the VLSI technology advances continuously, ASIC can easily achieve the required performance and most of them are actually over-designed. Thus, architecture shrinking is inevitable in optimal designs especially when supply voltages are getting lower. However, conventional designs starting from minimization of algorithmic operations (e.g. multiply) may not always lead to optimal architectures, for the wires and the interconnection complexity significantly grow and have become predominant. This paper explores algorithms and architectures for the 2-D transform in H.264/AVC, of which the operations are very simple (i.e. only shift and add). We have shown that fewer operations do not always result in more compact designs. In our experiments with the UMC 0.18µm CMOS technology, the most straightforward matrix multiplication without separable 2-D operation or any fast algorithm has the best area efficiency for D1-size (720×480) video at 30fps. It saves 48%, 34%, and 16% silicon area of t...