—Data encryption and decryption are common operations in a network based application programs with security. In order to keep pace with the input data rate in such applications, real-time processing of data encryption/decryption is essential. For example, in an environment where a multimedia data is streamed, high speed data encryption/decryption is crucial. In this paper, we propose a new approach to parallelize AES-CTR algorithm by extending the size of the block which is encrypted at one time across the unit block boundaries. The proposed approach leads to significant performance improvements using a general-purpose multi-core processor and a Graphic Processing Unit (GPU) which become popular these days. In particular, the performance improvement on GPU is dramatic; close to 9-times faster compared with the original coarse-grain parallelization approach, mainly thanks to the “multi-core” nature of the GPU architecture. Keywords-AES;multi-core;GPU; parallelization