—Optical flow estimation aims at inferring a dense pixel-wise correspondence field between two images or video frames. It is commonly used in video processing and computer vision applications, including motion-compensated frame processing, extracting temporal features, computing stereo disparity, understanding scene context/dynamics and understanding behavior. Dense optical flow estimation is a computationally complex problem. Fortunately, a wide range of optical flow estimation algorithms are embarrassingly parallel and can efficiently be accelerated on GPUs. In this work we discuss a massively multi-threaded GPU implementation of the anisotropic HuberL1 optical flow estimation algorithm using OpenCL framework, which achieves per frame execution time speed-up factors up to almost 300×. Overall algorithm flow, GPU specific implementation details and performance results are presented.