Parallelization, performance analysis, and algorithm consideration of Hough transform on chip multiprocessors