A video from a moving camera produces different number of observations of different scene areas. We can construct an attention map of the scene by bringing the frames to a common reference and counting the number of frames that observed each scene point. Different representations can be constructed from this. The base of the attention map gives the scene mosaic. Super-resolved images of parts of the scene can be obtained using a subset of observations or video frames. We can combine mosaicing with superresolution by using all observations, but the magnification factor will vary across the scene based on the attention received. The height of the attention map indicates the amount of super-resolution for that scene point. We modify the traditional super-resolution framework to generate a varying resolution image for panning cameras in this paper. The varying resolution image uses all useful data available in a video. We introduce the concept of attention-based super-resolution and give ...
Dileep Vaka, P. J. Narayanan, C. V. Jawahar