In this paper we introduce a framework for automated text recognition from images. We first describe a simple but efficient text detection and recognition method based on analysis of Maximally Stable Extremal Regions (MSERs) and simple template matching which allows to provide initial character recognition results. The main emphasis of the paper is on introducing a novel method for exploiting contextual information to improve the obtained recognition results. We propose to analyze the results of web search engine queries on two levels of detail, which both allow to significantly improve the overall text recognition performance. The experimental evaluations on reference data sets prove that even based on a low quality single character recognition method the proposed web search engine extension enables reasonable text recognition results.