We have proposed a complete system for text detection and localization in gray scale scene images. A boosting framework integrating feature and weak classifier selection based on computational complexity is proposed to construct efficient text detectors. The proposed scheme uses a small set of heterogeneous features which are spatially combined to build a large set of features. A neural network based localizer learns necessary rules for localization. The evaluation is done on the challenging ICDAR 2003 robust reading and text locating database. The results are encouraging and our system can localize text of various font sizes and styles in complex background.