The paper presents a method for efficient text detection in unconstrained environments, based on image features derived from connected components and on a classification architecture implementing a focus of attention approach. The main application motivating the work is container code detection with the final goal of checking freight trains composition. Although the method is strongly influenced by the application experimental evidence speaks in favour of its generality: we present results on container codes, car plates images and on the benchmark dataset ICDAR.