Much past research on finding text in natural scenes uses bottom-up grouping processes to detect candidate text features as a first processing step. While such grouping procedures are a fast and efficient way of extracting the parts of an image that are most likely to contain text, they still suffer from large amounts of false positives that must be pruned out before they can be read by OCR. We argue that a natural framework for pruning out false positive text features is figure-ground segmentation, which we implement using a graphical model (i.e. MRF). The graphical model is "data-driven" in that the nodes of the graph correspond to the candidate text features. Since each node has only two possible states (figure and ground), and since the connectivity of the graphical model is sparse, we can perform rapid inference on the graph using belief propagation. We show promising results on a variety of urban and indoor scene images containing signs, demonstrating the feasibility o...