COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images

10 years 4 months ago

Download vision.cornell.edu

This paper describes the COCO-Text dataset. In recent years large-scale datasets like SUN and Imagenet drove the advancement of scene understanding and object recognition. The goal of COCO-Text is to advance state-of-the-art in text detection and recognition in natural images. The dataset is based on the MS COCO dataset, which contains images of complex everyday scenes. The images were not collected with text in mind and thus contain a broad variety of text instances. To reﬂect the diversity of text in natural scenes, we annotate text with (a) location in terms of a bounding box, (b) ﬁne-grained classiﬁcation into machine printed text and handwritten text, (c) classiﬁcation into legible and illegible text, (d) script of the text and (e) transcriptions of legible text. The dataset contains over 173k text annotations in over 63k images. We provide a statistical analysis of the accuracy of our annotations. In addition, we present an analysis of three leading state-of-the-art phot...

Andreas Veit, Tomas Matera, Lukas Neumann, Jiri Ma

Real-time Traffic

CORR 2016 | Education |

claim paper

» Detecting texts of arbitrary orientations in natural images

» Text detection and restoration in natural scene images

» Using contours to detect and localize junctions in natural images

» ICDAR 2003 Robust Reading Competitions

» Simplex Distributions for Embedding Data Matrices over Time

Post Info
More Details (n/a)

Added	01 Apr 2016
Updated	01 Apr 2016
Type	Journal
Year	2016
Where	CORR
Authors	Andreas Veit, Tomas Matera, Lukas Neumann, Jiri Matas, Serge J. Belongie

Comments (0)

Sciweavers

COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images

CORR 2016 | Education |

Explore & Download

Productivity Tools

Sciweavers