Chargrid-OCR: End-to-end Trainable Optical Character Recognition for
Printed Documents using Instance Segmentation
release_ehc3wyggt5cojib4mdgan2w7x4
by
Christian Reisswig, Anoop R Katti, Marco Spinaci, Johannes Höhne
2019
Abstract
We present an end-to-end trainable approach for Optical Character Recognition
(OCR) on printed documents. Specifically, we propose a model that predicts a) a
two-dimensional character grid (chargrid) representation of a document
image as a semantic segmentation task and b) character boxes for delineating
character instances as an object detection task. For training the model, we
build two large-scale datasets without resorting to any manual annotation -
synthetic documents with clean labels and real documents with noisy labels. We
demonstrate experimentally that our method, trained on the combination of these
datasets, (i) outperforms previous state-of-the-art approaches in accuracy (ii)
is easily parallelizable on GPU and is, therefore, significantly faster and
(iii) is easy to train and adapt to a new domain.
In text/plain
format
Archived Files and Locations
application/pdf 695.2 kB
file_htqni3wks5bohlr4xnku5x37wa
|
arxiv.org (repository) web.archive.org (webarchive) |
1909.04469v2
access all versions, variants, and formats of this works (eg, pre-prints)