Learning to Predict More Accurate Text Instances for Scene Text
Detection
release_ukdp32giezgq3hvizhakgkxqgm
by
XiaoQian Li, Jie Liu, ShuWu Zhang, GuiXuan Zhang
2019
Abstract
At present, multi-oriented text detection methods based on deep neural
network have achieved promising performances on various benchmarks.
Nevertheless, there are still some difficulties for arbitrary shape text
detection, especially for a simple and proper representation of arbitrary shape
text instances. In this paper, a pixel-based text detector is proposed to
facilitate the representation and prediction of text instances with arbitrary
shapes in a simple manner. Firstly, to alleviate the effect of the target
vertex sorting and achieve the direct regression of arbitrary shape text
instances, the starting-point-independent coordinates regression loss is
proposed. Furthermore, to predict more accurate text instances, the text
instance accuracy loss is proposed as an assistant task to refine the predicted
coordinates under the guidance of IoU. To evaluate the effectiveness of our
detector, extensive experiments have been carried on public benchmarks. On the
ICDAR 2015 Incidental Scene Text benchmark, our method achieves 86.5% of
F-measure, and we obtain 84.8% of F-measure on Total-Text benchmark. The
results show that our method can reach state-of-the-art performance.
In text/plain
format
Archived Files and Locations
application/pdf 3.9 MB
file_ayidw7yczrfzbkvgwn577e23le
|
arxiv.org (repository) web.archive.org (webarchive) |
1911.07423v1
access all versions, variants, and formats of this works (eg, pre-prints)