Text-DIAE: Degradation Invariant Autoencoders for Text Recognition and Document Enhancement release_w7vbzopsxngu7njknd6a2n7shi

by Mohamed Ali Souibgui, Sanket Biswas, Andres Mafla, Ali Furkan Biten, Alicia Fornés, Yousri Kessentini, Josep Lladós, Lluis Gomez, Dimosthenis Karatzas

Released as a article .

2022  

Abstract

In this work, we propose Text-Degradation Invariant Auto Encoder (Text-DIAE) aimed to solve two tasks, text recognition (handwritten or scene-text) and document image enhancement. We define three pretext tasks as learning objectives to be optimized during pre-training without the usage of labelled data. Each of the pre-text objectives is specifically tailored for the final downstream tasks. We conduct several ablation experiments that show the importance of each degradation for a specific domain. Exhaustive experimentation shows that our method does not have limitations of previous state-of-the-art based on contrastive losses while at the same time requiring essentially fewer data samples to converge. Finally, we demonstrate that our method surpasses the state-of-the-art significantly in existing supervised and self-supervised settings in handwritten and scene text recognition and document image enhancement. Our code and trained models will be made publicly available at < http://Upon_Acceptance>.
In text/plain format

Archived Files and Locations

application/pdf  19.7 MB
file_gykqt3ayebbfngttk3ximnhbbe
arxiv.org (repository)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article
Stage   submitted
Date   2022-03-16
Version   v3
Language   en ?
arXiv  2203.04814v3
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: 3d3a097f-f9e6-424d-843b-57f7c8e8083f
API URL: JSON