Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting
release_c4gurfucxjaj3oyhtbcbyqvqcq
by
Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Yongxin Yang, Timothy M. Hospedales, Tao Xiang, Yi-Zhe Song
2021
Abstract
Self-supervised learning has gained prominence due to its efficacy at
learning powerful representations from unlabelled data that achieve excellent
performance on many challenging downstream tasks. However supervision-free
pre-text tasks are challenging to design and usually modality specific.
Although there is a rich literature of self-supervised methods for either
spatial (such as images) or temporal data (sound or text) modalities, a common
pre-text task that benefits both modalities is largely missing. In this paper,
we are interested in defining a self-supervised pre-text task for sketches and
handwriting data. This data is uniquely characterised by its existence in dual
modalities of rasterized images and vector coordinate sequences. We address and
exploit this dual representation by proposing two novel cross-modal translation
pre-text tasks for self-supervised feature learning: Vectorization and
Rasterization. Vectorization learns to map image space to vector coordinates
and rasterization maps vector coordinates to image space. We show that the our
learned encoder modules benefit both raster-based and vector-based downstream
approaches to analysing hand-drawn data. Empirical evidence shows that our
novel pre-text tasks surpass existing single and multi-modal self-supervision
methods.
In text/plain
format
Archived Files and Locations
application/pdf 1.1 MB
file_zoaokg5eqzgv7ntgbrbu2qzaq4
|
arxiv.org (repository) web.archive.org (webarchive) |
2103.13716v1
access all versions, variants, and formats of this works (eg, pre-prints)