Measuring the Biases and Effectiveness of Content-Style Disentanglement
release_mqo5s4gf2jbgrbt55vkqniujpa
by
Xiao Liu, Spyridon Thermos, Gabriele Valvano, Agisilaos Chartsias, Alison O'Neil, Sotirios A. Tsaftaris
2021
Abstract
A recent spate of state-of-the-art semi- and un-supervised solutions
disentangle and encode image "content" into a spatial tensor and image
appearance or "style" into a vector, to achieve good performance in spatially
equivariant tasks (e.g. image-to-image translation). To achieve this, they
employ different model design, learning objective, and data biases. While
considerable effort has been made to measure disentanglement in vector
representations, and assess its impact on task performance, such analysis for
(spatial) content - style disentanglement is lacking. In this paper, we conduct
an empirical study to investigate the role of different biases in content-style
disentanglement settings and unveil the relationship between the degree of
disentanglement and task performance. In particular, we consider the setting
where we: (i) identify key design choices and learning constraints for three
popular content-style disentanglement models; (ii) relax or remove such
constraints in an ablation fashion; and (iii) use two metrics to measure the
degree of disentanglement and assess its effect on each task performance. Our
experiments reveal that there is a "sweet spot" between disentanglement, task
performance and - surprisingly - content interpretability, suggesting that
blindly forcing for higher disentanglement can hurt model performance and
content factors semanticness. Our findings, as well as the used
task-independent metrics, can be used to guide the design and selection of new
models for tasks where content-style representations are useful.
In text/plain
format
Archived Files and Locations
application/pdf 2.7 MB
file_27xabztwpva3letguaouvjqxn4
|
arxiv.org (repository) web.archive.org (webarchive) |
2008.12378v4
access all versions, variants, and formats of this works (eg, pre-prints)