Zhang, et al.. Visual Commonsense in Pretrained Unimodal and Multimodal Models. 4 May 2022.