Eloff, et al.. Multimodal One-shot Learning of Speech and Images. IEEE, 2019, doi:10.1109/icassp.2019.8683587.