Touvron, Cord, Douze, Massa, Sablayrolles, Jégou, 2021. Training data-efficient image transformers distillation through attention.