N, 2021. Using Large Pre-Trained Models with Cross-Modal Attention for Multi-Modal Emotion Recognition.