Multi-Grained Visual Pivot-Guided Multi-Modal Neural Machine Translation with Text-Aware Cross-Modal Contrastive Disentangling
Multi-Scale Vision-Centric Multi-Modal Neural Machine Translation: Text-Aware Cross-Modality Contrastive Decoupling Academic Background Multi-Modal Neural Machine Translation (MNMT) aims to incorporate language-independent visual information into text to enhance machine translation performance. However, due to the significant modal differences betw...