Highlights from Machine Translation and Multilinguality in April 2024
Meta4XNLI: A Crosslingual Parallel Corpus for Metaphor Detection and Interpretation
Folks from the University of the Basque Country prepared an English-Spanish dataset for natural langauge inference (i.e., deciding if sentences follow from each other, are in contradiction, or have nothing to do with each other) with metaphorical expressions. Unlike the standard version of this task (XNLI), which does not use figurative language, there is a large gap between in-language training and language transfer. (Transfer means that we finetune a multilingual model in one language, but then we apply the model in another language). Also, there is a gap between metaphorical and non-metaphorical sentences. I would not necessarily expect this because a pre-trained LM is trained on real-world texts full of metaphors; some are frozen, and some are made