Gaussian Multi-head Attention for Simultaneous Machine Translation
Source code for our ACL 2022 paper “Gaussian Multi-head Attention for Simultaneous Machine Translation” (PDF)
Our method is implemented based on the open-source toolkit Fairseq.
Core code of Gaussian Multi-head Attention is in fairseq/modules/gaussian_multihead_attention.py
Requirements and Installation
-
Python version = 3.6
-
PyTorch version = 1.7
-
Install fairseq:
git clone https://github.com/ictnlp/GMA.git cd GMA pip install --editable ./
Quick Start
Data Pre-processing
We use the data of IWSLT15 English-Vietnamese (download here) and WMT15 German-English (download here), and apply BPE with 32K merge operations on WMT15 German-English via subword_nmt/apply_bpe.py.
Then, we process the data into the fairseq format: