Issue #126 – Learning Feature Weights for Denoising Parallel Corpora

15 Apr21 Issue #126 – Learning Feature Weights for Denoising Parallel Corpora Author: Dr. Patrik Lambert, Senior Machine Translation Scientist @ Iconic Introduction Large web-crawled parallel corpora constitute a very useful source of data to improve neural machine translation (NMT) engines. However, their effectiveness is reduced by the large amount of noise they usually contain. As early as in issue #2 of this series, we pointed out that NMT is particularly sensitive to noise in the training data. In issue […]

Read more