Issue #13 – Evaluation of Neural MT Architectures
11 Oct18
Issue #13 – Evaluation of Neural MT Architectures
Author: Raj Nath Patel, Machine Translation Scientist @ Iconic
What are the different approaches to Neural MT? Since its relatively recent advent, the underlying technology has been based on one of three main architectures:
- Recurrent Neural Networks (RNN)
- Convolutional Neural Networks (CNN)
- Self-Attention Networks (Transformer)
For various language pairs, non-recurrent architectures (CNN and Transformer) have outperformed RNNs but there has not been any solid explanations as to why. In this post, we’ll evaluate these architectures for their ability to model more complex linguistic phenomena, such as long-range dependencies, and try to understand if and how it contributes to better performance.
Contrastive Evaluation of Machine Translation
BLEU is used as a standard metric to evaluate the quality of the translation (we won’t open that Pandora’s box today!), but it can’t explicitly evaluate the translation with respect to a specific linguistic phenomenon eg. subject-verb-agreement and word sense disambiguation (both happen implicitly during machine translation). In literature, contrastive translations are the most common approach used to measure the accuracy of a model with respect to various linguistic phenomena.
Contastive translations are created by introducing a specific type of error/noise in human translation.
To finish reading, please visit source site