Issue #62 – Domain Differential Adaptation for Neural MT
28 Nov19
Issue #62 – Domain Differential Adaptation for Neural MT
Author: Raj Patel, Machine Translation Scientist @ Iconic
Neural MT models are data hungry and domain sensitive, and it is nearly impossible to obtain a good amount ( >1M segments) of training data for every domain we are interested in. One common strategy is to align the statistics of the source and target domain, but the drawback of this approach is that the statistics of the different domains are inherently divergent and smoothing over these does not always ensure optimal performance. In this post we’ll discuss the Domain Differential Adaptation (DDA) proposed by Dou et al. (2019), where instead of smoothing over the differences we embrace them.
Domain Differential Adaptation
In the DDA method, we capture the domain difference by two Language Models (LM)s, trained on in-domain (LM-in) and out-of-domain (LM-out) monolingual data respectively. Then we adapt the NMT model trained on out-of-domain data (NMT-out) producing a system as approximate to the NMT model trained on in-domain parallel data (NMT-in) without using any in-domain parallel data. In the paper, the authors proposed two approaches under the overall umbrella of the DDA framework:
- Shallow Adaptation: Given
To finish reading, please visit source site