Issue #2 – Data Cleaning for Neural MT

25 Jul18 Issue #2 – Data Cleaning for Neural MT Author: Dr. Patrik Lambert, Machine Translation Scientist @ Iconic “Garbage in, Garbage out” – noisy data is a big problem for all machine learning tasks, and MT is no different. By noisy data, we mean bad alignments, poor translations, misspellings, and other inconsistencies in the data used to train the systems. Statistical MT systems are more robust, and can cope with up to 10% noise in the training data without […]

Read more

Issue #1 – Scaling Neural MT

18 Jul18 Issue #1 – Scaling Neural MT Author: Dr. Rohit Gupta, Sr. Machine Translation Scientist @ Iconic Training a neural machine translation engine is a time consuming task. It typically takes a number of days or even weeks, when running powerful GPUs. Reducing this time is a priority of any neural MT developer. In this post we explore a recent work (Ott et al, 2018), whereby, without compromising the translation quality, they speed up the training 4.9 times on […]

Read more
1 906 907 908