Issue #2 – Data Cleaning for Neural MT
25 Jul18 Issue #2 – Data Cleaning for Neural MT Author: Dr. Patrik Lambert, Machine Translation Scientist @ Iconic “Garbage in, Garbage out” – noisy data is a big problem for all machine learning tasks, and MT is no different. By noisy data, we mean bad alignments, poor translations, misspellings, and other inconsistencies in the data used to train the systems. Statistical MT systems are more robust, and can cope with up to 10% noise in the training data without […]
Read more