Machine Translation Weekly 70: Loss Masking instead of Data Filtering
This week, I will have a closer look at a recent pre-print introducing an alternative for parallel data filtering for machine translation training. The title of the pre-print is Gradient-guided Loss Masking for Neural Machine Translation and comes from CMU and Google. Training data cleanness is a surprisingly important factor for machine translation quality. A large part of the data that we use for training comes from crawling the Internet, so there is no quality guarantee. On the other hand, […]
Read more