Neural Machine Translation

Machine Translation Weekly 51: Machine Translation without Embeddings

Over the few years when neural models are the state of the art in machine translation, the architectures got quite standardized. There is a vocabulary of several thousand discrete input/output units. As the first step, the inputs are represented by static embeddings which get encoded into a contextualized vector representation. It is used as a sort of working memory by the decoder that typically has a similar architecture as the decoder that generates the output left-to-right. In most cases, the […]

Read more

Machine Translation Weekly 52: Human Parity in Machine Translation

This week I am going to have a look at a paper by my former colleagues from Prague “Transforming machine translation: a deep learning system reaches news translation quality comparable to human professionals” that was published in Nature Communications. The paper systematically studies machine translation quality compared to human translation quality with the main criterion being the human judgment about the translations. Already in 2016, Google announced almost reaching human parity on their internal test sets. However, these results were […]

Read more

Machine Translation Weekly 53: Code Swithing Pre-training for NMT

After a short break, MT weekly is again here, and today I will talk about a paper “CSP: Code-Switching Pre-training for Neural Machine Translation” that will appear at this year’s virtual EMNLP. The paper proposes a new and surprisingly elegant way of monolingual pre-training for both supervised and unsupervised neural machine translation. The idea is quite simple. The model they use is the standard Transformer; all the magic is how the model is trained. First, it is pre-trained on synthetic […]

Read more

Machine Translation Weekly 54: Nearest Neighbor MT

This week, I will discuss Nearest Neighbor Machine Translation, a paper from this year ICML that takes advantage of overlooked representation learning capabilities of machine translation models. This paper’s idea is pretty simple and is basically the same as in the previous work on nearest neighbor language models. The paper implicitly argues (or at least I think it does) that the final softmax layer of the MT models is too simplifying and thus pose a sort of information bottleneck, even […]

Read more

Neuronové sítě a strojový překlad

Článek původně vyšel v loňském prosincovém čísle časopisu Rozhledy matematicko-fyzikální. Co je to strojový překlad Pod strojovým překladem si většina lidí představí nejspíš Google Translate a většina lidí si také nejspíš vyzkoušela, jak funguje. Ten, kdo překladač používá častěji si mohl všimnout, že zhruba před třemi lety se kvalita překladu, kterou služba poskytuje, dramaticky zlepšila. Důvodem bylo, že se změnila technologie, na které překlad stojí: překlad založený na statistických metodách nahradily neuronové sítě. Hodně lidí také asi překvapí, že překladač […]

Read more

Machine Translation Weekly 44: Tangled up in BLEU (and not blue)

For quite a while, machine translation is approached as a behaviorist simulation. Don’t you know what a good translation is? It does not matter, you can just simulate what humans do. Don’t you know how to measure if something is a good translation? It does not matter, you can simulate what humans do again. Things seem easy. We learn how to translate from tons of training data that were translated by humans. When we want to measure how well the […]

Read more

Machine Translation Weekly 45: Deep Encoder, Shallow Decoder, and the Fall of Non-autoregressive models

Researchers concerned with machine translation speed invented several methods that are supposed to significantly speed up the translation while maintaining as much as possible from the translation quality of the state-of-the-art models. The methods are usually based on generating as many words as possible in parallel. State-of-the-art models do not generate in parallel, they are autoregressive: it means that they generate words one by one and condition the decisions about the next words on the previously generated words. On the […]

Read more

Machine Translation Weekly 43: Dynamic Programming Encoding

One of the narratives people (including me) love to associate with neural machine translation is that we got rid of all linguistic assumptions about the text and let the neural network learn their own way independent of what people think about language. It sounds cool, it almost gives a science-fiction feeling. What I think that we really do is that we move our assumptions about language from hard constrains of discrete representation into soft constraints of inductive biases that we […]

Read more
1 11 12 13 14