Machine Translation Weekly 46: The News GPT-3 has for Machine Translation

Back in 2013, a friend of mine enthusiastically told me, how excited he was about deep learning democratizing AI (and way saying it was not relevant for NLP at all): there was no need for large CPU clusters, all you needed was buying a gaming PC and start training models and publishing ground-breaking papers. Now, it is 2020 and there is GPT-3… Some weeks ago OpenAI published a pre-print about their giant language model that they call GPT-3. It was […]

Read more

Machine Translation Weekly 47: Notes from the ACL

In this extremely long post, I will not focus on one paper as I usually do, but instead will show my brief, but still infinitely long notes from this year’s ACL. Many people already commented on the virtual format of the conference. I will spare you of that and rather talk about the content of the conference including a list of short summaries of papers. Focus on Evaluation Many papers commented on how we evaluate our models and many of […]

Read more

Machine Translation Weekly 48: MARGE

This week, I will comment on a recent pre-print by Facebook AI titled Pre-training via Paraphrasing. The paper introduces a model called MARGE (indeed, they want to say it belongs to the same family as BART by Facebook) that uses a clever way of denoising as a training objective for the representation. Most of the currently used pre-trained models are based on some de-noising. We sample some noise in the input and want the model to get rid of it […]

Read more

Machine Translation Weekly 49: Paraphrasing using multilingual MT

It is a well-known fact that when you have a hammer, everything looks like a nail. It is a less-known fact that when you have a sequence-to-sequence model, everything looks like machine translation. One example of this thinking is the paper Paraphrase Generation as Zero-Shot Multilingual Translation: Disentangling Semantic Similarity from Lexical and Syntactic Diversity recently uploaded to arXiv by researchers from Johns Hopkins University. The paper approaches the task of paraphrase generation, i.e., for a source sentence, they want […]

Read more

Machine Translation Weekly 50: Language-Agnostic Multilingual Representations

Pre-trained multilingual representations promise to make the current best NLP model available even for low-resource languages. With a truly language-neutral pre-trained multilingual representation, we could train a task-specific model for English (or another language with available training data) and such a model would work for all languages the representation model can work with. (Except that by doing so, the models might transfer Western values into low-resource language applications.) There are several multilingual contextual embeddings models (such as multilingual BERT or […]

Read more

Machine Translation Weekly 51: Machine Translation without Embeddings

Over the few years when neural models are the state of the art in machine translation, the architectures got quite standardized. There is a vocabulary of several thousand discrete input/output units. As the first step, the inputs are represented by static embeddings which get encoded into a contextualized vector representation. It is used as a sort of working memory by the decoder that typically has a similar architecture as the decoder that generates the output left-to-right. In most cases, the […]

Read more

Machine Translation Weekly 52: Human Parity in Machine Translation

This week I am going to have a look at a paper by my former colleagues from Prague “Transforming machine translation: a deep learning system reaches news translation quality comparable to human professionals” that was published in Nature Communications. The paper systematically studies machine translation quality compared to human translation quality with the main criterion being the human judgment about the translations. Already in 2016, Google announced almost reaching human parity on their internal test sets. However, these results were […]

Read more

Machine Translation Weekly 53: Code Swithing Pre-training for NMT

After a short break, MT weekly is again here, and today I will talk about a paper “CSP: Code-Switching Pre-training for Neural Machine Translation” that will appear at this year’s virtual EMNLP. The paper proposes a new and surprisingly elegant way of monolingual pre-training for both supervised and unsupervised neural machine translation. The idea is quite simple. The model they use is the standard Transformer; all the magic is how the model is trained. First, it is pre-trained on synthetic […]

Read more

Machine Translation Weekly 54: Nearest Neighbor MT

This week, I will discuss Nearest Neighbor Machine Translation, a paper from this year ICML that takes advantage of overlooked representation learning capabilities of machine translation models. This paper’s idea is pretty simple and is basically the same as in the previous work on nearest neighbor language models. The paper implicitly argues (or at least I think it does) that the final softmax layer of the MT models is too simplifying and thus pose a sort of information bottleneck, even […]

Read more

Neuronové sítě a strojový překlad

Článek původně vyšel v loňském prosincovém čísle časopisu Rozhledy matematicko-fyzikální. Co je to strojový překlad Pod strojovým překladem si většina lidí představí nejspíš Google Translate a většina lidí si také nejspíš vyzkoušela, jak funguje. Ten, kdo překladač používá častěji si mohl všimnout, že zhruba před třemi lety se kvalita překladu, kterou služba poskytuje, dramaticky zlepšila. Důvodem bylo, že se změnila technologie, na které překlad stojí: překlad založený na statistických metodách nahradily neuronové sítě. Hodně lidí také asi překvapí, že překladač […]

Read more
1 749 750 751 752 753 906