Neural Machine Translation

Machine Translation Weekly 57: Document-level MT with Context Masking

This week, I am going to discuss the paper “Long-Short Term Masking Transformer: A Simple but Effective Baseline for Document-level Neural Machine Translation” by authors from Alibaba Group. The preprint of the paper appeared a month ago on arXiv and will be presented at this year’s EMNLP. Including document-level context into machine translation is one of the biggest challenges of current machine translation. It has several reasons. One is the lack of document-level training data, which is partially caused by […]

Read more

Machine Translation Weekly 56: Beam Search and Models’ Surprisal

Last year an EMNLP paper “On NMT Search Errors and Model Errors: Cat Got Your Tongue?” (that I discussed in MT Weekly 20) showed a mindblowing property of neural machine translation models that the most probable target sentence is not necessarily the best target sentence. In NMT, we model the target sentence probably that is factorized using the chain rule into conditional token probabilities. We can imagine the target sentence generation like this: The model estimates the probability of the […]

Read more

Machine Translation Weekly 55: Social Polarization Seen through Word Embeddings

This week, I am going to have a closer look at a paper that creatively uses methods for bilingual word embeddings for social media analysis. The paper’s preprint was uploaded last week on arXiv. The title is “We Don’t Speak the Same Language: Interpreting Polarization through Machine Translation,” and most of the authors CMU in Pittsburgh. The paper’s central assumption is that the polarization of different opinion groups, especially in the USA, went so far that some words have totally […]

Read more

Machine Translation Weekly 46: The News GPT-3 has for Machine Translation

Back in 2013, a friend of mine enthusiastically told me, how excited he was about deep learning democratizing AI (and way saying it was not relevant for NLP at all): there was no need for large CPU clusters, all you needed was buying a gaming PC and start training models and publishing ground-breaking papers. Now, it is 2020 and there is GPT-3… Some weeks ago OpenAI published a pre-print about their giant language model that they call GPT-3. It was […]

Read more

Machine Translation Weekly 47: Notes from the ACL

In this extremely long post, I will not focus on one paper as I usually do, but instead will show my brief, but still infinitely long notes from this year’s ACL. Many people already commented on the virtual format of the conference. I will spare you of that and rather talk about the content of the conference including a list of short summaries of papers. Focus on Evaluation Many papers commented on how we evaluate our models and many of […]

Read more

Machine Translation Weekly 48: MARGE

This week, I will comment on a recent pre-print by Facebook AI titled Pre-training via Paraphrasing. The paper introduces a model called MARGE (indeed, they want to say it belongs to the same family as BART by Facebook) that uses a clever way of denoising as a training objective for the representation. Most of the currently used pre-trained models are based on some de-noising. We sample some noise in the input and want the model to get rid of it […]

Read more

Machine Translation Weekly 49: Paraphrasing using multilingual MT

It is a well-known fact that when you have a hammer, everything looks like a nail. It is a less-known fact that when you have a sequence-to-sequence model, everything looks like machine translation. One example of this thinking is the paper Paraphrase Generation as Zero-Shot Multilingual Translation: Disentangling Semantic Similarity from Lexical and Syntactic Diversity recently uploaded to arXiv by researchers from Johns Hopkins University. The paper approaches the task of paraphrase generation, i.e., for a source sentence, they want […]

Read more

Machine Translation Weekly 50: Language-Agnostic Multilingual Representations

Pre-trained multilingual representations promise to make the current best NLP model available even for low-resource languages. With a truly language-neutral pre-trained multilingual representation, we could train a task-specific model for English (or another language with available training data) and such a model would work for all languages the representation model can work with. (Except that by doing so, the models might transfer Western values into low-resource language applications.) There are several multilingual contextual embeddings models (such as multilingual BERT or […]

Read more

Machine Translation Weekly 51: Machine Translation without Embeddings

Over the few years when neural models are the state of the art in machine translation, the architectures got quite standardized. There is a vocabulary of several thousand discrete input/output units. As the first step, the inputs are represented by static embeddings which get encoded into a contextualized vector representation. It is used as a sort of working memory by the decoder that typically has a similar architecture as the decoder that generates the output left-to-right. In most cases, the […]

Read more

Machine Translation Weekly 52: Human Parity in Machine Translation

This week I am going to have a look at a paper by my former colleagues from Prague “Transforming machine translation: a deep learning system reaches news translation quality comparable to human professionals” that was published in Nature Communications. The paper systematically studies machine translation quality compared to human translation quality with the main criterion being the human judgment about the translations. Already in 2016, Google announced almost reaching human parity on their internal test sets. However, these results were […]

Read more
1 10 11 12 13 14