Neural Machine Translation

Machine Translation Weekly 59: Notes from EMNLP 2020

Another large NLP conference that must have taken place in a virtual environment, EMNLP 2020, is over, and here are my notes from the conference. The ACL in the summer that had most Q&A sessions on Zoom, which meant most of the authors waiting forever if someone takes the courage to enter the room. EMNLP sort of simulated the standard conference format that hopefully reduced the communication barrier. There were public Q&A sessions with short presentations and poster sessions in […]

Read more

Machine Translation Weekly 58: Poisoning machine translation

Today, I am going to talk about a topic that is rather unknown to me: the safety and vulnerability of machine translation. I will comment on a paper Targeted Poisoning Attacks on Black-Box Neural Machine Translation by authors from the University of Melbourne and Facebook AI. The main issue making machine-translation users vulnerable is that they typically do not understand the target language and do not have any other choice than trusting the system that target-language output is adequate. Most […]

Read more

Machine Translation Weekly 57: Document-level MT with Context Masking

This week, I am going to discuss the paper “Long-Short Term Masking Transformer: A Simple but Effective Baseline for Document-level Neural Machine Translation” by authors from Alibaba Group. The preprint of the paper appeared a month ago on arXiv and will be presented at this year’s EMNLP. Including document-level context into machine translation is one of the biggest challenges of current machine translation. It has several reasons. One is the lack of document-level training data, which is partially caused by […]

Read more

Machine Translation Weekly 56: Beam Search and Models’ Surprisal

Last year an EMNLP paper “On NMT Search Errors and Model Errors: Cat Got Your Tongue?” (that I discussed in MT Weekly 20) showed a mindblowing property of neural machine translation models that the most probable target sentence is not necessarily the best target sentence. In NMT, we model the target sentence probably that is factorized using the chain rule into conditional token probabilities. We can imagine the target sentence generation like this: The model estimates the probability of the […]

Read more

Machine Translation Weekly 55: Social Polarization Seen through Word Embeddings

This week, I am going to have a closer look at a paper that creatively uses methods for bilingual word embeddings for social media analysis. The paper’s preprint was uploaded last week on arXiv. The title is “We Don’t Speak the Same Language: Interpreting Polarization through Machine Translation,” and most of the authors CMU in Pittsburgh. The paper’s central assumption is that the polarization of different opinion groups, especially in the USA, went so far that some words have totally […]

Read more

Machine Translation Weekly 46: The News GPT-3 has for Machine Translation

Back in 2013, a friend of mine enthusiastically told me, how excited he was about deep learning democratizing AI (and way saying it was not relevant for NLP at all): there was no need for large CPU clusters, all you needed was buying a gaming PC and start training models and publishing ground-breaking papers. Now, it is 2020 and there is GPT-3… Some weeks ago OpenAI published a pre-print about their giant language model that they call GPT-3. It was […]

Read more

Machine Translation Weekly 47: Notes from the ACL

In this extremely long post, I will not focus on one paper as I usually do, but instead will show my brief, but still infinitely long notes from this year’s ACL. Many people already commented on the virtual format of the conference. I will spare you of that and rather talk about the content of the conference including a list of short summaries of papers. Focus on Evaluation Many papers commented on how we evaluate our models and many of […]

Read more

Machine Translation Weekly 48: MARGE

This week, I will comment on a recent pre-print by Facebook AI titled Pre-training via Paraphrasing. The paper introduces a model called MARGE (indeed, they want to say it belongs to the same family as BART by Facebook) that uses a clever way of denoising as a training objective for the representation. Most of the currently used pre-trained models are based on some de-noising. We sample some noise in the input and want the model to get rid of it […]

Read more

Machine Translation Weekly 49: Paraphrasing using multilingual MT

It is a well-known fact that when you have a hammer, everything looks like a nail. It is a less-known fact that when you have a sequence-to-sequence model, everything looks like machine translation. One example of this thinking is the paper Paraphrase Generation as Zero-Shot Multilingual Translation: Disentangling Semantic Similarity from Lexical and Syntactic Diversity recently uploaded to arXiv by researchers from Johns Hopkins University. The paper approaches the task of paraphrase generation, i.e., for a source sentence, they want […]

Read more

Machine Translation Weekly 50: Language-Agnostic Multilingual Representations

Pre-trained multilingual representations promise to make the current best NLP model available even for low-resource languages. With a truly language-neutral pre-trained multilingual representation, we could train a task-specific model for English (or another language with available training data) and such a model would work for all languages the representation model can work with. (Except that by doing so, the models might transfer Western values into low-resource language applications.) There are several multilingual contextual embeddings models (such as multilingual BERT or […]

Read more
1 10 11 12 13 14