Neural Machine Translation

Few words on Natural Language Processing and User Autonomy

As natural language processing (NLP) finds its way from university labs and becomes a crucial element of many user-facing technologies (machine translation, search, language-model-based assistants), people start to get concerned about the ethics of this technology. When people talk about NLP ethics, the main topics are: biases that the models get from training data, replication of toxic behavior found on the Internet, underrepresentation of already underprivileged groups, differences between the technology availability between the global north and global south. Now, […]

Highlights from Machine Translation and Multilinguality in March 2023

Here is what I found the most interesting in MT and multilinguality in March I only feature two papers (both from Microsoft, co-incidence), not because there were too few on arXiv, but because I did not manage to read that much this month. In this paper, folks from Microsoft in India experiment with zero-shot crosslingual transfer for classification. They use a multi-task learning setup. Besides performing the task in the source language, they teach the model using a two-player game. […]

March 2, 2023 Neural Machine Translation (NMT), NMT Leave a comment

Highlights from Machine Translation and Multilinguality in February 2023

There were plenty of interesting pre-prints on arXiv in February. Here is a brief summary of three that I think are cool but could get lost in the hundreds of papers that went public. The unreasonable effectiveness of few-shot learning for machine translation Folks from Google experimented with few-shot MT based on language-model. Instead of using one of the cool huge language models we all know, they train their smaller ones. They prepare specific bi- and tri-lingual LMs (8B parameters; […]

February 20, 2023 Neural Machine Translation (NMT), NMT Leave a comment

Questions and answers about ChatGPT and large language models

There’s been a lot of media coverage of ChatGPT and language models lately, and I feel like not everything is being said quite right. That’s why I have prepared some questions and answers that hopefully help clarify what they are talking about. Questions: What is a (large) language model? What is ChatGPT? Are GPT-3.5 and ChatGPT the best things out there? Are there any available alternatives, ideally open source? How can ChatGPT speak multiple languages? Does it use machine translation? […]

February 7, 2023 Neural Machine Translation (NMT), NMT Leave a comment

Otázky a odpovědi o ChatGPT a velkých jazykových modelech

Poslední dobou se v médiích poměrně často píše o ChatGPT a jazykových modelech a mám pocit, že ne úplně všechno se říká úplně správně. Proto jsem připravil několik otázek a odpovědí, které snad pomůžou vyjasnit, o čem se to vlastně mluví. Otázky: Co je to (velký) jazykový model? Co je to ChatGPT? Je GPT-3.5 a ChatGPT to nejlepší, co existuje? Jsou nějaké dostupné alternativy, ideálně open source? Jaktože umí ChatGPT česky, používá strojový překlad? Kde se berou znalosti, které má ChatGPT? […]

February 6, 2023 Neural Machine Translation (NMT), NMT Leave a comment

Highlights from Machine Translation and Multilinguality in December 2022 and January 2023

Here is what I found interesting on arXiv in December 2022 and January 2023. At the beginning of January, there a relatively few new pre-prints in general. But now it is catching momentum again, with more papers appearing every day. BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting In this paper, folks from the Big Science Workshop elaborate on how to add language support to the already trained BLOOM model. They tried two approaches: MAD-X (clever stuff with adapters, […]

January 19, 2023 Neural Machine Translation (NMT), NMT Leave a comment

Why don’t people use character-level MT? – One year later

In this post, I comment on our (i.e., myself, Helmut Schmid and Alex Fraser) year-old paper “Why don’t people use character-level machine translation,” published in Findings of ACL 2022. Here, I will (besides briefly summarizing the paper’s main message) mostly comment on what I learned while working on the one-year-later perspective, focusing more on what I would do differently now. If you are interested in the exact research content, read the paper or watch a 5-minute presentation. Paper TL;DR Doing […]

December 21, 2022 Neural Machine Translation (NMT), NMT Leave a comment

Notes from EMNLP 2022

Last week I was at EMNLP in Abu Dhabi. Besides losing my passport and figuring out what to do on such an occasion (many thanks to the personnel of the Czech embassy in Abu Dhabi), I had plenty of interesting conversations and saw many interesting posters. When I was at my first NLP conference 8 years ago, I was amazed by the papers presented at the conference and returned with a long list of ideas of what I should try […]

December 2, 2022 Neural Machine Translation (NMT), NMT Leave a comment

Highlights from Machine Translation and Multilinguality in November 2022

Here are my monthly highlights from paper machine translation and multilinguality that appeared on arXiv in November 2022. A preprint with 19 authors from 13 institutions presents something like the T0 model: but instead of starting with the (more or less) monolingual T5 model, they use multilingual BLOOM and mT5 and call the resulting model BLOOMZ and mT0. The main idea is finetuning the underlying model (or the foundation model?) on as many tasks as possible so that the model […]

November 6, 2022 Neural Machine Translation (NMT), NMT Leave a comment

Highlights from Machine Translation and Multilinguality in October 2022

Here are my monthly highlights from paper machine translation and multilinguality that appeared on arXiv, many of them preprints from the upcoming EMNLP conference. Folks from Amazon published a pre-print that introduces a simple method of how to make pre-trained multilingual representation more robust towards noisy inputs. It is a very straightforward approach: they sample typos based on Wikipedia logs and use those during model training. In addition, they add a contrastive loss that forces the noisy versions of sentences […]

« 1 2 3 4 5 … 14 »