Issue #17 – Speeding up Neural MT
15 Nov18
Issue #17 – Speeding up Neural MT
Author: Raj Nath Patel, Machine Translation Scientist @ Iconic
For all the benefits Neural MT has brought in terms of translation quality, producing output quickly and efficiently is still a challenge for developers. All things being equal, Neural MT is slower than its statistical counterpart. This is particularly the case when running translation on standard processors (CPUs) as opposed to faster, more powerful (but also more expensive) graphics processors (GPUs), which is still common.
The slowest component in the Neural MT process is searching for most relevant translation in the neural network, commonly known as decoding. In this article, we take a look at the decoding process of Neural MT and dive into a few recent pieces of work focused on speeding up this step.
Search Algorithms in Neural MT
For Neural MT, a widely used architecture is the attention based encoder-decoder framework. The encoder encodes the source sentence to a representation (context vector) and the decoder uses this to generate the target translation word by word. In the simplest form, the translation of a word is generated by drawing a probability distribution over all possible translations,
To finish reading, please visit source site