Building Transformer Models with Attention Crash Course. Build a Neural Machine Translator in 12 Days
data:image/s3,"s3://crabby-images/ebf96/ebf96e7784d7bc31111716a27fe95694b17a4943" alt=""
Transformer is a recent breakthrough in neural machine translation. Natural languages are complicated. A word in one language can be translated into multiple words in another, depending on the context. But what exactly a context is, and how you can teach the computer to understand the context was a big problem to solve. The invention of the attention mechanism solved the problem of how to encode a context into a word, or in other words, how you can present a word and its context together in a numerical vector. Transformer brings this to one level higher so that we can build a neural network for natural language translation using only the attention mechanism but no recurrent structure. This not