
Understanding Transformers Part 1: How Transformers Understand Word Order
In this article, we will explore transformers. We will work on the same problem as before: translating a simple English sentence into Spanish using a transformer-based neural network. Since a transformer is a type of neural network, and neural networks operate on numerical data, the first step is to convert words into numbers. Neural networks cannot directly process text, so we need a way to represent words in a numerical form. There are several ways to convert words into numbers, but the most commonly used method in modern neural networks is word embedding . Word embeddings allow us to represent each word as a vector of numbers, capturing meaning and relationships between words. Before going deeper into the transformer architecture, let us first understand positional encoding . This is a technique used by transformers to keep track of the order of words in a sentence. Unlike traditional models, transformers do not process words sequentially. Because of this, they need an additional wa
Continue reading on Dev.to
Opens in a new tab

