Back to articles
Understanding Transformers Part 3: How Transformers Combine Meaning and Position

Understanding Transformers Part 3: How Transformers Combine Meaning and Position

via Dev.toRijul Rajesh

In the previous article , we learned how positional encoding is generated using sine and cosine waves. Now we will apply those values to each word in the sentence. Applying Positional Encoding to All Words To get the positional values for the second word, we take the y-axis values from each curve at the x-axis position corresponding to the second word. To get the positional values for the third word, we follow the same process. Positional Values for Each Word By doing this for every word, we get a set of positional values for each one: Each word now has its own unique sequence of positional values. Combining Embeddings with Positional Encoding The next step is to add these positional values to the word embeddings. After this addition, each word embedding now contains both: semantic meaning (from embeddings) positional information (from positional encoding) So for the sentence: "Jack eats burger" we now have embeddings that also capture word order. What Happens When We Change Word Order

Continue reading on Dev.to

Opens in a new tab

Read Full Article
2 views

Related Articles