Back to articles
Understanding Transformers Part 4: Introduction to Self-Attention

Understanding Transformers Part 4: Introduction to Self-Attention

via Dev.toRijul Rajesh

In the previous article , we learned how word embeddings and positional encoding are combined to represent both meaning and position. Now let’s go back to our example where we translate the English sentence “Let’s go” , and add positional values to the word embeddings. Now, let’s get the positional encoding for both words. Understanding Relationships Between Words Now let’s explore how a transformer keeps track of relationships between words. Consider the sentence: “The pizza came out of the oven and it tasted good.” The word “it” could refer to pizza , or it could potentially refer to oven . It is important that the transformer correctly associates “it” with “pizza” . Self-Attention Transformers use a mechanism called self-attention to handle this. Self-attention helps the model determine how each word relates to every other word in the sentence, including itself. Once these relationships are calculated, they are used to determine how each word is represented. For example, if “it” is

Continue reading on Dev.to

Opens in a new tab

Read Full Article
2 views

Related Articles