Transformer - Encoder Deep Dive - Part 3: What is Self-Attention
Recap Embedding: "The", "dog", "bit", "the", "man" each have a unique semantic identity. Positional Encoding: Each word now knows exactly where it sits in the sentence. Wait... What exactly is the Encoder's job? Part 2 The sole purpose of the Encoder is to understand Context. With the example, "The dog bit the man" - let’s look at the word "bit". On its own, "bit" could mean: A small piece of something (a "bit" of chocolate). The past tense of a bite (the action). A digital 0 or 1 (a computer "bit"). The Encoder doesn't know which one it is until it pays Attention to the words around it through association. Those words are like strangers in an elevator—they are standing near each other, but they aren't talking. What exactly is "Self-Attention"? Self: The model is looking at the same sentence it is currently processing. It isn't looking at a dictionary or a translation yet; it's just looking at its own words. Attention: The model decides which other words in that sentence are relevant t
Continue reading on Dev.to
Opens in a new tab

