Understanding Transformers Part 2: Positional Encoding with Sine and Cosine

In the previous article , we converted words into embeddings. Now let’s see how transformers add position to those numbers. The numbers that represent word order in a transformer come from a sequence of sine and cosine waves. Each curve is responsible for generating position values for a specific dimension of the word embedding. Understanding the Idea Think of each embedding dimension as getting its value from a different wave. For example: The green curve provides the positional values for the first embedding dimension of every word. For the first word in the sentence, which lies at the far left of the graph (position 0 on the x-axis): The value taken from the green curve is 0 (the y-axis value at that position). The orange curve provides the positional values for the second embedding dimension . At the same position (first word): The value from the orange curve is 1 . The blue curve provides the positional values for the third embedding dimension . For the first word: The value is 0

Understanding Transformers Part 2: Positional Encoding with Sine and Cosine

Related Articles

clmystery: A command-line murder mystery

The Downfall and Enshittification of Microsoft in 2026

When not to use Event Sourcing?

A Cryptography Engineer’s Perspective on Quantum Computing Timelines

What are you doing this week?

Related Articles

News
clmystery: A command-line murder mystery
Lobsters • 3h ago

News
The Downfall and Enshittification of Microsoft in 2026
Lobsters • 4h ago

News
When not to use Event Sourcing?
Reddit Programming • 6h ago

News
A Cryptography Engineer’s Perspective on Quantum Computing Timelines
Lobsters • 7h ago

News
What are you doing this week?
Lobsters • 7h ago