Understanding Transformers Part 1: How Transformers Understand Word Order

In this article, we will explore transformers. We will work on the same problem as before: translating a simple English sentence into Spanish using a transformer-based neural network. Since a transformer is a type of neural network, and neural networks operate on numerical data, the first step is to convert words into numbers. Neural networks cannot directly process text, so we need a way to represent words in a numerical form. There are several ways to convert words into numbers, but the most commonly used method in modern neural networks is word embedding . Word embeddings allow us to represent each word as a vector of numbers, capturing meaning and relationships between words. Before going deeper into the transformer architecture, let us first understand positional encoding . This is a technique used by transformers to keep track of the order of words in a sentence. Unlike traditional models, transformers do not process words sequentially. Because of this, they need an additional wa

Understanding Transformers Part 1: How Transformers Understand Word Order

Related Articles

(Artificial) Intelligence saturation and the future of work

Stamp It! All Programs Must Report Their Version

Biggest Breakthroughs in Computer Science: 2025

Parallelizing Cellular Automata with WebGPU Compute Shaders

FRACTRAN: A Simple Universal Programming Language for Arithmetic

Related Articles

News
(Artificial) Intelligence saturation and the future of work
Lobsters • 6h ago

News
Stamp It! All Programs Must Report Their Version
Lobsters • 6h ago

News
Biggest Breakthroughs in Computer Science: 2025
Reddit Programming • 7h ago

News
Parallelizing Cellular Automata with WebGPU Compute Shaders
Reddit Programming • 7h ago

News
FRACTRAN: A Simple Universal Programming Language for Arithmetic
Reddit Programming • 16h ago