FlareStart
HomeNewsHow ToSources
FlareStart

Where developers start their day. All the tech news & tutorials that matter, in one place.

Quick Links

  • Home
  • News
  • Tutorials
  • Sources

Connect

© 2026 FlareStart. All rights reserved.

Back to articles
ArticleTools

Building makemore Part 4: Becoming a Backprop Ninja

via Andrej KarpathyAndrej Karpathy3y ago

We take the 2-layer MLP (with BatchNorm) from the previous video and backpropagate through it manually without using PyTorch autograd's loss.backward(): through the cross entropy loss, 2nd linear layer, tanh, batchnorm, 1st linear layer, and the embedding table. Along the way, we get a strong intuitive understanding about how gradients flow backwards through the compute graph and on the level of efficient Tensors, not just individual scalars like in micrograd. This helps build competence and intuition around how neural nets are optimized and sets you up to more confidently innovate on and debug modern neural networks. !!!!!!!!!!!! I recommend you work through the exercise yourself but work with it in tandem and whenever you are stuck unpause the video and see me give away the answer. This video is not super intended to be simply watched. The exercise is here: https://colab.research.google.com/drive/1WV2oi2fh9XXyldh02wupFQX0wh5ZC-z-?usp=sharing !!!!!!!!!!!! Links: - makemore on github:

Watch on Andrej Karpathy

Opens in a new tab

Watch on YouTube
1 views

Related Articles

Why Degrees Don’t Make Developers
Article

Why Degrees Don’t Make Developers

Continuously Delivered • 2w ago

When you write your tests TOO LATE... #softwareengineering
Article

When you write your tests TOO LATE... #softwareengineering

Continuously Delivered • 3w ago

"Hello police? I'd like to report a journalism."
Article

"Hello police? I'd like to report a journalism."

Benn Jordan • 1mo ago

Traditional X-Mas Stream
Article

Traditional X-Mas Stream

Yannic Kilcher • 1mo ago

Praxis meets AI Agents
News

Praxis meets AI Agents

Dev.to • 26m ago

Discover More Articles