Back to articles
I Couldn't Quite Understand Transformers Until I Built One

I Couldn't Quite Understand Transformers Until I Built One

via Dev.to TutorialJem Herbert-Rice

The Problem I read "Attention is All You Need" a couple of times and watched a few hours worth of Youtube (thanks 3Blue1Brown and Andrej Karpathy!) to try and wrap my head around multi-attention heads and transformers. However, it wasn't quite clicking. So, I built a visualiser where I could watch it happen myself - and it turned out to be pretty useful! There's something about wading through all the error messages to get to a functional final product that is the most satisfying feeling in the world. The Solution Seeing how attention works in such a plain format was the 'A-Ha!' moment, and I thought it may be pretty help some others as well. So, I fleshed it out a little more, added some trained models and causal masking options, added some user friendly features and bundled it together in a Streamlit app. So now, you can try my Transformer Attention Visualiser right here! Try it here → What It Does It let's you: Build sentences (Mad Libs style or custom input) Watch attention patterns

Continue reading on Dev.to Tutorial

Opens in a new tab

Read Full Article
32 views

Related Articles