[Learning notes] reading "Attention is all you need" paper

Abstract Encoder : is just the part we studied about RNN that reads the input sequence and "digest" it, the part responsible for creating and updating the hidden state, in a way it's like a person reading something and keeping the "gist" of it in mind Decoder : is the part responsible for using that "gist", that hidden state, the mathematical vector, and use it to produce an output Attention mechanism: hidden state in case RNN ig Dispensing with: getting rid of Pros of transformers: Better results Parallelization Less time to train "Speedometer"reading for AI translation ability: BLEU (Bilingual Evaluation Understudy): so it seems to be some math formula, used as a metric to grade a machine's translation, comparing it with a translation written by a professional, a human ofc 0.0 would mean there was no matching and the model produced a horribly wrong result 100.0 or 1.0 would mean it was a perfect match, but ofc this can never be the case, we can say the exact same thing using lots of

[Learning notes] reading "Attention is all you need" paper

Related Articles

Building Business Credit From Zero: The Exact Steps Nobody Posts Online

Do you want to build a robot snowman?

I Haven’t Written Real Code in 3 Months. My Products Still Ship.

My Learning Experience with Sorting Algorithms

Stop Building Projects. Start Building Systems.

Related Articles

How-To
Building Business Credit From Zero: The Exact Steps Nobody Posts Online
Dev.to Beginners • 1h ago

How-To
Do you want to build a robot snowman?
TechCrunch • 4h ago

How-To
I Haven’t Written Real Code in 3 Months. My Products Still Ship.
Medium Programming • 7h ago

How-To
My Learning Experience with Sorting Algorithms
Dev.to Tutorial • 9h ago

How-To
Stop Building Projects. Start Building Systems.
Medium Programming • 10h ago