Let's reproduce GPT-2 (124M)
ArticleMachine Learningvia Andrej Karpathy

Let's reproduce GPT-2 (124M)

We reproduce the GPT-2 (124M) from scratch. This video covers the whole process: First we build the GPT-2 network, then we optimize its training to be...

Andrej Karpathy1y ago
Let's build the GPT Tokenizer
ArticleMachine Learningvia Andrej Karpathy

Let's build the GPT Tokenizer

The Tokenizer is a necessary and pervasive component of Large Language Models (LLMs), where it translates between strings and tokens (text chunks). To...

Andrej Karpathy2y ago
Building makemore Part 2: MLP
ArticleMachine Learningvia Andrej Karpathy

Building makemore Part 2: MLP

We implement a multilayer perceptron (MLP) character-level language model. In this video we also introduce many basics of machine learning (e.g. model...

Andrej Karpathy3y ago

Showing 11721 - 11737 of 11737 articles