The LLM Journey (Part 4): Training GPT-2 and the Compute Gold Rush

In previous parts of The LLM Journey, we’ve covered: Part 1: How raw internet text becomes tokens. Part 2: How neural networks learn to predict the next token. Part 3: […]
The LLM Journey (Part 3): From Training to Inference

In Part 2, we unpacked how large language models (LLMs) learn during training — billions of tokens fed into neural networks, shaping parameters that capture patterns of human language. But once […]