
What If GPT Didn’t “Learn”, It Just Found a Winning Lottery Ticket?
I used to say it confidently: “The model learned this.” It felt obvious. We initialize weights. We run gradient descent. We minimize loss. The network learns. End of story. But then I came across a research paper that genuinely disturbed that simple narrative. While reading late one night, I stumbled upon a paper by Jonathan Frankle and Michael Carbin. At first, it looked like just another pruning paper. But the core claim made me stop and reread it twice. It suggested something radical: What if neural networks don’t build intelligence from scratch during training? What if, hidden inside a randomly initialized network, there already exists a smaller subnetwork that is capable of solving the task and training merely discovers it? That idea is known as the Lottery Ticket Hypothesis. And if it’s even partially true, then gradient descent isn’t constructing intelligence. It’s searching for a winning ticket inside structured randomness. Before we go further, let’s unpack what that actually
Continue reading on Dev.to
Opens in a new tab



