Video summary
Dario Amodei on the state of AI scaling
In this Dwarkesh Patel conversation, Dario Amodei reflects on how AI progress has evolved over the last three years. He says the core scaling story has held up, with both pre-training and RL showing continued gains as models train on broader data for longer. He also frames current systems as partway between human learning and evolution, and argues that generalization emerges from scale rather than from teaching every skill directly. The excerpt centers on his view that AI may be approaching the end of its exponential phase, while still leaving room for major near-term gains in verifiable tasks like coding.
Scaling laws still seem intact
Explains why pre-training and reinforcement learning both appear to follow log-linear scaling, with broader data and longer training leading to better generalization.
A new way to think about learning
Argues that today’s models sit between human learning and evolution, with in-context learning and training playing different roles along that spectrum.
The end of the exponential?
Says the most surprising shift is how little public recognition there is of how close AI may be to the end of its exponential growth phase.
Topics
AI scaling laws
Amodei describes how pre-training and RL scaling both appear to keep improving in a log-linear way as data, compute, and training time increase.
How models learn
He argues that current models can be understood as sitting between evolution and human learning, with in-context learning filling a different role than training.
The end of the exponential
He says the biggest surprise is the lack of public awareness about how close AI may be to the end of its exponential growth.
Sample transcript excerpt
Transcript
Timestamped transcript passages group captions into readable sections, making the documentary easier to scan, cite, and summarize.
So again, this is situated between evolution and human learning. But once you learn all those skills, you have them. Just like with pre-training, just how the models know more, if I look at a pre-trained model, it knows more about the history of samurai in Japan than I do. It knows more about baseball than I do. It knows more about low-pass filters and electronics, all of these things. Its knowledge is way broader than mine. So I think even just that may get us to the point where the models are better at everything.
We also have, again, just with scaling the kind of existing setup, the in-context learning.
Sign in to view the full timestamped transcript and use it in Crawlora workflows.