An audio version of my blog post, Thoughts on AI progress (Dec 2025)

我的博客文章音频版,《关于人工智能进展的思考》(2025年12月)

Dwarkesh Podcast

2025-12-24

12 分钟
PDF

单集简介 ...

Read the essay here. Timestamps 00:00:00 What are we scaling? 00:03:11 The value of human labor 00:05:04 Economic diffusion lag is cope00:06:34 Goal-post shifting is justified 00:08:23 RL scaling 00:09:18 Broadly deployed intelligence explosion Get full access to Dwarkesh Podcast at www.dwarkesh.com/subscribe
更多

单集文稿 ...

  • I'm confused why some people have super short timelines,

  • yet at the same time are polished on scaling up reinforcement learning atop LLMs.

  • If we're actually close to a human-like learner,

  • then this whole approach of training on verifiable outcomes is doomed.

  • Now,

  • currently the labs are trying to bake in a bunch of skills into these models through mid-training.

  • There's an entire supply chain of companies that are building RL environments,

  • which teach the model how to navigate web browser or use Excel to build financial models.

  • Now, either of these models will soon learn on the job in a self-directed way,

  • which will make all this free-making pointless.

  • Or they won't, which means that AGI is not imminent.

  • Humans don't have to go through the special training phase where they need to rehearse every single piece of software that they might ever need to use on the job.

  • Baron Millage made an interesting point about this in a recent blog post he wrote.

  • He writes, quote, When we see frontier models improving at various benchmarks,

  • we should think not just about the increased scale and the clever ML research ideas,

  • but the billions of dollars that are paid to PhDs, MDs,

  • and other experts to write questions and provide example answers and reasoning targeting these precise capabilities.

  • You can see this tension most vividly in robotics.

  • In some fundamental sense, robotics is an algorithms problem, not a hardware or data problem.

  • With very little training, a human can learn how to teleoperate current hardware to do useful work.