2024-08-22
2 小时 30 分钟Today, I'm chatting with Joe Carl Smith.
He's a philosopher, in my opinion, a capital G, great philosopher.
And you can find his essays at joecarlsmith.com.
So we have a GPT-4, and it doesn't seem like a paper clipper kind of thing.
It understands human values.
In fact, if you help have it explain, like, why is being a paper clipper bad?
Or like, what would just tell me your opinions about being a paper clipper?
Or like, explain why the galaxy shouldn't be turned into paper clips.
Okay, so.
What is happening such that dot, dot, dot,
we have a system that takes over and converts the world into something valueless?
One thing I'll just say off the bat is like,
when I'm thinking about misaligned AIs, I'm thinking about, or the type that I'm worried about,
I'm thinking about AIs that have a relatively specific set of properties related to agency and planning and kind of awareness and understanding of the world.
One is this capacity to plan.
and kind of make kind of relatively sophisticated plans on the basis of models of the world,
where those plans are being kind of evaluated according to criteria.
That planning capability needs to be driving the model's behavior.
So there are models that are sort of in some sense capable of planning,
but it's not like when they give output,