Joe Carlsmith — Preventing an AI takeover

乔·卡尔史密斯 —— 防止人工智能接管

Dwarkesh Podcast

2024-08-22

2 小时 30 分钟

PDF

单集简介 ...

Chatted with Joe Carlsmith about whether we can trust power/techno-capital, how to not end up like Stalin in our urge to control the future, gentleness towards the artificial Other, and much more. Check out Joe's sequence on Otherness and Control in the Age of AGI here. Watch on YouTube. Listen on Apple Podcasts, Spotify, or any other podcast platform. Read the full transcript here. Follow me on Twitter for updates on future episodes. Sponsors: - Bland.ai is an AI agent that automates phone calls in any language, 24/7. Their technology uses "conversational pathways" for accurate, versatile communication across sales, operations, and customer support. You can try Bland yourself by calling 415-549-9654. Enterprises can get exclusive access to their advanced model at bland.ai/dwarkesh. - Stripe is financial infrastructure for the internet. Millions of companies from Anthropic to Amazon use Stripe to accept payments, automate financial processes and grow their revenue. If you’re interested in advertising on the podcast, check out this page. Timestamps: (00:00:00) - Understanding the Basic Alignment Story (00:44:04) - Monkeys Inventing Humans (00:46:43) - Nietzsche, C.S. Lewis, and AI (1:22:51) - How should we treat AIs (1:52:33) - Balancing Being a Humanist and a Scholar (2:05:02) - Explore exploit tradeoffs and AI Get full access to Dwarkesh Podcast at www.dwarkesh.com/subscribe

Today, I'm chatting with Joe Carl Smith.
He's a philosopher, in my opinion, a capital G, great philosopher.
And you can find his essays at joecarlsmith.com.
So we have a GPT-4, and it doesn't seem like a paper clipper kind of thing.
It understands human values.
In fact, if you help have it explain, like, why is being a paper clipper bad?
Or like, what would just tell me your opinions about being a paper clipper?
Or like, explain why the galaxy shouldn't be turned into paper clips.
Okay, so.
What is happening such that dot, dot, dot,
we have a system that takes over and converts the world into something valueless?
One thing I'll just say off the bat is like,
when I'm thinking about misaligned AIs, I'm thinking about, or the type that I'm worried about,
I'm thinking about AIs that have a relatively specific set of properties related to agency and planning and kind of awareness and understanding of the world.
One is this capacity to plan.
and kind of make kind of relatively sophisticated plans on the basis of models of the world,
where those plans are being kind of evaluated according to criteria.
That planning capability needs to be driving the model's behavior.
So there are models that are sort of in some sense capable of planning,
but it's not like when they give output,

> Dwarkesh Podcast 的更多单集

Joe Carlsmith — Preventing an AI takeover

乔·卡尔史密斯 —— 防止人工智能接管

Dwarkesh Podcast

单集简介 ...

单集文稿 ...