Michael Rowe

Trying to get better at getting better

How OpenAI is developing real solutions to the AI alignment problem

Growth in AI safety spending

Farquhar, S (2017). Changes in funding in the AI safety field.

Here’s a situation we all regularly confront: you want to answer a difficult question, but aren’t quite smart or informed enough to figure it out for yourself. The good news is you have access to experts who are smart enough to figure it out. The bad news is that they disagree.

If given plenty of time – and enough arguments, counterarguments and counter-counter-arguments between all the experts – should you eventually be able to figure out which is correct? What if one expert were deliberately trying to mislead you? And should the expert with the correct view just tell the whole truth, or will competition force them to throw in persuasive lies in order to have a chance of winning you over?

In other words: does ‘debate’, in principle, lead to truth?

Source: Wiblin, R. & Harris, K. (2018). Dr Paul Christiano on how OpenAI is developing real solutions to the ‘AI alignment problem’, and his vision of how humanity will progressively hand over decision-making to AI systems.

This is one of the most thoughtful conversations I’ve heard on the alignment problem in AI safety.  It wasn’t always easy to follow as both participants are operating at a very high level of understanding on the topic but it’s really rewarding. It’s definitely something I’ll listen to again. Topics that they covere include:

  • Why Paul expects AI to transform the world gradually rather than explosively and what that would look like.
  • Several concrete methods OpenAI is trying to develop to ensure AI systems do what we want even if they become more competent than us.
  • Why AI systems will probably be granted legal and property rights.
  • How an advanced AI that doesn’t share human goals could still have moral value.
  • Why machine learning might take over science research from humans before it can do most other tasks.
  • Which decade we should expect human labour to become obsolete, and how this should affect your savings plan.

Share this


Discover more from Michael Rowe

Subscribe to get the latest posts to your email.