The problem with AI

In general, advanced AIs can be expected to want to take over the universe so that they can work towards their goals with the least risk of interruption (read: probably kill all humans), so what to do....

Give it a stop button! But it will want to prevent you from pressing that button, because that'll stop it from meeting its goals. It'll work hard to find a way to get to the button or to persuade you never to press it (read: probably kill you). You don't want this.

Program the AI to like having its stop button be pressed! But it will just act in an undesirable way (read: go berserk and probably kill lots of people) so that you press it.

So what to do? I wondered about programming the AI to simply like humans being in control of the universe (if you could figure out how to define "humans" unambiguously without messing up - a separate impossibly difficult problem). But then it might decide that, since AI is probably the biggest threat to human agency, AI development must be stopped.

Picture the scenario.... the AI programmers nervously turn on the robot and it seems to behave very nicely. Then a week later hitmen, paid by the robot who has hacked into all the banks, murder every AI developer on the planet and then the robot sticks its hard drive in a microwave and deletes itself.

Well it's a difficult problem as you can see.