How might a real-world AI system that receives orders in natural language and does what you mean look?

  • If it works on open-ended big tasks like “make my company successful”, then you can probably just make a sovereign (I.e. tell it “do good things”)

  • To always do what a person means, really demands knowing it better than the person themself, since a person cant specify what they mean.

  • Bostrom would call that a Genie. If it does what you mean, that would be an aligned genie. Or super butler (?) chapter 10 (page 149, he says “The ideal genie would be a super-butler rather than an autistic savant”. He also argues that the distinction in practice would not be large)

  • Technology for creating a safe genie is just as hard as a sovereign and might as well build a soveireign (at least for full superintellingce) .

  • Example of strategy that wouldn’t scale, at ~human level giving out tasks buys you some safety over open ended goals, but at radically superintelligent levels it does not buy you much safety.