How might a real-world AI system that receives orders in natural language and does what you mean look?
-
If it works on open-ended big tasks like “make my company successful”, then you can probably just make a sovereign (I.e. tell it “do good things”)
-
To always do what a person means, really demands knowing it better than the person themself, since a person cant specify what they mean.
-
Bostrom would call that a Genie. If it does what you mean, that would be an aligned genie. Or super butler (?) chapter 10 (page 149, he says “The ideal genie would be a super-butler rather than an autistic savant”. He also argues that the distinction in practice would not be large)
-
Technology for creating a safe genie is just as hard as a sovereign and might as well build a soveireign (at least for full superintellingce) .
-
Example of strategy that wouldn’t scale, at ~human level giving out tasks buys you some safety over open ended goals, but at radically superintelligent levels it does not buy you much safety.