Would we know if an AGI was misaligned?
Can we ever be sure that an AI is aligned?
Deceptive Alignment