To understand the existential risks posed by AI, it helps to understand the details of how AI works and, on a more abstract level, the dynamics that could create these risks if AI is scaled up to intelligence far above the human level. Various concepts have been defined to help us think about such risks.
Acting strategically to reduce the risk requires models of how the future of advanced AI will play out. This involves answering questions like when we will get advanced AI, how fast the transition to a superintelligence will be, and what a superintelligence would be capable of.
A key part of mitigating AI risk is aligning AI with human intentions. There are a range of approaches to this: for example, methods like adversarial training and learning from human feedback have made current AI systems more likely to produce the kinds of outputs intended by their designers. However, these methods have weaknesses, and the safety methods used on existing models may not generalize to future contexts. There are various proposals for scaling safety techniques as capabilities increase, as well as attempts to investigate the alignment problem at a more fundamental level.
Where technical alignment is about the design of AI systems, AI governance is about human decision-making around those systems. It focuses on designing 1) policies for safe AI development and deployment, for both current and future AI systems, and 2) incentive and enforcement mechanisms to ensure that relevant actors, such as governments and AI developers, follow those policies.