At a high level, what is the challenge of AI alignment?
Nick Bostrom writes that we’re facing the challenge of “Philosophy With A Deadline”.
Many problems relevant to AI alignment are problems philosophers have been dealing with for centuries. To what degree is meaning inherent in language, versus something that requires external context? How do we translate between the logic of formal systems and normal ambiguous human speech? Can morality be reduced to a set of ironclad rules, and if not, how do we know what it is at all?
Existing attempts to answer these questions — from Aristotle, Kant, Mill, Wittgenstein, Quine, and others — may help people understand these issues better, but are not formal in a way that could be implemented in computer code. Just as a good textbook can help an American learn Chinese, but cannot be encoded into machine language to make a Chinese-speaking computer, so the philosophies that help humans are only a starting point for the project of computers that understand us and share our values.
The field of AI alignment combines formal logic, mathematics, computer science, cognitive science, and philosophy in order to advance that project.
This is the philosophy; the other half of Bostrom’s formulation is the deadline. Traditional philosophy has been going on for almost three thousand years; AI alignment must be solved before the development of superintelligent AI, an event which may be anywhere from years to centuries away.
If the alignment problem isn't adequately addressed by then, we are likely to see poorly-aligned superintelligences that are unintentionally hostile to the human race, with some of the catastrophic outcomes mentioned above. This is why so many experts are urging quick action on getting AI alignment research up to an adequate level.
If it turns out that superintelligence is centuries away and such research is premature, little will have been lost. But if our projections were too optimistic, and superintelligence is imminent, then doing such research now rather than later becomes vital.