What are "coherence theorems" and what do they tell us about AI?

Coherence theorems (a subset of selection theorems) are defined in various ways, and their relevance for AI is contested. Here’s one way of defining coherence theorems:

“Unless an agent can be represented as maximizing expected utility, that agent is liable to pursue strategies that are dominated by some other available strategy.”

On this reading, there are no coherence theorems, because theorems which derive conclusions about agents acting in line with expected utility maximization make certain assumptions about the nature of an agent’s preferences — such as completeness and transitivity. That said, there are theorems which show that an agent with certain constraints on their preferences over lotteries will either be representable as maximizing expected utility, or else there will be some strategy which is better for them, relative to their own values. Such theorems just require certain assumptions, which may be more or less plausible for future AI systems.

Many people are worried that attempts to construct powerful agents — at least in cases where AI alignment research is conducted with a very precise understanding of agency cognition — will lead to AIs that behave a lot like expected utility maximizers. In turn, this worry relies on the view that the axioms used to prove ‘coherence theorems’ are likely to hold powerful AI systems that arise in the future. Related discussions can be found below:

  • John Wentworth offers a proposal for how agents with incomplete preferences should make decisions.

  • There is a discussion here on the degree to which we should expect future AI systems to have transitive preferences. This is important because completeness is one of the more contested axioms used to prove coherence theorems, and often thought to be required in order to justify the more plausible constraint of transitivity.

    • A related theorem applying to agents with incomplete preferences is provided in this paper.
  • Background reading on the subject can be found in the following two articles, and this (open access) book.