What is a utility function?

3 min read

A utility function assigns numerical values ("utilities") to outcomes, in such a way that outcomes with higher utilities are absolutely always preferred to outcomes with lower utilities, with no exceptions; the lack of exploitable holes in the preference ordering is necessary for the definition and separates utility from mere reward.

Utility Functions do not work very well in practice for individual humans. Human drives are not coherent nor is there any reason to think they would converge to a utility-function-grade level of reliability (Thou Art Godshatter), and even people with a strong interest in the concept have trouble working out what their utility function actually is even slightly (Post Your Utility Function). Furthermore, humans appear to calculate reward and loss separately - adding one to the other does not predict their behavior accurately, and thus human reward is not human utility. This makes humans highly exploitable - and in fact, not being exploitable would be a minimum requirement in order to qualify as having a coherent utility function.

pjeby posits humans' difficulty in understanding their own utility functions as the root of akrasia.

However, utility functions can be a useful model for dealing with humans in groups, e.g. in economics.

The VNM Theorem tag is likely to be a strict subtag of the Utility Functions tag, because the VNM theorem establishes when preferences can be represented by a utility function, but a post discussing utility functions may or may not discuss the VNM theorem/axioms.

Because utility functions arise from VNM rationality, they may still be of note in understanding intelligent systems even when the system does not explicitly store a utility function anywhere, since reducing exploitable error rate should eventually converge to utility-function-like guarantees.

Utility Functions