How can I use a background in the social sciences to help with AI alignment?

3 min read

Suggest changes in Google Docs

Nora Ammann, in the post AI alignment as “navigating the space of intelligent behaviour”, describes “three epistemic strategies for making progress on the alignment problem: 1) tinkering, 2) idealization and 3) intelligence-in-the-wild”. Research in the social sciences, biology, philosophy, and other fields can inform alignment efforts by shedding light on “intelligence-in-the-wild”. (As illustrated by the examples below, such research often still involves mathematics as well.)

Some examples of approaches, taken from the post:

Steve Byrnes’s research on brain-like AGI safety asks how we can align artificial general intelligence if it’s built on the same principles as the human brain, drawing analogies with neuroscience.
John Wentworth studies agent-like systems in nature to understand agency in general.
Andrew Critch’s research on multipolar takeoffs and “robust agent-agnostic processes” relates to concepts from sociology.
Discussions of mesa-optimization use human evolution as a source of analogies.

Principles of Intelligent Behavior in Biological and Social Systems (PIBBSS) is a group that runs a summer research fellowship and has recommendations for books and videos.

Other such research agendas exist. You can consider these as examples of what alignment-relevant research with varying amounts of math and computer science could look like:

An Open Agency Architecture for Safe Transformative AI is an AI alignment paradigm aimed at ending the acute risk period without creating worse risks.
Learning Normativity: A Research Agenda aims to develop ways for agents to learn norms like languages and values in the absence of perfect feedback.
What Should AI Owe To Us? Accountable and Aligned AI Systems via Contractualist AI Alignment tries to ground AI alignment in pluralist and contractualist norms.
Political Economy of Reinforcement Learning (PERLS) is a workshop studying the societal implications of reinforcement learning systems.
The Alignment of Complex Systems Research Group studies connections between AI alignment and complex systems theory.

How can I do conceptual, mathematical, or philosophical work on AI alignment?

How can I work on AI policy?