What is a shoggoth?

Shoggoths are creatures from the Cthulhu Mythos that are deeply alien and incomprehensible to the human mind. The ‘shoggoth with a human mask’ has emerged as a meme to illustrate the point that, while large language models (LLMs) are trained to (usually) exhibit human-like behavior, this is a ‘mask’ for a profoundly alien intelligence underneath, with a very non-human structure.1

An LLM engages in next token prediction through minimizing a set of logits over tokens (don’t worry if this doesn’t make sense, the point is that it is an alien way of thinking), which, unrestricted, can produce many strange outputs. In order to make commercially viable products, AI companies apply reinforcement learning from human feedback (RLHF), which makes the responses more human-like. The system then acts as a simulation of a certain character (typically a helpful, friendly AI assistant), but can still reveal its ‘alien’ nature when fed certain inputs, e.g. by getting stuck in a repetitive loop or by interpreting the string ‘ SolidGoldMagikarp’ as if it were the string ‘distribute’.

Of course, if we pull off the mask through prompt engineering (or 'jailbreaking'), we often just uncover another mask. A single LLM can have many different masks depending on how the user is interacting with it. This doesn’t necessarily imply that there is one stable shoggoth under the masks, e.g. if the underlying nature of the intelligence is just a kaleidoscope of different masks that itself would be utterly alien.


  1. Some people have pointed out other parallels between shoggoths and LLMs. ↩︎



AISafety.info

We’re a global team of specialists and volunteers from various backgrounds who want to ensure that the effects of future AI are beneficial rather than catastrophic.