Where can I find videos about AI safety?
Also available at aisafety.video
Top recommendation: AI Safety Intro Video playlist
Generally good sources
Channels
- Robert Miles AI Safety (and Rob's videos on Computerphile)
- Centre for the Governance of AI (GovAI)
- Apart Research
- Neel Nanda
- Center for AI Safety
- Towards Data Science
- The Inside View
- Future of Life Institute
- SERI
- CERI
- AI Safety Talks (and its playlists)
- AI Safety Reading Group
- Mechanistic Interpretability
- Intro to ML Safety
- AGI safety talks from AGISF
- AISS discussion days and AISS YouTube
- Victoria Krakovna AI talks
- Jack Parker
- PIBBSS
- MIRI
- The Future Society
- Cooperative AI Foundation
- Quantified Uncertainty Research Institute
- Robin Hanson AI Risk Conversations
- Singular Learning Theory Summit
- Relevant, but less focused on AI existential risk: Rational Animations, Centre for Effective Altruism, Future of Humanity Institute, Center for Security and Emerging Technology, CSER, SSC meetups, Foresight Institute, Science, Technology & the Future, Berkman Klein Center, Schwartz Reisman Institute, Stanford HAI, Carper AI, Lex Fridman, Digital Humanism, Cognitive Revolution Podcast, NIST AI Metrology Colloquia Series
- AI content without much AI safety: AI Explained, Andrej Karpathy, John Tan Chong Min, Edan Meyer, Yannic Kilcher, Mutual Information, Computerphile, CodeEmporium, sentdex, nPlan, Jay Alammar, Assembly AI, Aleksa Gordić, Simons Institute, 2 Minute Papers, Machine Learning Street Talk, ColdFusion, HuggingFace, AI Coffee Break, Alex Smola, Welcome AI Overlords, Valence Discovery, The Alan Turing Institute, Jordan Harrod, Cambridge Ellis Unit, Weights & Biases, UCL CSML Seminar Series, Harvard Medical AI, IARAI, Alfredo Canziani, Andreas Geiger, CMU AI Seminar, Jeremy Howard, Google Research, AI for Good, IPAM UCLA, One world theoretical ML, What's AI, Stanford MedAI, MILA neural scaling seminars, Digital Engine (sometimes misleading), Steve Brunton, PyTorch, What's AI by Louis Bouchard, Eye on AI, Super Data Science, Waterloo AI, Matt Wolfe, DeepLearningAI, TechTechPotato, Asianometry
- Other languages: Karl Olsberg (German)
Lists
- AI Alignment YouTube Playlists – excellent resource. Slide-light (reordered) and slide-heavy playlists.
- the gears to ascenscion lists many channels for understanding current capabilities trends
- AI Safety Support "Lots of Links": Videos
- A ranked list of all EA-relevant documentaries, movies, and TV | Brian Tan on EAF (“AI Safety / Risks” section)
- San Francisco Alignment Workshop 2023
- Towards AGI: Scaling, Alignment & Emergent Behaviors in Neural Nets
Specific suggestions
Note that:
- I haven’t watched all of these videos. Feel free to comment with more recommendations!
- This list does not focus on podcasts, although there are a few podcast recommendations. See this page for some AI safety podcasts.
Introductory
- See also AI safety intros for readings
- Could AI wipe out humanity? | Most pressing problems
- AI on the Hill: Why artificial intelligence is a public safety issue (Jeremie Harris)
- Hassabis, Altman and AGI Labs Unite - AI Extinction Risk Statement [ft. Sutskever, Hinton + Voyager]
- AI and Evolution (Dan Hendrycks)
- Vael Gates: Researcher Perceptions of Current and Future AI
- Intro to AI Safety, Remastered (Rob Miles)
- Connor Leahy, AI Fire Alarm, AI Alignment & AGI Fire Alarm - Connor Leahy (ML Street Talk), and Connor Leahy on AI Safety and Why the World is Fragile
- Positive Outcomes for AI | Nate Soares | Talks at Google
- Eliezer Yudkowsky – AI Alignment: Why It's Hard, and Where to Start and Sam Harris 2018 - IS vs OUGHT, Robots of The Future Might Deceive Us with Eliezer Yudkowsky (full transcript here) and 159 - We’re All Gonna Die with Eliezer Yudkowsky
- Brian Christian and Ben Garfinkel and Richard Ngo and Paul Christiano on the 80,000 Hours Podcast
- Richard Ngo and Paul Christiano on AXRP
- Ajeya Cotra on how Artificial Intelligence Could Cause Catastrophe
- Jeremie Harris - TDS Podcast Finale: The future of AI, and the risks that come with it
- Rohin Shah on the State of AGI Safety Research in 2021 and AI Alignment: An Introduction | Rohin Shah | EAGxOxford 22
- The Alignment Problem: Machine Learning and Human Values with Brian Christian Q&A section
- David Krueger AI Safety and Alignment- Part 1 and part 2
- A Response to Steven Pinker on AI
- X-Risk Overview (Dan Hendrycks)
- Provably Beneficial AI | Stuart Russell
- Stuart Russell's BBC Reith lecture audio
- Myths and Facts About Superintelligent AI (Max Tegmark + minutephysics)
- What happens when our computers get smarter than we are? | Nick Bostrom
- Risks from Advanced AI with Jakub Kraus and AI safety intro talk
- What is the alignment problem? (Samuel Albanie)
- SaTML 2023 - Jacob Steinhardt - Aligning ML Systems with Human Intent
- How We Prevent the AI’s from Killing us with Paul Christiano
Landscape
- Current work in AI alignment | Paul Christiano | EA Global: San Francisco 2019
- Paradigms of AI alignment: components and enablers | Victoria Krakovna | EAGxVirtual 2022
- How to build a safe advanced AI (Evan Hubinger) | What's up in AI safety? (Asya Bergal)
- Rohin Shah on the State of AGI Safety Research in 2021
Inner alignment
- Risks from Learned Optimization: Evan Hubinger at MLAB2
- The OTHER AI Alignment Problem: Mesa-Optimizers and Inner Alignment
- Deceptive Misaligned Mesa-Optimisers? It's More Likely Than You Think...
- We Were Right! Real Inner Misalignment
Outer alignment
- 9 Examples of Specification Gaming
- How to Keep Improving When You're Better Than Any Teacher - Iterated Distillation and Amplification
- AI Toy Control Problem (Stuart Armstrong)
- AIS via Debate (Joe Collman)
- Another Outer Alignment Failure Story
Agent foundations
- AI & Logical Induction - Computerphile
- Intro to Agent Foundations (Understanding Infra-Bayesianism Part 4)
- Scott Garrabrant – Finite Factored Sets
- EC'21 Tutorial: Designing Agents' Preferences, Beliefs, and Identities (Part 3) and part 4 from FOCAL at CMU
Interpretability
- See the interpretability playground
- What is mechanistic interpretability? Neel Nanda explains.
- A Walkthrough of Toy Models of Superposition w/ Jess Smith
- ROME: Locating and Editing Factual Associations in GPT (Paper Explained & Author Interview)
- Feature Visualization & The OpenAI microscope and Building Blocks of AI Interpretability | Two Minute Papers #234
- ICLR 2022 Keynote: Been Kim
- 25. Interpretability (MIT 6.S897 Machine Learning for Healthcare, Spring 2019)
- Cohere For AI - Community Talks - Catherine Olsson on Mechanistic Interpretability: Getting Started
- Transformer Circuit Videos + YouTube playlist
- A Walkthrough of A Mathematical Framework for Transformer Circuits
- A Walkthrough of Interpretability in the Wild Part 1/2: Overview (w/ authors Kevin, Arthur, Alex) and part 2
- Intro talk for the interpretability hackathon
- Reliable and Interpretable Artificial Intelligence -- Lecture 1 (Introduction)
- Chris Olah on the 80,000 Hours Podcast
Organizations
- AI alignment and Redwood Research | Buck Shlegeris (CTO)
- Daniela and Dario Amodei on Anthropic
- Alignment Research Center - Q&A with Mark Xu
- Training machine learning (ML) systems to answer open-ended questions | Andreas Stuhlmuller + Amanda Ngo, Ought | Automating Complex Reasoning (Ought)
Individual researchers
- Peter Railton - A World of Natural and Artificial Agents in a Shared Environment
- Provably Beneficial AI and the Problem of Control and Human-compatible artificial intelligence - Stuart Russell, University of California (Stuart Russell)
- Victoria Krakovna–AGI Ruin, Sharp Left Turn, Paradigms of AI Alignment
- David Krueger—AI Alignment, David Krueger: Existential Safety, Alignment, and Specification Problems
- Holden Karnofsky - Transformative AI & Most Important Century
- Katja Grace—Slowing Down AI, Forecasting AI Risk
- Shahar Avin–AI Governance
- Markus Anderljung–Regulating Advanced AI
- A Conversation with John Wentworth
- Connor Leahy | Promising Paths to Alignment
- Prosaic Intent Alignment (Paul Christiano)
- Timelines for Transformative AI and Language Model Alignment | Ajeya Cotra
- AI Research Considerations for Existential Safety (Andrew Critch)
- A Conversation with Vanessa Kosoy
- Causal foundations for safe AGI - Tom Everitt (DeepMind)
- Differential Progress in Cooperative AI: Motivation and Measurement (Jesse Clifton and Sammy Martin)
- Open-source learning: A bargaining approach | Jesse Clifton | EA Global: London 2019
- Jan Leike - AI alignment at OpenAI
- AGISF - Research questions for the most important century - Holden Karnofsky
- Scott Aaronson Talks AI Safety
- BI 151 Steve Byrnes: Brain-like AGI Safety
- Stuart Armstrong - How Could We Align AI?
- SDS 597: A.I. Policy at OpenAI — with Miles Brundage
- Owain Evans - Predicting the future of AI
- Ethan Perez | Discovering language model behaviors with model-written evaluations
- Helen Toner - The strategic and security implications of AI
Reasoning about future AI
- Optimal Policies Tend To Seek Power (Alex Turner at NeurIPS 2021)
- AI "Stop Button" Problem - Computerphile
- Intelligence and Stupidity: The Orthogonality Thesis
- Why Would AI Want to do Bad Things? Instrumental Convergence
- Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover
- Existential Risk from Power-Seeking AI (Joe Carlsmith)
Frontier AI regulation
- Frontier AI regulation: Preparing for the future beyond ChatGPT
- GovAI Webinar: How Should Frontier AI Models be Regulated?
International AI governance
US AI Policy
Compute governance
Hardware supply chain
- GTC March 2024 Keynote with NVIDIA CEO Jensen Huang
- Notes on AI Hardware - Benjamin Spector | Stanford MLSys #88
- “The Decision of the Century”: Choosing EUV Lithography, Intel & AMD: The First 30 Years, A Brief History of Semiconductor Packaging, What Goes On Inside a Semiconductor Wafer Fab, and many more from Asianometry
- The race for semiconductor supremacy | FT Film
- EUV lithography systems (scroll to “How does EUV work?”)
- The AI Hardware Show 2023, Episode 1: TPU, A100, AIU, BR100, MI250X
- All about AI Accelerators: GPU, TPU, Dataflow, Near-Memory, Optical, Neuromorphic & more (w/ Author)
- Semiconductors: Everything You Wanted to Know
- ASML's Secret: An exclusive view from inside the global semiconductor giant | VPRO Documentary
- Semiconductor Expert Reveals Why US Export Controls Have Failed
- Nvidia Part III: The Dawn of the AI Era (2022-2023) (Audio)
- From Sand to Silicon: The Making of a Microchip | Intel
- How ASML, TSMC And Intel Dominate The Chip Market | CNBC Marathon
- How Microchips Are Made - Manufacturing of a Semiconductor
- I Can Die Now. - Intel Fab Tour!
- Chip Manufacturing - How are Microchips made? | Infineon
- Stanford Seminar - Nvidia’s H100 GPU
- Quick Tour of NVIDIA DGX H100
- How We DESTROYED the NVIDIA H100 GPU: The ULTIMATE Comino Tear Down! COMINO H100 WATERBLOCK TEASER
Misc AI governance
- AI Ethics Seminar with Matthijs Maas - Pausing AI & Technological Restraint - April 25, 2023
- Markus Anderljung Regulating increasingly advanced AI some hypotheses
- Sam Altman and William G. Gale discuss Taxation Solutions for Advanced AI
- Paul Scharre & Helen Toner on AI Capabilities & the Nature of Warfare
- Why governing AI is our opportunity to shape the long-term future? | Jade Leung | TEDxWarwickSalon + Priorities in AGI governance research | Jade Leung | EA Global: SF 22
- An Introduction to AI Governance | Ben Garfinkel | EAGxVirtual 2022
- The Windfall Clause: Sharing the benefits of advanced AI | Cullen O’Keefe + Sharing the Benefits of AI: The Windfall Clause (Rob Miles)
- Margaret Roberts & Jeffrey Ding: Censorship’s Implications for Artificial Intelligence
- Preparing for AI: risks and opportunities | Allan Dafoe | EAG 2017 London + AI Strategy, Policy, and Governance | Allan Dafoe
- Simeon Campos–Short Timelines, AI Governance, Field Building
- More than Deepfakes (Katerina Sedova and John Bansemer)
- AI and the Development, Displacement, or Destruction of the Global Legal Order (Matthijs Maas)
- Future-proofing AI Governance | The Athens Roundtable on AI and the Rule of Law 2022
- Having Our Cake and Eating It Too with Amanda Askell (covers incentives in AI development)
Ethics
- Shelly Kagan - The Moral Claims of AI
- 2022 Annual Uehiro Lectures in Practical Ethics, 'Ethics and Artificial Intelligence’
Career planning
- How I think students should orient to AI safety | Buck Shlegeris | EA Student Summit 2020
- AGISF - Careers in AI Alignment and Governance - Alex Lawsen
- Catherine Olsson & Daniel Ziegler on the 80,000 Hours Podcast
- AI Safety Careers | Rohin Shah, Lewis Hammond and Jamie Bernardi | EAGxOxford 22
- Early-Career Opportunities in AI Governance | Lennart Heim, Caroline Jeanmaire | EAGxOxford 22
- Artificial Intelligence Career Stories | EA Student Summit 2020
Forecasting
- Will AI end everything? A guide to guessing | Katja Grace | EAG Bay Area 23
- Neural Scaling Laws and GPT-3 (Jared Kaplan)
- WHY AND HOW OF SCALING LARGE LANGUAGE MODELS | NICHOLAS JOSEPH
- Jack Clark Presenting the 2022 AI Index Report
- Reasons you might think human level AI soon is unlikely | Asya Bergal | EAGxVirtual 2020
- Existential Risk Pessimism and the Time of Perils | David Thorstad | EAGxOxford 22
- Alex Lawsen—Forecasting AI Progress and Alex Lawsen forecasting videos
- Betting on AI is like betting on semiconductors in the 70's | Danny Hernandez | EA Global: SF 22 + Danny Hernandez on the 80,000 Hours Podcast
- AI safety | Katja Grace | EA Global: San Francisco 2017
- Experts' Predictions about the Future of AI
- Economic Growth in the Long Run: Artificial Intelligence Explosion or an Empty Planet?
- Katja Grace—Slowing Down AI, Forecasting AI Risk
- Sam Bowman - Are we under-hyping AI?
- Moore's Law, exponential growth, and extrapolation! (Steve Brunton)
Capabilities
- Satya Nadella Full Keynote Microsoft Ignite 2022 with Sam Altman, start at 12:25
- Competition-Level Code Generation with AlphaCode (Paper Review)
- It’s Time to Pay Attention to A.I. (ChatGPT and Beyond)
- The text-to-image revolution, explained + How the World Cup’s AI instant replay works (Vox)
- Creating a Space Game with OpenAI Codex
How AI works
- [1hr Talk] Intro to Large Language Models
- AGISF Week 0 Intro to ML
- Stanford CS25 - Transformers United
- Transformers, explained: Understand the model behind GPT, BERT, and T5; Transformers for beginners | What are they and how do they work
- Reinforcement learning playlist (Steve Brunton)
- How AI Image Generators Work (Stable Diffusion / Dall-E) - Computerphile + Stable Diffusion in Code (AI Image Generation) - Computerphile
- Attention in Neural Networks
- What is a transformer? + Implementing GPT-2 from scratch (Neel Nanda)
- The spelled-out intro to neural networks and backpropagation: building micrograd
- Neural Networks (3Blue1Brown)
- DeepMind x UCL RL Lecture Series - Introduction to Reinforcement Learning [1/13]
- Generative Adversarial Networks (GANs) - Computerphile
- Deep Learning for Computer Vision (Justin Johnson) lecture videos
- CS25 I Stanford Seminar - Transformers United: DL Models that have revolutionized NLP, CV, RL
- Broderick: Machine Learning, MIT 6.036 Fall 2020 + course page
- DEEP LEARNING - DS-GA 1008 · Spring 2020 · NYU
- Practical Deep Learning course | fast.ai
- Hugging Face NLP Course
- Transformer Attention - AI Safety at UCLA
China
- Re-deciphering China’s AI dream | Jeffrey Ding | EA Global: London 2019
- Sino-Western cooperation in AI safety | Brian Tse | EA Global: San Francisco 2019
- China's Long-Term Investments in AI Growth
- Why China is losing the microchip war (Vox)
Rationality
- Effective behavior change | Spencer Greenberg | EA Global: San Francisco 2019
- Making high impact decisions | Anna Edmonds | EA Global: SF 22
- Decision-making workshop: learn how to make better decisions | Spencer Greenberg
- Decoupling: a technique for reducing bias | David Manley | EA Student Summit 2020
- Rationality & Alignment | Ruby Bloom | EA Global: SF 22
Debates / discussions between people with different perspectives
Misc
- Slaughterbots + Why We Should Ban Lethal Autonomous Weapons + A.I. Is Making it Easier to Kill (You). Here’s How. | NYT
- Tobias Baumann on Artificial Sentience and Reducing the Risk of Astronomical Suffering
- The Doomsday Argument | PBS Space Time
- How will AI change the world?
- Forming your own views on AI safety (without stress!) | Neel Nanda | EA Global: SF 22 – also see Neel's presentation slides and "Inside Views Resources" doc
- Applied Linear Algebra Lectures (John Wentworth)
- AI alignment, philosophical pluralism, and the relevance of non-Western philosophy | Tan Zhi Xuan
- Anders Sandberg on Information Hazards
- Reframing superintelligence | Eric Drexler | EA Global: London 2018
- Robin Hanson | The Age of Em
- AlphaGo - The Movie | Full award-winning documentary
- Should We Build Superintelligence?
- Moloch section of Liv Boeree interview with Lex Fridman (and Ginsberg)
- Recordings of talks from the Japan AI Alignment Conference
- ChatGPT in Context. Part 1 - The Transformer, a Revolution in Computation (Piero Scaruffi)
- Is AI (ChatGPT, etc.) Sentient? A Perspective from Early Buddhist Psychology
Watching videos in a group
Discussion prompts
- Paul Christiano on AI alignment - discussion + Paul Christiano alignment chart
- Allan Dafoe on AI strategy, policy, and governance - discussion
- Vael Gates: Researcher Perceptions of Current and Future AI - discussion
- Sam Harris and Eliezer Yudkowsky on “AI: Racing Toward the Brink” -- discussion
Higher-level meeting tips
- Show the video → people discuss afterwards with prompts
- Active learning techniques: here
- You can skip around through parts of the video!
- See “Discussion Groups” from the EA Groups Resource Center
- Make the official meeting end after 1 hour so people are free to leave, but give people the option to linger for longer and continue their discussion.
- You can also do readings instead of videos, similar to this. Or play around with a model (e.g. test out hypotheses about how a language model works).
- Try to keep the video portion to under 20 minutes unless the video is really interesting.
- For a short video you could watch one of Apart’s ML + AI safety updates. Some of these contain many topics, so people can discuss what they find interesting.