This page aims at linking to resources I like related to extinction risks from AI.



There are resources of different levels of difficulty

Outline of this page: 

 I. Intro Resources on AI Extinction Risks.

II. Deep Dive into Specific Arguments & Concrete Scenarios

III. AI Governance & Policy 

IV. Some Context on Frontier AI Labs

V. Technical Safety Work on Current AI Architectures

VI. Attempts at Building Provably Safe AI Architectures


If you want to learn how neural networks work in the first place, here's the least worst intro resource I know of.

Why is AI an Existential Risk? 

General Technical Explainers

Concrete Scenarios

This is in increasing order of difficulty to understand: 

Core Alignment Failures & Arguments


AI Governance & Policy 

Compute Governance



Challenges & Approaches to Governance

Context About AGI Labs

Technical Safety Work On Current Architectures 

Interpretability 


Evaluating Models


Failure Demonstrations & Arguments

Attempts at Building Provably Safe Architectures (Technical)

David Dalrymple's Open Agency Architecture


The Learning-Theoretic Agenda 


John Wentworth's Plan


Cognitive Emulation (CoEm)