Robert Miles AI Safety
AI Safety, Alignment Theory, and Neural Networks with a focus on technical risk analysis and academic research.
Nutrition Label
Robert Miles provides high-fidelity breakdowns of AI safety research, translating dense academic papers into accessible, rigorous explainers. Viewers can expect deep dives into alignment theory, mesa-optimization, and instrumental convergence, often illustrated with precise analogies and code. His content prioritizes educational accuracy and technical nuance over hype.
Strengths
- +
- +
- +
Notes
- !Citations are rigorous; check video descriptions for direct links to the academic papers and datasets discussed.
- !Content focuses on theoretical alignment and safety risks rather than consumer product reviews or tutorials.
Rating Breakdown
Breakdown across the key dimensions we rate. Methodology →
Recent Videos

Tech is Good, AI Will Be Different

AI Safety Career Advice! (And So Can You!)

Using Dangerous AI, But Safely?

AI Ruined My Year

Why Does AI Lie, and What Can We Do About It?

Deceptive Misaligned Mesa-Optimisers? It's More Likely Than You Think...

The OTHER AI Alignment Problem: Mesa-Optimizers and Inner Alignment

We Were Right! Real Inner Misalignment

Intro to AI Safety, Remastered

Quantilizers: AI That Doesn't Try Too Hard
Why this rating
Evidence receipts showing why each dimension is rated the way it is.
“Once you have AI systems intelligently pursuing their own goals, you have the first ever technology which isn't just about enabling people to get what they want, but about the technology itself getting what it wants.”[04:40] →
Demonstrates precise command of alignment theory concepts (instrumental convergence, agency) and synthesizes them accurately without jargon overload.
“I didn't take any money from 80,000 Hours for this video, to make sure I would be free to say whatever I wanted.”[14:55] →
The creator explicitly discloses the lack of financial incentive to establish independence from the organization being promoted.
“So if the attack policy looks for good opportunities to insert a backdoor and actually goes for it on 0.1% of the problems, there's a probability of 0.85 that it will do it at least once without being spotted by human oversight.”[16:58] →
Breaks down the specific statistical logic used in the paper to determine optimal attack rates for the 'Red Team'.