Mechanistic Interpretability 1 0 Hackathon Neel Nanda

Media Summary: This is a talk I gave to my MATS 9.0 training scholars about the big picture of mech interp - as of Oct 2025, what had changed? This is a talk I gave to my MATS scholars, with a stylised history of the field of How can we reverse engineer what a neural network is doing? In this IASEAI '25 session, An Introduction to

Mechanistic Interpretability 1 0 Hackathon Neel Nanda - Detailed Analysis & Overview

This is a talk I gave to my MATS 9.0 training scholars about the big picture of mech interp - as of Oct 2025, what had changed? This is a talk I gave to my MATS scholars, with a stylised history of the field of How can we reverse engineer what a neural network is doing? In this IASEAI '25 session, An Introduction to When Anthropic tested Claude Sonnet 4.5 for alignment, the model appeared perfectly behaved — but it turned out the model had ... Visit our sponsor 80000 hours - grab their free career guide and check out their podcast! Use our ... Art by Clipped from episode 19 of AXRP: Transcript of that episode: ...

A talk I gave to my MATS 9.0 training program about reasoning model Warning: This is an ad-libbed talk, and I'm sure I got some facts wrong. This is a talk I gave to my MATS 9.0 training program on ...