Posts

Introspective interpretability

Language models, world models, and human model-building

Notes on teaching GPT-3 adding numbers