A Three-Phase Factual Recall Circuit in Gemma-2B and Gemma-12B-IT
Activation patching reveals how facts are stored, routed, and read out across transformer layers, and why the residual stream does most of the work
The post A Three-Phase Factual Recall Circuit in Gemma-2B and Gemma-12B-IT appeared first on Towards Data Science.
Read the full article on the original site.
Read Full Article