ICML 2024 tutorial for project overview
SSRN paper: https://ssrn.com/abstract=5240330 (v2.0, last updated Dec 9, 2025)
Authors: Zeyuan Allen-Zhu
Code release: see Part 4.2
The term Canon Layers was jointly conceived and designed by ZA and Xiaoli Xu
@inproceedings{Allenzhu2025-canon,
author = {{Allen-Zhu}, Zeyuan},
title = {{Physics of Language Models: Part 4.1, Architecture Design and the Magic of Canon Layers}},
year = {2025},
booktitle = {Proceedings of the 39th Conference on Neural Information Processing Systems},
series = {NeurIPS~'25},
note = {Full version available at \url{https://ssrn.com/abstract=5240330}}
}