ICML 2024 tutorial for project overview
SSRN paper: https://ssrn.com/abstract=5240330 (v1.1, last updated May 19, 2025)
Alternative download: http://zeyuan.allen-zhu.com/paper/2025-canon.pdf
Authors: Zeyuan Allen-Zhu
v2 is in progress: many new exciting results + larger scale real-life experiments + code release; stay tuned.
@article{Allen2025-canon,
author = {{Allen-Zhu}, Zeyuan},
title = {{Physics of Language Models: Part 4.1, Architecture Design and the Magic of Canon Layers}},
year = {2025},
month = may,
journal = {SSRN Electronic Journal},
note = {\url{https://ssrn.com/abstract=5240330}}
}