Physics of Language Models: Part 2.1,
Grade-School Math and the Hidden Reasoning Process

arXiv paper: https://arxiv.org/abs/2407.20311  (last update: July 2024)
Authors: Tian Ye, Zicheng Xu, Yuanzhi Li and Zeyuan Allen-Zhu

Code Release - 12/15/2024: We believe in the importance of sharing our codebase for the iGSM data generation pipeline. However, we kindly ask for your patience as we take the necessary time for legal review. As a small team, we are not top in the queue in terms of priorities, so we have to wait. In the meantime, our paper includes the complete pseudocode for the data generation process. Thank you for your understanding.

Slide show (best viewed on a computer)

@article{YXLA2024-gsm1,
  author = {Ye, Tian and Xu, Zicheng and Li, Yuanzhi and {Allen-Zhu}, Zeyuan},
  title = {{Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process}},
  journal = {ArXiv e-prints},
  year = 2024,
  month = jul,
  volume = {abs/2407.20311},
  note = {Full version available at \url{http://arxiv.org/abs/2407.20311}}
}