Build a Reasoning Model (From Scratch)
Book companion hub
The best way to understand how reasoning techniques work is by coding them from scratch. Build a Reasoning Model (From Scratch) starts from a pretrained LLM and walks through evaluation, inference-time scaling, reinforcement learning, and distillation in Python and PyTorch.
Study Guide
Start with the chapter
Read the chapter first so the reasoning technique, evaluation setup, and training objective are clear before coding.
Open book pageBuild alongside the book chapter
Retype and run the code after reading each chapter for the best (but most time-intensive) learning experience. Otherwise, execute the notebooks cell by cell and edit small parts when you want to explore an idea. (I have some more tips on reading books here, if you are interested)
Open code repositoryUse the exercises as the check
Try the exercises at the end of each chapter before looking at the solutions. The exercises help self-check whether you understood the chapter implementation.
Feedback
See book page for more testimonials.
"One of the best resources I've seen on reasoning models."
"The most important topic in modern AI taught in the best way possible: by building it from the ground up."
"This book doesn't just explain reasoning models, it equips you to build, test, and truly understand them from first principles."
"By building reasoning models from scratch, you gain a level of understanding that papers alone cannot provide."
Chapter Map
| Chapter 1 | Conceptual map of reasoning models and the implementation path. | Open Chapter 1 code |
| Chapter 2 | Text generation with a pretrained LLM before adding reasoning-specific techniques. | Open Chapter 2 code |
| Chapter 3 | Evaluating reasoning models with parsing, verifiers, and benchmark-style checks. | Open Chapter 3 code |
| Chapter 4 | Inference-time scaling methods such as best-of-N and majority voting. | Open Chapter 4 code |
| Chapter 5 | Self-refinement and additional inference-time scaling experiments. | Open Chapter 5 code |
| Chapter 6 | Training reasoning models with reinforcement learning and verifiable rewards. | Open Chapter 6 code |
| Chapter 7 | Improving policy optimization and GRPO-style reinforcement learning workflows. | Open Chapter 7 code |
| Chapter 8 | Distilling reasoning models and generating reasoning distillation data. | Open Chapter 8 code |
| Appendix A | References and further reading for the reasoning-model chapters. | Open repository root |
| Appendix B | Exercise solutions, with code notebooks in the corresponding chapter folders. | Open repository root |
| Appendix C | Qwen3 LLM source code for connecting the book implementation to a modern open-weight model. | Open Appendix C code |
| Appendix D | Using larger LLMs in the reasoning workflow. | Open Appendix D code |
| Appendix E | Batching and throughput-oriented execution. | Open Appendix E code |
| Appendix F | Common approaches to LLM evaluation. | Open Appendix F code |
| Appendix G | Building a chat interface for the reasoning model workflow. | Open Appendix G code |
Related Reasoning Articles
These articles are complementary resources for the book. Use them for additional context around reasoning models, inference-time scaling, and reinforcement learning, while keeping the book chapters as the main path.
Understanding Reasoning LLMs
A high-level map of what reasoning models are and how to think about their behavior.
Read the articleInference-Time Scaling Categories
A practical taxonomy for methods that spend more compute during generation.
Read the articleReinforcement Learning for LLM Reasoning
An overview of RL-based reasoning training, including why verifiable tasks matter.
Read the articleReasoning and Inference Scaling
A broader look at test-time compute and reasoning-model behavior in current systems.
Read the articleFirst Look at Reasoning From Scratch
A preview of the book's motivation, scope, and implementation-first direction.
Read the articleDGX Spark and Local PyTorch Workflows
Notes on local hardware experiments, including reasoning-from-scratch development context.
Read the articleWhere to Go Next
After finishing the book, use these links to connect the reasoning workflow back to LLM fundamentals, architecture references, and related follow-up articles.
Build a Large Language Model (From Scratch)
Go here if you want the bottom-up architecture, text embedding, attention, pretraining, and finetuning path first.
Open LLM hubLLM Architecture Gallery
Use the gallery to connect the reasoning workflow to the model families and architecture choices behind current LLMs.
Open galleryInference-Time Scaling Categories
Read this after Chapters 4 and 5 to place best-of-N, majority voting, and self-refinement in a broader taxonomy.
Read scaling overviewReinforcement Learning for LLM Reasoning
Use this with Chapters 6 and 7 for more context on RLVR, verifiable rewards, and reasoning-model training.
Read RL overview