Book companion hub

The best way to understand how reasoning techniques work is by coding them from scratch. Build a Reasoning Model (From Scratch) starts from a pretrained LLM and walks through evaluation, inference-time scaling, reinforcement learning, and distillation in Python and PyTorch.

Build a Reasoning Model From Scratch book cover

GitHub repository Star 4.4k

High-level mental model for building a reasoning model from scratch
The book's working path: start from a pretrained LLM, evaluate reasoning behavior, add inference-time scaling, reinforcement learning, and distillation.

Study Guide

1. Read

Start with the chapter

Read the chapter first so the reasoning technique, evaluation setup, and training objective are clear before coding.

Open book page
2. Code

Build alongside the book chapter

Retype and run the code after reading each chapter for the best (but most time-intensive) learning experience. Otherwise, execute the notebooks cell by cell and edit small parts when you want to explore an idea. (I have some more tips on reading books here, if you are interested)

Open code repository
3. Exercises

Use the exercises as the check

Try the exercises at the end of each chapter before looking at the solutions. The exercises help self-check whether you understood the chapter implementation.

Feedback

See book page for more testimonials.

"One of the best resources I've seen on reasoning models."

Ivan Leo, Member of Technical Staff, Google DeepMind

"The most important topic in modern AI taught in the best way possible: by building it from the ground up."

Logan Thorneloe, Software Engineer, Google and author of AI for Software Engineers

"This book doesn't just explain reasoning models, it equips you to build, test, and truly understand them from first principles."

Aman Chadha, Senior Staff Tech Lead / Senior Manager, Google DeepMind

"By building reasoning models from scratch, you gain a level of understanding that papers alone cannot provide."

Omar Sanseviero, Developer Experience Lead, Google DeepMind

Chapter Map

Chapter 1 Conceptual map of reasoning models and the implementation path. Open Chapter 1 code
Chapter 2 Text generation with a pretrained LLM before adding reasoning-specific techniques. Open Chapter 2 code
Chapter 3 Evaluating reasoning models with parsing, verifiers, and benchmark-style checks. Open Chapter 3 code
Chapter 4 Inference-time scaling methods such as best-of-N and majority voting. Open Chapter 4 code
Chapter 5 Self-refinement and additional inference-time scaling experiments. Open Chapter 5 code
Chapter 6 Training reasoning models with reinforcement learning and verifiable rewards. Open Chapter 6 code
Chapter 7 Improving policy optimization and GRPO-style reinforcement learning workflows. Open Chapter 7 code
Chapter 8 Distilling reasoning models and generating reasoning distillation data. Open Chapter 8 code
Appendix A References and further reading for the reasoning-model chapters. Open repository root
Appendix B Exercise solutions, with code notebooks in the corresponding chapter folders. Open repository root
Appendix C Qwen3 LLM source code for connecting the book implementation to a modern open-weight model. Open Appendix C code
Appendix D Using larger LLMs in the reasoning workflow. Open Appendix D code
Appendix E Batching and throughput-oriented execution. Open Appendix E code
Appendix F Common approaches to LLM evaluation. Open Appendix F code
Appendix G Building a chat interface for the reasoning model workflow. Open Appendix G code

Related Reasoning Articles

These articles are complementary resources for the book. Use them for additional context around reasoning models, inference-time scaling, and reinforcement learning, while keeping the book chapters as the main path.

Understanding Reasoning LLMs

A high-level map of what reasoning models are and how to think about their behavior.

Read the article

Inference-Time Scaling Categories

A practical taxonomy for methods that spend more compute during generation.

Read the article

Reinforcement Learning for LLM Reasoning

An overview of RL-based reasoning training, including why verifiable tasks matter.

Read the article

Reasoning and Inference Scaling

A broader look at test-time compute and reasoning-model behavior in current systems.

Read the article

First Look at Reasoning From Scratch

A preview of the book's motivation, scope, and implementation-first direction.

Read the article

DGX Spark and Local PyTorch Workflows

Notes on local hardware experiments, including reasoning-from-scratch development context.

Read the article

Where to Go Next

After finishing the book, use these links to connect the reasoning workflow back to LLM fundamentals, architecture references, and related follow-up articles.

Foundation

Build a Large Language Model (From Scratch)

Go here if you want the bottom-up architecture, text embedding, attention, pretraining, and finetuning path first.

Open LLM hub
Reference

LLM Architecture Gallery

Use the gallery to connect the reasoning workflow to the model families and architecture choices behind current LLMs.

Open gallery
Bonus material

Inference-Time Scaling Categories

Read this after Chapters 4 and 5 to place best-of-N, majority voting, and self-refinement in a broader taxonomy.

Read scaling overview
Bonus material

Reinforcement Learning for LLM Reasoning

Use this with Chapters 6 and 7 for more context on RLVR, verifiable rewards, and reasoning-model training.

Read RL overview