Machine Learning FAQ
Why can a model have low training loss but still generate poor text?
A model can have low training loss but still generate poor text because low loss only means it predicts the next token well on the training setup. That is not the same as producing good multi-sentence outputs in open-ended use.
The training objective is local: for each position, predict the next token. But good text quality is global. It depends on coherence, factuality, structure, instruction-following, and avoiding drift over many steps.

There are several common reasons for this mismatch.
Overfitting
The model may drive training loss down while generalizing poorly to new text. In that case, validation behavior matters more than training loss.
One-step versus rollout quality
During training, the model sees correct prefixes. During generation, it conditions on its own outputs. Small mistakes can compound over time even if one-step prediction loss was low.
Objective mismatch
Cross-entropy does not directly optimize for truthfulness, helpfulness, style, or task completion. A model can be statistically good at continuation without being especially useful to a human.
Decoding choices
Bad sampling settings can make a decent model look poor. Generation quality depends on both the trained model and the decoding policy.

This is one reason later repo chapters move beyond plain pretraining. Finetuning, instruction tuning, preference tuning, and better evaluation all exist because low next-token loss is not enough by itself.
So low loss is still valuable. It usually means the model learned something real. But it is only one part of the story.
In short, a model can have low training loss and still generate poor text because next-token prediction measures local fit to training data, while real generation quality depends on generalization, long-horizon behavior, decoding, and downstream alignment with what users actually want.