In LoRA, rank and alpha are the two most important adapter hyperparameters.

Rank controls the inner dimension of the low-rank update. In practical terms, it controls how much extra capacity the adapter has.

  • lower rank means fewer trainable parameters and lower cost
  • higher rank means more expressive adapters, but less parameter efficiency

The basic LoRA layer in the repo is built from two small matrices, and the rank determines the size of that bottleneck

Alpha controls the scale of the LoRA update relative to the frozen base layer. In the repo code, the LoRA contribution is scaled by alpha / rank, which helps make runs across different ranks easier to compare without constantly retuning everything else.

That means alpha is not about capacity in the same way rank is. It is more about how strongly the adapter update influences the final layer output.

In practice:

  • if rank is too low, the adapter may underfit
  • if alpha is too low, the adapter may have too little effect
  • if alpha is too high, the adapter can dominate too aggressively and training may become less well behaved

The repo’s wrapped linear layer view makes it clear that LoRA behavior depends both on the structure of the adapter and on how strongly its output is added back into the frozen base layer

So rank and alpha work together:

  • rank decides how much structure the adapter can learn
  • alpha decides how much that learned structure changes the layer output

There is no single universally best pair. Smaller tasks and tighter hardware budgets often work with lower ranks, while harder adaptations may benefit from larger ranks and a well-chosen alpha.

In short, rank determines the capacity and size of a LoRA adapter, while alpha determines the strength of its contribution, and practical LoRA tuning is largely about choosing a good balance between those two.