What is a large language model?

Short explanation of LLMs and how they relate to language modeling.

What is grouped-query attention?

Practical explanation of grouped-query attention in transformer LLMs.

How do we evaluate LLM outputs?

Overview of approaches for evaluating generated LLM answers.

What is byte pair encoding?

Tokenization concept used by many language models.

Inference cache used to speed up autoregressive transformer decoding.

Machine Learning FAQ

It is always a pleasure to engage in discussions about machine learning. Below, I collected some of the most frequently asked questions that I answered via email or other social network platforms in hope that these are useful to others!

Looking for quick definitions? See the LLM Glossary for concise explanations of key terms like GQA, KV cache, RoPE, LoRA, MoE, and more.

Browse articles by topic: LLM Articles PyTorch Machine Learning Best Articles

The only thing to do with good advice is to pass it on. It is never of any use to oneself.
— Oscar Wilde

General Questions About Machine Learning and Data Science

Questions About the Machine Learning Field

Questions about Machine Learning Concepts and Statistics

Activation Functions

Cost/Loss Functions and Optimization

Deployment and Production

What is the difference between stateful and stateless training?

Regression Analysis

Tree-based Models

Model Evaluation

Logistic Regression

Neural Networks and Deep Learning

Convolutional Neural Networks

Preprocessing, Feature Selection, and Feature Extraction

Naive Bayes

Other

Programming Languages and Libraries for Data Science and Machine Learning

Large Language Models (LLMs)

Foundations

Pretraining and Generation

Attention, Transformers, and Context

Finetuning and Adaptation

Alignment and Evaluation

Architectures and Model Families

Efficiency and Deployment

Go deeper

These FAQ answers are intentionally concise. For longer implementations, research notes, and tutorials, the articles below go into more detail.

LLM Articles Finetuning, attention, tokenization, architecture, and evaluation — all on-domain posts. ML Fundamentals Model evaluation, PCA, LDA, Naive Bayes, and classical ML implementations. Best Articles by Topic A curated, non-chronological guide to the strongest starting points. LLM Glossary Concise definitions of GQA, KV cache, RoPE, LoRA, MoE, and 50+ more terms.