Hello, I'm Sebastian Raschka, PhD
I am an LLM Research Engineer with over a decade of experience in artificial intelligence. My work bridges academia and industry, including roles as senior engineer at Lightning AI and as a statistics professor at the University of Wisconsin-Madison.
I am also the author of Build a Large Language Model (From Scratch).
My expertise lies in LLM research and the development of high-performance AI systems, with a deep focus on practical, code-driven implementations. (For my most up-to-date CV details, please visit my LinkedIn profile.)
Recent Notes and Blog Entries
Recent Developments in LLM Architectures: KV Sharing, mHC, and Compressed Attention
May 16, 2026
From Gemma 4 to DeepSeek V4, How New Open-Weight LLMs Are Reducing Long-Context Costs
My Workflow for Understanding LLM Architectures
Apr 18, 2026
A learning-oriented workflow for understanding new open-weight model releases
Apr 4, 2026
How coding agents use tools, memory, and repo context to make LLMs work better in practice
A Visual Guide to Attention Variants in Modern LLMs
Mar 22, 2026
From MHA and GQA to MLA, sparse attention, and hybrid architectures
Mar 14, 2026
I put together a new LLM Architecture Gallery that collects the architecture figures from my recent comparison articles in one place, together with compact fact sheets and links.