Build a Large Language Model (From Scratch)

LLMs from Scratch Cover

ISBN-13 978-1633437166

Description

In Build a Large Language Model (from Scratch), you’ll discover how LLMs work from the inside out. In this book, I’ll guide you step by step through creating your own LLM, explaining each stage with clear text, diagrams, and examples.

The method described in this book for training and developing your own small-but-functional model for educational purposes mirrors the approach used in creating large-scale foundational models such as those behind ChatGPT. The book uses Python and PyTorch for all its coding examples.

Reviews

I got a serious closeup look at what goes on inside an LLM. every step of the way, the book surprised with great detail, reiteration, recap and very manageable chunks to internalize the ideas.

–Via Ganapathy Subramaniam, Gen AI developer

‘Build a Large Language Model from Scratch’ by Sebastian Raschka @rasbt has been an invaluable resource for me, connecting many dots and sparking numerous ‘aha’ moments.

This book comes highly recommended for gaining a hands-on understanding of large language models.

Via Faisal Alsrheed, AI researcher

While learning a new concept, I have always felt more confident about my understanding of the concept if I’m able to code it myself from scratch. Most tutorials tend to cover the high level concept and leave out the minor details, and the absence of these details is acutely felt when you try to put these concepts into code. Thats why I really appreciate Sebastian Raschka, PhD’s latest book - Build a Large Language Model (from scratch).

At a time when most LLM implementations tend to use high level packages (transformers, timm), its really refreshing to see the progressive development of an LLM by coding the core building blocks using basic PyTorch elements. It also makes you appreciate how some of the core building blocks of SOTA LLMs can be distilled down to relatively simple concepts.

Roshan Santhosh, Data Scientist at Meta

Ultimate hands on guide to build foundational models. This is the book you want to buy if you want to go deep. –Antonio Gulli, Google Sr Director

It is a great book. I learned many things that were not clear to me. I highly recommend this book.

–Via Tae-Wan Kim, Professor, Seoul National University


A high-level, no-code overview that explains the development of an LLM, featuring numerous figures from the book, which itself focuses on the underlying code that implements these processes:





Machine Learning Q and AI

Machine Learning and Q and AI

ISBN-10: 1718503768 ISBN-13: 978-1718503762 Paperback: 264 pages No Starch Press (March, 2024)

Description

If you’re ready to venture beyond introductory concepts and dig deeper into machine learning, deep learning, and AI, the question-and-answer format of Machine Learning Q and AI will make things fast and easy for you, without a lot of mucking about.

Each brief, self-contained chapter journeys through a fundamental question in AI, unraveling it with clear explanations, diagrams, and exercises.

  • Multi-GPU training paradigms
  • Finetuning transformers
  • Differences between encoder- and decoder-style LLMs
  • Concepts behind vision transformers
  • Confidence intervals for ML
  • And many more!

This book is a fully edited and revised version of Machine Learning Q and AI, which was available on Leanpub.

Reviews

“Sebastian has a gift for distilling complex, AI-related topics into practical takeaways that can be understood by anyone. His new book, Machine Learning Q and AI, is another great resource for AI practitioners of any level.”
–Cameron R. Wolfe, Writer of Deep (Learning) Focus

“Sebastian uniquely combines academic depth, engineering agility, and the ability to demystify complex ideas. He can go deep into any theoretical topics, experiment to validate new ideas, then explain them all to you in simple words. If you’re starting your journey into machine learning, Sebastian is your guide.”
–Chip Huyen, Author of Designing Machine Learning Systems

“One could hardly ask for a better guide than Sebastian, who is, without exaggeration, the best machine learning educator currently in the field. On each page, Sebastian not only imparts his extensive knowledge but also shares the passion and curiosity that mark true expertise.”
–Chris Albon, Director of Machine Learning, The Wikimedia Foundation




Machine Learning with PyTorch and Scikit-Learn

Machine Learning with PyTorch and Scikit-Learn

ISBN-10: 1801819319 ISBN-13: 978-1801819312 Paperback: 770 pages Packt Publishing Ltd. (February 25, 2022)

About this book

Initially, this project started as the 4th edition of Python Machine Learning. However, after putting so much passion and hard work into the changes and new topics, we thought it deserved a new title. So, what’s new? There are many contents and additions, including the switch from TensorFlow to PyTorch, new chapters on graph neural networks and transformers, a new section on gradient boosting, and many more that I will detail in a separate blog post. For those who are interested in knowing what this book covers in general, I’d describe it as a comprehensive resource on the fundamental concepts of machine learning and deep learning. The first half of the book introduces readers to machine learning using scikit-learn, the defacto approach for working with tabular datasets. Then, the second half of this book focuses on deep learning, including applications to natural language processing and computer vision. While basic knowledge of Python is required, this book will take readers on a journey from understanding machine learning from the ground up towards training advanced deep learning models by the end of the book.

Reviews

“I’m confident that you will find this book invaluable both as a broad overview of the exciting field of machine learning and as a treasure of practical insights. I hope it inspires you to apply machine learning for the greater good in your problem area, whatever it might be.”

Dmytro Dzhulgakov, PyTorch Core Maintainer

“This 700-page book covers most of today’s widely used machine learning algorithms, and will be especially useful to anybody who wants to understand modern machine learning through examples of working code. It covers a variety of approaches, from basic algorithms such as logistic regression to very recent topics in deep learning such as BERT and GPT language models and generative adversarial networks. The book provides examples of nearly every algorithm it discusses in the convenient form of downloadable Jupyter notebooks that provide both code and access to datasets. Importantly, the book also provides clear instructions on how to download and start using state-of-the-art software packages that take advantage of GPU processors, including PyTorch and Google Colab.”

Tom M. Mitchell, professor, founder and former Chair of the Machine Learning Department at Carnegie Mellon University (CMU)

More information

Translations

Python Machine Learning Japanese

Machine Learning with PyTorch and Scikit-Learn in Serbian

Python Machine Learning Spanish

Python Machine Learning Korean

  • Japanese ISBN-13: 978-4295015581
  • Serbian ISBN-13: 978-8673105772
  • Spanish ISBN-13: 978-8426735737
  • Korean ISBN-13: 979-1140707362






Older Books

You can find a list of all my books here.