Build a Large Language Model (From Scratch)

LLMs from Scratch Cover

In Build a Large Language Model (from Scratch), you’ll discover how LLMs work from the inside out. In this book, I’ll guide you step by step through creating your own LLM, explaining each stage with clear text, diagrams, and examples.

The method described in this book for training and developing your own small-but-functional model for educational purposes mirrors the approach used in creating large-scale foundational models such as those behind ChatGPT. The book uses Python and PyTorch for all its coding examples.




Machine Learning Q and AI

Machine Learning and Q and AI

If you’ve locked down the basics of machine learning and AI and want a fun way to address lingering knowledge gaps, Machine Learning Q and AI is for you. This rapid-fire series of short chapters addresses 30 essential questions in the field, helping you stay current on the latest technologies you can implement in your own work.

Each of the 30 short chapters of Machine Learning and AI asks and answers a central question, with diagrams to explain new concepts and ample references for further reading

  • Multi-GPU training paradigms
  • Finetuning transformers
  • Differences between encoder- and decoder-style LLMs
  • Concepts behind vision transformers
  • Confidence intervals for ML
  • And many more!

This book is a fully edited and revised version of Machine Learning Q and AI, which was available on Leanpub.

Reviews

“One could hardly ask for a better guide than Sebastian, who is, without exaggeration, the best machine learning educator currently in the field. On each page, Sebastian not only imparts his extensive knowledge but also shares the passion and curiosity that mark true expertise.”
– Chris Albon, Director of Machine Learning, The Wikimedia Foundation




Machine Learning with PyTorch and Scikit-Learn

Machine Learning with PyTorch and Scikit-Learn

ISBN-10: 1801819319 ISBN-13: 978-1801819312 Paperback: 770 pages Packt Publishing Ltd. (February 25, 2022)

About this book

Initially, this project started as the 4th edition of Python Machine Learning. However, after putting so much passion and hard work into the changes and new topics, we thought it deserved a new title. So, what’s new? There are many contents and additions, including the switch from TensorFlow to PyTorch, new chapters on graph neural networks and transformers, a new section on gradient boosting, and many more that I will detail in a separate blog post. For those who are interested in knowing what this book covers in general, I’d describe it as a comprehensive resource on the fundamental concepts of machine learning and deep learning. The first half of the book introduces readers to machine learning using scikit-learn, the defacto approach for working with tabular datasets. Then, the second half of this book focuses on deep learning, including applications to natural language processing and computer vision. While basic knowledge of Python is required, this book will take readers on a journey from understanding machine learning from the ground up towards training advanced deep learning models by the end of the book.

Praise

“I’m confident that you will find this book invaluable both as a broad overview of the exciting field of machine learning and as a treasure of practical insights. I hope it inspires you to apply machine learning for the greater good in your problem area, whatever it might be.”

Dmytro Dzhulgakov, PyTorch Core Maintainer

“This 700-page book covers most of today’s widely used machine learning algorithms, and will be especially useful to anybody who wants to understand modern machine learning through examples of working code. It covers a variety of approaches, from basic algorithms such as logistic regression to very recent topics in deep learning such as BERT and GPT language models and generative adversarial networks. The book provides examples of nearly every algorithm it discusses in the convenient form of downloadable Jupyter notebooks that provide both code and access to datasets. Importantly, the book also provides clear instructions on how to download and start using state-of-the-art software packages that take advantage of GPU processors, including PyTorch and Google Colab.”

Tom M. Mitchell, professor, founder and former Chair of the Machine Learning Department at Carnegie Mellon University (CMU)

More info

Translations

Machine Learning with PyTorch and Scikit-Learn in Serbian

Python Machine Learning Japanese

Python Machine Learning Spanish

  • Serbian ISBN-13: 978-8673105772
  • Japanese ISBN-13: 978-4295015581
  • Spanish ISBN-13: 978-8426735737






Older Books

Python Machine Learning, 3rd Edition

Python Machine Learning

ISBN-10: 1789955750 ISBN-13: 978-1789955750 Paperback: 770 pages; ebook available in Kindle format, Epub, PDF Packt Publishing Ltd. (December 12th, 2019)

What’s new in this third edition? Many readers have told us how much they love the first 12 chapters of the book as a comprehensive introduction to machine learning and Python’s scientific computing stack. To keep these chapters relevant and to improve the explanations based on reader feedback, we updated them to support the latest versions of NumPy, SciPy, and scikit-learn.

One of the most exciting events in the deep learning world was the release of TensorFlow 2. Consequently, all the TensorFlow-related deep learning chapters have received a big overhaul. Since TensorFlow 2 introduced many new features and fundamental changes, we rewrote these chapters from scratch. Furthermore, we added a new chapter on Generative Adversarial Networks, which are one of the hottest topics in deep learning research, as well as a comprehensive introduction to reinforcement learning based on numerous requests from readers.

Reviews

Translations

Python Machine Learning Italian

Python Machine Learning German

Python Machine Learning Serbian

Python Machine Learning Korean

Python Machine Learning Japabese

  • Italian ISBN-13: 978-8850335244 (Machine Learning con Python – nuova edizione)
  • Serbian ISBN-13: 978-8673105499 (Python mašinsko učenje)
  • Korean ISBN-13 TBD (머신 러닝 교과서 with 파이썬, 사이킷런, 텐서플로)
  • German ISBN-13 978-3747502136 (Machine Learning mit Python und Keras, TensorFlow 2 und Scikit-learn: Das umfassende Praxis-Handbuch für Data Science, Deep Learning und Predictive Analytics)




Python Machine Learning, 2nd Edition

Python Machine Learning

ISBN-10: 1787125939 ISBN-13: 978-1787125933 Paperback: 622 pages; ebook available in Kindle format, Epub, PDF Packt Publishing Ltd. (September 20th, 2017)

From the back cover:

Machine learning is eating the software world, and now deep learning is extending machine learning. This second edition of Sebastian Raschka’s bestselling book, Python Machine Learning, is now thoroughly updated using the latest Python open source libraries, so that you can understand and work at the cutting-edge of machine learning, neural networks, and deep learning.

This highly acclaimed book has been modernized to include the popular TensorFlow deep learning library, essential coverage of the Keras neural network library, and the latest scikit-learn machine learning library updates. The result is a new edition of this classic book at the cutting edge of deep learning and machine learning.

If you’re new to machine learning, you’ll find that this edition offers the techniques you need to create machine learning and deep learning applications. Raschka and Mirjalili introduce you to machine learning and deep learning algorithms from scratch, and if you read the first edition of this book, you’ll be delighted to find a new balance of classical and modern ideas.

Translations

Python Machine Learning Spanish

Python Machine Learning Spanish

Python Machine Learning Spanish

Python Machine Learning Korean

Python Machine Learning Spanish

  • Japanese ISBN-13: 978-4295003373
  • Spanish ISBN-13: 978-8426727206
  • German ISBN-13: 978-3958457331
  • Korean ISBN-13: 979-1160507966
  • Polish ISBN-13: 978-8328351219



Python Machine Learning, 1st Edition

This book will teach you the fundamentals of machine learning and how to utilize these in real-world applications using Python. Step-by-step, you will expand your skill set with the best practices for transforming raw data into useful information, developing learning algorithms efficiently, and evaluating results.

Python Machine Learning What you can expect are 400 pages rich in useful material just about everything you need to know to get started with machine learning. My mission was to not treat algorithms as a black box, provide the necessary math intuition in the most accessible way, and provide code examples to put the learned material into action.

Knowledge is gained by learning, the key is our enthusiasm, but the true mastery of skills can only be achieved by practice.

The focus of this book will help you to understand machine learning concepts and algorithms. We will implement algorithms from scratch in Python and NumPy to complement our learning experience, go over many examples using scikit-learn for our own convenience, and optimize our code via Theano and Keras for neural network training on GPUs.

ISBN-10: 1783555130 ISBN-13: 978-1783555130 Paperback: 454 pages, ebook Packt Publishing Ltd. (September 24th, 2015)

Sebastian Raschka’s new book, Python Machine Learning, has just been released. I got a chance to read a review copy and it’s just as I expected - really great! It’s well organized, super easy to follow, and it not only offers a good foundation for smart, non-experts, practitioners will get some ideas and learn new tricks here as well. – Lon Riesberg at Data Elixir

Superb job! Thus far, for me it seems to have hit the right balance of theory and practice…math and code!Brian Thomas

I’ve read (virtually) every Machine Learning title based around scikit-learn and this is hands-down the best one out there.Jason Wolosonovich

If you need help to decide whether this book is for you, check out some of the “longer” reviews linked below. (If you wrote a review, please let me know, and I’d be happy to add it to the list).

Sebastian Raschka created an amazing machine learning tutorial which combines theory with practice. The book explains machine learning from a theoretical perspective and has tons of coded examples to show how you would actually use the machine learning technique. It can be read by a beginner or advanced programmer.

Translations

Python Machine Learning German

Python Machine Learning Japanese

Python Machine Learning Italian

Python Machine Learning Korean

Python Machine Learning Chinese (traditional)

Python Machine Learning Chinese (simple)

Python Machine Learning Russian

Python Machine Learning Polish

  • German ISBN-13: 978-3958454224
  • Japanese ISBN-13: 978-4844380603
  • Italian ISBN-13: 978-8850333974
  • Chinese (traditional) ISBN-13: 978-9864341405
  • Chinese (simple) ISBN-13: 978-7111558804
  • Korean ISBN-13: 979-1187497035
  • Russian ISBN-13: 978-5970604090
  • Polish ISBN-13: 978-8328336131



Heat Maps in R: How-To

Sebastian Raschka

We are living in the information age where huge amounts of data are readily available to everyone. In my book, I provide a practical hands-on approach of how to create heat maps using the free and probably most popular Statistical Software Package: R. Don’t worry, I already did the hard work for you and provide all the code you’ll need to create great heat maps from your data. Detailed information on each approach make this book a valuable experience for beginners as well as experienced users of R.

My honest opinion: This book is a couple of years old by now and many new packages have been been developed in R since then. Although this book contains a little bit more than “just” heat maps, maybe one of my blog articles is already sufficient to get you started.

ISBN-10: 1782165649 ISBN-13: 78-1782165644 Paperback: 72 pages, ebook Packt Publishing (June, 2013)




Book Chapters

Python Interviews: Discussions with Python Experts

Sebastian Raschka Python Interviews

A book featuring 20 interviews with Python experts from a diverse set of fields. In my interview, I answer Mike’s questions on the use of Python for data science, machine learning, and AI.

ISBN-13: 978-1788399081 ASIN: B0793XYQGZ Packt Publishing (February, 2018)











Computational Drug Discovery and Design 2nd Edition: Techniques for Developing Reliable Machine Learning Classifiers Applied to Understanding and Predicting Protein:Protein Interaction Hot Spots.

Sebastian Raschka Computational Bio and ML

This book aims to provide protocols for the use of bioinformatics tools in drug discovery and design. With my co-authors, I contributed a chapter on using machine learning to predict protein hot-spots:

  • Jiaxing Chen, Leslie A. Kuhn, and Sebastian Raschka (2023) Computational Drug Discovery and Design: Techniques for Developing Reliable Machine Learning Classifiers Applied to Understanding and Predicting Protein:Protein Interaction Hot Spots.

ISBN 978-1-0716-3440-0 (Hardcover) ISBN 978-1-0716-3441-7 (eBook) Springer Nature / Humana Press




Computational Drug Discovery and Design: Automated Inference of Chemical Group Discriminants of Biological Activity from Virtual Screening Data

Sebastian Raschka Computational Bio and ML

This book aims to provide protocols for the use of bioinformatics tools in drug discovery and design. With my co-authors, I contributed a chapter on using machine learning to assess the importance of chemical groups in biological activity datasets:

  • Raschka, Sebastian, Leslie A. Kuhn, Anne M. Scott, and Weiming Li (2018) Computational Drug Discovery and Design: Automated Inference of Chemical Group Discriminants of Biological Activity from Virtual Screening Data.

ISBN 978-1-4939-7755-0 (Hardcover) ISBN 978-1-4939-7756-7 (eBook) Springer Nature / Humana Press