Understanding Reasoning LLMs

Understanding Reasoning LLMs

In this article, I will describe the four main approaches to building reasoning models, or how we can enhance LLMs with reasoning capabilities. I hope this provides valuable insights and helps you navigate the rapidly evolving literature and hype surrounding this topic.

Noteworthy LLM Research Papers of 2024

Noteworthy LLM Research Papers of 2024

This article covers 12 influential AI research papers of 2024, ranging from mixture-of-experts models to new LLM scaling laws for precision.

Implementing A Byte Pair Encoding (BPE) Tokenizer From Scratch

Implementing A Byte Pair Encoding (BPE) Tokenizer From Scratch

This is a standalone notebook implementing the popular byte pair encoding (BPE) tokenization algorithm, which is used in models like GPT-2 to GPT-4, Llama 3, etc., from scratch for educational purposes."

LLM Research Papers: The 2024 List

LLM Research Papers: The 2024 List

I want to share my running bookmark list of many fascinating (mostly LLM-related) papers I stumbled upon in 2024. It's just a list, but maybe it will come in handy for those who are interested in finding some gems to read for the holidays.

Understanding Multimodal LLMs

Understanding Multimodal LLMs

There has been a lot of new research on the multimodal LLM front, including the latest Llama 3.2 vision models, which employ diverse architectural strategies to integrate various data types like text and images. For instance, The decoder-only method uses a single stack of decoder blocks to process all modalities sequentially. On the other hand, cross-attention methods (for example, used in Llama 3.2) involve separate encoders for different modalities with a cross-attention layer that allows these encoders to interact. This article explains how these different types of multimodal LLMs function. Additionally, I will review and summarize roughly a dozen other recent multimodal papers and models published in recent weeks to compare their approaches.