North Mini Code is a new open-weight model by Cohere for agentic coding tasks.

Based on the release post, it is a 30B-parameter Mixture-of-Experts model with 3B active parameters, available under Apache 2.0. Architecturally, the interesting part is the 30B-A3B tradeoff, with 128 experts, 8 active experts per token, and interleaved sliding-window and global attention.

The important detail is the evaluation setup. The release emphasizes agentic coding, where the model has to work inside a tool loop instead of only returning a code answer for a prompt:

  1. On Terminal-Bench, the model has to use a terminal, inspect the environment, run commands, read outputs, and continue from the observed state.
  2. On SWE-Bench, the model works on GitHub-style software issues. It has to understand the repository, find relevant files, make a patch, and pass tests.
  3. SciCode and LiveCodeBench are closer to traditional code-generation benchmarks. They still require reasoning, but the interaction loop is much shorter.

That focus on agentic coding is probably why North Mini Code looks far ahead of Gemma 4 on the workflow-heavy rows in the table. The more traditional code-generation rows are still competitive, although not quite at Qwen3.6 level.

As usual, I would treat these as release-time benchmark numbers as of June 12, 2026. For agentic coding, harness details, tool APIs, timeouts, and prompt templates can move results substantially.

North Mini Code architecture and benchmark overview

Composite figure from the original Substack note, summarizing the North Mini Code architecture and a release-time benchmark snapshot.

Source: lightly edited website version of my Substack note.