North Mini Code and Agentic Coding Benchmarks
North Mini Code is a new open-weight model by Cohere for agentic coding tasks.
Based on the release post, it is a 30B-parameter Mixture-of-Experts model with 3B active parameters, available under Apache 2.0. Architecturally, the interesting part is the 30B-A3B tradeoff, with 128 experts, 8 active experts per token, and interleaved sliding-window and global attention.
The important detail is the evaluation setup. The release emphasizes agentic coding, where the model has to work inside a tool loop instead of only returning a code answer for a prompt:
- On Terminal-Bench, the model has to use a terminal, inspect the environment, run commands, read outputs, and continue from the observed state.
- On SWE-Bench, the model works on GitHub-style software issues. It has to understand the repository, find relevant files, make a patch, and pass tests.
- SciCode and LiveCodeBench are closer to traditional code-generation benchmarks. They still require reasoning, but the interaction loop is much shorter.
That focus on agentic coding is probably why North Mini Code looks far ahead of Gemma 4 on the workflow-heavy rows in the table. The more traditional code-generation rows are still competitive, although not quite at Qwen3.6 level.
As usual, I would treat these as release-time benchmark numbers as of June 12, 2026. For agentic coding, harness details, tool APIs, timeouts, and prompt templates can move results substantially.
Source: lightly edited website version of my Substack note.
Read Next
VibeThinker-3B and the Strength of Post-Training
Short note on VibeThinker-3B, a 3B model based on Qwen2.5-Coder-3B whose reported coding and reasoning results point to strong post-training.
Nemotron 3 Ultra and Latent MoE Scaling
Short note on Nemotron 3 Ultra, NVIDIA's 550B total and 55B active hybrid Mamba-Transformer Latent MoE model.
MiniMax M2 and Production-Oriented Model Design
Short note on the MiniMax-M2 technical report, including full attention, fine-grained MoE, agent pipelines, speed rewards, and self-evolution.
