338 Downloads Updated 8 hours ago
ollama run north-mini-code-1.0
North Mini Code is the first model in Cohere’s new family of models, and is specifically designed and trained for agentic software engineering tasks.
On Artificial Analysis’ Coding Index, North Mini Code scores 33.4, outperforming similarly sized open models like Qwen3.5 (35B-A3B), Gemma 4 (26B-A4B), and Devstral Small 2 (24B), as well as substantially larger models including Nemotron 3 Super (120B-A12B), Mistral Small 4 (119B-A6B), and Devstral 2 (123B).
North Mini Code is a decoder-only Transformer-based sparse Mixture-of-Experts model. It interleaves sliding-window attention (with RoPE) and global attention (with no positional embeddings) in a 3:1 ratio. The feed-forward block is an MoE block with 128 experts, 8 of which are activated per token, each using SwiGLU activation. The router applies a sigmoid activation before top-k selection, and a single dense layer precedes the sparse layers.
North Mini Code is trained for tool use and agentic coding, and supports interleaved thinking — it works best with thinking enabled. For best performance, pass model-generated thinking content forward to subsequent agentic steps and chat turns. Tool descriptions are best provided as JSON schema.
North Mini Code is released under the Apache 2.0 license, and also requires adhering to Cohere Lab’s Acceptable Use Policy.