How does MiMo Code compare to Claude Code?

According to Xiaomi (self-reported, no independent verification), MiMo Code outperforms Claude Code on SWE-bench Verified (82% vs 79%), SWE-bench Pro (62% vs 55%), and Terminal Bench 2 (73% vs 69%). In a double-blind test with 576 devs, it achieved >65% win rate on tasks with 200+ steps, but ties on short tasks (~50/50). Important: the comparison is only vs Claude Code (Sonnet 4.6). Codex CLI + GPT-5.5 scores 82.2% on Terminal-Bench 2.0 — 9 points above MiMo Code. All benchmarks are self-reported by Xiaomi with zero independent verification.

What are the caveats and risks of MiMo Code?

There are four main caveats. First: all benchmarks are self-reported by Xiaomi, with no third-party verification. Second: the comparison is only vs Claude Code, not including Codex CLI + GPT-5.5 which scores 9 points higher on Terminal-Bench 2.0. Third: MiMo Auto is free for a limited time and routes code through Xiaomi servers — a Chinese company subject to Chinese law, which includes obligations to cooperate with government authorities. Fourth: it is V0.1.0, the first public release, and local deployment of V2.5-Pro requires ~8x H200 GPUs (~600GB+ VRAM in FP8).

Xiaomi MiMo Code: The Open-Source Coding Agent That Challenges Claude Code

Q: What is MiMo Code and how does it work?

MiMo Code is an open-source coding agent (MIT license) released by Xiaomi on June 10, 2026. It is a fork of OpenCode that runs natively in the terminal with full tool use (files, bash, Git, LSP, MCP), a subagent system with parallel execution, persistent memory via SQLite FTS5 with checkpoint and self-maintenance (dream/distill), Max Mode with parallel best-of-N (N=5), and Compose Mode for specs-driven development with built-in skills (planning, TDD, code review, debugging). The underlying model is MiMo-V2.5-Pro: 1.02T total parameters, 42B active per inference, MoE with hybrid attention and 1M token context.

Q: What does harness > model mean and why does it matter?

The real innovation in MiMo Code is not the model — it is the harness (the orchestration architecture around the model). According to Xiaomi (self-reported, no independent verification), even using the same MiMo-V2.5-Pro model, the MiMo Code harness scores ~5 points above the Claude Code harness on SWE-bench Pro. The difference lies in the memory architecture: checkpoint-writer, context rebuild, and dream/distill give an advantage on long tasks. It is the same lesson from SantanderAI: architecture matters more than raw model capability. The model is a commodity; the harness is the differentiator.

Xiaomi — the same company that makes your Redmi phone and your air purifier — released an open-source coding agent that challenges Claude Code. MiMo Code. MIT license. 10.8K stars in 16 days. Fork of OpenCode. We analyzed the numbers, the caveats, and what this actually means for AI engineering.

What is MiMo Code

According to Xiaomi, MiMo Code is a terminal-native coding agent with full tool use: files, bash, Git, LSP, and MCP. It has a subagent system with parallel execution, persistent memory via SQLite FTS5 with checkpoint and self-maintenance (dream/distill), Max Mode with parallel best-of-N (N=5), and Compose Mode for specs-driven development with built-in skills (planning, TDD, code review, debugging). Voice input via TenVAD + MiMo ASR.

The underlying model is MiMo-V2.5-Pro: 1.02T total parameters, 42B active per inference, MoE with hybrid attention (Sliding Window + Global), 1M token context, 3 MTP layers for speculative decoding (~3x speedup). Pre-trained on 27 trillion tokens. Native FP8.

Bonus: MiMo-7B-RL (7.8B dense) matches o1-mini on math/code benchmarks. A 7B model matching a model 20x larger — according to Xiaomi (self-reported, no independent verification).

The benchmarks — and why you should read them with caution

According to Xiaomi (self-reported, no independent verification):

SWE-bench Verified: 82% vs Claude Code's 79% (+3)
SWE-bench Pro: 62% vs 55% (+7)
Terminal Bench 2: 73% vs 69% (+4)
Double-blind test with 576 devs: >65% win rate on tasks with 200+ steps. Below 200 steps: ~50/50.

The numbers are impressive. But they are all self-reported by Xiaomi with zero independent verification. We repeat this because it is fundamental: there is no third-party audit, no published reproducibility, no peer review. It is the equivalent of a company publishing its own NPS — it might be true, but you do not know if it is.

And the comparison is only vs Claude Code (Sonnet 4.6). Codex CLI + GPT-5.5 scores 82.2% on Terminal-Bench 2.0 — 9 points above MiMo Code. When the comparison field widens, the narrative changes.

Harness > model — the real innovation

The real innovation in MiMo Code is not the model. It is the harness — the orchestration architecture around the model. According to Xiaomi (self-reported, no independent verification), even using the same MiMo-V2.5-Pro model, the MiMo Code harness scores ~5 points above the Claude Code harness on SWE-bench Pro.

The difference lies in the memory architecture: checkpoint-writer saves context at regular intervals, context rebuild reconstructs context when the token limit approaches, and dream/distill compresses and consolidates memories between sessions. This gives an advantage on long tasks — exactly where the double-blind test showed >65% win rate.

It is the same lesson from SantanderAI: architecture matters more than raw model capability. The model is a commodity; the harness is the differentiator. When SantanderAI open-sourced its AI stack, the signal was that the infrastructure layer is becoming commoditized. MiMo Code reinforces this: if the same model scores differently depending on the harness, the value is not in the model — it is in the orchestration.

The caveats that matter

Four caveats that do not appear in the README:

Self-reported benchmarks: zero independent verification. Xiaomi may have selected favorable tasks, calibrated hyperparameters specifically for benchmarks, or simply reported the best runs. Without reproducibility, the numbers are indicative, not conclusive.
Limited comparison: only vs Claude Code. Codex CLI + GPT-5.5 scores 82.2% on Terminal-Bench 2.0 — 9 points higher. If the comparison were vs Codex, the narrative would be different.
MiMo Auto and Chinese law: MiMo Auto is free for a limited time and routes code through Xiaomi servers. Xiaomi is a Chinese company subject to Chinese law — this includes obligations to cooperate with government authorities. For proprietary code, customer data, and trade secrets, this is a real risk. Local deployment is the alternative, but requires ~8x H200 GPUs (~600GB+ VRAM in FP8).
V0.1.0: this is the first public release. V0.1.0 software has bugs, unstable APIs, and incomplete documentation by definition. It is not production-ready without extensive validation.

The pattern: SantanderAI → Xiaomi → who is next?

The pattern is clear: open-source from unexpected sources is redefining AI infrastructure. First Santander — a European bank open-sourcing its AI governance stack. Now Xiaomi — a Chinese phone manufacturer releasing a coding agent that challenges the market leader.

What do these two have in common? They are not AI companies. They are companies that depend on AI and decided that the infrastructure layer is not a competitive differentiator. Santander opened guardrails and bridges because every bank needs them. Xiaomi opened a coding agent because the model is a commodity — the harness is what matters.

At Tech86, we see this pattern accelerating. Companies in non-AI sectors will increasingly open-source AI infrastructure. The next one could be an automaker, a retailer, or a logistics company. When infrastructure is open-source, differentiation migrates to the application — and that is where we help companies compete: adopting the open-source layer and differentiating on integration, calibration, and operations.

Xiaomi MiMo Code: The Open-Source Coding Agent That Challenges Claude Code

What is MiMo Code

The benchmarks — and why you should read them with caution

Harness > model — the real innovation

The caveats that matter

The pattern: SantanderAI → Xiaomi → who is next?

Frequently Asked Questions

What is MiMo Code and how does it work?

How does MiMo Code compare to Claude Code?

What are the caveats and risks of MiMo Code?

What does harness > model mean and why does it matter?

Blog — Get in Touch

Schedule a Meeting

Email

WhatsApp

Address

Tech86 Specialist

We Value Your Privacy