Alibaba's 27B Model Outguns Its Own Massive MoE on Coding
Qwen3.6-27B, a dense 27B-parameter model, reportedly beats Alibaba's own Qwen3.5-397B-A17B on major coding benchmarks.
Alibaba just dropped Qwen3.6-27B, an open-weight dense language model packing 27 billion parameters. The headline claim: it surpasses the company's own Qwen3.5-397B-A17B — a far larger Mixture-of-Experts model with 397B total parameters and 17B active — on major coding benchmarks.
That's a dense model punching well above its weight class against an MoE architecture designed to be more efficient at scale. The smaller model beating the bigger one on coding tasks signals serious architectural or training improvements under the hood.
Qwen3.6-27B is available as open weights through Hugging Face, ModelScope, and via the Qwen team's Discord. It's a straight dense transformer — no sparse routing tricks, no conditional computation. Just 27B parameters doing the work.
For developers and researchers, this is a practical win. Smaller dense models are easier to deploy and reason about than massive MoE setups.